We are planning to reuse more than just uniforms later,
hence let's clarify the name of these.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21995>
These were meant to explain the LDS layout, but
the actual LDS usage is better explained by:
ngg_nogs_get_culling_pervertex_lds_size().
Also add some comments there.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21995>
This builds with LLVM 12 -> 17 and a running a simple app seems to work.
I couldn't test LLVM 11 because meson fails with:
Looking for a fallback subproject for the dependency llvm (modules:
bitwriter, engine, mcdisassembler, mcjit, core, executionengine,
scalaropts, transformutils, instcombine, amdgpu, bitreader, ipo, native)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8297
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22021>
When cs allocation fails in radv_create_cmd_buffer,
radv_destroy_cmd_buffer is called before returning
VK_ERROR_OUT_OF_HOST_MEMORY. At that point, the upload list is not
initalized yet, so SIGSEGV will occur when trying to iterate through the
upload bo list. Initialize the upload list earlier to avoid this.
Signed-off-by: Benjamin Cheng <ben@bcheng.me>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22016>
With GPL it's not possible to know the primitive topology when
compiling the pre-rasterization stages. For NGG, we use the maximum
number of vertices per prim and rely on the hardware to ignore the
extra bits for points/lines.
Though, this can't work for NGG streamout because the number of
vertices per prim is used to compute a streamout offset. The only
way to solve this is to pass the number of vertices per prim through
a new user SGPR.
This fixes a bunch of streamout tests with Zink/RADV on GFX11.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21833>
Can you spot the problem?
exp param0 v6, v5, v5, v5
exp param1 v7, off, off, off
exp param1 v7, off, off, off
radeonsi uses ac_nir_optimize_outputs to eliminate output stores with
identical SSA defs (i.e. duplicated), which then causes 2 outputs to
map to the same parameter export.
This is a regression. The old LLVM code was correctly emitting each
export only once. vs_output_param_mask was supposed to be used for
this instead of vs_output_param_offset.
Fixes: 80506be31b - ac/nir/ngg,radv,radeonsi: nogs use ac_nir_export_(position|parameter)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21920>
This helps pass the mesh shader I/O tests.
Swizzled buffer addressing seems to be broken on GFX11
when the idxen bit is 0.
No Fossil DB changes on Rembrandt (GFX10.3).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21930>
This introduces tracking of the required semaphore values in pipelines,
which is then propagated to cmd_buffers on bind. Each queue also keeps
track the maximum count it has waited for, so that we can avoid the waiting
overhead once all the shaders are loaded and referenced.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
Following PAL's implementation, this patch avoids allocating shader code
buffers in BAR and use SDMA to upload them to invisible VRAM
directly.
For some games like HZD, shaders can take as much as 400MB, which exceeds
the non-resizable BAR size (256MB) and cause inconsistent spilling
behavior. The kernel will normally move these to invisible VRAM on its own,
but there are a few cases that it does not reliably happen. This patch does
the moving explicitly in the driver to ensure predictable results.
In this patch, we upload the shaders synchronously; so the shader will be
ready as soon as vkCreate*Pipeline returns. A following patch will make
this asynchronous and don't block until we see a use of the pipeline.
As a side effect, when SQTT is used we now store the shaders on a cacheable
buffer which would speed up writing the trace to the disk.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
For consistency with the sdma_copy_buffer helper that will be added next.
As a general justification, SDMA commands require little state tracking and
using radeon_cmdbuf makes it more suitable for driver internal use.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
We are about to enable pre-merge testing for radv-zink on vangogh,
which would mean the steam decks would be used for the following jobs:
* Mesa pre-merge CI:
* zink: 3 (~12 minutes)
* Mesa Post-merge CI:
* vkcts: 4 (~30 minutes)
* vkd3d: 1 (~5 minutes)
* DXVK CI: 1 (takes ~4 hours)
This means we could have 9 jobs running at the same time on steam
decks, despite only having 6 available. By reducing the number of decks
allocated for VKCTS runs from 4 to 2, we get closer to the actual
availability, and since vkd3d is so short + DXVK CI runs so
infrequently, we should never have to wait for a deck for too long!
Unfortunately, with the change of parallelism, a known flake started
failing more consistently, so I added it to the flakes list.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21873>
No one implement this intrinsic in llvm, so remove the
llvm entry too.
This will be used in TCS nir tess factor write.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
tcs_tess_lvl_(in|out)_loc may be not set if user miss tess
factor output.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
Radeonsi is going to use nir tess factor write, so need to
remap tess factor location.
RADV set tess factor driver location to be 0 and 1 in
get_linked_variable_location(). While radeonsi also set them
to be 0 and 1 in st->map_io aka. si_shader_io_get_unique_index_patch().
We could just set them to be 0 and 1 at the beginning of
ac_nir_lower_hs_outputs_to_mem(), but in order to keep the
location map at the same place, we still do this in
lower_hs_output_store().
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
It will be shared by other nir lowering too.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>