Commit graph

11560 commits

Author SHA1 Message Date
Benjamin Cheng
e57caf9893 radv: initialize cmd_buffer upload list earlier
When cs allocation fails in radv_create_cmd_buffer,
radv_destroy_cmd_buffer is called before returning
VK_ERROR_OUT_OF_HOST_MEMORY. At that point, the upload list is not
initalized yet, so SIGSEGV will occur when trying to iterate through the
upload bo list. Initialize the upload list earlier to avoid this.

Signed-off-by: Benjamin Cheng <ben@bcheng.me>
Reviewed-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22016>
2023-03-21 08:06:24 +00:00
Lang Yu
19b89c8077 amd/common: fix a typo
Fixes: 35f053ba8c ("radv: Fix corrupted mipmap copies on GFX9+")

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22000>
2023-03-20 20:30:32 +00:00
Konstantin Seurer
deb537de3c radv/rt: Handle load_constant instructions when inlining shaders
Fixes the following tests:
dEQP-VK.ray_query.builtin.rayqueryterminate.ahit.aabbs,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.ahit.triangles,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.call.aabbs,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.call.triangles,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.chit.aabbs,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.chit.triangles,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.miss.aabbs,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.miss.triangles,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.rgen.aabbs,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.rgen.triangles,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.sect.aabbs,Fail
dEQP-VK.ray_query.builtin.rayqueryterminate.sect.triangles,Fail

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8647
Fixes: fda262f ("radv/rt: move Ray Tracing shader creation into separate file")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22002>
2023-03-20 19:04:34 +00:00
Samuel Pitoiset
d750ad19fd radv: fix NGG streamout with VS and GPL on GFX11
With GPL it's not possible to know the primitive topology when
compiling the pre-rasterization stages. For NGG, we use the maximum
number of vertices per prim and rely on the hardware to ignore the
extra bits for points/lines.

Though, this can't work for NGG streamout because the number of
vertices per prim is used to compute a streamout offset. The only
way to solve this is to pass the number of vertices per prim through
a new user SGPR.

This fixes a bunch of streamout tests with Zink/RADV on GFX11.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21833>
2023-03-20 17:47:03 +00:00
Samuel Pitoiset
0badfd8b20 radv: add helpers for destroying various pipeline types
Much cleaner than having a single function for everything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21894>
2023-03-20 13:56:32 +00:00
Samuel Pitoiset
abfdc06b01 radv: rename RADV_PIPELINE_LIBRARY to RADV_PIPELINE_RAY_TRACING_LIB
This seems more consistent with graphics pipeline libraries and it
avoids any confusion.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21894>
2023-03-20 13:56:32 +00:00
Oleksii Bozhenko
bbde684ca0 ci: Uprev Piglit
Signed-off-by: Oleksii Bozhenko <oleksii.bozhenko@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21810>
2023-03-20 04:19:23 +00:00
Konstantin Seurer
0f18bb4076 radv: Fix inserting stack_size into the cache
Fixes: 3e03fe4 ("radv/rt: move stack_sizes into radv_ray_tracing_module")
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21969>
2023-03-18 14:57:51 +00:00
Konstantin Seurer
3887f64dc3 radv: Fix loading stack_size from the cache
Fixes: 3e03fe4 ("radv/rt: move stack_sizes into radv_ray_tracing_module")
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21969>
2023-03-18 14:57:51 +00:00
Tatsuyuki Ishi
22d6556a4b radv: Fix missing wait of GS copy shader upload for dmashaders.
Fixes: 0cde42a506 ("radv: Wait for shader uploads asynchronously.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21985>
2023-03-18 03:04:15 +00:00
Marek Olšák
6eddc6dd5a ac/nir: use plural correctly in the ac_nir_export_parameters name
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21920>
2023-03-17 23:58:28 +00:00
Marek Olšák
3626bc2daa ac/nir: don't emit duplicated parameter exports
Can you spot the problem?
    exp param0 v6, v5, v5, v5
    exp param1 v7, off, off, off
    exp param1 v7, off, off, off

radeonsi uses ac_nir_optimize_outputs to eliminate output stores with
identical SSA defs (i.e. duplicated), which then causes 2 outputs to
map to the same parameter export.

This is a regression. The old LLVM code was correctly emitting each
export only once. vs_output_param_mask was supposed to be used for
this instead of vs_output_param_offset.

Fixes: 80506be31b - ac/nir/ngg,radv,radeonsi: nogs use ac_nir_export_(position|parameter)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21920>
2023-03-17 23:58:28 +00:00
Martin Roukala (né Peres)
d3c1cc9261 radv/ci: update VanGogh's expectations
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21983>
2023-03-17 22:27:01 +00:00
Rhys Perry
596f2ef361 aco: set needs_flat_scr=true for RT
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Fixes: 39c828cb9f ("aco: remove aco::rt_stack variable")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21961>
2023-03-17 16:55:57 +00:00
Rhys Perry
184cf1cb79 aco/gfx11: fix RT prolog scratch initialization
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Fixes: 6446b79168 ("aco: implement select_rt_prolog()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21961>
2023-03-17 16:55:57 +00:00
Timur Kristóf
a42c57dc01 aco: Always enable idxen for swizzled buffer access on GFX11.
This helps pass the mesh shader I/O tests.
Swizzled buffer addressing seems to be broken on GFX11
when the idxen bit is 0.

No Fossil DB changes on Rembrandt (GFX10.3).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21930>
2023-03-17 00:34:21 +00:00
Timur Kristóf
1f9e44c181 aco: Disable MUBUF/MTBUF offsets when they are zero.
Fossil DB stats on Rembrandt (GFX10.3):

Totals from 1264 (0.94% of 134920) affected shaders:
VGPRs: 69504 -> 69336 (-0.24%)
CodeSize: 6885468 -> 6886224 (+0.01%); split: -0.02%, +0.03%
MaxWaves: 24632 -> 24670 (+0.15%)
Instrs: 1287027 -> 1287209 (+0.01%); split: -0.04%, +0.05%
Latency: 6830411 -> 6831165 (+0.01%); split: -0.06%, +0.07%
InvThroughput: 1220643 -> 1220438 (-0.02%); split: -0.04%, +0.02%
VClause: 24737 -> 24751 (+0.06%); split: -0.25%, +0.30%
SClause: 42774 -> 42911 (+0.32%); split: -0.13%, +0.45%
Copies: 75408 -> 75600 (+0.25%); split: -0.62%, +0.88%
PreVGPRs: 60544 -> 59809 (-1.21%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21930>
2023-03-17 00:34:21 +00:00
Timur Kristóf
40676da381 aco: Use zero for MUBUF/MTBUF when soffset is undefined.
No Fossil DB changes on Rembrandt (GFX10.3).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21930>
2023-03-17 00:34:21 +00:00
Timur Kristóf
b3933ffe60 aco: Don't add soffset to swizzled MUBUF base.
No Fossil DB changes on Rembrandt (GFX10.3).

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21930>
2023-03-17 00:34:20 +00:00
Friedrich Vock
89590c1d84 radv: Add RT shader stage names for executable properties
Now that we use raygen shaders, we also need to support RT stages for
executable properties.

Fixes: f123d65e9f ("radv/rt: use prolog for raytracing shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21960>
2023-03-16 21:28:03 +00:00
Tatsuyuki Ishi
0cde42a506 radv: Wait for shader uploads asynchronously.
This introduces tracking of the required semaphore values in pipelines,
which is then propagated to cmd_buffers on bind. Each queue also keeps
track the maximum count it has waited for, so that we can avoid the waiting
overhead once all the shaders are loaded and referenced.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
2023-03-16 18:02:57 +00:00
Tatsuyuki Ishi
a8c5fd3b1b radv: Upload shaders to invisible VRAM on small BAR systems.
Following PAL's implementation, this patch avoids allocating shader code
buffers in BAR and use SDMA to upload them to invisible VRAM
directly.

For some games like HZD, shaders can take as much as 400MB, which exceeds
the non-resizable BAR size (256MB) and cause inconsistent spilling
behavior. The kernel will normally move these to invisible VRAM on its own,
but there are a few cases that it does not reliably happen. This patch does
the moving explicitly in the driver to ensure predictable results.

In this patch, we upload the shaders synchronously; so the shader will be
ready as soon as vkCreate*Pipeline returns. A following patch will make
this asynchronous and don't block until we see a use of the pipeline.

As a side effect, when SQTT is used we now store the shaders on a cacheable
buffer which would speed up writing the trace to the disk.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
2023-03-16 18:02:57 +00:00
Tatsuyuki Ishi
3b258ae2d9 radv: Introduce sdma_copy_buffer for GFX7+.
Helper salvaged from radeonsi (before SDMA removal).

This will be used for driver internal submissions to DMA shaders from GTT
to invisible VRAM.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
2023-03-16 18:02:56 +00:00
Tatsuyuki Ishi
d4fb3db748 radv: Use radeon_cmdbuf for sdma_copy_image.
For consistency with the sdma_copy_buffer helper that will be added next.

As a general justification, SDMA commands require little state tracking and
using radeon_cmdbuf makes it more suitable for driver internal use.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16271>
2023-03-16 18:02:56 +00:00
Jesse Natalie
f8566533ea radv: Fix returning an expression from a void function
Fixes: d5de56bf ("radv: add RT shader args")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21932>
2023-03-16 16:07:14 +00:00
Tatsuyuki Ishi
9faaff4561 radv/rt: Don't upload the prolog twice.
radv_shader_create already calls radv_shader_binary_upload.

Fixes: 4b92a53285 ("radv: add radv_create_rt_prolog()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21945>
2023-03-16 12:27:21 +00:00
Martin Roukala (né Peres)
928aab57a3 radv/ci: reduce the parallelism for vkcts-vangogh
We are about to enable pre-merge testing for radv-zink on vangogh,
which would mean the steam decks would be used for the following jobs:

 * Mesa pre-merge CI:
  * zink: 3 (~12 minutes)
 * Mesa Post-merge CI:
   * vkcts: 4 (~30 minutes)
   * vkd3d: 1 (~5 minutes)
 * DXVK CI: 1 (takes ~4 hours)

This means we could have 9 jobs running at the same time on steam
decks, despite only having 6 available. By reducing the number of decks
allocated for VKCTS runs from 4 to 2, we get closer to the actual
availability, and since vkd3d is so short + DXVK CI runs so
infrequently, we should never have to wait for a deck for too long!

Unfortunately, with the change of parallelism, a known flake started
failing more consistently, so I added it to the flakes list.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21873>
2023-03-16 11:31:03 +00:00
Qiang Yu
719366c2b2 ac/llvm,radeonsi: lower nir_load_ring_tess_factors_amd
No one implement this intrinsic in llvm, so remove the
llvm entry too.

This will be used in TCS nir tess factor write.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
2023-03-16 04:33:30 +00:00
Qiang Yu
99828e0390 ac/nir: handle tess factor output missing case
tcs_tess_lvl_(in|out)_loc may be not set if user miss tess
factor output.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
2023-03-16 04:33:30 +00:00
Qiang Yu
700e24941c ac/nir: init tess factor location with IO remap
Radeonsi is going to use nir tess factor write, so need to
remap tess factor location.

RADV set tess factor driver location to be 0 and 1 in
get_linked_variable_location(). While radeonsi also set them
to be 0 and 1 in st->map_io aka. si_shader_io_get_unique_index_patch().

We could just set them to be 0 and 1 at the beginning of
ac_nir_lower_hs_outputs_to_mem(), but in order to keep the
location map at the same place, we still do this in
lower_hs_output_store().

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
2023-03-16 04:33:30 +00:00
Qiang Yu
c06329eb3f ac/nir: tcs write tess factor support pass by reg
For radeonsi usage.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
2023-03-16 04:33:30 +00:00
Qiang Yu
e070a9e8d0 ac/nir: move store_var_components to common place
It will be shared by other nir lowering too.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21437>
2023-03-16 04:33:30 +00:00
Daniel Schürmann
39c828cb9f aco: remove aco::rt_stack variable
Since we initialize scratch in the RT proglog,
there is no need for this variable anymore.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
f123d65e9f radv/rt: use prolog for raytracing shaders
Co-authored-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Friedrich Vock
bea022d1f6 radv/rt: Add shader config combination/postprocessing utils
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Friedrich Vock
0569b350ed radv: Emit RT shader VA user SGPR
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
a16df842a6 radv: compile rt_prolog
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
4b92a53285 radv: add radv_create_rt_prolog()
Co-authored-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
6446b79168 aco: implement select_rt_prolog()
Co-authored-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
7d35bf24f6 aco: create hw_init_scratch() function for p_init_scratch lowering
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
2fee99a36c aco: implement load_ray_launch_{id|size}
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
c7c68e1193 aco: move rt_dynamic_callable_stack_base_amd to VGPR
In future, we will use a VGPR arg for that between RT stages.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
1f01a86b36 aco: don't set private_segment_buffer/scratch_offset on GFX9+
It is unused. Also don't initialize scratch in raytracing stages as it gets
initialized in the prolog shader.

Co-authored-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:30 +00:00
Daniel Schürmann
a33b9d43d8 aco: add RT stage enums
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:29 +00:00
Daniel Schürmann
c38b8678c9 radv: add RT shader handling to radv_postprocess_config
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:29 +00:00
Daniel Schürmann
3f03eebf04 radv: add RT stages to radv_get_shader_name()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:29 +00:00
Daniel Schürmann
650f386bdd radv: handle RT stages in radv_nir_shader_info_pass()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:29 +00:00
Daniel Schürmann
d5de56bf59 radv: add RT shader args
Co-authored-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21780>
2023-03-16 01:40:29 +00:00
Lynne
f5e5ec180c aco_validate: allow for wave32 in p_dual_src_export_gfx11
Fixes RADV_PERFTEST=pswave32

Fixes: bb90d29660 ("aco: add p_dual_src_export_gfx11 for dual source blending on GFX11")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21934>
2023-03-15 23:55:41 +00:00
Timur Kristóf
6185e4f2ff aco, radv: Remove VS IO information from ACO.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16805>
2023-03-15 14:54:28 +00:00