Konstantin Seurer
58a35647e1
radv: Fix crash if proceed comes before initialize
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
"initialize" can be NULL if the rq_proceed was visited before
rq_initialize.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14626
cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39251 >
2026-01-12 22:34:32 +00:00
Natalie Vock
473cf6046a
aco/spill_preserved: Preserve linear VGPRs even if they aren't p_spill operands
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Natalie Vock
1ef2691221
aco/spill: Fix preserved reload operand update
...
p_logical_end is actually after p_reload_preserved, so this didn't do
anything.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Natalie Vock
548062f10e
aco/insert_waitcnt: Don't determine linearity by reg number
...
VGPRs can be linear too, and RT function calls will add VMEM
instructions acting on linear VGPRs. Using the linear VGPR in a block
with only linear preds will cause the pass to incorrectly skip inserting
s_waitcnt.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Natalie Vock
7c12603933
aco/lower_to_hw_instr: Preserve linearity of lowered linear VGPRs
...
So subsequent passes like waitcnt insertion can know these are linear.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Natalie Vock
0d93e8ce54
aco: Don't insert p_reload_preserved in loops
...
This can't work.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Natalie Vock
c816f699b2
aco/spill_preserved: Only reload linear VGPRs at end
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Natalie Vock
897c95c37e
aco: Include arbitrarily fixed registers in max_reg_demand
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39157 >
2026-01-12 21:46:50 +00:00
Georg Lehmann
daf235c607
aco/tests: don't destroy vk_device if it was never created
...
Happens if you only run one test that doesn't need a vk_device.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39268 >
2026-01-12 16:16:54 +00:00
Georg Lehmann
fad95030a7
aco/tests: test VALUMaskWriteHazard with v_cmpx
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:39 +00:00
Georg Lehmann
1d85552745
aco/tests: test VALUReadSGPRHazard with v_cmpx
...
To avoid regressing this in a future rework.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:39 +00:00
Georg Lehmann
3e10ab34e1
aco/insert_NOPs: explicitly wait for sa_sdst to resolve SALU -> VALU hazards
...
The assumption that these waits are not required has been proven incorrect
in at least some cases.
Totals from 190 (0.24% of 79825) affected shaders: (Navi31)
Instrs: 499718 -> 500491 (+0.15%)
CodeSize: 2658228 -> 2661916 (+0.14%)
Latency: 5964632 -> 5965453 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 794221 -> 794289 (+0.01%)
Totals from 17093 (21.41% of 79839) affected shaders: (Navi48)
Instrs: 22805214 -> 22854313 (+0.22%)
CodeSize: 121240428 -> 121432904 (+0.16%); split: -0.00%, +0.16%
Latency: 166500300 -> 166530529 (+0.02%); split: -0.00%, +0.02%
InvThroughput: 28770053 -> 28772870 (+0.01%); split: -0.00%, +0.01%
Fixes: 018f45f981 ("aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14516
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39252 >
2026-01-12 15:48:38 +00:00
Samuel Pitoiset
b65cc9d587
ac,radv: sample and set correct shader/memory clocks for RGP
...
These clocks need to be the clocks at trace time. This shouldn't fix
anything given that RADV sets profile_peak when SQTT is enabled but
better to report it correctly anyways.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39208 >
2026-01-12 11:58:43 +00:00
David Rosca
0518784b62
radv/amdgpu: Only wait on queue syncobj when needed
...
This would always wait on the queue syncobj if there was any other
wait syncobj, but it should only wait after zero submit.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39193 >
2026-01-12 10:59:03 +00:00
Dave Airlie
ab9e904f24
radv/coopmat: fix deref stride
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This at least fixes the nir debug output to have correct values.
Fixes: 48fc8c8d1c ("radv/nir/lower_cmat: set optimal load/store alignment")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39256 >
2026-01-12 10:39:05 +00:00
David Rosca
df4220d500
radv/video: Use different dpb swizzle mode for 10 bit encode
...
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39189 >
2026-01-12 10:18:18 +00:00
David Rosca
587a7aa510
radv: Enable DCC modifiers for multi plane formats on GFX12
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39190 >
2026-01-12 09:57:56 +00:00
Samuel Pitoiset
5bcca4a832
radv/spm: use a staging buffer for faster reads on dGPUS
...
This allows us to move the SPM buffer to VRAM because I think it must
be in VRAM too.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195 >
2026-01-12 09:35:37 +00:00
Samuel Pitoiset
6863a90486
radv/spm: rework allocating the SPM buffer
...
For using a staging buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195 >
2026-01-12 09:35:37 +00:00
Samuel Pitoiset
c7d0aa6671
radv/sqtt: use a staging buffer for faster reads on dGPUS
...
This is way faster.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195 >
2026-01-12 09:35:36 +00:00
Samuel Pitoiset
5d430940d2
radv/sqtt: rework allocating the SQTT buffer
...
For using a staging buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195 >
2026-01-12 09:35:36 +00:00
Samuel Pitoiset
1c611c2dac
radv/sqtt: use VkCommandBuffer objects for SQTT start/stop sequences
...
For using a staging buffer.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39195 >
2026-01-12 09:35:35 +00:00
Samuel Pitoiset
6f4b3c7c9b
ac/perfcounter: re-order GPU perf blocks on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:33 +00:00
Samuel Pitoiset
653b39989a
ac/perfcounter: define more GPU blocks on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:33 +00:00
Samuel Pitoiset
d391cd0c4d
ac/perfcounter: fix number of scoped instances for RMI block
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:33 +00:00
Samuel Pitoiset
ea63aa3e8e
ac/perfcounter: add missing configuration for GCEA on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:33 +00:00
Samuel Pitoiset
6722a6332a
ac,radv,radeonsi: rename num_spm_counters to num_spm_modules
...
A module can have different number of counters.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:32 +00:00
Samuel Pitoiset
fb43d7bff2
ac/perfcounter: re-order GPU perf blocks on GFX12
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:31 +00:00
Samuel Pitoiset
3b6ff80d48
ac/perfcounter: define more GPU blocks on GFX12
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:31 +00:00
Samuel Pitoiset
eb37d6ceb7
ac/perfcounter: fix computing number of 16-bit/32-bit SPM counters
...
Determine them only when both are explicitly 0.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:30 +00:00
Samuel Pitoiset
d1efdc7e76
ac/perfcounter: fix number of 32-bit SPM counters
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:29 +00:00
Samuel Pitoiset
9fe57d3882
ac/spm: define new per-shader engine blocks
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:29 +00:00
Samuel Pitoiset
60fac38491
ac/spm: fix typo in one GPU perf block name
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39199 >
2026-01-12 08:10:29 +00:00
Samuel Pitoiset
db02077c8a
radv: remove extra instructions after UNREACHABLE
...
Minor cleanups.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39237 >
2026-01-12 07:41:08 +00:00
Samuel Pitoiset
e1e2517664
radv: use UNREACHABLE for illegal texture filter
...
Found this with a broken CTS test, way easier to crash for isolating
the test case.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39237 >
2026-01-12 07:41:08 +00:00
Samuel Pitoiset
91e0f8f1e5
radv/rt: fix a compilation warning about uninitialized fields
...
Just zero-initialize the layout struct to fix the following warning
because radv_use_bvh8() might return FALSE.
../src/amd/vulkan/radv_acceleration_structure.c: In function ‘radv_update_as_gfx12’:
../src/amd/vulkan/radv_acceleration_structure.c:873:70: warning: ‘layout.bounds_offsets’ may be used uninitialized [-Wmaybe-uninitialized]
873 | .bounds = state->build_info->scratchData.deviceAddress + layout.bounds_offsets,
| ~~~~~~^~~~~~~~~~~~~~~
../src/amd/vulkan/radv_acceleration_structure.c:866:33: note: ‘layout.bounds_offsets’ was declared here
866 | struct update_scratch_layout layout;
| ^~~~~~
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39228 >
2026-01-12 07:18:50 +00:00
Konstantin Seurer
077292f65b
radv/bvh: Use box16 nodes when bvh8 is not used
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Using box16 nodes trades bvh quality for memory bandwidth which seems to
be roughly equal in performance.
Stats assuming box16 nodes are as expensive as box32 nodes:
Totals from 7668 (79.68% of 9624) affected BVHs:
compacted_size: 951666944 -> 742347648 (-22.00%)
max_depth: 57606 -> 57615 (+0.02%)
sah: 129114796242 -> 129998517775 (+0.68%); split: -0.00%, +0.68%
scene_sah: 188564162 -> 192063633 (+1.86%); split: -0.02%, +1.88%
box16_node_count: 0 -> 3270600 (+inf%)
box32_node_count: 3365707 -> 95100 (-97.17%)
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883 >
2026-01-10 11:36:28 +01:00
Konstantin Seurer
543a88af99
radv/bvh: Add radv_aabb16 and use it for box16 nodes
...
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883 >
2026-01-10 11:36:19 +01:00
Konstantin Seurer
fefdad9249
radv/rra: Count box16 nodes properly
...
Otherwise rra won't allocate memory when loading the capture.
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883 >
2026-01-10 11:34:18 +01:00
Konstantin Seurer
39d58a55a7
aco: Add support to f2f16 with rtpi/rtni
...
Those rounding modes are needed when computing 16-bit bounding boxes
since the bounding box must not get smaller.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883 >
2026-01-10 11:34:12 +01:00
Alyssa Rosenzweig
235e868ef7
ac/nir: use nir_is_shared_access
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39219 >
2026-01-09 20:51:13 +00:00
Benjamin Cheng
499d9e2e98
radv/video: Allow aliasing of video images
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39109 >
2026-01-09 13:52:56 +00:00
Georg Lehmann
6d07a56c6a
ac/nir/lower_ps_late: preserve signed zero, inf, nan for exports
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187 >
2026-01-09 11:58:52 +00:00
Georg Lehmann
84ecac58a6
ac/nir/opt_pack_half: preserve fp_math_ctrl
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187 >
2026-01-09 11:58:52 +00:00
Georg Lehmann
5241343ccb
ac/nir/lower_sin_cos: preserve fp_math_ctrl
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187 >
2026-01-09 11:58:52 +00:00
Georg Lehmann
9331726157
ac/nir/lower_sin_cos: use nir_shader_alu_pass
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39187 >
2026-01-09 11:58:52 +00:00
Samuel Pitoiset
4fa20bacac
radv/ci: document a regression with transfer queue on RENOIR
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Weird that only RENOIR fails given that ASTC/ETC2 aren't natively
supported too.
Needs to be investigated but SDMA supports these formats to some
extent it seems.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39230 >
2026-01-09 10:47:31 +00:00
Samuel Pitoiset
edb730f647
radv: fix flushing gang semaphore with SDMA/ACE
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
If the main CS is SDMA and the gang CS is ACE, this would emit a
SDMA_FENCE packet on ACE which just hangs.
Fixes: b1938901d0 ("radv: Use SDMA fence packet when flushing gang semaphores")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39211 >
2026-01-09 09:07:45 +00:00
Natalie Vock
60dd9d797e
aco: Swizzle ray launch IDs in the RT prolog
...
This converts from 1D workgroups to 2D ray launch IDs entirely via
shader ALU, including handling partial/cut-off workgroups optimally.
Doing this entirely in-shader means it Just Works(TM) with indirect
dispatches as well. Previous approaches manipulating various things on
CPU depending on the dispatch size couldn't handle indirect dispatches.
The swizzle implemented here also swizzles with a recursive Z-order
pattern, which should be a little more optimal than arranging
invocations linearly within the wave.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142 >
2026-01-08 19:49:55 +01:00
Natalie Vock
1f6ac3fa93
radv/rt,aco: Always dispatch 1D workgroups for RT
...
We will swizzle the workgroups ourselves in the next commit.
Removes the need for 1D dispatch workarounds.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39142 >
2026-01-08 19:49:54 +01:00