Commit graph

18142 commits

Author SHA1 Message Date
Konstantin Seurer
f33623ea5f radv/rra: Increase rra_validation_context::location
31 bytes aren't always enough for storing the string.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35166>
2025-07-22 14:40:33 +00:00
Konstantin Seurer
e62c464e4e radv/rra: Only write used BLAS
Halves the size of cp2077 captures.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35166>
2025-07-22 14:40:33 +00:00
Martin Roukala (né Peres)
e782201f5e radv/ci: add post-merge jobs for gfx1201
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36210>
2025-07-22 08:43:17 +00:00
Natalie Vock
87ebe6b31a radv/winsys: Support vm_always_valid in the NULL winsys
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
A few device features (most importantly bufferDeviceAddress) are behind
a check for has_vm_always_valid. When replaying fossilize captures using
SPIR-V capabilities like PhysicalStorageBuffer addresses (which itself
depends on bufferDeviceAddress) on a null device, these features will be
hidden and replay will fail. Claim vm_always_valid support in the null
winsys - it's not like we'll ever create any BOs anyway.

Fixes: df1224c8 ("radv: rework VM_ALWAYS_VALID handling")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36221>
2025-07-22 07:38:25 +00:00
Samuel Pitoiset
9f725cf348 radv: reject 1D block-compresed formats with mips on GFX6
GFX6 has general issues with 1D BCn formats.

Fixes recent VKCTS coverage
dEQP-VK.api.copy_and_blit.*image_to_buffer.1d_images.mip_copies_*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36259>
2025-07-22 06:09:54 +00:00
Samuel Pitoiset
1323a1194e radv: fix reporting instance/vertex_count for direct draws with RGP on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is a cursed interaction with RGP...

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36188>
2025-07-22 05:52:10 +00:00
Samuel Pitoiset
aa80a08598 radv: simplify emitting SQTT shaders relocation for GFX6-GFX11.5
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36188>
2025-07-22 05:52:10 +00:00
Samuel Pitoiset
9965a344ae radv: fix SQTT shaders relocation on GFX12
This fixes reporting ISA and instruction timing with RGP.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13525
Fixes: 098c15bfc9 ("radv: use paired shader registers for graphics on GFX12")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36188>
2025-07-22 05:52:10 +00:00
Georg Lehmann
0fce848b54 radv: vectorize 8/16bit bitfield_select
No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36141>
2025-07-21 20:42:32 +00:00
Georg Lehmann
d906fa041a ac/nir: don't lower 8/16bit bitfield_select
This is just a bitwise operation, we can support all sizes.

No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36141>
2025-07-21 20:42:32 +00:00
Georg Lehmann
c80daf934c aco: supported 64bit or vectorized bitfield_select
No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36141>
2025-07-21 20:42:32 +00:00
Georg Lehmann
14b36fb790 aco/isel: don't create literal operands for SALU bitfield_select
Let the optimizer handle this.

No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36141>
2025-07-21 20:42:32 +00:00
Rhys Perry
256a7cc4f0 aco/isel: optimize uniform vote
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
fossil-db (navi21):
Totals from 21 (0.03% of 79825) affected shaders:
Instrs: 44939 -> 44913 (-0.06%)
CodeSize: 236612 -> 236504 (-0.05%)
Latency: 509496 -> 509349 (-0.03%)
Copies: 3624 -> 3620 (-0.11%)
SALU: 5458 -> 5432 (-0.48%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36177>
2025-07-21 14:19:58 +00:00
Rhys Perry
0bceb07c03 aco: optimize uniform s_not
fossil-db (navi21):
Totals from 1442 (1.81% of 79825) affected shaders:
Instrs: 2224425 -> 2220624 (-0.17%); split: -0.18%, +0.01%
CodeSize: 11778260 -> 11763264 (-0.13%); split: -0.14%, +0.01%
Latency: 13396254 -> 13392346 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 3145007 -> 3144982 (-0.00%); split: -0.00%, +0.00%
SClause: 53037 -> 53035 (-0.00%); split: -0.01%, +0.01%
Copies: 185852 -> 184777 (-0.58%); split: -0.71%, +0.13%
Branches: 60799 -> 60805 (+0.01%)
PreSGPRs: 62940 -> 62954 (+0.02%); split: -0.01%, +0.03%
SALU: 298564 -> 294761 (-1.27%); split: -1.34%, +0.06%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36177>
2025-07-21 14:19:58 +00:00
Rhys Perry
85b31c9c4d aco/opt: add some comments
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36177>
2025-07-21 14:19:58 +00:00
Rhys Perry
2fff1db5c8 aco: don't both flip s_cselect and label uniform_bool
Otherwise, the uniform_bool could point to a temporary which might be
DCE'd, since it's not used by the s_cselect.

fossil-db (navi21):
Totals from 1 (0.00% of 79825) affected shaders:
Instrs: 1267 -> 1269 (+0.16%)
Latency: 91071 -> 91103 (+0.04%)
SALU: 283 -> 285 (+0.71%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36177>
2025-07-21 14:19:58 +00:00
Rhys Perry
2239a5e9ae aco: stop labeling first def of and(uniform_bool/uniform_bitwise, exec)
The optimizer shouldn't consider a lanemask to be a uniform boolean unless
it's either 0 or -1. Optimizations involving s_not/s_xor might not work
properly otherwise.

No fossil-db changes (navi21).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36177>
2025-07-21 14:19:58 +00:00
Alyssa Rosenzweig
b7304f180e radv: remove redundant nir->info.internal = true
this is already set when building the gs copy shader since it is a nir_builder
simple shader

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
2025-07-21 12:11:42 +00:00
Alyssa Rosenzweig
6b34e2174e nir: introduce ergonomic tex builder
for intrinsics, we have these really nice builders using designated initializers
+ macros to specify optional indices. texture instrs have even more craziness
involved, but we can do the same trick. this commit takes the existing "fixed
form" deref-centric tex builders and generalizes them to work with non-deref
textures, making it useful also for GL and late VK passes, while providing an
API that strives to be ergonomic and consistent.

this series only implements a subset of possible texture operations for now, but
more generalizing could be added as people have need.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
2025-07-21 12:11:41 +00:00
David Rosca
566ea76d8e radv/video: Support DRM format modifier tiling
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36192>
2025-07-21 10:56:34 +00:00
David Rosca
e435ea78e8 ac/surface: Add ac_modifier_supports_video
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36192>
2025-07-21 10:56:34 +00:00
David Rosca
211dc09e0f radv/video: Fix encode when using layered source image
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Found by inspection.

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36214>
2025-07-21 09:52:47 +00:00
Rhys Perry
d81add3dda aco: optimize s_and(s_cselect, exec)
fossil-db (navi21):
Totals from 62 (0.08% of 79825) affected shaders:
Instrs: 178887 -> 178745 (-0.08%)
CodeSize: 942980 -> 942328 (-0.07%)
Latency: 1274513 -> 1273653 (-0.07%)
InvThroughput: 213862 -> 213774 (-0.04%); split: -0.04%, +0.00%
SALU: 26446 -> 26301 (-0.55%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36005>
2025-07-21 08:27:01 +00:00
Rhys Perry
b2e5fc9451 aco/lower_phis: add bld_before_logical_end helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36005>
2025-07-21 08:27:01 +00:00
Samuel Pitoiset
49e49d7dde radv: add RADV_DEBUG=novideo to disable all video extensions
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's useful when video support has issues.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36186>
2025-07-21 07:20:12 +00:00
Samuel Pitoiset
af22d5c97d radv: use vk_optimize_depth_stencil_state() for optimal settings
For apps that aren't optimized.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36168>
2025-07-21 06:53:40 +00:00
Samuel Pitoiset
79c02a3388 radv: adjust conservative rasterization configuration on GFX12
PAL doesn't set these two registers either.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36196>
2025-07-21 06:26:48 +00:00
Konstantin Seurer
d59c22b6e1 radv/rt: Implement null acceleration structure in shader code
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The previous approach is broken with descriptor buffer capture/replay
because the address off the dummy VA used can randomly change.

Totals from 78 (20.58% of 379) affected shaders:

Instrs: 3837275 -> 3839653 (+0.06%); split: -0.01%, +0.07%
CodeSize: 20235104 -> 20251744 (+0.08%); split: -0.01%, +0.09%
SpillSGPRs: 997 -> 1007 (+1.00%)
Latency: 22305937 -> 22331551 (+0.11%); split: -0.03%, +0.15%
InvThroughput: 4232313 -> 4237341 (+0.12%); split: -0.03%, +0.15%
VClause: 97043 -> 97027 (-0.02%); split: -0.02%, +0.01%
SClause: 72169 -> 72416 (+0.34%); split: -0.00%, +0.35%
Copies: 321578 -> 322126 (+0.17%); split: -0.11%, +0.28%
Branches: 110163 -> 110444 (+0.26%); split: -0.00%, +0.26%
PreSGPRs: 7879 -> 7942 (+0.80%)
VALU: 2155040 -> 2156425 (+0.06%); split: -0.02%, +0.09%
SALU: 502292 -> 503078 (+0.16%); split: -0.00%, +0.16%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36034>
2025-07-19 21:02:42 +00:00
Konstantin Seurer
d28ff8050a radv/rt: Use inv_dir for software ray-triangle tests
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:37 +00:00
Konstantin Seurer
5494789e89 radv/rt: Optimize emulated ray-triangle tests
The imod instructions are lowered to 4 alu instructions each. We can do
better by packing the results with the values for kz.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:37 +00:00
Konstantin Seurer
d140f2a6a2 radv: Implement watertightness for emulated RT
Instead of using fp64 (Which is broken in some cases) the new approach
only uses fp32 and implements tiebreaking for edge/vertex hits. Using
fp32 is also much faster, improving performance of q2rtx by around 40%.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:36 +00:00
Konstantin Seurer
55641f9ca0 radv: Disable pointer flags and the GFX12 WA for emulated RT
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36213>
2025-07-19 16:35:36 +00:00
Konstantin Seurer
df44b353ad radv: Optimize ray tracing position fetch
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Gets rid of a lot of indirection when fetching triangle positions.
Storing the primitive address increases register pressure by a bit but
the traversal shader which should have the highest register demand
should not be affected when position fetch is not used.

Totals:
Instrs: 4021686 -> 4022435 (+0.02%); split: -0.01%, +0.03%
CodeSize: 21235812 -> 21235832 (+0.00%); split: -0.02%, +0.02%
Latency: 23402275 -> 23412110 (+0.04%); split: -0.04%, +0.09%
InvThroughput: 4352818 -> 4352206 (-0.01%); split: -0.04%, +0.02%
VClause: 101906 -> 102058 (+0.15%); split: -0.03%, +0.18%
Copies: 342210 -> 342368 (+0.05%); split: -0.09%, +0.14%
Branches: 114988 -> 114993 (+0.00%)
PreVGPRs: 26551 -> 27111 (+2.11%)
VALU: 2249366 -> 2249524 (+0.01%); split: -0.01%, +0.02%
SALU: 529828 -> 529808 (-0.00%); split: -0.01%, +0.00%

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35533>
2025-07-19 16:07:59 +00:00
Ruijing Dong
32a2012975 radeonsi/vcn: vcn5 av1 decoding context buffer fix
In VCN5, the AV1 context buffer has changed to a bigger
one than VCN4. It fixed an AV1 decoding issue on VCN5.

Cc: mesa-stable

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36208>
2025-07-18 16:45:42 +00:00
Georg Lehmann
05ee3c6e0f ac/nir/lower_mem_access_bit_sizes: make 8/16bit access 32bit if possible
This also means we stop splitting 8/16bit vec8.

Foz-DB GFX1201:
Totals from 112 (0.14% of 80301) affected shaders:
Instrs: 219953 -> 218280 (-0.76%)
CodeSize: 1335916 -> 1325748 (-0.76%)
VGPRs: 10460 -> 10412 (-0.46%)
Latency: 1435629 -> 1432818 (-0.20%); split: -0.22%, +0.02%
InvThroughput: 733424 -> 733271 (-0.02%); split: -0.02%, +0.00%
VClause: 4178 -> 4182 (+0.10%)
SClause: 2191 -> 2196 (+0.23%)
Copies: 13911 -> 13784 (-0.91%); split: -1.06%, +0.14%
PreVGPRs: 7620 -> 7619 (-0.01%); split: -0.03%, +0.01%
VALU: 140400 -> 140167 (-0.17%); split: -0.17%, +0.01%
SALU: 18459 -> 18276 (-0.99%)
VMEM: 9219 -> 8944 (-2.98%)
VOPD: 4216 -> 4220 (+0.09%); split: +0.24%, -0.14%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36164>
2025-07-17 21:00:06 +00:00
Boyuan Zhang
9b158d0512 ci/fluster: remove 3 pass cases resulted by gaps_in_frame
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
These 3 failed tests are passing now by enabling the gaps_in_frame feature.
Therefore, remove all of them.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36056>
2025-07-17 12:44:51 +00:00
Boyuan Zhang
a63e5f015e radeon/vcn: add gaps_in_frame flag to h264 sps
Implement gaps_in_frame_num_value_allowed_flag in h264 msg buffer.
Replace hardcoded flag values with defines.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36056>
2025-07-17 12:44:51 +00:00
Alyssa Rosenzweig
2308960bed treewide: use nir_mov_scalar
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Via Coccinelle patch:

    @@
    expression builder, scalar;
    @@

    -nir_channel(builder, scalar.def, scalar.comp)
    +nir_mov_scalar(builder, scalar)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36142>
2025-07-16 18:59:16 +00:00
Georg Lehmann
497f607c8e radv/nir/lower_cmat: vectorize GFX11 B -> ACC conversion
Foz-DB Navi31:
Totals from 7 out of 14 FSR4 shaders:
MaxWaves: 50 -> 52 (+4.00%)
Instrs: 44951 -> 44516 (-0.97%); split: -1.00%, +0.03%
CodeSize: 309176 -> 305500 (-1.19%); split: -1.23%, +0.04%
VGPRs: 1464 -> 1416 (-3.28%)
SpillVGPRs: 188 -> 92 (-51.06%)
Scratch: 24064 -> 11776 (-51.06%)
Latency: 171318 -> 163663 (-4.47%); split: -4.51%, +0.04%
InvThroughput: 178796 -> 178956 (+0.09%); split: -0.04%, +0.13%
VClause: 769 -> 730 (-5.07%); split: -6.50%, +1.43%
Copies: 3149 -> 3261 (+3.56%); split: -1.21%, +4.76%
PreVGPRs: 1607 -> 1467 (-8.71%)
VALU: 37715 -> 37744 (+0.08%); split: -0.11%, +0.18%
SALU: 754 -> 753 (-0.13%)
VMEM: 2813 -> 2621 (-6.83%)
VOPD: 1674 -> 1685 (+0.66%); split: +1.55%, -0.90%

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
7546169e1c radv/nir/lower_cmat: vectorize GFX11 ACC -> B conversion
Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 64204 -> 60749 (-5.38%)
CodeSize: 439052 -> 417668 (-4.87%)
SpillVGPRs: 186 -> 188 (+1.08%)
Scratch: 23808 -> 24064 (+1.08%)
Latency: 208878 -> 202903 (-2.86%)
InvThroughput: 232898 -> 225688 (-3.10%)
VClause: 902 -> 907 (+0.55%); split: -1.55%, +2.11%
Copies: 6418 -> 3762 (-41.38%)
Branches: 55 -> 37 (-32.73%)
PreSGPRs: 297 -> 298 (+0.34%)
PreVGPRs: 2299 -> 2303 (+0.17%)
VALU: 54762 -> 51489 (-5.98%)
SALU: 956 -> 938 (-1.88%)
VMEM: 3469 -> 3473 (+0.12%)
VOPD: 3895 -> 2126 (-45.42%)

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
d672737372 nir,aco: add byte_perm_amd
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
56d93c40ea radv/nir/lower_cmat: convert matrix use in smaller type
Less conversions, and less data to move around.

Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 65443 -> 64204 (-1.89%); split: -1.93%, +0.04%
CodeSize: 441884 -> 439052 (-0.64%); split: -1.21%, +0.57%
Latency: 213374 -> 208878 (-2.11%); split: -2.17%, +0.07%
InvThroughput: 236922 -> 232898 (-1.70%); split: -1.77%, +0.08%
VClause: 935 -> 902 (-3.53%); split: -3.74%, +0.21%
Copies: 5064 -> 6418 (+26.74%); split: -13.35%, +40.09%
Branches: 54 -> 55 (+1.85%)
VALU: 55700 -> 54762 (-1.68%); split: -1.85%, +0.16%
VOPD: 3459 -> 3895 (+12.60%); split: +16.88%, -4.28%

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
f2846b936a radv/nir/lower_cmat: use v_permlanex16_b32 instead of ds_swizzle_b32 for GFX11 ACC->B
ds_swizzle is slower than I expected.

Foz-DB Navi31:
Totals from 10 out of 14 FSR4 shaders:
Instrs: 68802 -> 65443 (-4.88%)
CodeSize: 458000 -> 441884 (-3.52%)
Latency: 218147 -> 213374 (-2.19%); split: -3.17%, +0.99%
InvThroughput: 230190 -> 236922 (+2.92%); split: -0.25%, +3.18%
VClause: 922 -> 935 (+1.41%); split: -0.98%, +2.39%
Copies: 5877 -> 5064 (-13.83%); split: -15.74%, +1.91%
Branches: 37 -> 54 (+45.95%)
VALU: 53441 -> 55700 (+4.23%); split: -0.55%, +4.77%
SALU: 872 -> 956 (+9.63%)
VOPD: 1767 -> 3459 (+95.76%)

Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:51 +00:00
Samuel Pitoiset
00a6e284c8 radv: implement DGC IB chaining when the number of sequences is too high
The maximum number of DWORDS per IB is limited by the hardware. So,
when the number of sequences is too high, it would just hang.

The solution here is to implement IB chaining inside the DGC cmdbuf
itself, so that a sequence chains the next one basically.

In practice, games only use up to 4K sequences and they aren't affected
by this change.

This fixes dEQP-VK.dgc.ext.misc.properties.maxIndirectSequenceCount.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13536
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36062>
2025-07-16 10:30:41 +00:00
Samuel Pitoiset
6d1daf51c9 ci: uprev VKCTS main to 73db56e823f8bf6b9dcab57af43b4216c3ba19b5
RADV is the only driver using VKCTS main.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36106>
2025-07-16 08:54:01 +00:00
Samuel Pitoiset
ea742877f6 radv: re-run clang-format
For style consistency.

$ clang-format -i $(find src/amd/vulkan/ -name "*.h" -o -name "*.c" -o -name "*.cpp")

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36118>
2025-07-16 09:10:33 +02:00
Samuel Pitoiset
6111e40a55 radv/bvh: remove redundant definition of DIV_ROUND_UP
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36118>
2025-07-16 09:09:30 +02:00
Eric Engestrom
1b8a073e4c radv/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137>
2025-07-15 22:27:40 +00:00
Eric Engestrom
3fc6d51a03 radeonsi/ci: document recent flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36137>
2025-07-15 22:27:40 +00:00
Natalie Vock
ac96594b86 aco/isel: Use vector-aligned operands for ds_stack_push8_pop1_rtn_b32
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:40 +00:00