Samuel Pitoiset
0cc4e16c70
ac/spm,radv,radeonsi: configure the SPM sample interval in common code
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38577 >
2025-11-24 08:05:08 +00:00
Samuel Pitoiset
9a61eaa1e3
radv: remove the ability to create NULL devices with RADV_FORCE_FAMILY
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
On Linux, drm-shim is the replacement.
On Windows, the project to support a compile-only device has been
abandonned since a while, so it's fine to not allow creating NULL
devices for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38544 >
2025-11-24 07:44:49 +00:00
Yogesh Mohan Marimuthu
3ba6c9d0ac
winsys/amdgpu: enable userq reg shadowing for gfx11.5
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36700 >
2025-11-23 19:44:07 +00:00
Yogesh Mohan Marimuthu
9beb668d8d
winsys/amdgpu: fwm packet pre-emption for gfx 11.5
...
gfx 11.5 uses f32 firmware. f32 firmware requires COND_EXEC
packet to flush the ring buffer when pre-emption occured.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36700 >
2025-11-23 19:44:06 +00:00
Dave Airlie
3eef0c0245
radv: add support for cooperative matrix per element operations.
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36992 >
2025-11-22 13:16:20 +10:00
Samuel Pitoiset
344040c367
radv: enable RADV_THREAD_TRACE_CACHE_COUNTERS on GFX12
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38489 >
2025-11-21 11:52:58 +00:00
Samuel Pitoiset
473118b6eb
ac/spm: use hardware names for performance counters
...
Much easier to read.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38489 >
2025-11-21 11:52:58 +00:00
Samuel Pitoiset
4c21a4846c
ac/spm: adjust the granularity of SPM results on GFX12
...
It's 1, only GFX11-11.5 uses units of segment.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38489 >
2025-11-21 11:52:58 +00:00
Samuel Pitoiset
f434c5c934
ac/spm: add cache counters configuration for GFX12
...
This is for the cache counters prior to RGP 2.6.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38489 >
2025-11-21 11:52:58 +00:00
Samuel Pitoiset
da07f1ef3f
radv: allocate the SQTT BO in GTT for faster readback
...
Reading VRAM from CPU is very slow.
This is similar to the SPM BO, and generating RGP captures is now
way faster.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38551 >
2025-11-21 11:34:09 +00:00
Anna Maniscalco
3e01031f10
radv: consistently use the value in bytes for esgs_itemsize
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Previosuly this value was in bytes for vs/tes and in dwords for gs.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38514 >
2025-11-20 16:45:37 +00:00
Anna Maniscalco
5e8885a339
radv: recalculate legacy_gs_info on bind
...
Previously legacy_gs_info calculated based on
gs_info->legacy_gs_info.esgs_itemsize which is calculated based on gs
input varyings.
However, when using ESO vs/tes can have outputs not read by gs, which
leads to underestimating LDS usage.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38514 >
2025-11-20 16:45:37 +00:00
Pierre-Eric Pelloux-Prayer
9e76f5f2a2
radv: enable global BO list if vm_always_valid is supported
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529 >
2025-11-20 10:21:47 +00:00
Pierre-Eric Pelloux-Prayer
cf4c55a20f
ac/info: get vm_always_valid support through ac_linux_drm
...
For virtio it depends on the host support in virglrenderer.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529 >
2025-11-20 10:21:47 +00:00
Pierre-Eric Pelloux-Prayer
f57993b71d
ac/virtio: fix incorrect NULL check
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529 >
2025-11-20 10:21:47 +00:00
Pierre-Eric Pelloux-Prayer
51365585e2
ac/virtio: remove dead code
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529 >
2025-11-20 10:21:47 +00:00
Samuel Pitoiset
3889695e9f
aco/tests: switch to drm-shim
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38536 >
2025-11-20 09:53:29 +00:00
Samuel Pitoiset
b4121a30df
amd/drm-shim: export a function that allows to select a different device
...
To be used by ACO tests. Need to remove gnu_symbol_visibility for
exporting the symbol.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38536 >
2025-11-20 09:53:29 +00:00
Samuel Pitoiset
168a8d0b52
radv: fix RB+ for depth-only with unused attachments
...
When there are no color outputs in the rendering state, but color write
enable/write aren't masked out (which seems legal with
VK_EXT_dynamic_rendering_unused_attachments), the driver must emit
CB_DISABLE to disable CB rendering completely.
Otherwise, if there is also a depth/stencil attachment in the rendering
state, CB0 is always set to 32_R for RB+. That means, the pixel shader
would still export fragments but to the previously bound color
attachment.
VKCTS is missing coverage.
Fixes: 4580293ab2 ("radv: implement RB+ depth-only rendering for better perf")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14319
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38509 >
2025-11-20 07:37:17 +00:00
Marek Olšák
9e339f4b32
nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees
...
This describes better what it does.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38471 >
2025-11-20 05:42:11 +00:00
Marek Olšák
65837d8289
ac,radeonsi: remove gfx11 FW-based MCBP
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's too slow to be usable. User queues could replace it.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38338 >
2025-11-20 03:31:47 +00:00
Georg Lehmann
fa66b670d4
aco/optimizer: reduce max alu_opt_info stack operands to 4
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ALU instructions typically have a maximum of 3 operands, and even when combining
instructions, the peak count will not go above 4.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:43 +00:00
Georg Lehmann
4da74eed96
aco/tests: test packed fma opts
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:43 +00:00
Georg Lehmann
1f0293be0d
aco/optimizer: use new helpers for packed fma
...
Foz-DB Navi48:
Totals from 374 (0.45% of 82419) affected shaders:
MaxWaves: 5476 -> 5480 (+0.07%)
Instrs: 2786653 -> 2784061 (-0.09%); split: -0.11%, +0.01%
CodeSize: 15163340 -> 15153460 (-0.07%); split: -0.08%, +0.01%
VGPRs: 46884 -> 46860 (-0.05%)
SpillVGPRs: 188 -> 189 (+0.53%)
Scratch: 3207936 -> 3208192 (+0.01%)
Latency: 27352681 -> 27350006 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 5933554 -> 5932632 (-0.02%); split: -0.02%, +0.01%
VClause: 62355 -> 62359 (+0.01%); split: -0.03%, +0.04%
Copies: 290221 -> 289786 (-0.15%); split: -0.21%, +0.06%
Branches: 108566 -> 108569 (+0.00%); split: -0.01%, +0.01%
PreVGPRs: 40172 -> 40157 (-0.04%)
VALU: 1355753 -> 1353329 (-0.18%); split: -0.19%, +0.01%
SALU: 524836 -> 524831 (-0.00%); split: -0.01%, +0.01%
VMEM: 90948 -> 90950 (+0.00%)
VOPD: 10489 -> 10490 (+0.01%); split: +0.98%, -0.97%
Foz-DB Navi21:
Totals from 374 (0.45% of 82387) affected shaders:
MaxWaves: 4339 -> 4348 (+0.21%)
Instrs: 2255741 -> 2253554 (-0.10%); split: -0.10%, +0.00%
CodeSize: 12755276 -> 12744184 (-0.09%); split: -0.09%, +0.01%
VGPRs: 40376 -> 40352 (-0.06%)
Latency: 27357012 -> 27348737 (-0.03%); split: -0.07%, +0.04%
InvThroughput: 7213578 -> 7211136 (-0.03%); split: -0.07%, +0.04%
VClause: 62154 -> 62172 (+0.03%); split: -0.01%, +0.04%
Copies: 268204 -> 268048 (-0.06%); split: -0.22%, +0.16%
Branches: 107067 -> 107066 (-0.00%)
PreVGPRs: 37615 -> 37599 (-0.04%)
VALU: 1423326 -> 1421187 (-0.15%); split: -0.16%, +0.01%
SALU: 383388 -> 383390 (+0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:43 +00:00
Georg Lehmann
fec10ea3ea
aco/optimizer: use new helpers for add16 opts
...
Foz-DB Navi48:
Totals from 164 (0.20% of 82419) affected shaders:
Instrs: 145304 -> 145335 (+0.02%); split: -0.00%, +0.02%
CodeSize: 794156 -> 794280 (+0.02%); split: -0.00%, +0.02%
Latency: 1884349 -> 1884227 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 350403 -> 350393 (-0.00%)
Foz-DB Navi21:
Totals from 164 (0.20% of 82387) affected shaders:
Instrs: 117416 -> 117414 (-0.00%)
CodeSize: 673328 -> 673312 (-0.00%)
Latency: 1896952 -> 1897094 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 638536 -> 638556 (+0.00%); split: -0.01%, +0.01%
Copies: 14579 -> 14577 (-0.01%)
VALU: 65895 -> 65893 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
e8f5b9374b
aco/optimizer: use new helpers to optimize mul(b2f(a), b)
...
Foz-DB Navi48:
Totals from 979 (1.19% of 82419) affected shaders:
Instrs: 3630560 -> 3629463 (-0.03%); split: -0.03%, +0.00%
CodeSize: 19154176 -> 19147124 (-0.04%); split: -0.04%, +0.00%
Latency: 17700546 -> 17699505 (-0.01%); split: -0.01%, +0.01%
InvThroughput: 3143808 -> 3143254 (-0.02%); split: -0.02%, +0.01%
SClause: 76410 -> 76405 (-0.01%); split: -0.01%, +0.00%
Copies: 256544 -> 256554 (+0.00%); split: -0.02%, +0.02%
PreVGPRs: 40868 -> 40835 (-0.08%)
VALU: 2003291 -> 2002466 (-0.04%); split: -0.04%, +0.00%
SALU: 514000 -> 514006 (+0.00%)
VOPD: 3254 -> 3256 (+0.06%); split: +0.12%, -0.06%
Foz-DB Navi21:
Totals from 926 (1.12% of 82387) affected shaders:
MaxWaves: 21538 -> 21542 (+0.02%)
Instrs: 2984216 -> 2983187 (-0.03%); split: -0.04%, +0.00%
CodeSize: 16104112 -> 16097272 (-0.04%); split: -0.05%, +0.00%
VGPRs: 46864 -> 46848 (-0.03%)
Latency: 15678064 -> 15677099 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 3779550 -> 3778230 (-0.03%); split: -0.04%, +0.01%
VClause: 81590 -> 81598 (+0.01%)
SClause: 70753 -> 70751 (-0.00%); split: -0.01%, +0.00%
Copies: 240446 -> 240466 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 51121 -> 51062 (-0.12%)
PreVGPRs: 38538 -> 38505 (-0.09%)
VALU: 1978847 -> 1977777 (-0.05%); split: -0.06%, +0.00%
SALU: 439184 -> 439212 (+0.01%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
f0e24284f5
aco/optimizer: create max3/min3/med3 with salu min/max
...
Foz-DB Navi48:
Totals from 175 (0.21% of 82419) affected shaders:
Instrs: 465863 -> 465260 (-0.13%); split: -0.13%, +0.00%
CodeSize: 2362264 -> 2360744 (-0.06%); split: -0.07%, +0.00%
Latency: 1548501 -> 1548371 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 227683 -> 227630 (-0.02%); split: -0.08%, +0.06%
Copies: 33646 -> 33648 (+0.01%)
PreSGPRs: 9996 -> 10004 (+0.08%)
VALU: 175836 -> 175850 (+0.01%)
SALU: 122094 -> 121621 (-0.39%); split: -0.39%, +0.00%
Foz-DB Navi21:
Totals from 1 (0.00% of 82387) affected shaders:
InvThroughput: 74 -> 76 (+2.70%)
VALU: 57 -> 58 (+1.75%)
SALU: 61 -> 60 (-1.64%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
d21734e024
aco/optimizer: use new helper functions to create med3
...
Foz-DB Navi48:
Totals from 9659 (11.72% of 82419) affected shaders:
Instrs: 17301747 -> 17301735 (-0.00%); split: -0.00%, +0.00%
CodeSize: 93378108 -> 93378184 (+0.00%); split: -0.00%, +0.00%
Latency: 145441784 -> 145441791 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 25768777 -> 25768778 (+0.00%)
Copies: 1370123 -> 1370124 (+0.00%)
VALU: 9705655 -> 9705656 (+0.00%)
Foz-DB Navi21:
Totals from 22 (0.03% of 82387) affected shaders:
Instrs: 27433 -> 27406 (-0.10%)
CodeSize: 146440 -> 146352 (-0.06%); split: -0.06%, +0.00%
Latency: 305857 -> 305806 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 63634 -> 63580 (-0.08%)
VALU: 19109 -> 19082 (-0.14%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
6fc250fc06
aco/optimizer: use new helpers for min3/max3/minmax/maxmin
...
Foz-DB Navi48:
Totals from 10453 (12.68% of 82419) affected shaders:
Instrs: 18676282 -> 18675798 (-0.00%); split: -0.00%, +0.00%
CodeSize: 100603268 -> 100603508 (+0.00%); split: -0.00%, +0.00%
Latency: 157036823 -> 157031708 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 28049331 -> 28048776 (-0.00%); split: -0.00%, +0.00%
Copies: 1452464 -> 1452503 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 458422 -> 458413 (-0.00%); split: -0.00%, +0.00%
VALU: 10429583 -> 10429353 (-0.00%); split: -0.00%, +0.00%
SALU: 2628403 -> 2628416 (+0.00%); split: -0.00%, +0.00%
VOPD: 21738 -> 21744 (+0.03%); split: +0.04%, -0.01%
Foz-DB Navi21:
Totals from 889 (1.08% of 82387) affected shaders:
MaxWaves: 15641 -> 15639 (-0.01%); split: +0.01%, -0.03%
Instrs: 2505527 -> 2505489 (-0.00%); split: -0.01%, +0.01%
CodeSize: 13975300 -> 13976516 (+0.01%); split: -0.00%, +0.01%
VGPRs: 65584 -> 65576 (-0.01%); split: -0.02%, +0.01%
Latency: 37135606 -> 37132577 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 10937032 -> 10935704 (-0.01%); split: -0.01%, +0.00%
VClause: 63136 -> 63140 (+0.01%); split: -0.01%, +0.01%
Copies: 256011 -> 256073 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 51804 -> 51809 (+0.01%)
PreVGPRs: 57905 -> 57890 (-0.03%); split: -0.03%, +0.00%
VALU: 1593523 -> 1593339 (-0.01%); split: -0.02%, +0.00%
SALU: 425116 -> 425134 (+0.00%); split: -0.00%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
5d02eae052
aco/optimizer: add less agressive pattern matching option
...
Still a bit more aggresive than the classic is_used_once,
but it should still prevent most regressions for patterns
that use min/max/mul as outer instruction.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
2c05aa34aa
aco/optimizer: create fma with s_mul_f32/f16
...
Foz-DB Navi48:
Totals from 14473 (17.56% of 82419) affected shaders:
MaxWaves: 397738 -> 397720 (-0.00%); split: +0.00%, -0.01%
Instrs: 22133626 -> 21984649 (-0.67%); split: -0.68%, +0.01%
CodeSize: 117440104 -> 117111440 (-0.28%); split: -0.30%, +0.02%
VGPRs: 825820 -> 825928 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 15496 -> 15512 (+0.10%); split: -0.19%, +0.29%
Latency: 152141755 -> 152058676 (-0.05%); split: -0.07%, +0.02%
InvThroughput: 25715152 -> 25681160 (-0.13%); split: -0.14%, +0.01%
VClause: 402752 -> 400798 (-0.49%); split: -0.53%, +0.04%
SClause: 587448 -> 586772 (-0.12%); split: -0.19%, +0.07%
Copies: 1650891 -> 1661495 (+0.64%); split: -0.14%, +0.78%
Branches: 541341 -> 541334 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 748235 -> 748332 (+0.01%); split: -0.03%, +0.04%
VALU: 11754090 -> 11755396 (+0.01%); split: -0.01%, +0.02%
SALU: 3659133 -> 3536435 (-3.35%); split: -3.36%, +0.01%
VOPD: 17201 -> 17083 (-0.69%); split: +0.05%, -0.74%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
5abc961514
aco/optimizer: use new helpers to create fma
...
Foz-DB Navi48:
Totals from 25949 (31.48% of 82419) affected shaders:
Instrs: 30904250 -> 30904153 (-0.00%); split: -0.00%, +0.00%
CodeSize: 164623100 -> 164604652 (-0.01%); split: -0.01%, +0.00%
Latency: 209402611 -> 209402684 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 36622293 -> 36622236 (-0.00%); split: -0.00%, +0.00%
Copies: 2252080 -> 2251998 (-0.00%); split: -0.00%, +0.00%
VALU: 16831507 -> 16831382 (-0.00%); split: -0.00%, +0.00%
VOPD: 28252 -> 28295 (+0.15%)
Foz-DB Navi21:
Totals from 56269 (68.30% of 82387) affected shaders:
Instrs: 43751754 -> 43746463 (-0.01%); split: -0.01%, +0.00%
CodeSize: 233615096 -> 233576912 (-0.02%); split: -0.02%, +0.00%
VGPRs: 2445528 -> 2445520 (-0.00%)
Latency: 276776920 -> 276761183 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 66406450 -> 66402214 (-0.01%); split: -0.01%, +0.00%
VClause: 902951 -> 902947 (-0.00%)
Copies: 3926260 -> 3926289 (+0.00%); split: -0.01%, +0.01%
VALU: 26924056 -> 26918783 (-0.02%); split: -0.02%, +0.00%
SALU: 6938335 -> 6938321 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
1e2aea7461
aco/optimizer: add new helper functions for combining two instructions
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
87e168f223
aco/optimizer: make label_mad more generic
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
53f5e447db
aco/optimizer: add extract_float helper
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Georg Lehmann
7eccf5c745
aco/optimizer: refactor insert
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150 >
2025-11-19 10:51:42 +00:00
Samuel Pitoiset
7c9e5b4c1c
radv: remove unreachable code for prefetch in radv_cs_emit_cp_dma()
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
CP DMA prefetches are implemented with a separate function.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449 >
2025-11-19 08:03:38 +00:00
Samuel Pitoiset
60d438e517
radv: always use MALL for CP DMA operations on GFX12
...
CP DMA isn't coherent with L2 on GFX12, but {SRC,DST}_ADDR_TC_L2 means
MALL.
Only small buffers are using copy/fill CP DMA operations, so this
shouldn't have much effect.
Found by inspection.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449 >
2025-11-19 08:03:38 +00:00
Samuel Pitoiset
b2a13ce92c
radv/tests: require drm-shim and use it instead of RADV_FORCE_FAMILY
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38507 >
2025-11-19 07:11:05 +00:00
Boris Brezillon
ea4d4d2a77
nir: Prepare nir_lower_io_vars_to_temporaries() for optional PLS lowering
...
Rather than adding another boolean to optionally lower PLS vars, pass
the types we want to lowers through a nir_variable_mode bitmask.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110 >
2025-11-18 20:25:42 +00:00
Natalie Vock
1243d575a5
aco/insert_nops: Consider s_setpc target susceptible to VALUReadSGPRHazard
...
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.
This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38290 >
2025-11-18 18:43:00 +00:00
Samuel Pitoiset
9f512d8f93
radv: advertise VK_EXT_custom_resolve
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38442 >
2025-11-18 17:03:13 +00:00
Samuel Pitoiset
91469bcc30
radv: implement VK_EXT_custom_resolve
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38442 >
2025-11-18 17:03:13 +00:00
Dave Airlie
ad25196d35
radv: add support for cooperative matrix reductions.
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This add support to the lowering the reduction operations.
Thanks to Georg Lehmann for a lot of the ideas and optimising in
this.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389 >
2025-11-17 23:33:59 +00:00
Georg Lehmann
3a175b54a4
aco,nir: support subdword v_permlane_b16
...
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389 >
2025-11-17 23:33:59 +00:00
Georg Lehmann
018f45f981
aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits
...
Mostly removes SALU->VALU waits if the VALU writes a sgpr.
Foz-DB GFX1201:
Totals from 18553 (22.51% of 82419) affected shaders:
Instrs: 27388414 -> 27321118 (-0.25%)
CodeSize: 145389276 -> 145118128 (-0.19%); split: -0.19%, +0.00%
Latency: 200288087 -> 200252583 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 36311237 -> 36307369 (-0.01%); split: -0.01%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38445 >
2025-11-17 16:28:36 +00:00
Georg Lehmann
b1d730982e
aco/insert_NOPs: remove redundant VALUMaskWriteHazard waits
...
This removes a lot of VALU->SALU waits.
Foz-DB Navi31:
Totals from 8908 (10.84% of 82179) affected shaders:
Instrs: 17118986 -> 17084870 (-0.20%)
CodeSize: 91057212 -> 90919300 (-0.15%); split: -0.15%, +0.00%
Latency: 154044128 -> 154036848 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 26608698 -> 26607933 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38445 >
2025-11-17 16:28:36 +00:00
David Rosca
3abb2707e2
radv/video: Fix coding used_by_curr_pic_lt_flag
...
Fixes: d68a1fc0d4 ("radv/video: port hevc slice header encoding from radeonsi")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14301
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38475 >
2025-11-17 11:51:08 +00:00
Samuel Pitoiset
8d4ba81ca8
radv: remove now unused SDMA helpers
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38448 >
2025-11-17 11:29:24 +00:00
Samuel Pitoiset
a4e4f13c78
ac,radv: add ac_emit_sdma_copy_t2t_sub_window()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38448 >
2025-11-17 11:29:24 +00:00