Commit graph

20134 commits

Author SHA1 Message Date
Samuel Pitoiset
da07f1ef3f radv: allocate the SQTT BO in GTT for faster readback
Reading VRAM from CPU is very slow.

This is similar to the SPM BO, and generating RGP captures is now
way faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38551>
2025-11-21 11:34:09 +00:00
Anna Maniscalco
3e01031f10 radv: consistently use the value in bytes for esgs_itemsize
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Previosuly this value was in bytes for vs/tes and in dwords for gs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38514>
2025-11-20 16:45:37 +00:00
Anna Maniscalco
5e8885a339 radv: recalculate legacy_gs_info on bind
Previously legacy_gs_info calculated based on
gs_info->legacy_gs_info.esgs_itemsize which is calculated based on gs
input varyings.

However, when using ESO vs/tes can have outputs not read by gs, which
leads to underestimating LDS usage.

Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38514>
2025-11-20 16:45:37 +00:00
Pierre-Eric Pelloux-Prayer
9e76f5f2a2 radv: enable global BO list if vm_always_valid is supported
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529>
2025-11-20 10:21:47 +00:00
Pierre-Eric Pelloux-Prayer
cf4c55a20f ac/info: get vm_always_valid support through ac_linux_drm
For virtio it depends on the host support in virglrenderer.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529>
2025-11-20 10:21:47 +00:00
Pierre-Eric Pelloux-Prayer
f57993b71d ac/virtio: fix incorrect NULL check
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529>
2025-11-20 10:21:47 +00:00
Pierre-Eric Pelloux-Prayer
51365585e2 ac/virtio: remove dead code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38529>
2025-11-20 10:21:47 +00:00
Samuel Pitoiset
3889695e9f aco/tests: switch to drm-shim
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38536>
2025-11-20 09:53:29 +00:00
Samuel Pitoiset
b4121a30df amd/drm-shim: export a function that allows to select a different device
To be used by ACO tests. Need to remove gnu_symbol_visibility for
exporting the symbol.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38536>
2025-11-20 09:53:29 +00:00
Samuel Pitoiset
168a8d0b52 radv: fix RB+ for depth-only with unused attachments
When there are no color outputs in the rendering state, but color write
enable/write aren't masked out (which seems legal with
VK_EXT_dynamic_rendering_unused_attachments), the driver must emit
CB_DISABLE to disable CB rendering completely.

Otherwise, if there is also a depth/stencil attachment in the rendering
state, CB0 is always set to 32_R for RB+. That means, the pixel shader
would still export fragments but to the previously bound color
attachment.

VKCTS is missing coverage.

Fixes: 4580293ab2 ("radv: implement RB+ depth-only rendering for better perf")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14319
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38509>
2025-11-20 07:37:17 +00:00
Marek Olšák
9e339f4b32 nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees
This describes better what it does.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38471>
2025-11-20 05:42:11 +00:00
Marek Olšák
65837d8289 ac,radeonsi: remove gfx11 FW-based MCBP
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's too slow to be usable. User queues could replace it.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38338>
2025-11-20 03:31:47 +00:00
Georg Lehmann
fa66b670d4 aco/optimizer: reduce max alu_opt_info stack operands to 4
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ALU instructions typically have a maximum of 3 operands, and even when combining
instructions, the peak count will not go above 4.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:43 +00:00
Georg Lehmann
4da74eed96 aco/tests: test packed fma opts
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:43 +00:00
Georg Lehmann
1f0293be0d aco/optimizer: use new helpers for packed fma
Foz-DB Navi48:
Totals from 374 (0.45% of 82419) affected shaders:
MaxWaves: 5476 -> 5480 (+0.07%)
Instrs: 2786653 -> 2784061 (-0.09%); split: -0.11%, +0.01%
CodeSize: 15163340 -> 15153460 (-0.07%); split: -0.08%, +0.01%
VGPRs: 46884 -> 46860 (-0.05%)
SpillVGPRs: 188 -> 189 (+0.53%)
Scratch: 3207936 -> 3208192 (+0.01%)
Latency: 27352681 -> 27350006 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 5933554 -> 5932632 (-0.02%); split: -0.02%, +0.01%
VClause: 62355 -> 62359 (+0.01%); split: -0.03%, +0.04%
Copies: 290221 -> 289786 (-0.15%); split: -0.21%, +0.06%
Branches: 108566 -> 108569 (+0.00%); split: -0.01%, +0.01%
PreVGPRs: 40172 -> 40157 (-0.04%)
VALU: 1355753 -> 1353329 (-0.18%); split: -0.19%, +0.01%
SALU: 524836 -> 524831 (-0.00%); split: -0.01%, +0.01%
VMEM: 90948 -> 90950 (+0.00%)
VOPD: 10489 -> 10490 (+0.01%); split: +0.98%, -0.97%

Foz-DB Navi21:
Totals from 374 (0.45% of 82387) affected shaders:
MaxWaves: 4339 -> 4348 (+0.21%)
Instrs: 2255741 -> 2253554 (-0.10%); split: -0.10%, +0.00%
CodeSize: 12755276 -> 12744184 (-0.09%); split: -0.09%, +0.01%
VGPRs: 40376 -> 40352 (-0.06%)
Latency: 27357012 -> 27348737 (-0.03%); split: -0.07%, +0.04%
InvThroughput: 7213578 -> 7211136 (-0.03%); split: -0.07%, +0.04%
VClause: 62154 -> 62172 (+0.03%); split: -0.01%, +0.04%
Copies: 268204 -> 268048 (-0.06%); split: -0.22%, +0.16%
Branches: 107067 -> 107066 (-0.00%)
PreVGPRs: 37615 -> 37599 (-0.04%)
VALU: 1423326 -> 1421187 (-0.15%); split: -0.16%, +0.01%
SALU: 383388 -> 383390 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:43 +00:00
Georg Lehmann
fec10ea3ea aco/optimizer: use new helpers for add16 opts
Foz-DB Navi48:
Totals from 164 (0.20% of 82419) affected shaders:
Instrs: 145304 -> 145335 (+0.02%); split: -0.00%, +0.02%
CodeSize: 794156 -> 794280 (+0.02%); split: -0.00%, +0.02%
Latency: 1884349 -> 1884227 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 350403 -> 350393 (-0.00%)

Foz-DB Navi21:
Totals from 164 (0.20% of 82387) affected shaders:
Instrs: 117416 -> 117414 (-0.00%)
CodeSize: 673328 -> 673312 (-0.00%)
Latency: 1896952 -> 1897094 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 638536 -> 638556 (+0.00%); split: -0.01%, +0.01%
Copies: 14579 -> 14577 (-0.01%)
VALU: 65895 -> 65893 (-0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
e8f5b9374b aco/optimizer: use new helpers to optimize mul(b2f(a), b)
Foz-DB Navi48:
Totals from 979 (1.19% of 82419) affected shaders:
Instrs: 3630560 -> 3629463 (-0.03%); split: -0.03%, +0.00%
CodeSize: 19154176 -> 19147124 (-0.04%); split: -0.04%, +0.00%
Latency: 17700546 -> 17699505 (-0.01%); split: -0.01%, +0.01%
InvThroughput: 3143808 -> 3143254 (-0.02%); split: -0.02%, +0.01%
SClause: 76410 -> 76405 (-0.01%); split: -0.01%, +0.00%
Copies: 256544 -> 256554 (+0.00%); split: -0.02%, +0.02%
PreVGPRs: 40868 -> 40835 (-0.08%)
VALU: 2003291 -> 2002466 (-0.04%); split: -0.04%, +0.00%
SALU: 514000 -> 514006 (+0.00%)
VOPD: 3254 -> 3256 (+0.06%); split: +0.12%, -0.06%

Foz-DB Navi21:
Totals from 926 (1.12% of 82387) affected shaders:
MaxWaves: 21538 -> 21542 (+0.02%)
Instrs: 2984216 -> 2983187 (-0.03%); split: -0.04%, +0.00%
CodeSize: 16104112 -> 16097272 (-0.04%); split: -0.05%, +0.00%
VGPRs: 46864 -> 46848 (-0.03%)
Latency: 15678064 -> 15677099 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 3779550 -> 3778230 (-0.03%); split: -0.04%, +0.01%
VClause: 81590 -> 81598 (+0.01%)
SClause: 70753 -> 70751 (-0.00%); split: -0.01%, +0.00%
Copies: 240446 -> 240466 (+0.01%); split: -0.01%, +0.02%
PreSGPRs: 51121 -> 51062 (-0.12%)
PreVGPRs: 38538 -> 38505 (-0.09%)
VALU: 1978847 -> 1977777 (-0.05%); split: -0.06%, +0.00%
SALU: 439184 -> 439212 (+0.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
f0e24284f5 aco/optimizer: create max3/min3/med3 with salu min/max
Foz-DB Navi48:
Totals from 175 (0.21% of 82419) affected shaders:
Instrs: 465863 -> 465260 (-0.13%); split: -0.13%, +0.00%
CodeSize: 2362264 -> 2360744 (-0.06%); split: -0.07%, +0.00%
Latency: 1548501 -> 1548371 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 227683 -> 227630 (-0.02%); split: -0.08%, +0.06%
Copies: 33646 -> 33648 (+0.01%)
PreSGPRs: 9996 -> 10004 (+0.08%)
VALU: 175836 -> 175850 (+0.01%)
SALU: 122094 -> 121621 (-0.39%); split: -0.39%, +0.00%

Foz-DB Navi21:
Totals from 1 (0.00% of 82387) affected shaders:
InvThroughput: 74 -> 76 (+2.70%)
VALU: 57 -> 58 (+1.75%)
SALU: 61 -> 60 (-1.64%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
d21734e024 aco/optimizer: use new helper functions to create med3
Foz-DB Navi48:
Totals from 9659 (11.72% of 82419) affected shaders:
Instrs: 17301747 -> 17301735 (-0.00%); split: -0.00%, +0.00%
CodeSize: 93378108 -> 93378184 (+0.00%); split: -0.00%, +0.00%
Latency: 145441784 -> 145441791 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 25768777 -> 25768778 (+0.00%)
Copies: 1370123 -> 1370124 (+0.00%)
VALU: 9705655 -> 9705656 (+0.00%)

Foz-DB Navi21:
Totals from 22 (0.03% of 82387) affected shaders:
Instrs: 27433 -> 27406 (-0.10%)
CodeSize: 146440 -> 146352 (-0.06%); split: -0.06%, +0.00%
Latency: 305857 -> 305806 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 63634 -> 63580 (-0.08%)
VALU: 19109 -> 19082 (-0.14%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
6fc250fc06 aco/optimizer: use new helpers for min3/max3/minmax/maxmin
Foz-DB Navi48:
Totals from 10453 (12.68% of 82419) affected shaders:
Instrs: 18676282 -> 18675798 (-0.00%); split: -0.00%, +0.00%
CodeSize: 100603268 -> 100603508 (+0.00%); split: -0.00%, +0.00%
Latency: 157036823 -> 157031708 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 28049331 -> 28048776 (-0.00%); split: -0.00%, +0.00%
Copies: 1452464 -> 1452503 (+0.00%); split: -0.00%, +0.00%
PreVGPRs: 458422 -> 458413 (-0.00%); split: -0.00%, +0.00%
VALU: 10429583 -> 10429353 (-0.00%); split: -0.00%, +0.00%
SALU: 2628403 -> 2628416 (+0.00%); split: -0.00%, +0.00%
VOPD: 21738 -> 21744 (+0.03%); split: +0.04%, -0.01%

Foz-DB Navi21:
Totals from 889 (1.08% of 82387) affected shaders:
MaxWaves: 15641 -> 15639 (-0.01%); split: +0.01%, -0.03%
Instrs: 2505527 -> 2505489 (-0.00%); split: -0.01%, +0.01%
CodeSize: 13975300 -> 13976516 (+0.01%); split: -0.00%, +0.01%
VGPRs: 65584 -> 65576 (-0.01%); split: -0.02%, +0.01%
Latency: 37135606 -> 37132577 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 10937032 -> 10935704 (-0.01%); split: -0.01%, +0.00%
VClause: 63136 -> 63140 (+0.01%); split: -0.01%, +0.01%
Copies: 256011 -> 256073 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 51804 -> 51809 (+0.01%)
PreVGPRs: 57905 -> 57890 (-0.03%); split: -0.03%, +0.00%
VALU: 1593523 -> 1593339 (-0.01%); split: -0.02%, +0.00%
SALU: 425116 -> 425134 (+0.00%); split: -0.00%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
5d02eae052 aco/optimizer: add less agressive pattern matching option
Still a bit more aggresive than the classic is_used_once,
but it should still prevent most regressions for patterns
that use min/max/mul as outer instruction.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
2c05aa34aa aco/optimizer: create fma with s_mul_f32/f16
Foz-DB Navi48:
Totals from 14473 (17.56% of 82419) affected shaders:
MaxWaves: 397738 -> 397720 (-0.00%); split: +0.00%, -0.01%
Instrs: 22133626 -> 21984649 (-0.67%); split: -0.68%, +0.01%
CodeSize: 117440104 -> 117111440 (-0.28%); split: -0.30%, +0.02%
VGPRs: 825820 -> 825928 (+0.01%); split: -0.01%, +0.02%
SpillSGPRs: 15496 -> 15512 (+0.10%); split: -0.19%, +0.29%
Latency: 152141755 -> 152058676 (-0.05%); split: -0.07%, +0.02%
InvThroughput: 25715152 -> 25681160 (-0.13%); split: -0.14%, +0.01%
VClause: 402752 -> 400798 (-0.49%); split: -0.53%, +0.04%
SClause: 587448 -> 586772 (-0.12%); split: -0.19%, +0.07%
Copies: 1650891 -> 1661495 (+0.64%); split: -0.14%, +0.78%
Branches: 541341 -> 541334 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 748235 -> 748332 (+0.01%); split: -0.03%, +0.04%
VALU: 11754090 -> 11755396 (+0.01%); split: -0.01%, +0.02%
SALU: 3659133 -> 3536435 (-3.35%); split: -3.36%, +0.01%
VOPD: 17201 -> 17083 (-0.69%); split: +0.05%, -0.74%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
5abc961514 aco/optimizer: use new helpers to create fma
Foz-DB Navi48:
Totals from 25949 (31.48% of 82419) affected shaders:
Instrs: 30904250 -> 30904153 (-0.00%); split: -0.00%, +0.00%
CodeSize: 164623100 -> 164604652 (-0.01%); split: -0.01%, +0.00%
Latency: 209402611 -> 209402684 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 36622293 -> 36622236 (-0.00%); split: -0.00%, +0.00%
Copies: 2252080 -> 2251998 (-0.00%); split: -0.00%, +0.00%
VALU: 16831507 -> 16831382 (-0.00%); split: -0.00%, +0.00%
VOPD: 28252 -> 28295 (+0.15%)

Foz-DB Navi21:
Totals from 56269 (68.30% of 82387) affected shaders:
Instrs: 43751754 -> 43746463 (-0.01%); split: -0.01%, +0.00%
CodeSize: 233615096 -> 233576912 (-0.02%); split: -0.02%, +0.00%
VGPRs: 2445528 -> 2445520 (-0.00%)
Latency: 276776920 -> 276761183 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 66406450 -> 66402214 (-0.01%); split: -0.01%, +0.00%
VClause: 902951 -> 902947 (-0.00%)
Copies: 3926260 -> 3926289 (+0.00%); split: -0.01%, +0.01%
VALU: 26924056 -> 26918783 (-0.02%); split: -0.02%, +0.00%
SALU: 6938335 -> 6938321 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
1e2aea7461 aco/optimizer: add new helper functions for combining two instructions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
87e168f223 aco/optimizer: make label_mad more generic
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
53f5e447db aco/optimizer: add extract_float helper
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Georg Lehmann
7eccf5c745 aco/optimizer: refactor insert
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38150>
2025-11-19 10:51:42 +00:00
Samuel Pitoiset
7c9e5b4c1c radv: remove unreachable code for prefetch in radv_cs_emit_cp_dma()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
CP DMA prefetches are implemented with a separate function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449>
2025-11-19 08:03:38 +00:00
Samuel Pitoiset
60d438e517 radv: always use MALL for CP DMA operations on GFX12
CP DMA isn't coherent with L2 on GFX12, but {SRC,DST}_ADDR_TC_L2 means
MALL.

Only small buffers are using copy/fill CP DMA operations, so this
shouldn't have much effect.

Found by inspection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38449>
2025-11-19 08:03:38 +00:00
Samuel Pitoiset
b2a13ce92c radv/tests: require drm-shim and use it instead of RADV_FORCE_FAMILY
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38507>
2025-11-19 07:11:05 +00:00
Boris Brezillon
ea4d4d2a77 nir: Prepare nir_lower_io_vars_to_temporaries() for optional PLS lowering
Rather than adding another boolean to optionally lower PLS vars, pass
the types we want to lowers through a nir_variable_mode bitmask.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37110>
2025-11-18 20:25:42 +00:00
Natalie Vock
1243d575a5 aco/insert_nops: Consider s_setpc target susceptible to VALUReadSGPRHazard
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.

This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38290>
2025-11-18 18:43:00 +00:00
Samuel Pitoiset
9f512d8f93 radv: advertise VK_EXT_custom_resolve
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38442>
2025-11-18 17:03:13 +00:00
Samuel Pitoiset
91469bcc30 radv: implement VK_EXT_custom_resolve
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38442>
2025-11-18 17:03:13 +00:00
Dave Airlie
ad25196d35 radv: add support for cooperative matrix reductions.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This add support to the lowering the reduction operations.

Thanks to Georg Lehmann for a lot of the ideas and optimising in
this.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>
2025-11-17 23:33:59 +00:00
Georg Lehmann
3a175b54a4 aco,nir: support subdword v_permlane_b16
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38389>
2025-11-17 23:33:59 +00:00
Georg Lehmann
018f45f981 aco/insert_NOPs: remove redundant VALUReadSGPRHazard waits
Mostly removes SALU->VALU waits if the VALU writes a sgpr.

Foz-DB GFX1201:
Totals from 18553 (22.51% of 82419) affected shaders:
Instrs: 27388414 -> 27321118 (-0.25%)
CodeSize: 145389276 -> 145118128 (-0.19%); split: -0.19%, +0.00%
Latency: 200288087 -> 200252583 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 36311237 -> 36307369 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38445>
2025-11-17 16:28:36 +00:00
Georg Lehmann
b1d730982e aco/insert_NOPs: remove redundant VALUMaskWriteHazard waits
This removes a lot of VALU->SALU waits.

Foz-DB Navi31:
Totals from 8908 (10.84% of 82179) affected shaders:
Instrs: 17118986 -> 17084870 (-0.20%)
CodeSize: 91057212 -> 90919300 (-0.15%); split: -0.15%, +0.00%
Latency: 154044128 -> 154036848 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 26608698 -> 26607933 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38445>
2025-11-17 16:28:36 +00:00
David Rosca
3abb2707e2 radv/video: Fix coding used_by_curr_pic_lt_flag
Fixes: d68a1fc0d4 ("radv/video: port hevc slice header encoding from radeonsi")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14301
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38475>
2025-11-17 11:51:08 +00:00
Samuel Pitoiset
8d4ba81ca8 radv: remove now unused SDMA helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38448>
2025-11-17 11:29:24 +00:00
Samuel Pitoiset
a4e4f13c78 ac,radv: add ac_emit_sdma_copy_t2t_sub_window()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38448>
2025-11-17 11:29:24 +00:00
Samuel Pitoiset
f5ecc5ffd5 ac,radv,radeonsi: add ac_emit_sdma_copy_tiled_sub_window()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38448>
2025-11-17 11:29:24 +00:00
Samuel Pitoiset
5f8fa6ae03 ac,radv,radeonsi: add ac_emit_sdma_copy_linear_sub_window()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38448>
2025-11-17 11:29:23 +00:00
David Rosca
3858a6a696 radv/video: Fix coding allow_screen_content_tools and force_integer_mv
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This was copied from radeonsi which expected seq_force_screen_content_tools = 2
and seq_force_integer_mv = 2.

Fixes: 37e71a5cb2 ("radv/video: add support for AV1 encoding")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38371>
2025-11-17 08:43:54 +00:00
Collabora's Gfx CI Team
c319cb627f Uprev ANGLE to 127a84404b88dbc4327ffb7f831a9a36c3b111bc
e9626fbced...127a84404b

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38102>
2025-11-17 08:07:36 +00:00
Samuel Pitoiset
9666bd1245 radv: remove unnecessary handling of SDMA in radv_cs_emit_write_event_eop()
This function is only called for GFX or ACE. SDMA uses are already
handled before.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38430>
2025-11-17 08:28:38 +01:00
Samuel Pitoiset
6413651bcf ac,radv,radeonsi: add ac_emit_sdma_copy_linear()
RadeonSI wasn't considering the undocumented HW limitation apparently.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38430>
2025-11-17 08:28:37 +01:00
Samuel Pitoiset
191bf7aba6 ac,radv: add ac_emit_sdma_constant_fill()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38430>
2025-11-17 08:25:32 +01:00
Julia Zhang
0007644913 amdgpu/virtio: unmap bo in destroy_host_blob
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Unmap bo in destroy_host_blob when hb->cpu_addr is not NULL.
This avoid memory leak caused by bo refcount is not 0 when
amdvgpu_bo_free is called.

Signed-off-by: Julia Zhang <Julia.Zhang@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38440>
2025-11-17 05:35:31 +00:00
Timur Kristóf
0d20bdbe2c ac: Improve description of some HW workarounds
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Also add references to their conterparts in old PAL code.
This makes it easier to remember whether we mitigated the
same issues as PAL did.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38304>
2025-11-15 14:25:07 +01:00