Commit graph

17520 commits

Author SHA1 Message Date
Georg Lehmann
906b7dbcec aco: replace novalidateir with novalidate debug option
The next commits will add more validation that's enabled by default.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:25 +00:00
Georg Lehmann
1540db244b aco/optimizer: store parent_instr for all temps
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:25 +00:00
Georg Lehmann
918359b41e aco/optimizer: add semantic aliases for info.instr
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:25 +00:00
Georg Lehmann
c62d7e680c aco/optimizer: remove label_mul
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:24 +00:00
Georg Lehmann
f773860a23 aco/optimizer: remove label_bitwise
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:24 +00:00
Georg Lehmann
cf3ec4a28f aco/optimizer: remove label_split
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:24 +00:00
Georg Lehmann
907e86e8fb aco/optimizer: remove label_vec
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:24 +00:00
Georg Lehmann
2c0a924521 aco/optimizer: remove label_minmax
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:24 +00:00
Georg Lehmann
dca8a7981d aco/optimizer: remove label_f2f32
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:24 +00:00
Georg Lehmann
17a973c6fa aco/optimizer: remove label_dpp8 and label_dpp16
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:24 +00:00
Georg Lehmann
dfa7e56f23 aco/optimizer: remove label_add_sub
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:23 +00:00
Georg Lehmann
345bf8a2f2 aco/optimizer: remove label_vop3p
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:23 +00:00
Georg Lehmann
6667ee66d5 aco/optimizer: remove label_vopc
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858>
2025-05-09 14:23:23 +00:00
Georg Lehmann
6f4e26e54d radv/gfx12+: enable VK_KHR_shader_bfloat16
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
GFX11 seems to have precision issues, so don't enable the extension there for now.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Georg Lehmann
a2209547db ac/nir: enable nir_op_bfdot2_bfadd
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Georg Lehmann
44be05cc45 ac/llvm: support nir_op_bfdot2_bfadd
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Georg Lehmann
f5a5905e37 aco: support nir_op_bfdot2_bfadd
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Georg Lehmann
f364303084 ac/nir: set lower_bfloat16_conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Georg Lehmann
7716e63cd6 radv/nir/lower_cmat: handle bf16 conversions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Georg Lehmann
78524837c1 radv/nir/opt_cmat: support bfloat16
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Georg Lehmann
5ca98bf99e aco: support bf16 wmma
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Georg Lehmann
e8f5c335ff radv,aco,nir: keep the A and B base type for cmat_muladd_amd
With bfloat16, and the two fp8 formats in the future, using just the bit size
to identify the types is no longer possible.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:25 +00:00
Konstantin Seurer
c21e1776b3 radv: Use build flags instead of defines
Using the meta framework makes managing shader variants much easier.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594>
2025-05-09 09:55:32 +00:00
Konstantin Seurer
33ac143779 vulkan: Introduce VK_BUILD_FLAG for specializing BVH build shaders
The advantage of using spec constants is that we do not have to include
multiple spirv binaries for multiple variants of a build stage.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594>
2025-05-09 09:55:32 +00:00
Samuel Pitoiset
dae45adc9d aco: adjust an assertion in select_trap_handler_shader()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
ae6d3df139 radv,aco: dump more SQ_WAVE registers from the trap handler on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
0e73c85424 radv: fix configuring TRAP_PRESENT for compute shaders on GFX12
It no longer exists and it's been replaced by DYNAMIC_VGPR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
50a01a6559 radv: fix save/restore SCC in the trap handler on GFX12
SQ_WAVE_STATE_PRIV now contains SCC instead of SQ_WAVE_STATUS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
effa563bb0 radv: adjust computing the PC from the trap handler on GFX12
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840>
2025-05-09 09:04:57 +00:00
Valentine Burley
34012d5af3 ci: Remove EXTERNAL_KERNEL_TAG variable
The EXTERNAL_KERNEL_TAG variable is no longer needed.
For LAVA and bare-metal, we can override the KERNEL_TAG variable to fetch
both the kernel image and modules from a different tag than the default
mainline gfx-ci/linux kernel.

For LAVA, this also avoids the issue where jobs using EXTERNAL_KERNEL_TAG
would still have mainline kernel modules downloaded by the LAVA overlay.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34873>
2025-05-09 07:18:53 +00:00
Samuel Pitoiset
4b76d04f7f radv: ignore conditional rendering with vkCmdTraceRays*
CmdTraceRays is neither a dispatch or a draw command which means it
shouldn't be affected by conditional rendering.

Fixes recent VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34868>
2025-05-08 19:08:10 +00:00
Samuel Pitoiset
b7d2cdd2b4 radv: ignore radv_disable_dcc_stores on GFX12
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's not necessary because DCC is completely transparent to the
userspace driver. Also it's causing issues with scanout.

This fixes rendering issues with scanout in Indiana Jones.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12924
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34859>
2025-05-08 17:17:28 +02:00
Rhys Perry
2704a30df0 radv: perform nir_opt_access before the first radv_optimize_nir
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Two lowered loads might not be CSE'd after nir_lower_explicit_io if one of
them is shrinked. This doesn't happen for deref loads, but it needs the
CAN_REORDER flag first.

fossil-db (gfx1201):
Totals from 556 (0.70% of 79377) affected shaders:
MaxWaves: 14936 -> 14940 (+0.03%); split: +0.05%, -0.03%
Instrs: 2140334 -> 2140942 (+0.03%); split: -0.07%, +0.10%
CodeSize: 11137948 -> 11145416 (+0.07%); split: -0.07%, +0.13%
SpillSGPRs: 2385 -> 2527 (+5.95%); split: -0.34%, +6.29%
Latency: 12310570 -> 12305011 (-0.05%); split: -0.08%, +0.04%
InvThroughput: 2136142 -> 2135516 (-0.03%); split: -0.06%, +0.03%
VClause: 47419 -> 47420 (+0.00%); split: -0.01%, +0.01%
SClause: 58423 -> 58290 (-0.23%); split: -0.36%, +0.14%
Copies: 160626 -> 161321 (+0.43%); split: -0.25%, +0.68%
Branches: 69693 -> 69710 (+0.02%); split: -0.04%, +0.06%
PreSGPRs: 34824 -> 34945 (+0.35%); split: -0.24%, +0.58%
PreVGPRs: 28682 -> 28649 (-0.12%); split: -0.36%, +0.24%
VALU: 1080800 -> 1081171 (+0.03%); split: -0.04%, +0.08%
SALU: 353112 -> 353770 (+0.19%); split: -0.15%, +0.34%
SMEM: 81587 -> 81364 (-0.27%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
18a53230eb aco: don't check dst_bitsize in apply_load_extract
I don't think this is necessary.

fossil-db (gfx1201):
Totals from 12 (0.02% of 79377) affected shaders:
Instrs: 73041 -> 72669 (-0.51%); split: -0.51%, +0.00%
CodeSize: 417376 -> 413852 (-0.84%); split: -0.85%, +0.00%
Latency: 1301862 -> 1301533 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 599874 -> 599723 (-0.03%)
VClause: 1344 -> 1346 (+0.15%)
Copies: 15855 -> 15832 (-0.15%); split: -0.37%, +0.23%
VALU: 42138 -> 41883 (-0.61%); split: -0.61%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
eb95f7cc0e aco: support sign extension in apply_load_extract
fossil-db (gfx1201):
Totals from 10 (0.01% of 79377) affected shaders:
Instrs: 28954 -> 28938 (-0.06%)
CodeSize: 164552 -> 164472 (-0.05%)
Latency: 1249341 -> 1247037 (-0.18%)
InvThroughput: 297077 -> 296618 (-0.15%)
VALU: 15951 -> 15941 (-0.06%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
0de0fd38b4 aco: support more opcodes in apply_ds_extract
fossil-db (gfx1201):
Totals from 320 (0.40% of 79377) affected shaders:
Instrs: 3439754 -> 3432384 (-0.21%)
CodeSize: 18008696 -> 17973180 (-0.20%); split: -0.20%, +0.00%
VGPRs: 16016 -> 15404 (-3.82%)
Latency: 20246168 -> 20295740 (+0.24%); split: -0.08%, +0.33%
InvThroughput: 4462916 -> 4478546 (+0.35%); split: -0.08%, +0.43%
VClause: 87123 -> 87099 (-0.03%)
Copies: 261779 -> 261948 (+0.06%); split: -0.05%, +0.12%
Branches: 94611 -> 94601 (-0.01%); split: -0.01%, +0.00%
VALU: 1870695 -> 1865738 (-0.26%)
SALU: 488351 -> 487557 (-0.16%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
3b42626973 ac/nir: allow 8/16-bit smem loads
fossil-db (gfx1201):
Totals from 295 (0.37% of 79377) affected shaders:
Instrs: 314018 -> 313355 (-0.21%); split: -0.22%, +0.00%
CodeSize: 1697996 -> 1696528 (-0.09%); split: -0.11%, +0.02%
Latency: 4197719 -> 4197106 (-0.01%)
InvThroughput: 1258891 -> 1258744 (-0.01%)
PreSGPRs: 12232 -> 12230 (-0.02%)
SALU: 66762 -> 66269 (-0.74%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
5b116c4de9 ac/nir: allow vectorization of unsupported 8/16-bit loads
These can later be lowered to a vectorized 32-bit load.

fossil-db (gfx1201):
Totals from 1259 (1.59% of 79377) affected shaders:
MaxWaves: 36821 -> 36817 (-0.01%)
Instrs: 4363702 -> 4355749 (-0.18%); split: -0.23%, +0.05%
CodeSize: 22779980 -> 22706504 (-0.32%); split: -0.37%, +0.05%
VGPRs: 69672 -> 69792 (+0.17%); split: -0.02%, +0.19%
SpillSGPRs: 675 -> 673 (-0.30%)
Latency: 26684053 -> 26663819 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 5617687 -> 5614798 (-0.05%); split: -0.10%, +0.04%
VClause: 106830 -> 106654 (-0.16%); split: -0.17%, +0.00%
SClause: 75523 -> 75495 (-0.04%); split: -0.04%, +0.01%
Copies: 323199 -> 323525 (+0.10%); split: -0.10%, +0.20%
Branches: 109475 -> 109480 (+0.00%); split: -0.00%, +0.01%
PreSGPRs: 55036 -> 55040 (+0.01%)
PreVGPRs: 47538 -> 47582 (+0.09%); split: -0.12%, +0.21%
VALU: 2377777 -> 2389977 (+0.51%); split: -0.02%, +0.53%
SALU: 578272 -> 578385 (+0.02%); split: -0.02%, +0.04%
VMEM: 190065 -> 181204 (-4.66%)
SMEM: 99709 -> 99565 (-0.14%)
VOPD: 244 -> 243 (-0.41%); split: +0.41%, -0.82%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
6dbf44ad9c ac/nir: allow less than one register of overfetch
This is to allow vectorization of 8/16-bit loads, which can later be
cheaply lowered to a 32-bit load.

fossil-db (gfx1201):
Totals from 178 (0.22% of 79377) affected shaders:
MaxWaves: 4138 -> 4102 (-0.87%)
Instrs: 619714 -> 617917 (-0.29%); split: -0.32%, +0.03%
CodeSize: 3364396 -> 3352724 (-0.35%); split: -0.38%, +0.03%
VGPRs: 12896 -> 12980 (+0.65%); split: -0.19%, +0.84%
SpillSGPRs: 546 -> 545 (-0.18%)
Latency: 7589585 -> 7406076 (-2.42%); split: -2.45%, +0.04%
InvThroughput: 1926356 -> 1879866 (-2.41%); split: -2.42%, +0.00%
VClause: 12301 -> 11750 (-4.48%)
SClause: 13614 -> 13583 (-0.23%); split: -0.45%, +0.22%
Copies: 82207 -> 82265 (+0.07%); split: -0.10%, +0.17%
Branches: 19284 -> 19266 (-0.09%)
PreSGPRs: 9525 -> 9457 (-0.71%)
PreVGPRs: 12366 -> 12421 (+0.44%)
VALU: 347928 -> 348020 (+0.03%); split: -0.01%, +0.04%
SALU: 82620 -> 82519 (-0.12%); split: -0.19%, +0.07%
VMEM: 22248 -> 21430 (-3.68%)
SMEM: 17951 -> 17843 (-0.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
ddef4bddf8 ac/nir: round components when lowering 8/16-bit loads to 32-bit
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
4fa1c92862 aco/gfx12: allow 8/16-bit smem loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
75efc218f5 aco: support 8/16-bit loads in smem_combine()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
8abb787c6b radv/gfx12: use dword3 smem loads for push constants
fossil-db (gfx1201):
Totals from 5 (0.01% of 79377) affected shaders:
(no affected stats)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
13b0131edc aco/gfx12: select dwordx3 smem loads
fossil-db (gfx1201):
Totals from 47814 (60.24% of 79377) affected shaders:
MaxWaves: 1436871 -> 1436855 (-0.00%); split: +0.00%, -0.00%
Instrs: 36653621 -> 36649461 (-0.01%); split: -0.07%, +0.06%
CodeSize: 194102884 -> 194076060 (-0.01%); split: -0.06%, +0.04%
VGPRs: 2267944 -> 2269648 (+0.08%); split: -0.01%, +0.08%
SpillSGPRs: 8301 -> 8295 (-0.07%); split: -0.08%, +0.01%
Latency: 249627561 -> 249631829 (+0.00%); split: -0.04%, +0.04%
InvThroughput: 40004042 -> 40003575 (-0.00%); split: -0.02%, +0.02%
VClause: 680488 -> 680429 (-0.01%); split: -0.08%, +0.07%
SClause: 1062835 -> 1066206 (+0.32%); split: -0.20%, +0.52%
Copies: 2393981 -> 2393607 (-0.02%); split: -0.23%, +0.22%
Branches: 728117 -> 728113 (-0.00%); split: -0.00%, +0.00%
VALU: 20358269 -> 20358585 (+0.00%); split: -0.01%, +0.01%
SALU: 4737317 -> 4736411 (-0.02%); split: -0.07%, +0.06%
SMEM: 1712349 -> 1710075 (-0.13%); split: -0.13%, +0.00%
VOPD: 5808 -> 5813 (+0.09%); split: +0.12%, -0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
90a5c93ea5 aco: prepare for dwordx3 smem loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
Rhys Perry
208d62430f aco/gfx12: use s_load_dwordx3 to load ray launch sizes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:49 +00:00
Rhys Perry
cbd718506b aco: add smem opcode helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:49 +00:00
Jesse.Zhang
d8624e6a79 winsys/amdgpu: Add support for queue priority in Mesa
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This patch adds support for queue priority levels in Mesa's AMDGPU winsys layer.
The changes include:

1. Updated ac_drm_create_userqueue() to accept and pass through flags parameter
2. Modified amdgpu_userq_init() to use the flags when creating queues
3. Added flags field to amdgpu_userq struct to store priority settings
4. Updated header definitions to match kernel UAPI changes

This aligns with the kernel changes provided by Alex:
https://lists.freedesktop.org/archives/amd-gfx/2025-April/122782.html
https://lists.freedesktop.org/archives/amd-gfx/2025-April/122780.html
https://lists.freedesktop.org/archives/amd-gfx/2025-April/122786.html

v2: We only need 1 normal priority queue and 1 TMZ normal priority queue.(Marek Olšák)
v3: Simplified to only support normal priority queues
v4: use a local variable instead of being in struct amdgpu_userq.(Marek Olšák)
v5: rebase the latest main branch.

Signed-off-by: Jesse.Zhang <Jesse.zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34568>
2025-05-08 04:29:29 +00:00
Marek Olšák
870d17012a ac: adjust maximum HS workgroup size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This has no effect on triangles because max 64 patches implied max 192
threads, but it improves performance for cases when the number of threads
per patch is > 3.

This improves the score for gfxbench5 "gl_tess_off" (offscreen) by 11%
on Navi48.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863>
2025-05-08 02:54:13 +00:00
Marek Olšák
b960137ebf aco: remove unused aco_shader_info::tcs_offchip_layout
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863>
2025-05-08 02:54:13 +00:00