Georg Lehmann
906b7dbcec
aco: replace novalidateir with novalidate debug option
...
The next commits will add more validation that's enabled by default.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:25 +00:00
Georg Lehmann
1540db244b
aco/optimizer: store parent_instr for all temps
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:25 +00:00
Georg Lehmann
918359b41e
aco/optimizer: add semantic aliases for info.instr
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:25 +00:00
Georg Lehmann
c62d7e680c
aco/optimizer: remove label_mul
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:24 +00:00
Georg Lehmann
f773860a23
aco/optimizer: remove label_bitwise
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:24 +00:00
Georg Lehmann
cf3ec4a28f
aco/optimizer: remove label_split
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:24 +00:00
Georg Lehmann
907e86e8fb
aco/optimizer: remove label_vec
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:24 +00:00
Georg Lehmann
2c0a924521
aco/optimizer: remove label_minmax
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:24 +00:00
Georg Lehmann
dca8a7981d
aco/optimizer: remove label_f2f32
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:24 +00:00
Georg Lehmann
17a973c6fa
aco/optimizer: remove label_dpp8 and label_dpp16
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:24 +00:00
Georg Lehmann
dfa7e56f23
aco/optimizer: remove label_add_sub
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:23 +00:00
Georg Lehmann
345bf8a2f2
aco/optimizer: remove label_vop3p
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:23 +00:00
Georg Lehmann
6667ee66d5
aco/optimizer: remove label_vopc
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34858 >
2025-05-09 14:23:23 +00:00
Georg Lehmann
6f4e26e54d
radv/gfx12+: enable VK_KHR_shader_bfloat16
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
GFX11 seems to have precision issues, so don't enable the extension there for now.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:26 +00:00
Georg Lehmann
a2209547db
ac/nir: enable nir_op_bfdot2_bfadd
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:26 +00:00
Georg Lehmann
44be05cc45
ac/llvm: support nir_op_bfdot2_bfadd
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:26 +00:00
Georg Lehmann
f5a5905e37
aco: support nir_op_bfdot2_bfadd
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:26 +00:00
Georg Lehmann
f364303084
ac/nir: set lower_bfloat16_conversions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:26 +00:00
Georg Lehmann
7716e63cd6
radv/nir/lower_cmat: handle bf16 conversions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:25 +00:00
Georg Lehmann
78524837c1
radv/nir/opt_cmat: support bfloat16
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:25 +00:00
Georg Lehmann
5ca98bf99e
aco: support bf16 wmma
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:25 +00:00
Georg Lehmann
e8f5c335ff
radv,aco,nir: keep the A and B base type for cmat_muladd_amd
...
With bfloat16, and the two fp8 formats in the future, using just the bit size
to identify the types is no longer possible.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768 >
2025-05-09 11:20:25 +00:00
Konstantin Seurer
c21e1776b3
radv: Use build flags instead of defines
...
Using the meta framework makes managing shader variants much easier.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594 >
2025-05-09 09:55:32 +00:00
Konstantin Seurer
33ac143779
vulkan: Introduce VK_BUILD_FLAG for specializing BVH build shaders
...
The advantage of using spec constants is that we do not have to include
multiple spirv binaries for multiple variants of a build stage.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34594 >
2025-05-09 09:55:32 +00:00
Samuel Pitoiset
dae45adc9d
aco: adjust an assertion in select_trap_handler_shader()
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840 >
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
ae6d3df139
radv,aco: dump more SQ_WAVE registers from the trap handler on GFX12
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840 >
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
0e73c85424
radv: fix configuring TRAP_PRESENT for compute shaders on GFX12
...
It no longer exists and it's been replaced by DYNAMIC_VGPR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840 >
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
50a01a6559
radv: fix save/restore SCC in the trap handler on GFX12
...
SQ_WAVE_STATE_PRIV now contains SCC instead of SQ_WAVE_STATUS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840 >
2025-05-09 09:04:57 +00:00
Samuel Pitoiset
effa563bb0
radv: adjust computing the PC from the trap handler on GFX12
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34840 >
2025-05-09 09:04:57 +00:00
Valentine Burley
34012d5af3
ci: Remove EXTERNAL_KERNEL_TAG variable
...
The EXTERNAL_KERNEL_TAG variable is no longer needed.
For LAVA and bare-metal, we can override the KERNEL_TAG variable to fetch
both the kernel image and modules from a different tag than the default
mainline gfx-ci/linux kernel.
For LAVA, this also avoids the issue where jobs using EXTERNAL_KERNEL_TAG
would still have mainline kernel modules downloaded by the LAVA overlay.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34873 >
2025-05-09 07:18:53 +00:00
Samuel Pitoiset
4b76d04f7f
radv: ignore conditional rendering with vkCmdTraceRays*
...
CmdTraceRays is neither a dispatch or a draw command which means it
shouldn't be affected by conditional rendering.
Fixes recent VKCTS coverage.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34868 >
2025-05-08 19:08:10 +00:00
Samuel Pitoiset
b7d2cdd2b4
radv: ignore radv_disable_dcc_stores on GFX12
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
It's not necessary because DCC is completely transparent to the
userspace driver. Also it's causing issues with scanout.
This fixes rendering issues with scanout in Indiana Jones.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12924
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34859 >
2025-05-08 17:17:28 +02:00
Rhys Perry
2704a30df0
radv: perform nir_opt_access before the first radv_optimize_nir
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Two lowered loads might not be CSE'd after nir_lower_explicit_io if one of
them is shrinked. This doesn't happen for deref loads, but it needs the
CAN_REORDER flag first.
fossil-db (gfx1201):
Totals from 556 (0.70% of 79377) affected shaders:
MaxWaves: 14936 -> 14940 (+0.03%); split: +0.05%, -0.03%
Instrs: 2140334 -> 2140942 (+0.03%); split: -0.07%, +0.10%
CodeSize: 11137948 -> 11145416 (+0.07%); split: -0.07%, +0.13%
SpillSGPRs: 2385 -> 2527 (+5.95%); split: -0.34%, +6.29%
Latency: 12310570 -> 12305011 (-0.05%); split: -0.08%, +0.04%
InvThroughput: 2136142 -> 2135516 (-0.03%); split: -0.06%, +0.03%
VClause: 47419 -> 47420 (+0.00%); split: -0.01%, +0.01%
SClause: 58423 -> 58290 (-0.23%); split: -0.36%, +0.14%
Copies: 160626 -> 161321 (+0.43%); split: -0.25%, +0.68%
Branches: 69693 -> 69710 (+0.02%); split: -0.04%, +0.06%
PreSGPRs: 34824 -> 34945 (+0.35%); split: -0.24%, +0.58%
PreVGPRs: 28682 -> 28649 (-0.12%); split: -0.36%, +0.24%
VALU: 1080800 -> 1081171 (+0.03%); split: -0.04%, +0.08%
SALU: 353112 -> 353770 (+0.19%); split: -0.15%, +0.34%
SMEM: 81587 -> 81364 (-0.27%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
18a53230eb
aco: don't check dst_bitsize in apply_load_extract
...
I don't think this is necessary.
fossil-db (gfx1201):
Totals from 12 (0.02% of 79377) affected shaders:
Instrs: 73041 -> 72669 (-0.51%); split: -0.51%, +0.00%
CodeSize: 417376 -> 413852 (-0.84%); split: -0.85%, +0.00%
Latency: 1301862 -> 1301533 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 599874 -> 599723 (-0.03%)
VClause: 1344 -> 1346 (+0.15%)
Copies: 15855 -> 15832 (-0.15%); split: -0.37%, +0.23%
VALU: 42138 -> 41883 (-0.61%); split: -0.61%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
eb95f7cc0e
aco: support sign extension in apply_load_extract
...
fossil-db (gfx1201):
Totals from 10 (0.01% of 79377) affected shaders:
Instrs: 28954 -> 28938 (-0.06%)
CodeSize: 164552 -> 164472 (-0.05%)
Latency: 1249341 -> 1247037 (-0.18%)
InvThroughput: 297077 -> 296618 (-0.15%)
VALU: 15951 -> 15941 (-0.06%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
0de0fd38b4
aco: support more opcodes in apply_ds_extract
...
fossil-db (gfx1201):
Totals from 320 (0.40% of 79377) affected shaders:
Instrs: 3439754 -> 3432384 (-0.21%)
CodeSize: 18008696 -> 17973180 (-0.20%); split: -0.20%, +0.00%
VGPRs: 16016 -> 15404 (-3.82%)
Latency: 20246168 -> 20295740 (+0.24%); split: -0.08%, +0.33%
InvThroughput: 4462916 -> 4478546 (+0.35%); split: -0.08%, +0.43%
VClause: 87123 -> 87099 (-0.03%)
Copies: 261779 -> 261948 (+0.06%); split: -0.05%, +0.12%
Branches: 94611 -> 94601 (-0.01%); split: -0.01%, +0.00%
VALU: 1870695 -> 1865738 (-0.26%)
SALU: 488351 -> 487557 (-0.16%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
3b42626973
ac/nir: allow 8/16-bit smem loads
...
fossil-db (gfx1201):
Totals from 295 (0.37% of 79377) affected shaders:
Instrs: 314018 -> 313355 (-0.21%); split: -0.22%, +0.00%
CodeSize: 1697996 -> 1696528 (-0.09%); split: -0.11%, +0.02%
Latency: 4197719 -> 4197106 (-0.01%)
InvThroughput: 1258891 -> 1258744 (-0.01%)
PreSGPRs: 12232 -> 12230 (-0.02%)
SALU: 66762 -> 66269 (-0.74%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
5b116c4de9
ac/nir: allow vectorization of unsupported 8/16-bit loads
...
These can later be lowered to a vectorized 32-bit load.
fossil-db (gfx1201):
Totals from 1259 (1.59% of 79377) affected shaders:
MaxWaves: 36821 -> 36817 (-0.01%)
Instrs: 4363702 -> 4355749 (-0.18%); split: -0.23%, +0.05%
CodeSize: 22779980 -> 22706504 (-0.32%); split: -0.37%, +0.05%
VGPRs: 69672 -> 69792 (+0.17%); split: -0.02%, +0.19%
SpillSGPRs: 675 -> 673 (-0.30%)
Latency: 26684053 -> 26663819 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 5617687 -> 5614798 (-0.05%); split: -0.10%, +0.04%
VClause: 106830 -> 106654 (-0.16%); split: -0.17%, +0.00%
SClause: 75523 -> 75495 (-0.04%); split: -0.04%, +0.01%
Copies: 323199 -> 323525 (+0.10%); split: -0.10%, +0.20%
Branches: 109475 -> 109480 (+0.00%); split: -0.00%, +0.01%
PreSGPRs: 55036 -> 55040 (+0.01%)
PreVGPRs: 47538 -> 47582 (+0.09%); split: -0.12%, +0.21%
VALU: 2377777 -> 2389977 (+0.51%); split: -0.02%, +0.53%
SALU: 578272 -> 578385 (+0.02%); split: -0.02%, +0.04%
VMEM: 190065 -> 181204 (-4.66%)
SMEM: 99709 -> 99565 (-0.14%)
VOPD: 244 -> 243 (-0.41%); split: +0.41%, -0.82%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
6dbf44ad9c
ac/nir: allow less than one register of overfetch
...
This is to allow vectorization of 8/16-bit loads, which can later be
cheaply lowered to a 32-bit load.
fossil-db (gfx1201):
Totals from 178 (0.22% of 79377) affected shaders:
MaxWaves: 4138 -> 4102 (-0.87%)
Instrs: 619714 -> 617917 (-0.29%); split: -0.32%, +0.03%
CodeSize: 3364396 -> 3352724 (-0.35%); split: -0.38%, +0.03%
VGPRs: 12896 -> 12980 (+0.65%); split: -0.19%, +0.84%
SpillSGPRs: 546 -> 545 (-0.18%)
Latency: 7589585 -> 7406076 (-2.42%); split: -2.45%, +0.04%
InvThroughput: 1926356 -> 1879866 (-2.41%); split: -2.42%, +0.00%
VClause: 12301 -> 11750 (-4.48%)
SClause: 13614 -> 13583 (-0.23%); split: -0.45%, +0.22%
Copies: 82207 -> 82265 (+0.07%); split: -0.10%, +0.17%
Branches: 19284 -> 19266 (-0.09%)
PreSGPRs: 9525 -> 9457 (-0.71%)
PreVGPRs: 12366 -> 12421 (+0.44%)
VALU: 347928 -> 348020 (+0.03%); split: -0.01%, +0.04%
SALU: 82620 -> 82519 (-0.12%); split: -0.19%, +0.07%
VMEM: 22248 -> 21430 (-3.68%)
SMEM: 17951 -> 17843 (-0.60%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
ddef4bddf8
ac/nir: round components when lowering 8/16-bit loads to 32-bit
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
4fa1c92862
aco/gfx12: allow 8/16-bit smem loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
75efc218f5
aco: support 8/16-bit loads in smem_combine()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
8abb787c6b
radv/gfx12: use dword3 smem loads for push constants
...
fossil-db (gfx1201):
Totals from 5 (0.01% of 79377) affected shaders:
(no affected stats)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
13b0131edc
aco/gfx12: select dwordx3 smem loads
...
fossil-db (gfx1201):
Totals from 47814 (60.24% of 79377) affected shaders:
MaxWaves: 1436871 -> 1436855 (-0.00%); split: +0.00%, -0.00%
Instrs: 36653621 -> 36649461 (-0.01%); split: -0.07%, +0.06%
CodeSize: 194102884 -> 194076060 (-0.01%); split: -0.06%, +0.04%
VGPRs: 2267944 -> 2269648 (+0.08%); split: -0.01%, +0.08%
SpillSGPRs: 8301 -> 8295 (-0.07%); split: -0.08%, +0.01%
Latency: 249627561 -> 249631829 (+0.00%); split: -0.04%, +0.04%
InvThroughput: 40004042 -> 40003575 (-0.00%); split: -0.02%, +0.02%
VClause: 680488 -> 680429 (-0.01%); split: -0.08%, +0.07%
SClause: 1062835 -> 1066206 (+0.32%); split: -0.20%, +0.52%
Copies: 2393981 -> 2393607 (-0.02%); split: -0.23%, +0.22%
Branches: 728117 -> 728113 (-0.00%); split: -0.00%, +0.00%
VALU: 20358269 -> 20358585 (+0.00%); split: -0.01%, +0.01%
SALU: 4737317 -> 4736411 (-0.02%); split: -0.07%, +0.06%
SMEM: 1712349 -> 1710075 (-0.13%); split: -0.13%, +0.00%
VOPD: 5808 -> 5813 (+0.09%); split: +0.12%, -0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
90a5c93ea5
aco: prepare for dwordx3 smem loads
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:50 +00:00
Rhys Perry
208d62430f
aco/gfx12: use s_load_dwordx3 to load ray launch sizes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:49 +00:00
Rhys Perry
cbd718506b
aco: add smem opcode helper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162 >
2025-05-08 13:30:49 +00:00
Jesse.Zhang
d8624e6a79
winsys/amdgpu: Add support for queue priority in Mesa
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This patch adds support for queue priority levels in Mesa's AMDGPU winsys layer.
The changes include:
1. Updated ac_drm_create_userqueue() to accept and pass through flags parameter
2. Modified amdgpu_userq_init() to use the flags when creating queues
3. Added flags field to amdgpu_userq struct to store priority settings
4. Updated header definitions to match kernel UAPI changes
This aligns with the kernel changes provided by Alex:
https://lists.freedesktop.org/archives/amd-gfx/2025-April/122782.html
https://lists.freedesktop.org/archives/amd-gfx/2025-April/122780.html
https://lists.freedesktop.org/archives/amd-gfx/2025-April/122786.html
v2: We only need 1 normal priority queue and 1 TMZ normal priority queue.(Marek Olšák)
v3: Simplified to only support normal priority queues
v4: use a local variable instead of being in struct amdgpu_userq.(Marek Olšák)
v5: rebase the latest main branch.
Signed-off-by: Jesse.Zhang <Jesse.zhang@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34568 >
2025-05-08 04:29:29 +00:00
Marek Olšák
870d17012a
ac: adjust maximum HS workgroup size
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This has no effect on triangles because max 64 patches implied max 192
threads, but it improves performance for cases when the number of threads
per patch is > 3.
This improves the score for gfxbench5 "gl_tess_off" (offscreen) by 11%
on Navi48.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863 >
2025-05-08 02:54:13 +00:00
Marek Olšák
b960137ebf
aco: remove unused aco_shader_info::tcs_offchip_layout
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34863 >
2025-05-08 02:54:13 +00:00