Commit graph

19321 commits

Author SHA1 Message Date
Georg Lehmann
69b5767eee aco/optimizer: use new helpers to create v_fma_mixlo_f16
Foz-DB Navi21:
Totals from 69 (0.07% of 97591) affected shaders:
Instrs: 45091 -> 45057 (-0.08%)
CodeSize: 244016 -> 243932 (-0.03%); split: -0.12%, +0.09%
VGPRs: 1792 -> 1680 (-6.25%)
Latency: 133496 -> 133572 (+0.06%); split: -0.03%, +0.09%
InvThroughput: 35383 -> 35338 (-0.13%)
Copies: 4050 -> 4048 (-0.05%)
VALU: 30172 -> 30138 (-0.11%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:58 +00:00
Georg Lehmann
ee28801eae aco/optimizer: use new helpers to apply insert
Foz-DB Navi21:
Totals from 505 (0.52% of 97591) affected shaders:
Instrs: 1438254 -> 1436780 (-0.10%); split: -0.11%, +0.01%
CodeSize: 8063364 -> 8054192 (-0.11%); split: -0.13%, +0.01%
Latency: 18596788 -> 18597262 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 5213861 -> 5213061 (-0.02%); split: -0.02%, +0.01%
VClause: 37121 -> 37130 (+0.02%)
Copies: 174744 -> 175222 (+0.27%); split: -0.07%, +0.34%
Branches: 65722 -> 65718 (-0.01%)
VALU: 912967 -> 911074 (-0.21%); split: -0.21%, +0.00%
SALU: 251045 -> 251560 (+0.21%); split: -0.01%, +0.21%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:58 +00:00
Georg Lehmann
d60ce9ceef aco/optimizer: use new helpers to apply packed fsat
No Foz-DB changes.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:57 +00:00
Georg Lehmann
0a82c8cb13 aco/optimizer: back propagate modifiers through rcp
Foz-DB Navi21:
Totals from 5 (0.01% of 97591) affected shaders:
Instrs: 1473 -> 1468 (-0.34%)
CodeSize: 7664 -> 7660 (-0.05%)
Latency: 25897 -> 25863 (-0.13%)
InvThroughput: 2737 -> 2731 (-0.22%)
VALU: 1141 -> 1136 (-0.44%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:57 +00:00
Georg Lehmann
4442064449 aco/optimizer: use new helpers to apply neg/abs to output of instructions
Foz-DB Navi21:
Totals from 6765 (6.93% of 97591) affected shaders:
MaxWaves: 134398 -> 134408 (+0.01%)
Instrs: 9775725 -> 9768079 (-0.08%); split: -0.08%, +0.01%
CodeSize: 50785228 -> 50777880 (-0.01%); split: -0.02%, +0.01%
VGPRs: 445840 -> 445784 (-0.01%)
SpillSGPRs: 14483 -> 14476 (-0.05%)
Latency: 40232431 -> 40230284 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 10339051 -> 10329846 (-0.09%); split: -0.09%, +0.00%
VClause: 186785 -> 186788 (+0.00%); split: -0.01%, +0.01%
SClause: 157106 -> 157116 (+0.01%); split: -0.00%, +0.01%
Copies: 746817 -> 745378 (-0.19%); split: -0.26%, +0.07%
Branches: 189298 -> 189211 (-0.05%); split: -0.06%, +0.01%
PreSGPRs: 346169 -> 346158 (-0.00%)
PreVGPRs: 370712 -> 370660 (-0.01%); split: -0.02%, +0.00%
VALU: 6847295 -> 6839753 (-0.11%); split: -0.11%, +0.00%
SALU: 1139960 -> 1139942 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:56 +00:00
Georg Lehmann
58f407702d aco/optimizer: handle gfx11+ vinterp as fma special case
No effect on its own, but will be important for output modifiers.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:56 +00:00
Georg Lehmann
37d3c63a12 aco/optimizer: add new helpers for applying output modifiers
To replace the old instr_mod_labels.

Foz-DB Navi21:
Totals from 683 (0.70% of 97591) affected shaders:
Instrs: 3341288 -> 3340447 (-0.03%); split: -0.03%, +0.00%
CodeSize: 18522460 -> 18520212 (-0.01%); split: -0.01%, +0.00%
Latency: 34359519 -> 34358772 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 9229621 -> 9229494 (-0.00%); split: -0.00%, +0.00%
Copies: 368383 -> 368260 (-0.03%); split: -0.04%, +0.00%
PreSGPRs: 48060 -> 48061 (+0.00%)
SALU: 543991 -> 543150 (-0.15%); split: -0.16%, +0.00%

Changes are caused by optimizing not(salu) without killed scc.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:56 +00:00
Georg Lehmann
fc29821d3b aco/optimizer: move med3 -> add_clamp opt later
Soon we will apply omod later,
when the combine_instruction reaches the multiplication with constant.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658>
2025-11-29 08:27:55 +00:00
Georg Lehmann
39a61502e5 aco/opt_postRA: allow v_cmpx to clobber exec before nop split/create vector
Kind of ugly, but I really hate seeing this in every rt traversal loop:

image_bvh64_intersect_ray v[56:59], [v40, v41, v42, v47, v48, v49, v50, v51, v52, v53, v54, v55], s[44:47]
v_cmp_class_f32_e64 s57, 0xff800000, v12
s_and_b32 exec_lo, s57, exec_lo
s_cbranch_execz BB219

Foz-DB Navi21:
Totals from 3394 (3.48% of 97591) affected shaders:
Instrs: 9536259 -> 9533592 (-0.03%)
CodeSize: 51657072 -> 51640120 (-0.03%); split: -0.03%, +0.00%
Latency: 109493553 -> 109513317 (+0.02%); split: -0.01%, +0.02%
InvThroughput: 29125525 -> 29131876 (+0.02%); split: -0.00%, +0.02%
Copies: 815888 -> 818219 (+0.29%); split: -0.01%, +0.30%
Branches: 277451 -> 277449 (-0.00%)
SALU: 1217642 -> 1214976 (-0.22%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38697>
2025-11-29 08:02:24 +00:00
Marek Olšák
e6499fa73e nir/recompute_io_bases: move color input bases after all other inputs
This is related to the FS prolog.
It should have no effect on other drivers.

v2: make it optional via io_options

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38599>
2025-11-29 05:00:40 +00:00
Marek Olšák
fa0bea5ff8 nir: remove nir_io_add_const_offset_to_base
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_constant_folding does it now.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277>
2025-11-29 00:16:38 +00:00
Marek Olšák
21cdbfa223 ac,radv: move opt_vectorize_callback to common code
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
radeonsi will use it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>
2025-11-28 20:16:10 +00:00
Marek Olšák
2c9995a94f ac/nir: move aco_nir_op_supports_packed_math_16bit here
aco_nir_op_supports_packed_math_16bit currently can't be used by amd/common
because tests don't link with ACO, so linking would fail, but we want
to move the nir_opt_vectorize callback here that uses it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603>
2025-11-28 20:16:10 +00:00
David Rosca
38090d5be0 radv/video: Drop casts from vk_find_struct*
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The macro itself does the cast.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521>
2025-11-28 15:35:26 +00:00
David Rosca
32a02720a8 radv/video: Init session and update rate control in ControlVideoCoding
This eliminates the last state we kept in encode video session.
Also fixes changing encode resolution without reset.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521>
2025-11-28 15:35:26 +00:00
David Rosca
a7fe0188d4 radv/video: Remove tile config and skip mode from video session state
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521>
2025-11-28 15:35:25 +00:00
David Rosca
5d0d00e5f8 radv/video: Use radv_enc_aligned_coded_extent for session params overrides
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521>
2025-11-28 15:35:25 +00:00
David Rosca
0fc4ead36f radv/video: Remove enc_session from video session state
It was only used to store aligned picture size. Add helper
function to get the aligned size and use it when needed.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521>
2025-11-28 15:35:25 +00:00
Samuel Pitoiset
c3420ca932 Revert "radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3"
This reverts commit 0391902eb5.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38711>
2025-11-28 15:34:53 +01:00
Samuel Pitoiset
92a468f8f2 ci: uprev vkd3d
vkd3d-proton had an issue with its runner and few tests were excluded
by accident.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38652>
2025-11-28 11:44:28 +00:00
Samuel Pitoiset
0391902eb5 radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3
Only very old MEC firmwares are concerned, so let's remove it and
disable mesh shaders with those firmwares.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38691>
2025-11-28 10:21:30 +00:00
Samuel Pitoiset
5fd7af9e42 ac/surface: do not use tile swizzle for replayable/aliased FMASK surfaces
Otherwise the VA might change.

Fixes: 2bbc7d1db6 ("radv: move more surf_index logic to use_tile_swizzle")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38696>
2025-11-28 07:39:33 +00:00
Yonggang Luo
0a32d5e6fd treewide: Use regexp to replace usage of setenv with os_set_option.
setenv\((.*), 1\);
=>
os_set_option($1, true);

setenv\((.*), 0\);
=>
os_set_option($1, false);

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>
2025-11-27 18:22:34 +00:00
Yonggang Luo
1825715623 treewide: Use regexp to replace usage of unsetenv with os_unset_option.
unsetenv\((.*)\);
=>
os_unset_option($1);

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>
2025-11-27 18:22:33 +00:00
Yonggang Luo
d277dfdd76 treewide: Replace the usage of setenv manually and #include "util/os_misc.h" when needed
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640>
2025-11-27 18:22:33 +00:00
Samuel Pitoiset
930cab7702 radv: fix fbfetch output with ESO
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes a real issue when ESO uses fbfetch output because this
was determined after instead of before.

This solution isn't the most elegant one but binding graphics shaders
earlier would require more work. Let's just handle this specific corner
case for now.

This fixes
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.custom_resolve.shader_objects.fragment_region*
on some GPUs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617>
2025-11-26 17:47:07 +00:00
Samuel Pitoiset
6569acbdf2 radv: make sure to reset uses_fbfetch_output for NULL fragment shaders
To prevent useless decompression passes if a previously bound FS was
using fbfetch output.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617>
2025-11-26 17:47:07 +00:00
Timur Kristóf
29dff2fd75 radv: Check RADV_PERFTEST=sparse for image formats and sparse queue
Without this, we will report some image formats as unsupported
and the dedicated sparse binding queue won't work
when sparse support is enabled using RADV_PERFTEST=sparse

Fixes: dd90c76cea12 ("radv: Advertise sparse features pre Polaris with perftest flag")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38676>
2025-11-26 13:12:27 +00:00
Samuel Pitoiset
f14e0d9f09 radv: add radv_hide_rebar_on_dgpu and enable for Red Dead Redemption 2
RDR2 VRAM memory management when resizable BAR is enabled seems
incorrect because it keeps allocating VRAM without freeing anything.

This introduces a drirc option to emulate a fake carveout of 256MiB to
workaround this game bug. This also adjust memory budgets by
distributing it between visible and invisible because AMDGPU reports
the same value for both when REBAR is enabled.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12091
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38627>
2025-11-26 10:12:45 +00:00
Samuel Pitoiset
9cca79d8f8 radv: fix resetting descriptor pool since the new descriptor sets allocator
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
RADV uses low VAs.

This fixes rendering issues and eventually GPU hangs with Detroit.

Fixes: 849d41dbf8 ("radv: implement a new descriptor sets allocator")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38650>
2025-11-26 09:09:13 +00:00
Marek Olšák
d9d3f6703c ac,winsys/amdgpu: report why ac_query_gpu_info failed
only these case were not reporting anything

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38602>
2025-11-25 21:17:35 +00:00
Marek Olšák
1c3e7e4ca0 ac: document RELEASE_MEM limitation with PS_DONE/CS_DONE on gfx6-11
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38602>
2025-11-25 21:17:35 +00:00
Benjamin Cheng
6aabc3d5d2 ac/parse_ib: Implement VCN dec message parsing
This makes the IB dumps more useful for decode, as most of the actual
decode command is within the message buffers.

Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38631>
2025-11-25 19:17:12 +00:00
Natalie Vock
b7f011e653 radv/rt: Correctly copy culling flags when updating to separate AS
This was missing and led to the field being uninitialized.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38488>
2025-11-25 15:25:21 +00:00
Natalie Vock
bc1eea90b9 radv/rt: Keep updated nodes always active
In updateable AS, we keep all nodes active even if they're
degenerate/NaN, because too many games ignore API rules about not
making inactive nodes active (and some vendor tips outright advise this
behavior). We also need to match this by keeping everything active in
the update side. The ALWAYS_ACTIVE macro has been long removed and
replaced by VK_BVH_BUILD_FLAG, too. Since updating only happens to
updateable AS, don't even check for the flag, just implement the
always-active handling.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38488>
2025-11-25 15:25:21 +00:00
Georg Lehmann
f5eb3fe9cb aco/optimizer: optimze cndmask(a, b, not(c)) to cndmask(b, a, c)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Can happen with nir_op_bitz/b2f/b2i.

Foz-DB Navi48:
Totals from 3465 (4.20% of 82419) affected shaders:
Instrs: 7534077 -> 7527637 (-0.09%); split: -0.09%, +0.01%
CodeSize: 40017384 -> 39993008 (-0.06%); split: -0.07%, +0.01%
Latency: 38593071 -> 38582815 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 8519291 -> 8518620 (-0.01%); split: -0.01%, +0.00%
VClause: 151669 -> 151662 (-0.00%); split: -0.02%, +0.02%
SClause: 155781 -> 155772 (-0.01%); split: -0.01%, +0.01%
Copies: 628453 -> 628531 (+0.01%); split: -0.01%, +0.02%
Branches: 180429 -> 180430 (+0.00%)
PreSGPRs: 182855 -> 182801 (-0.03%)
VALU: 4315173 -> 4315241 (+0.00%); split: -0.00%, +0.00%
SALU: 992125 -> 986876 (-0.53%); split: -0.53%, +0.00%
VOPD: 15827 -> 15838 (+0.07%); split: +0.23%, -0.16%

Foz-DB Navi21:
Totals from 3341 (4.06% of 82387) affected shaders:
MaxWaves: 61924 -> 61950 (+0.04%)
Instrs: 6640276 -> 6635078 (-0.08%); split: -0.08%, +0.00%
CodeSize: 35932788 -> 35913760 (-0.05%); split: -0.06%, +0.00%
VGPRs: 205512 -> 205456 (-0.03%)
Latency: 40201463 -> 40194285 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 12379144 -> 12378028 (-0.01%); split: -0.01%, +0.00%
VClause: 151556 -> 151563 (+0.00%); split: -0.01%, +0.01%
SClause: 157470 -> 157472 (+0.00%); split: -0.00%, +0.01%
Copies: 645034 -> 644947 (-0.01%); split: -0.02%, +0.01%
Branches: 192070 -> 192071 (+0.00%)
PreSGPRs: 173368 -> 173311 (-0.03%)
VALU: 4554790 -> 4554782 (-0.00%); split: -0.00%, +0.00%
SALU: 881251 -> 876087 (-0.59%); split: -0.59%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:19 +00:00
Georg Lehmann
752f1fb4ae aco/optimizer: extend existing patterns to handle b2f/b2i(not(a))
The next commit will optimize b2f(not(a)) and b2i(not(a)),
so handle those in other patterns to prevent regressions.

No Foz-DB changes on its own.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:19 +00:00
Georg Lehmann
c538f47f03 aco/optimizer: create ff0/bcnt0
Foz-DB Navi21:
Totals from 1 (0.00% of 82387) affected shaders:
Instrs: 350 -> 347 (-0.86%)
CodeSize: 1800 -> 1788 (-0.67%)
Latency: 2427 -> 2421 (-0.25%)
SALU: 80 -> 77 (-3.75%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:18 +00:00
Georg Lehmann
0f7a1ce23e aco/optimizer: some more mul opts
Foz-DB Navi48:
Totals from 1650 (2.00% of 82419) affected shaders:
Instrs: 975716 -> 970609 (-0.52%); split: -0.53%, +0.00%
CodeSize: 4986260 -> 4982916 (-0.07%); split: -0.09%, +0.02%
Latency: 2795394 -> 2793211 (-0.08%); split: -0.09%, +0.01%
InvThroughput: 620892 -> 620914 (+0.00%); split: -0.00%, +0.01%
VClause: 18773 -> 18729 (-0.23%)
SClause: 13219 -> 13218 (-0.01%)
Copies: 53619 -> 53620 (+0.00%); split: -0.01%, +0.01%
VALU: 592094 -> 592096 (+0.00%); split: -0.00%, +0.00%
SALU: 96586 -> 93532 (-3.16%); split: -3.17%, +0.00%

Foz-DB Navi21:
Totals from 1647 (2.00% of 82387) affected shaders:
Instrs: 1104100 -> 1100149 (-0.36%); split: -0.36%, +0.00%
CodeSize: 5631092 -> 5637668 (+0.12%); split: -0.00%, +0.12%
Latency: 3503029 -> 3501621 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 1088494 -> 1088495 (+0.00%); split: -0.00%, +0.00%
VClause: 20898 -> 20885 (-0.06%)
Copies: 72641 -> 72635 (-0.01%); split: -0.02%, +0.01%
VALU: 725593 -> 725592 (-0.00%); split: -0.00%, +0.00%
SALU: 139046 -> 135175 (-2.78%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:17 +00:00
Georg Lehmann
92dbf42379 aco/optimizer: use cndmask for neg(b2i)
Foz-DB Navi48:
Totals from 1310 (1.59% of 82419) affected shaders:
Instrs: 1337622 -> 1338677 (+0.08%); split: -0.00%, +0.08%
CodeSize: 7039828 -> 7043996 (+0.06%); split: -0.00%, +0.06%
Latency: 7783135 -> 7782526 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 1587987 -> 1586644 (-0.08%)
Branches: 24320 -> 24318 (-0.01%)

Foz-DB Navi21:
Totals from 334 (0.41% of 82387) affected shaders:
Instrs: 666102 -> 666094 (-0.00%)
CodeSize: 3599748 -> 3599724 (-0.00%)
Latency: 6873870 -> 6873868 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2151773 -> 2151780 (+0.00%); split: -0.00%, +0.00%
Branches: 17419 -> 17411 (-0.05%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:17 +00:00
Georg Lehmann
0e4d4aeef7 aco/optimizer: add some bitop combining
Foz-DB Navi48:
Totals from 53 (0.06% of 82419) affected shaders:
Instrs: 172843 -> 172769 (-0.04%); split: -0.06%, +0.01%
CodeSize: 937308 -> 936924 (-0.04%); split: -0.04%, +0.00%
Latency: 454652 -> 454823 (+0.04%); split: -0.01%, +0.05%
InvThroughput: 89833 -> 89812 (-0.02%); split: -0.06%, +0.03%
PreSGPRs: 2926 -> 2929 (+0.10%)
PreVGPRs: 2920 -> 2919 (-0.03%); split: -0.07%, +0.03%
VALU: 76638 -> 76556 (-0.11%)
SALU: 37856 -> 37859 (+0.01%); split: -0.01%, +0.01%
VOPD: 10943 -> 10936 (-0.06%)

Foz-DB Navi21:
Totals from 59 (0.07% of 82387) affected shaders:
Instrs: 1047744 -> 1047578 (-0.02%)
CodeSize: 5641948 -> 5640780 (-0.02%)
Latency: 5116816 -> 5116957 (+0.00%); split: -0.00%, +0.01%
InvThroughput: 1274035 -> 1274023 (-0.00%); split: -0.00%, +0.00%
VClause: 30744 -> 30745 (+0.00%)
PreSGPRs: 3329 -> 3333 (+0.12%)
PreVGPRs: 4130 -> 4129 (-0.02%); split: -0.05%, +0.02%
VALU: 689731 -> 689562 (-0.02%)
SALU: 162830 -> 162833 (+0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:16 +00:00
Georg Lehmann
ee0354e0f1 aco/optimizer: use new helpers for bitwise n2 opts
Foz-DB Navi48:
Totals from 604 (0.73% of 82419) affected shaders:
Instrs: 2759878 -> 2758431 (-0.05%); split: -0.06%, +0.01%
CodeSize: 14801888 -> 14793412 (-0.06%); split: -0.06%, +0.01%
SpillSGPRs: 6237 -> 6233 (-0.06%)
Latency: 23509766 -> 23507853 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 7471297 -> 7471008 (-0.00%); split: -0.00%, +0.00%
Branches: 104979 -> 104977 (-0.00%)
PreSGPRs: 51506 -> 51408 (-0.19%); split: -0.20%, +0.01%
VALU: 1351564 -> 1351561 (-0.00%); split: -0.00%, +0.00%
SALU: 537430 -> 536266 (-0.22%); split: -0.23%, +0.01%
VOPD: 3834 -> 3833 (-0.03%)

Foz-DB Navi21:
Totals from 739 (0.90% of 82387) affected shaders:
Instrs: 2489644 -> 2488228 (-0.06%); split: -0.06%, +0.00%
CodeSize: 13930192 -> 13915972 (-0.10%); split: -0.11%, +0.00%
SpillSGPRs: 980 -> 976 (-0.41%)
Latency: 25027553 -> 25027845 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 8591377 -> 8591097 (-0.00%); split: -0.00%, +0.00%
SClause: 78380 -> 78382 (+0.00%)
Copies: 275433 -> 275393 (-0.01%); split: -0.02%, +0.01%
Branches: 113718 -> 113716 (-0.00%)
PreSGPRs: 48377 -> 48260 (-0.24%); split: -0.27%, +0.03%
VALU: 1589250 -> 1589240 (-0.00%)
SALU: 420348 -> 418962 (-0.33%); split: -0.34%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:15 +00:00
Georg Lehmann
758fe79ad5 aco/optimizer: use new helpers for v_sub opts
Foz-DB Navi48:
Totals from 1315 (1.60% of 82419) affected shaders:
Instrs: 1339446 -> 1339428 (-0.00%)
CodeSize: 7049636 -> 7049596 (-0.00%)
Latency: 7790708 -> 7790698 (-0.00%)
InvThroughput: 1588815 -> 1588807 (-0.00%)
VALU: 826831 -> 826821 (-0.00%)

Foz-DB Navi21:
Totals from 344 (0.42% of 82387) affected shaders:
Instrs: 692048 -> 692040 (-0.00%); split: -0.00%, +0.00%
Latency: 6987086 -> 6987066 (-0.00%)
InvThroughput: 2174789 -> 2174762 (-0.00%)
Copies: 57845 -> 57850 (+0.01%)
VALU: 475761 -> 475748 (-0.00%)
SALU: 93692 -> 93697 (+0.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:14 +00:00
Georg Lehmann
e42be7536c aco/optimizer: use new helpers for remaining add opts
Foz-DB Navi48:
Totals from 373 (0.45% of 82419) affected shaders:
Instrs: 542269 -> 542186 (-0.02%); split: -0.06%, +0.04%
CodeSize: 2872728 -> 2867204 (-0.19%); split: -0.21%, +0.02%
Latency: 3174435 -> 3174634 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 828783 -> 828600 (-0.02%); split: -0.03%, +0.01%
SClause: 11954 -> 11955 (+0.01%)
Copies: 49104 -> 49110 (+0.01%)
PreSGPRs: 15422 -> 15420 (-0.01%)
VALU: 262635 -> 262641 (+0.00%)

Foz-DB Navi21:
Totals from 426 (0.52% of 82387) affected shaders:
Instrs: 624744 -> 624754 (+0.00%); split: -0.00%, +0.00%
CodeSize: 3382728 -> 3385664 (+0.09%); split: -0.00%, +0.09%
Latency: 3841693 -> 3842101 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 1132036 -> 1132065 (+0.00%); split: -0.00%, +0.00%
VClause: 14008 -> 14011 (+0.02%)
Copies: 73104 -> 73114 (+0.01%); split: -0.00%, +0.02%
PreSGPRs: 19504 -> 19502 (-0.01%)
SALU: 131431 -> 131443 (+0.01%)

Foz-DB Polaris10:
Totals from 812 (1.31% of 61894) affected shaders:
Instrs: 610178 -> 609219 (-0.16%); split: -0.21%, +0.05%
CodeSize: 3142404 -> 3147304 (+0.16%); split: -0.02%, +0.17%
VGPRs: 38380 -> 38376 (-0.01%)
Latency: 8312085 -> 8307755 (-0.05%); split: -0.12%, +0.07%
InvThroughput: 3929970 -> 3924631 (-0.14%); split: -0.15%, +0.01%
VClause: 15714 -> 15632 (-0.52%); split: -0.67%, +0.15%
SClause: 14509 -> 14510 (+0.01%); split: -0.02%, +0.03%
Copies: 70197 -> 70388 (+0.27%); split: -0.61%, +0.89%
PreSGPRs: 26409 -> 26404 (-0.02%); split: -0.02%, +0.00%
PreVGPRs: 30448 -> 30436 (-0.04%)
VALU: 408184 -> 407068 (-0.27%); split: -0.29%, +0.01%
SALU: 95726 -> 95959 (+0.24%); split: -0.30%, +0.54%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:13 +00:00
Georg Lehmann
adc55b1a1e aco/optimizer: use new helpers for v_and opt
Foz-DB Navi48:
Totals from 465 (0.56% of 82419) affected shaders:
Instrs: 372721 -> 372083 (-0.17%); split: -0.18%, +0.01%
CodeSize: 2004568 -> 2003332 (-0.06%)
Latency: 3664162 -> 3660745 (-0.09%); split: -0.10%, +0.00%
InvThroughput: 892042 -> 890994 (-0.12%); split: -0.12%, +0.01%
Copies: 35552 -> 35549 (-0.01%)
VALU: 171781 -> 171333 (-0.26%); split: -0.28%, +0.02%
SALU: 87946 -> 87949 (+0.00%)
VOPD: 48 -> 49 (+2.08%)

Foz-DB Navi21:
Totals from 191 (0.23% of 82387) affected shaders:
Instrs: 139340 -> 139178 (-0.12%); split: -0.13%, +0.02%
CodeSize: 798660 -> 798284 (-0.05%)
Latency: 1672750 -> 1673194 (+0.03%); split: -0.06%, +0.08%
InvThroughput: 634847 -> 634651 (-0.03%); split: -0.06%, +0.03%
Copies: 16372 -> 16366 (-0.04%); split: -0.04%, +0.01%
VALU: 79668 -> 79506 (-0.20%); split: -0.23%, +0.03%
SALU: 38233 -> 38236 (+0.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:13 +00:00
Georg Lehmann
7bc6d8e2ad aco/optimizer: add more v_add_lshl_u32 opts
No Foz-DB changes on Navi21.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:12 +00:00
Georg Lehmann
6a1caabd64 aco/optimizer: use new helpers for v_add_lshl_u32
Foz-DB Navi48:
Totals from 357 (0.43% of 82419) affected shaders:
Instrs: 244419 -> 243608 (-0.33%); split: -0.34%, +0.01%
CodeSize: 1302584 -> 1304188 (+0.12%); split: -0.00%, +0.13%
VGPRs: 21240 -> 21216 (-0.11%)
Latency: 1226165 -> 1225651 (-0.04%); split: -0.06%, +0.02%
InvThroughput: 162432 -> 161940 (-0.30%); split: -0.30%, +0.00%
Copies: 16607 -> 16610 (+0.02%)
PreSGPRs: 14082 -> 14135 (+0.38%)
PreVGPRs: 15917 -> 15914 (-0.02%)
VALU: 136308 -> 135699 (-0.45%)
SALU: 24415 -> 24418 (+0.01%)
VOPD: 333 -> 334 (+0.30%)

Foz-DB Navi21:
Totals from 319 (0.39% of 82387) affected shaders:
Instrs: 255434 -> 254831 (-0.24%)
CodeSize: 1375792 -> 1378164 (+0.17%)
VGPRs: 15360 -> 15344 (-0.10%)
Latency: 1405956 -> 1405181 (-0.06%)
InvThroughput: 174402 -> 173816 (-0.34%)
Copies: 25892 -> 25891 (-0.00%)
PreSGPRs: 14129 -> 14132 (+0.02%)
PreVGPRs: 12457 -> 12454 (-0.02%)
VALU: 139630 -> 139032 (-0.43%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:12 +00:00
Georg Lehmann
7108dac637 aco/optimizer: use new helpers for s_lshl<n>_add_u32
Foz-DB Navi48:
Totals from 7654 (9.29% of 82419) affected shaders:
Instrs: 6170479 -> 6174536 (+0.07%); split: -0.07%, +0.13%
CodeSize: 32489580 -> 32500100 (+0.03%); split: -0.07%, +0.10%
SpillSGPRs: 4253 -> 4224 (-0.68%); split: -0.71%, +0.02%
Latency: 60472662 -> 60489681 (+0.03%); split: -0.02%, +0.04%
InvThroughput: 9218099 -> 9218149 (+0.00%); split: -0.01%, +0.01%
VClause: 121094 -> 121089 (-0.00%); split: -0.01%, +0.00%
SClause: 178092 -> 179830 (+0.98%); split: -0.55%, +1.53%
Copies: 424495 -> 423756 (-0.17%); split: -0.57%, +0.40%
Branches: 120352 -> 120353 (+0.00%); split: -0.01%, +0.01%
PreSGPRs: 334391 -> 333381 (-0.30%); split: -0.33%, +0.02%
VALU: 3349394 -> 3349323 (-0.00%); split: -0.00%, +0.00%
SALU: 957913 -> 957149 (-0.08%); split: -0.25%, +0.17%
VOPD: 9177 -> 9179 (+0.02%); split: +0.03%, -0.01%

Foz-DB Navi21:
Totals from 7649 (9.28% of 82387) affected shaders:
Instrs: 6144605 -> 6143005 (-0.03%); split: -0.06%, +0.04%
CodeSize: 32685976 -> 32672380 (-0.04%); split: -0.08%, +0.04%
SpillSGPRs: 3079 -> 3067 (-0.39%); split: -0.42%, +0.03%
Latency: 64979945 -> 65002741 (+0.04%); split: -0.02%, +0.05%
InvThroughput: 14754398 -> 14754230 (-0.00%); split: -0.01%, +0.01%
VClause: 132336 -> 132357 (+0.02%); split: -0.02%, +0.03%
SClause: 190229 -> 191340 (+0.58%); split: -1.01%, +1.60%
Copies: 511915 -> 511287 (-0.12%); split: -0.44%, +0.32%
Branches: 157156 -> 157154 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 345761 -> 344826 (-0.27%); split: -0.33%, +0.05%
VALU: 3856887 -> 3856928 (+0.00%); split: -0.01%, +0.01%
SALU: 1001190 -> 1000362 (-0.08%); split: -0.22%, +0.14%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:10 +00:00
Georg Lehmann
d9919c3e10 aco/optimizer: optimize add(mad_u32_u16(a, b, 0), c)
Foz-DB Navi48:
Totals from 104 (0.13% of 82419) affected shaders:
Instrs: 3554243 -> 3553555 (-0.02%); split: -0.02%, +0.00%
CodeSize: 18836004 -> 18830572 (-0.03%); split: -0.03%, +0.00%
Latency: 19288034 -> 19287208 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 3527510 -> 3526925 (-0.02%); split: -0.02%, +0.00%
VClause: 89526 -> 89522 (-0.00%); split: -0.02%, +0.01%
SClause: 62484 -> 62492 (+0.01%); split: -0.00%, +0.01%
Copies: 266415 -> 266404 (-0.00%); split: -0.04%, +0.03%
Branches: 102123 -> 102125 (+0.00%)
VALU: 1987067 -> 1986531 (-0.03%); split: -0.03%, +0.00%
SALU: 471348 -> 471346 (-0.00%); split: -0.00%, +0.00%

Foz-DB Navi21:
Totals from 228 (0.28% of 82387) affected shaders:
Instrs: 3069693 -> 3068317 (-0.04%); split: -0.05%, +0.00%
CodeSize: 16582476 -> 16574920 (-0.05%); split: -0.05%, +0.00%
Latency: 20038755 -> 20030986 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 4742546 -> 4738245 (-0.09%); split: -0.10%, +0.00%
VClause: 93157 -> 93135 (-0.02%); split: -0.03%, +0.01%
Copies: 265019 -> 264959 (-0.02%); split: -0.04%, +0.02%
VALU: 2025352 -> 2023897 (-0.07%); split: -0.07%, +0.00%
SALU: 447385 -> 447375 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:10 +00:00
Georg Lehmann
0359c8a901 aco/optimizer: use new helpers for v_add_u32 opts
Foz-DB Navi48:
Totals from 1554 (1.89% of 82419) affected shaders:
Instrs: 5154325 -> 5151499 (-0.05%); split: -0.08%, +0.02%
CodeSize: 27310012 -> 27318708 (+0.03%); split: -0.01%, +0.05%
VGPRs: 97236 -> 97200 (-0.04%); split: -0.05%, +0.01%
Latency: 34121873 -> 34120894 (-0.00%); split: -0.02%, +0.01%
InvThroughput: 6735276 -> 6730418 (-0.07%); split: -0.08%, +0.01%
VClause: 130106 -> 130090 (-0.01%); split: -0.05%, +0.04%
SClause: 90439 -> 90449 (+0.01%); split: -0.00%, +0.01%
Copies: 382920 -> 382401 (-0.14%); split: -0.18%, +0.05%
Branches: 130089 -> 130091 (+0.00%)
PreSGPRs: 67745 -> 67743 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 72710 -> 72674 (-0.05%)
VALU: 2941866 -> 2938129 (-0.13%); split: -0.13%, +0.00%
SALU: 651032 -> 651779 (+0.11%); split: -0.02%, +0.14%
VOPD: 2446 -> 2393 (-2.17%); split: +0.70%, -2.86%

Foz-DB Navi21:
Totals from 1534 (1.86% of 82387) affected shaders:
MaxWaves: 32481 -> 32479 (-0.01%)
Instrs: 4732755 -> 4730039 (-0.06%); split: -0.06%, +0.00%
CodeSize: 25305728 -> 25313148 (+0.03%); split: -0.00%, +0.03%
VGPRs: 84424 -> 84448 (+0.03%)
SpillVGPRs: 2420 -> 2419 (-0.04%)
Scratch: 180224 -> 179200 (-0.57%)
Latency: 36843383 -> 36846269 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 9252495 -> 9238142 (-0.16%); split: -0.17%, +0.02%
VClause: 146629 -> 146671 (+0.03%); split: -0.02%, +0.05%
SClause: 94502 -> 94512 (+0.01%); split: -0.00%, +0.01%
Copies: 403672 -> 403592 (-0.02%); split: -0.09%, +0.07%
Branches: 141145 -> 141137 (-0.01%)
PreSGPRs: 70003 -> 70001 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 70835 -> 70800 (-0.05%)
VALU: 3114513 -> 3111338 (-0.10%); split: -0.10%, +0.00%
SALU: 651177 -> 651925 (+0.11%); split: -0.02%, +0.13%
VMEM: 271263 -> 271261 (-0.00%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530>
2025-11-25 11:49:09 +00:00