Georg Lehmann
69b5767eee
aco/optimizer: use new helpers to create v_fma_mixlo_f16
...
Foz-DB Navi21:
Totals from 69 (0.07% of 97591) affected shaders:
Instrs: 45091 -> 45057 (-0.08%)
CodeSize: 244016 -> 243932 (-0.03%); split: -0.12%, +0.09%
VGPRs: 1792 -> 1680 (-6.25%)
Latency: 133496 -> 133572 (+0.06%); split: -0.03%, +0.09%
InvThroughput: 35383 -> 35338 (-0.13%)
Copies: 4050 -> 4048 (-0.05%)
VALU: 30172 -> 30138 (-0.11%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:58 +00:00
Georg Lehmann
ee28801eae
aco/optimizer: use new helpers to apply insert
...
Foz-DB Navi21:
Totals from 505 (0.52% of 97591) affected shaders:
Instrs: 1438254 -> 1436780 (-0.10%); split: -0.11%, +0.01%
CodeSize: 8063364 -> 8054192 (-0.11%); split: -0.13%, +0.01%
Latency: 18596788 -> 18597262 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 5213861 -> 5213061 (-0.02%); split: -0.02%, +0.01%
VClause: 37121 -> 37130 (+0.02%)
Copies: 174744 -> 175222 (+0.27%); split: -0.07%, +0.34%
Branches: 65722 -> 65718 (-0.01%)
VALU: 912967 -> 911074 (-0.21%); split: -0.21%, +0.00%
SALU: 251045 -> 251560 (+0.21%); split: -0.01%, +0.21%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:58 +00:00
Georg Lehmann
d60ce9ceef
aco/optimizer: use new helpers to apply packed fsat
...
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:57 +00:00
Georg Lehmann
0a82c8cb13
aco/optimizer: back propagate modifiers through rcp
...
Foz-DB Navi21:
Totals from 5 (0.01% of 97591) affected shaders:
Instrs: 1473 -> 1468 (-0.34%)
CodeSize: 7664 -> 7660 (-0.05%)
Latency: 25897 -> 25863 (-0.13%)
InvThroughput: 2737 -> 2731 (-0.22%)
VALU: 1141 -> 1136 (-0.44%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:57 +00:00
Georg Lehmann
4442064449
aco/optimizer: use new helpers to apply neg/abs to output of instructions
...
Foz-DB Navi21:
Totals from 6765 (6.93% of 97591) affected shaders:
MaxWaves: 134398 -> 134408 (+0.01%)
Instrs: 9775725 -> 9768079 (-0.08%); split: -0.08%, +0.01%
CodeSize: 50785228 -> 50777880 (-0.01%); split: -0.02%, +0.01%
VGPRs: 445840 -> 445784 (-0.01%)
SpillSGPRs: 14483 -> 14476 (-0.05%)
Latency: 40232431 -> 40230284 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 10339051 -> 10329846 (-0.09%); split: -0.09%, +0.00%
VClause: 186785 -> 186788 (+0.00%); split: -0.01%, +0.01%
SClause: 157106 -> 157116 (+0.01%); split: -0.00%, +0.01%
Copies: 746817 -> 745378 (-0.19%); split: -0.26%, +0.07%
Branches: 189298 -> 189211 (-0.05%); split: -0.06%, +0.01%
PreSGPRs: 346169 -> 346158 (-0.00%)
PreVGPRs: 370712 -> 370660 (-0.01%); split: -0.02%, +0.00%
VALU: 6847295 -> 6839753 (-0.11%); split: -0.11%, +0.00%
SALU: 1139960 -> 1139942 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:56 +00:00
Georg Lehmann
58f407702d
aco/optimizer: handle gfx11+ vinterp as fma special case
...
No effect on its own, but will be important for output modifiers.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:56 +00:00
Georg Lehmann
37d3c63a12
aco/optimizer: add new helpers for applying output modifiers
...
To replace the old instr_mod_labels.
Foz-DB Navi21:
Totals from 683 (0.70% of 97591) affected shaders:
Instrs: 3341288 -> 3340447 (-0.03%); split: -0.03%, +0.00%
CodeSize: 18522460 -> 18520212 (-0.01%); split: -0.01%, +0.00%
Latency: 34359519 -> 34358772 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 9229621 -> 9229494 (-0.00%); split: -0.00%, +0.00%
Copies: 368383 -> 368260 (-0.03%); split: -0.04%, +0.00%
PreSGPRs: 48060 -> 48061 (+0.00%)
SALU: 543991 -> 543150 (-0.15%); split: -0.16%, +0.00%
Changes are caused by optimizing not(salu) without killed scc.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:56 +00:00
Georg Lehmann
fc29821d3b
aco/optimizer: move med3 -> add_clamp opt later
...
Soon we will apply omod later,
when the combine_instruction reaches the multiplication with constant.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38658 >
2025-11-29 08:27:55 +00:00
Georg Lehmann
39a61502e5
aco/opt_postRA: allow v_cmpx to clobber exec before nop split/create vector
...
Kind of ugly, but I really hate seeing this in every rt traversal loop:
image_bvh64_intersect_ray v[56:59], [v40, v41, v42, v47, v48, v49, v50, v51, v52, v53, v54, v55], s[44:47]
v_cmp_class_f32_e64 s57, 0xff800000, v12
s_and_b32 exec_lo, s57, exec_lo
s_cbranch_execz BB219
Foz-DB Navi21:
Totals from 3394 (3.48% of 97591) affected shaders:
Instrs: 9536259 -> 9533592 (-0.03%)
CodeSize: 51657072 -> 51640120 (-0.03%); split: -0.03%, +0.00%
Latency: 109493553 -> 109513317 (+0.02%); split: -0.01%, +0.02%
InvThroughput: 29125525 -> 29131876 (+0.02%); split: -0.00%, +0.02%
Copies: 815888 -> 818219 (+0.29%); split: -0.01%, +0.30%
Branches: 277451 -> 277449 (-0.00%)
SALU: 1217642 -> 1214976 (-0.22%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38697 >
2025-11-29 08:02:24 +00:00
Marek Olšák
e6499fa73e
nir/recompute_io_bases: move color input bases after all other inputs
...
This is related to the FS prolog.
It should have no effect on other drivers.
v2: make it optional via io_options
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38599 >
2025-11-29 05:00:40 +00:00
Marek Olšák
fa0bea5ff8
nir: remove nir_io_add_const_offset_to_base
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_constant_folding does it now.
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38277 >
2025-11-29 00:16:38 +00:00
Marek Olšák
21cdbfa223
ac,radv: move opt_vectorize_callback to common code
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
radeonsi will use it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603 >
2025-11-28 20:16:10 +00:00
Marek Olšák
2c9995a94f
ac/nir: move aco_nir_op_supports_packed_math_16bit here
...
aco_nir_op_supports_packed_math_16bit currently can't be used by amd/common
because tests don't link with ACO, so linking would fail, but we want
to move the nir_opt_vectorize callback here that uses it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38603 >
2025-11-28 20:16:10 +00:00
David Rosca
38090d5be0
radv/video: Drop casts from vk_find_struct*
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The macro itself does the cast.
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521 >
2025-11-28 15:35:26 +00:00
David Rosca
32a02720a8
radv/video: Init session and update rate control in ControlVideoCoding
...
This eliminates the last state we kept in encode video session.
Also fixes changing encode resolution without reset.
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521 >
2025-11-28 15:35:26 +00:00
David Rosca
a7fe0188d4
radv/video: Remove tile config and skip mode from video session state
...
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521 >
2025-11-28 15:35:25 +00:00
David Rosca
5d0d00e5f8
radv/video: Use radv_enc_aligned_coded_extent for session params overrides
...
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521 >
2025-11-28 15:35:25 +00:00
David Rosca
0fc4ead36f
radv/video: Remove enc_session from video session state
...
It was only used to store aligned picture size. Add helper
function to get the aligned size and use it when needed.
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38521 >
2025-11-28 15:35:25 +00:00
Samuel Pitoiset
c3420ca932
Revert "radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3"
...
This reverts commit 0391902eb5 .
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38711 >
2025-11-28 15:34:53 +01:00
Samuel Pitoiset
92a468f8f2
ci: uprev vkd3d
...
vkd3d-proton had an issue with its runner and few tests were excluded
by accident.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38652 >
2025-11-28 11:44:28 +00:00
Samuel Pitoiset
0391902eb5
radv: remove the workaround for DISPATCH_TASKMESH_INDIRECT_MULTI_ACE on GFX10.3
...
Only very old MEC firmwares are concerned, so let's remove it and
disable mesh shaders with those firmwares.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38691 >
2025-11-28 10:21:30 +00:00
Samuel Pitoiset
5fd7af9e42
ac/surface: do not use tile swizzle for replayable/aliased FMASK surfaces
...
Otherwise the VA might change.
Fixes: 2bbc7d1db6 ("radv: move more surf_index logic to use_tile_swizzle")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38696 >
2025-11-28 07:39:33 +00:00
Yonggang Luo
0a32d5e6fd
treewide: Use regexp to replace usage of setenv with os_set_option.
...
setenv\((.*), 1\);
=>
os_set_option($1, true);
setenv\((.*), 0\);
=>
os_set_option($1, false);
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640 >
2025-11-27 18:22:34 +00:00
Yonggang Luo
1825715623
treewide: Use regexp to replace usage of unsetenv with os_unset_option.
...
unsetenv\((.*)\);
=>
os_unset_option($1);
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640 >
2025-11-27 18:22:33 +00:00
Yonggang Luo
d277dfdd76
treewide: Replace the usage of setenv manually and #include "util/os_misc.h" when needed
...
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Antonio Ospite <antonio.ospite@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38640 >
2025-11-27 18:22:33 +00:00
Samuel Pitoiset
930cab7702
radv: fix fbfetch output with ESO
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes a real issue when ESO uses fbfetch output because this
was determined after instead of before.
This solution isn't the most elegant one but binding graphics shaders
earlier would require more work. Let's just handle this specific corner
case for now.
This fixes
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.custom_resolve.shader_objects.fragment_region*
on some GPUs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617 >
2025-11-26 17:47:07 +00:00
Samuel Pitoiset
6569acbdf2
radv: make sure to reset uses_fbfetch_output for NULL fragment shaders
...
To prevent useless decompression passes if a previously bound FS was
using fbfetch output.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38617 >
2025-11-26 17:47:07 +00:00
Timur Kristóf
29dff2fd75
radv: Check RADV_PERFTEST=sparse for image formats and sparse queue
...
Without this, we will report some image formats as unsupported
and the dedicated sparse binding queue won't work
when sparse support is enabled using RADV_PERFTEST=sparse
Fixes: dd90c76cea12 ("radv: Advertise sparse features pre Polaris with perftest flag")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38676 >
2025-11-26 13:12:27 +00:00
Samuel Pitoiset
f14e0d9f09
radv: add radv_hide_rebar_on_dgpu and enable for Red Dead Redemption 2
...
RDR2 VRAM memory management when resizable BAR is enabled seems
incorrect because it keeps allocating VRAM without freeing anything.
This introduces a drirc option to emulate a fake carveout of 256MiB to
workaround this game bug. This also adjust memory budgets by
distributing it between visible and invisible because AMDGPU reports
the same value for both when REBAR is enabled.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12091
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38627 >
2025-11-26 10:12:45 +00:00
Samuel Pitoiset
9cca79d8f8
radv: fix resetting descriptor pool since the new descriptor sets allocator
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
RADV uses low VAs.
This fixes rendering issues and eventually GPU hangs with Detroit.
Fixes: 849d41dbf8 ("radv: implement a new descriptor sets allocator")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38650 >
2025-11-26 09:09:13 +00:00
Marek Olšák
d9d3f6703c
ac,winsys/amdgpu: report why ac_query_gpu_info failed
...
only these case were not reporting anything
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38602 >
2025-11-25 21:17:35 +00:00
Marek Olšák
1c3e7e4ca0
ac: document RELEASE_MEM limitation with PS_DONE/CS_DONE on gfx6-11
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38602 >
2025-11-25 21:17:35 +00:00
Benjamin Cheng
6aabc3d5d2
ac/parse_ib: Implement VCN dec message parsing
...
This makes the IB dumps more useful for decode, as most of the actual
decode command is within the message buffers.
Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38631 >
2025-11-25 19:17:12 +00:00
Natalie Vock
b7f011e653
radv/rt: Correctly copy culling flags when updating to separate AS
...
This was missing and led to the field being uninitialized.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38488 >
2025-11-25 15:25:21 +00:00
Natalie Vock
bc1eea90b9
radv/rt: Keep updated nodes always active
...
In updateable AS, we keep all nodes active even if they're
degenerate/NaN, because too many games ignore API rules about not
making inactive nodes active (and some vendor tips outright advise this
behavior). We also need to match this by keeping everything active in
the update side. The ALWAYS_ACTIVE macro has been long removed and
replaced by VK_BVH_BUILD_FLAG, too. Since updating only happens to
updateable AS, don't even check for the flag, just implement the
always-active handling.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38488 >
2025-11-25 15:25:21 +00:00
Georg Lehmann
f5eb3fe9cb
aco/optimizer: optimze cndmask(a, b, not(c)) to cndmask(b, a, c)
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Can happen with nir_op_bitz/b2f/b2i.
Foz-DB Navi48:
Totals from 3465 (4.20% of 82419) affected shaders:
Instrs: 7534077 -> 7527637 (-0.09%); split: -0.09%, +0.01%
CodeSize: 40017384 -> 39993008 (-0.06%); split: -0.07%, +0.01%
Latency: 38593071 -> 38582815 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 8519291 -> 8518620 (-0.01%); split: -0.01%, +0.00%
VClause: 151669 -> 151662 (-0.00%); split: -0.02%, +0.02%
SClause: 155781 -> 155772 (-0.01%); split: -0.01%, +0.01%
Copies: 628453 -> 628531 (+0.01%); split: -0.01%, +0.02%
Branches: 180429 -> 180430 (+0.00%)
PreSGPRs: 182855 -> 182801 (-0.03%)
VALU: 4315173 -> 4315241 (+0.00%); split: -0.00%, +0.00%
SALU: 992125 -> 986876 (-0.53%); split: -0.53%, +0.00%
VOPD: 15827 -> 15838 (+0.07%); split: +0.23%, -0.16%
Foz-DB Navi21:
Totals from 3341 (4.06% of 82387) affected shaders:
MaxWaves: 61924 -> 61950 (+0.04%)
Instrs: 6640276 -> 6635078 (-0.08%); split: -0.08%, +0.00%
CodeSize: 35932788 -> 35913760 (-0.05%); split: -0.06%, +0.00%
VGPRs: 205512 -> 205456 (-0.03%)
Latency: 40201463 -> 40194285 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 12379144 -> 12378028 (-0.01%); split: -0.01%, +0.00%
VClause: 151556 -> 151563 (+0.00%); split: -0.01%, +0.01%
SClause: 157470 -> 157472 (+0.00%); split: -0.00%, +0.01%
Copies: 645034 -> 644947 (-0.01%); split: -0.02%, +0.01%
Branches: 192070 -> 192071 (+0.00%)
PreSGPRs: 173368 -> 173311 (-0.03%)
VALU: 4554790 -> 4554782 (-0.00%); split: -0.00%, +0.00%
SALU: 881251 -> 876087 (-0.59%); split: -0.59%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:19 +00:00
Georg Lehmann
752f1fb4ae
aco/optimizer: extend existing patterns to handle b2f/b2i(not(a))
...
The next commit will optimize b2f(not(a)) and b2i(not(a)),
so handle those in other patterns to prevent regressions.
No Foz-DB changes on its own.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:19 +00:00
Georg Lehmann
c538f47f03
aco/optimizer: create ff0/bcnt0
...
Foz-DB Navi21:
Totals from 1 (0.00% of 82387) affected shaders:
Instrs: 350 -> 347 (-0.86%)
CodeSize: 1800 -> 1788 (-0.67%)
Latency: 2427 -> 2421 (-0.25%)
SALU: 80 -> 77 (-3.75%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:18 +00:00
Georg Lehmann
0f7a1ce23e
aco/optimizer: some more mul opts
...
Foz-DB Navi48:
Totals from 1650 (2.00% of 82419) affected shaders:
Instrs: 975716 -> 970609 (-0.52%); split: -0.53%, +0.00%
CodeSize: 4986260 -> 4982916 (-0.07%); split: -0.09%, +0.02%
Latency: 2795394 -> 2793211 (-0.08%); split: -0.09%, +0.01%
InvThroughput: 620892 -> 620914 (+0.00%); split: -0.00%, +0.01%
VClause: 18773 -> 18729 (-0.23%)
SClause: 13219 -> 13218 (-0.01%)
Copies: 53619 -> 53620 (+0.00%); split: -0.01%, +0.01%
VALU: 592094 -> 592096 (+0.00%); split: -0.00%, +0.00%
SALU: 96586 -> 93532 (-3.16%); split: -3.17%, +0.00%
Foz-DB Navi21:
Totals from 1647 (2.00% of 82387) affected shaders:
Instrs: 1104100 -> 1100149 (-0.36%); split: -0.36%, +0.00%
CodeSize: 5631092 -> 5637668 (+0.12%); split: -0.00%, +0.12%
Latency: 3503029 -> 3501621 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 1088494 -> 1088495 (+0.00%); split: -0.00%, +0.00%
VClause: 20898 -> 20885 (-0.06%)
Copies: 72641 -> 72635 (-0.01%); split: -0.02%, +0.01%
VALU: 725593 -> 725592 (-0.00%); split: -0.00%, +0.00%
SALU: 139046 -> 135175 (-2.78%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:17 +00:00
Georg Lehmann
92dbf42379
aco/optimizer: use cndmask for neg(b2i)
...
Foz-DB Navi48:
Totals from 1310 (1.59% of 82419) affected shaders:
Instrs: 1337622 -> 1338677 (+0.08%); split: -0.00%, +0.08%
CodeSize: 7039828 -> 7043996 (+0.06%); split: -0.00%, +0.06%
Latency: 7783135 -> 7782526 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 1587987 -> 1586644 (-0.08%)
Branches: 24320 -> 24318 (-0.01%)
Foz-DB Navi21:
Totals from 334 (0.41% of 82387) affected shaders:
Instrs: 666102 -> 666094 (-0.00%)
CodeSize: 3599748 -> 3599724 (-0.00%)
Latency: 6873870 -> 6873868 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2151773 -> 2151780 (+0.00%); split: -0.00%, +0.00%
Branches: 17419 -> 17411 (-0.05%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:17 +00:00
Georg Lehmann
0e4d4aeef7
aco/optimizer: add some bitop combining
...
Foz-DB Navi48:
Totals from 53 (0.06% of 82419) affected shaders:
Instrs: 172843 -> 172769 (-0.04%); split: -0.06%, +0.01%
CodeSize: 937308 -> 936924 (-0.04%); split: -0.04%, +0.00%
Latency: 454652 -> 454823 (+0.04%); split: -0.01%, +0.05%
InvThroughput: 89833 -> 89812 (-0.02%); split: -0.06%, +0.03%
PreSGPRs: 2926 -> 2929 (+0.10%)
PreVGPRs: 2920 -> 2919 (-0.03%); split: -0.07%, +0.03%
VALU: 76638 -> 76556 (-0.11%)
SALU: 37856 -> 37859 (+0.01%); split: -0.01%, +0.01%
VOPD: 10943 -> 10936 (-0.06%)
Foz-DB Navi21:
Totals from 59 (0.07% of 82387) affected shaders:
Instrs: 1047744 -> 1047578 (-0.02%)
CodeSize: 5641948 -> 5640780 (-0.02%)
Latency: 5116816 -> 5116957 (+0.00%); split: -0.00%, +0.01%
InvThroughput: 1274035 -> 1274023 (-0.00%); split: -0.00%, +0.00%
VClause: 30744 -> 30745 (+0.00%)
PreSGPRs: 3329 -> 3333 (+0.12%)
PreVGPRs: 4130 -> 4129 (-0.02%); split: -0.05%, +0.02%
VALU: 689731 -> 689562 (-0.02%)
SALU: 162830 -> 162833 (+0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:16 +00:00
Georg Lehmann
ee0354e0f1
aco/optimizer: use new helpers for bitwise n2 opts
...
Foz-DB Navi48:
Totals from 604 (0.73% of 82419) affected shaders:
Instrs: 2759878 -> 2758431 (-0.05%); split: -0.06%, +0.01%
CodeSize: 14801888 -> 14793412 (-0.06%); split: -0.06%, +0.01%
SpillSGPRs: 6237 -> 6233 (-0.06%)
Latency: 23509766 -> 23507853 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 7471297 -> 7471008 (-0.00%); split: -0.00%, +0.00%
Branches: 104979 -> 104977 (-0.00%)
PreSGPRs: 51506 -> 51408 (-0.19%); split: -0.20%, +0.01%
VALU: 1351564 -> 1351561 (-0.00%); split: -0.00%, +0.00%
SALU: 537430 -> 536266 (-0.22%); split: -0.23%, +0.01%
VOPD: 3834 -> 3833 (-0.03%)
Foz-DB Navi21:
Totals from 739 (0.90% of 82387) affected shaders:
Instrs: 2489644 -> 2488228 (-0.06%); split: -0.06%, +0.00%
CodeSize: 13930192 -> 13915972 (-0.10%); split: -0.11%, +0.00%
SpillSGPRs: 980 -> 976 (-0.41%)
Latency: 25027553 -> 25027845 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 8591377 -> 8591097 (-0.00%); split: -0.00%, +0.00%
SClause: 78380 -> 78382 (+0.00%)
Copies: 275433 -> 275393 (-0.01%); split: -0.02%, +0.01%
Branches: 113718 -> 113716 (-0.00%)
PreSGPRs: 48377 -> 48260 (-0.24%); split: -0.27%, +0.03%
VALU: 1589250 -> 1589240 (-0.00%)
SALU: 420348 -> 418962 (-0.33%); split: -0.34%, +0.01%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:15 +00:00
Georg Lehmann
758fe79ad5
aco/optimizer: use new helpers for v_sub opts
...
Foz-DB Navi48:
Totals from 1315 (1.60% of 82419) affected shaders:
Instrs: 1339446 -> 1339428 (-0.00%)
CodeSize: 7049636 -> 7049596 (-0.00%)
Latency: 7790708 -> 7790698 (-0.00%)
InvThroughput: 1588815 -> 1588807 (-0.00%)
VALU: 826831 -> 826821 (-0.00%)
Foz-DB Navi21:
Totals from 344 (0.42% of 82387) affected shaders:
Instrs: 692048 -> 692040 (-0.00%); split: -0.00%, +0.00%
Latency: 6987086 -> 6987066 (-0.00%)
InvThroughput: 2174789 -> 2174762 (-0.00%)
Copies: 57845 -> 57850 (+0.01%)
VALU: 475761 -> 475748 (-0.00%)
SALU: 93692 -> 93697 (+0.01%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:14 +00:00
Georg Lehmann
e42be7536c
aco/optimizer: use new helpers for remaining add opts
...
Foz-DB Navi48:
Totals from 373 (0.45% of 82419) affected shaders:
Instrs: 542269 -> 542186 (-0.02%); split: -0.06%, +0.04%
CodeSize: 2872728 -> 2867204 (-0.19%); split: -0.21%, +0.02%
Latency: 3174435 -> 3174634 (+0.01%); split: -0.01%, +0.01%
InvThroughput: 828783 -> 828600 (-0.02%); split: -0.03%, +0.01%
SClause: 11954 -> 11955 (+0.01%)
Copies: 49104 -> 49110 (+0.01%)
PreSGPRs: 15422 -> 15420 (-0.01%)
VALU: 262635 -> 262641 (+0.00%)
Foz-DB Navi21:
Totals from 426 (0.52% of 82387) affected shaders:
Instrs: 624744 -> 624754 (+0.00%); split: -0.00%, +0.00%
CodeSize: 3382728 -> 3385664 (+0.09%); split: -0.00%, +0.09%
Latency: 3841693 -> 3842101 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 1132036 -> 1132065 (+0.00%); split: -0.00%, +0.00%
VClause: 14008 -> 14011 (+0.02%)
Copies: 73104 -> 73114 (+0.01%); split: -0.00%, +0.02%
PreSGPRs: 19504 -> 19502 (-0.01%)
SALU: 131431 -> 131443 (+0.01%)
Foz-DB Polaris10:
Totals from 812 (1.31% of 61894) affected shaders:
Instrs: 610178 -> 609219 (-0.16%); split: -0.21%, +0.05%
CodeSize: 3142404 -> 3147304 (+0.16%); split: -0.02%, +0.17%
VGPRs: 38380 -> 38376 (-0.01%)
Latency: 8312085 -> 8307755 (-0.05%); split: -0.12%, +0.07%
InvThroughput: 3929970 -> 3924631 (-0.14%); split: -0.15%, +0.01%
VClause: 15714 -> 15632 (-0.52%); split: -0.67%, +0.15%
SClause: 14509 -> 14510 (+0.01%); split: -0.02%, +0.03%
Copies: 70197 -> 70388 (+0.27%); split: -0.61%, +0.89%
PreSGPRs: 26409 -> 26404 (-0.02%); split: -0.02%, +0.00%
PreVGPRs: 30448 -> 30436 (-0.04%)
VALU: 408184 -> 407068 (-0.27%); split: -0.29%, +0.01%
SALU: 95726 -> 95959 (+0.24%); split: -0.30%, +0.54%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:13 +00:00
Georg Lehmann
adc55b1a1e
aco/optimizer: use new helpers for v_and opt
...
Foz-DB Navi48:
Totals from 465 (0.56% of 82419) affected shaders:
Instrs: 372721 -> 372083 (-0.17%); split: -0.18%, +0.01%
CodeSize: 2004568 -> 2003332 (-0.06%)
Latency: 3664162 -> 3660745 (-0.09%); split: -0.10%, +0.00%
InvThroughput: 892042 -> 890994 (-0.12%); split: -0.12%, +0.01%
Copies: 35552 -> 35549 (-0.01%)
VALU: 171781 -> 171333 (-0.26%); split: -0.28%, +0.02%
SALU: 87946 -> 87949 (+0.00%)
VOPD: 48 -> 49 (+2.08%)
Foz-DB Navi21:
Totals from 191 (0.23% of 82387) affected shaders:
Instrs: 139340 -> 139178 (-0.12%); split: -0.13%, +0.02%
CodeSize: 798660 -> 798284 (-0.05%)
Latency: 1672750 -> 1673194 (+0.03%); split: -0.06%, +0.08%
InvThroughput: 634847 -> 634651 (-0.03%); split: -0.06%, +0.03%
Copies: 16372 -> 16366 (-0.04%); split: -0.04%, +0.01%
VALU: 79668 -> 79506 (-0.20%); split: -0.23%, +0.03%
SALU: 38233 -> 38236 (+0.01%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:13 +00:00
Georg Lehmann
7bc6d8e2ad
aco/optimizer: add more v_add_lshl_u32 opts
...
No Foz-DB changes on Navi21.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:12 +00:00
Georg Lehmann
6a1caabd64
aco/optimizer: use new helpers for v_add_lshl_u32
...
Foz-DB Navi48:
Totals from 357 (0.43% of 82419) affected shaders:
Instrs: 244419 -> 243608 (-0.33%); split: -0.34%, +0.01%
CodeSize: 1302584 -> 1304188 (+0.12%); split: -0.00%, +0.13%
VGPRs: 21240 -> 21216 (-0.11%)
Latency: 1226165 -> 1225651 (-0.04%); split: -0.06%, +0.02%
InvThroughput: 162432 -> 161940 (-0.30%); split: -0.30%, +0.00%
Copies: 16607 -> 16610 (+0.02%)
PreSGPRs: 14082 -> 14135 (+0.38%)
PreVGPRs: 15917 -> 15914 (-0.02%)
VALU: 136308 -> 135699 (-0.45%)
SALU: 24415 -> 24418 (+0.01%)
VOPD: 333 -> 334 (+0.30%)
Foz-DB Navi21:
Totals from 319 (0.39% of 82387) affected shaders:
Instrs: 255434 -> 254831 (-0.24%)
CodeSize: 1375792 -> 1378164 (+0.17%)
VGPRs: 15360 -> 15344 (-0.10%)
Latency: 1405956 -> 1405181 (-0.06%)
InvThroughput: 174402 -> 173816 (-0.34%)
Copies: 25892 -> 25891 (-0.00%)
PreSGPRs: 14129 -> 14132 (+0.02%)
PreVGPRs: 12457 -> 12454 (-0.02%)
VALU: 139630 -> 139032 (-0.43%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:12 +00:00
Georg Lehmann
7108dac637
aco/optimizer: use new helpers for s_lshl<n>_add_u32
...
Foz-DB Navi48:
Totals from 7654 (9.29% of 82419) affected shaders:
Instrs: 6170479 -> 6174536 (+0.07%); split: -0.07%, +0.13%
CodeSize: 32489580 -> 32500100 (+0.03%); split: -0.07%, +0.10%
SpillSGPRs: 4253 -> 4224 (-0.68%); split: -0.71%, +0.02%
Latency: 60472662 -> 60489681 (+0.03%); split: -0.02%, +0.04%
InvThroughput: 9218099 -> 9218149 (+0.00%); split: -0.01%, +0.01%
VClause: 121094 -> 121089 (-0.00%); split: -0.01%, +0.00%
SClause: 178092 -> 179830 (+0.98%); split: -0.55%, +1.53%
Copies: 424495 -> 423756 (-0.17%); split: -0.57%, +0.40%
Branches: 120352 -> 120353 (+0.00%); split: -0.01%, +0.01%
PreSGPRs: 334391 -> 333381 (-0.30%); split: -0.33%, +0.02%
VALU: 3349394 -> 3349323 (-0.00%); split: -0.00%, +0.00%
SALU: 957913 -> 957149 (-0.08%); split: -0.25%, +0.17%
VOPD: 9177 -> 9179 (+0.02%); split: +0.03%, -0.01%
Foz-DB Navi21:
Totals from 7649 (9.28% of 82387) affected shaders:
Instrs: 6144605 -> 6143005 (-0.03%); split: -0.06%, +0.04%
CodeSize: 32685976 -> 32672380 (-0.04%); split: -0.08%, +0.04%
SpillSGPRs: 3079 -> 3067 (-0.39%); split: -0.42%, +0.03%
Latency: 64979945 -> 65002741 (+0.04%); split: -0.02%, +0.05%
InvThroughput: 14754398 -> 14754230 (-0.00%); split: -0.01%, +0.01%
VClause: 132336 -> 132357 (+0.02%); split: -0.02%, +0.03%
SClause: 190229 -> 191340 (+0.58%); split: -1.01%, +1.60%
Copies: 511915 -> 511287 (-0.12%); split: -0.44%, +0.32%
Branches: 157156 -> 157154 (-0.00%); split: -0.01%, +0.01%
PreSGPRs: 345761 -> 344826 (-0.27%); split: -0.33%, +0.05%
VALU: 3856887 -> 3856928 (+0.00%); split: -0.01%, +0.01%
SALU: 1001190 -> 1000362 (-0.08%); split: -0.22%, +0.14%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:10 +00:00
Georg Lehmann
d9919c3e10
aco/optimizer: optimize add(mad_u32_u16(a, b, 0), c)
...
Foz-DB Navi48:
Totals from 104 (0.13% of 82419) affected shaders:
Instrs: 3554243 -> 3553555 (-0.02%); split: -0.02%, +0.00%
CodeSize: 18836004 -> 18830572 (-0.03%); split: -0.03%, +0.00%
Latency: 19288034 -> 19287208 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 3527510 -> 3526925 (-0.02%); split: -0.02%, +0.00%
VClause: 89526 -> 89522 (-0.00%); split: -0.02%, +0.01%
SClause: 62484 -> 62492 (+0.01%); split: -0.00%, +0.01%
Copies: 266415 -> 266404 (-0.00%); split: -0.04%, +0.03%
Branches: 102123 -> 102125 (+0.00%)
VALU: 1987067 -> 1986531 (-0.03%); split: -0.03%, +0.00%
SALU: 471348 -> 471346 (-0.00%); split: -0.00%, +0.00%
Foz-DB Navi21:
Totals from 228 (0.28% of 82387) affected shaders:
Instrs: 3069693 -> 3068317 (-0.04%); split: -0.05%, +0.00%
CodeSize: 16582476 -> 16574920 (-0.05%); split: -0.05%, +0.00%
Latency: 20038755 -> 20030986 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 4742546 -> 4738245 (-0.09%); split: -0.10%, +0.00%
VClause: 93157 -> 93135 (-0.02%); split: -0.03%, +0.01%
Copies: 265019 -> 264959 (-0.02%); split: -0.04%, +0.02%
VALU: 2025352 -> 2023897 (-0.07%); split: -0.07%, +0.00%
SALU: 447385 -> 447375 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:10 +00:00
Georg Lehmann
0359c8a901
aco/optimizer: use new helpers for v_add_u32 opts
...
Foz-DB Navi48:
Totals from 1554 (1.89% of 82419) affected shaders:
Instrs: 5154325 -> 5151499 (-0.05%); split: -0.08%, +0.02%
CodeSize: 27310012 -> 27318708 (+0.03%); split: -0.01%, +0.05%
VGPRs: 97236 -> 97200 (-0.04%); split: -0.05%, +0.01%
Latency: 34121873 -> 34120894 (-0.00%); split: -0.02%, +0.01%
InvThroughput: 6735276 -> 6730418 (-0.07%); split: -0.08%, +0.01%
VClause: 130106 -> 130090 (-0.01%); split: -0.05%, +0.04%
SClause: 90439 -> 90449 (+0.01%); split: -0.00%, +0.01%
Copies: 382920 -> 382401 (-0.14%); split: -0.18%, +0.05%
Branches: 130089 -> 130091 (+0.00%)
PreSGPRs: 67745 -> 67743 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 72710 -> 72674 (-0.05%)
VALU: 2941866 -> 2938129 (-0.13%); split: -0.13%, +0.00%
SALU: 651032 -> 651779 (+0.11%); split: -0.02%, +0.14%
VOPD: 2446 -> 2393 (-2.17%); split: +0.70%, -2.86%
Foz-DB Navi21:
Totals from 1534 (1.86% of 82387) affected shaders:
MaxWaves: 32481 -> 32479 (-0.01%)
Instrs: 4732755 -> 4730039 (-0.06%); split: -0.06%, +0.00%
CodeSize: 25305728 -> 25313148 (+0.03%); split: -0.00%, +0.03%
VGPRs: 84424 -> 84448 (+0.03%)
SpillVGPRs: 2420 -> 2419 (-0.04%)
Scratch: 180224 -> 179200 (-0.57%)
Latency: 36843383 -> 36846269 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 9252495 -> 9238142 (-0.16%); split: -0.17%, +0.02%
VClause: 146629 -> 146671 (+0.03%); split: -0.02%, +0.05%
SClause: 94502 -> 94512 (+0.01%); split: -0.00%, +0.01%
Copies: 403672 -> 403592 (-0.02%); split: -0.09%, +0.07%
Branches: 141145 -> 141137 (-0.01%)
PreSGPRs: 70003 -> 70001 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 70835 -> 70800 (-0.05%)
VALU: 3114513 -> 3111338 (-0.10%); split: -0.10%, +0.00%
SALU: 651177 -> 651925 (+0.11%); split: -0.02%, +0.13%
VMEM: 271263 -> 271261 (-0.00%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38530 >
2025-11-25 11:49:09 +00:00