mesa/src/amd/compiler/instruction_selection
Georg Lehmann 883b1ca364 aco: disable wqm for tex loads when not needed
By only executing VMEM loads for lanes where the result is used, we can save
bandwidth.

The NIR pass only handles tex for now, but those are most common anyway.
We can extend it handle image/ssbo/ubo/global loads in the future.

Foz-DB GFX1201:
Totals from 32633 (40.66% of 80251) affected shaders:
Instrs: 22635910 -> 23193509 (+2.46%); split: -0.00%, +2.46%
CodeSize: 122880044 -> 125093428 (+1.80%); split: -0.00%, +1.81%
VGPRs: 1481868 -> 1481712 (-0.01%)
SpillSGPRs: 3877 -> 4301 (+10.94%); split: -0.52%, +11.45%
Latency: 171480552 -> 171685219 (+0.12%); split: -0.18%, +0.30%
InvThroughput: 24364743 -> 24373441 (+0.04%); split: -0.08%, +0.12%
VClause: 388318 -> 388557 (+0.06%); split: -0.06%, +0.13%
SClause: 774781 -> 776492 (+0.22%); split: -0.29%, +0.51%
Copies: 1416586 -> 1541199 (+8.80%); split: -0.16%, +8.96%
Branches: 419591 -> 419673 (+0.02%); split: -0.02%, +0.04%
PreSGPRs: 1330303 -> 1416540 (+6.48%)
PreVGPRs: 964864 -> 964863 (-0.00%)
VALU: 12919601 -> 12920254 (+0.01%); split: -0.01%, +0.01%
SALU: 2685402 -> 3224147 (+20.06%); split: -0.00%, +20.07%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970>
2025-08-15 07:03:46 +00:00
..
aco_instruction_selection.h aco: use new disable_wqm for mimg 2025-08-15 07:03:46 +00:00
aco_isel_cfg.cpp aco/isel: move control-flow helper functions into separate file 2025-05-16 11:01:19 +00:00
aco_isel_helpers.cpp aco: don't restrict vmem load scheduling by inserting p_end_wqm early 2025-08-15 07:03:46 +00:00
aco_isel_setup.cpp aco: disable wqm for tex loads when not needed 2025-08-15 07:03:46 +00:00
aco_select_nir.cpp aco: disable wqm for tex loads when not needed 2025-08-15 07:03:46 +00:00
aco_select_nir_alu.cpp build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
aco_select_nir_intrinsics.cpp aco: use a smaller wqm section for strict_wqm sampling 2025-08-15 07:03:46 +00:00
aco_select_ps_epilog.cpp build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
aco_select_ps_prolog.cpp aco/isel: move select_ps_prolog() into separate file 2025-05-16 11:01:19 +00:00
aco_select_rt_prolog.cpp aco/isel: move select_rt_prolog() into separate file 2025-05-16 11:01:19 +00:00
aco_select_trap_handler.cpp aco/isel: move select_trap_handler_shader() into separate file 2025-05-16 11:01:19 +00:00
aco_select_vs_prolog.cpp aco/isel: move select_vs_prolog() into separate file 2025-05-16 11:01:19 +00:00