Commit graph

3971 commits

Author SHA1 Message Date
Georg Lehmann
8343e45467 aco/lower_branches: update branch hints after changing jump targets
Fixes: 13ad3db43f ("aco/lower_branches: implement try_remove_simple_block() in lower_branches()")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37552>
2025-09-26 15:11:26 +00:00
Georg Lehmann
cc08786689 aco: use maximum RT vgpr_limit that doesn't reduce wave count
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
144 instead of 132 with 5 waves, in practice.

Foz-DB Navi31:
Totals from 33 (0.04% of 80273) affected shaders:
Instrs: 3266241 -> 3261329 (-0.15%)
CodeSize: 16885356 -> 16860088 (-0.15%)
VGPRs: 4356 -> 4752 (+9.09%)
SpillVGPRs: 2504 -> 1535 (-38.70%)
Scratch: 264704 -> 216320 (-18.28%)
Latency: 18445909 -> 18395904 (-0.27%)
InvThroughput: 3689182 -> 3679182 (-0.27%)
VClause: 85171 -> 84595 (-0.68%)
SClause: 59365 -> 59320 (-0.08%); split: -0.08%, +0.01%
Copies: 260528 -> 259113 (-0.54%); split: -0.59%, +0.05%
Branches: 92537 -> 92519 (-0.02%)
VALU: 1937426 -> 1935925 (-0.08%); split: -0.08%, +0.01%
SALU: 393075 -> 393047 (-0.01%); split: -0.01%, +0.01%
VMEM: 147914 -> 146003 (-1.29%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37548>
2025-09-26 08:45:05 +00:00
Georg Lehmann
8e03505782 aco: don't insert s_sendmsg dealloc_vgprs with little vgprs allocated
Reduces message bus traffic when the benefit is small.

Foz-DB Navi31:
Totals from 3752 (4.67% of 80273) affected shaders:
Instrs: 1999755 -> 1992249 (-0.38%)
CodeSize: 10531824 -> 10501800 (-0.29%)
Latency: 14935247 -> 14935147 (-0.00%)
InvThroughput: 5976053 -> 5975262 (-0.01%)

Foz-DB Navi33:
Totals from 2614 (3.26% of 80273) affected shaders:
Instrs: 969475 -> 964247 (-0.54%)
CodeSize: 5171240 -> 5150328 (-0.40%)
Latency: 7891519 -> 7891434 (-0.00%)
InvThroughput: 4815008 -> 4814287 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Georg Lehmann
27cc6317f9 aco: dealloc vgprs if there is a pending non scratch store and no pending export
Because s_sendmsg dealloc_vgprs waits for every counter except vs_count,
and the message bus has limited throughput, we should only insert the dealloc
when we know that it's beneficial.

Foz-DB Navi31:
Totals from 5280 (6.58% of 80273) affected shaders:
Instrs: 4186851 -> 4197416 (+0.25%)
CodeSize: 21910004 -> 21952264 (+0.19%)
Latency: 31679067 -> 31679173 (+0.00%)
InvThroughput: 9182625 -> 9183417 (+0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Georg Lehmann
26e041e821 aco: remove existing dealloc_vgprs use
We didn't consider that s_sendmsg dealloc_vgpr waits for all counters
expect vscnt.

Foz-DB Navi31:
Totals from 74090 (92.52% of 80084) affected shaders:
Instrs: 36031071 -> 35853573 (-0.49%)
CodeSize: 189233756 -> 188523764 (-0.38%)
Latency: 222378318 -> 222374890 (-0.00%)
InvThroughput: 33366893 -> 33362457 (-0.01%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
2025-09-26 07:51:02 +00:00
Georg Lehmann
cf30742a66 radv,aco: don't end monolithic ray tracing with unconditional terminate
The terminate requires more code and blocks us from deallocating VGPRs early.

Foz-DB Navi31:
Totals from 63 (0.08% of 80273) affected shaders:
Instrs: 3372702 -> 3372467 (-0.01%)
CodeSize: 17441676 -> 17440736 (-0.01%)
Latency: 19763447 -> 19763288 (-0.00%)
InvThroughput: 3860502 -> 3860478 (-0.00%)
Branches: 96204 -> 96141 (-0.07%)
SALU: 406648 -> 406549 (-0.02%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37542>
2025-09-25 15:35:55 +00:00
Daniel Schürmann
d041640b88 aco: remove excess offset handling for load/store_shared
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37453>
2025-09-24 14:28:25 +00:00
Daniel Schürmann
dbb20a4e23 aco/optimizer: remove DS offset optimization
No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37453>
2025-09-24 14:28:24 +00:00
Natalie Vock
f0d3d0ad21 aco/scheduler: Bail early on unreorderable instructions
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37212>
2025-09-22 11:13:50 +00:00
Rhys Perry
d6ed68212c aco: fix SGPR 8-bit nir_op_vec with mixed constant and non-constant
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
For example, vec2(non_const, const)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 04e3d7ad93 ("aco: improve nir_op_vec with constant operands")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13911
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37405>
2025-09-18 12:37:19 +00:00
Rhys Perry
8931672eef aco: workaround load tearing for load_shared2_amd
This probably has the same issue as load_shared.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 04956d54ce ("aco: force uniform result for LDS load with uniform address if it can be non uniform")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37417>
2025-09-17 11:29:21 +00:00
Rhys Perry
81df517553 aco: avoid unaligned offsets when selecting load_global_amd
SMEM instructions mask off the low bits for the base and offset sources
both before and after they're added. However, NIR expects ACO to only
care about the alignment of the final address.

fossil-db (gfx1201):
Totals from 21 (0.03% of 79839) affected shaders:
Instrs: 229780 -> 229876 (+0.04%)
CodeSize: 1267724 -> 1268080 (+0.03%)
Latency: 2800924 -> 2800978 (+0.00%)
InvThroughput: 520250 -> 520256 (+0.00%)
Copies: 27878 -> 27876 (-0.01%); split: -0.01%, +0.00%
SALU: 29591 -> 29643 (+0.18%)

fossil-db (polaris10):
Totals from 3 (0.00% of 62201) affected shaders:
Latency: 2651 -> 2652 (+0.04%)
InvThroughput: 662 -> 663 (+0.15%)
PreSGPRs: 51 -> 54 (+5.88%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37301>
2025-09-17 09:15:46 +00:00
Rhys Perry
6d71521ecd aco: avoid wraparound for smem global loads with both offsets
fossil-db (gfx1201):
Totals from 296 (0.37% of 79839) affected shaders:
Instrs: 382593 -> 380149 (-0.64%)
CodeSize: 1981452 -> 1970988 (-0.53%); split: -0.53%, +0.00%
Latency: 1575286 -> 1574252 (-0.07%)
InvThroughput: 215839 -> 215818 (-0.01%)
SClause: 8679 -> 8677 (-0.02%); split: -0.03%, +0.01%
Copies: 19642 -> 19641 (-0.01%); split: -0.03%, +0.02%
PreSGPRs: 14521 -> 14515 (-0.04%)
SALU: 57097 -> 55718 (-2.42%)

fossil-db (polaris10):
Totals from 30 (0.05% of 62201) affected shaders:
Instrs: 23341 -> 23379 (+0.16%); split: -0.01%, +0.18%
CodeSize: 121316 -> 121516 (+0.16%); split: -0.01%, +0.17%
SGPRs: 2368 -> 2384 (+0.68%)
Latency: 235153 -> 235374 (+0.09%); split: -0.01%, +0.11%
InvThroughput: 92582 -> 92566 (-0.02%)
SClause: 616 -> 619 (+0.49%)
Copies: 2717 -> 2720 (+0.11%)
PreSGPRs: 1204 -> 1213 (+0.75%)
SALU: 3654 -> 3692 (+1.04%); split: -0.08%, +1.12%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.2
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37301>
2025-09-17 09:15:46 +00:00
Georg Lehmann
714a149396 nir: remove unsigned upper bound config
All config information is now either in nir->info or nir->options.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:04 +00:00
Georg Lehmann
bb67dae12d nir/uub: remove max_workgroup_size from config
For most hardware, this is the same as max invocations in the workgroup.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:04 +00:00
Georg Lehmann
f3c08c9d27 nir/uub: use shader_info subgroup size
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:04 +00:00
Georg Lehmann
d029686e20 aco/isel: fix output args init stack buffer overflow
BITSET range functions include the end of the range.

Fixes: eb249bb18e ("aco: Only fix used variables to registers")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:03 +00:00
nihui
849344dc08 aco: set program->dev.fused_mad_mix=true for GFX940
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35655>
2025-09-16 07:02:32 +00:00
nihui
8c4f0b1353 aco: gfx940 has no mad f32 instruction
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35655>
2025-09-16 07:02:32 +00:00
Natalie Vock
e3460f15fa aco/opt: Work around GCC compiler issue
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
No functional change. Random code churn that, apparently, makes a
GCC miscompile disappear.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:21 +00:00
Natalie Vock
a06f38e5ae aco/vn: Don't combine expressions across calls
This increases live state across calls, which in turn increases spilling
and makes for slower shaders overall.

On top of RT function calls:
Totals from 7 (0.01% of 81072) affected shaders:

Instrs: 8980 -> 8955 (-0.28%); split: -0.88%, +0.60%
CodeSize: 51976 -> 51684 (-0.56%); split: -1.02%, +0.46%
SpillSGPRs: 248 -> 244 (-1.61%); split: -3.63%, +2.02%
SpillVGPRs: 367 -> 365 (-0.54%); split: -1.09%, +0.54%
Scratch: 32768 -> 31744 (-3.12%)
Latency: 135669 -> 128720 (-5.12%); split: -5.13%, +0.01%
InvThroughput: 35301 -> 34783 (-1.47%); split: -1.51%, +0.05%
VClause: 241 -> 242 (+0.41%)
SClause: 117 -> 120 (+2.56%)
Copies: 1311 -> 1338 (+2.06%); split: -0.69%, +2.75%
PreSGPRs: 899 -> 895 (-0.44%); split: -1.56%, +1.11%
PreVGPRs: 1103 -> 1099 (-0.36%)
VALU: 6143 -> 6098 (-0.73%); split: -1.22%, +0.49%
SALU: 913 -> 933 (+2.19%); split: -0.11%, +2.30%
VMEM: 989 -> 967 (-2.22%)
SMEM: 201 -> 214 (+6.47%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:21 +00:00
Natalie Vock
575d3adbf5 aco/validate: Validate call instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:21 +00:00
Natalie Vock
28dc185966 aco/sched: Handle calls
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
2be37a91fa aco/live_var_analysis: Handle calls
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
9c8a17e172 aco/lower_to_hw_instr: Lower calls
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
3667a7b687 aco: Add call info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
af812862b7 aco: Add call-related program/block properties
Indicates various properties about calls: Whether a program is an
indirect callee, whether a program or block contains function calls, and
whether registers used by a caller need to be preserved.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
917a98b722 aco: Add ABI and Pseudo CALL format
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
e850650f92 aco: Add function call attributes
ACO needs RADV to set certain attributes on NIR functions to help with
compilation of function calls.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
d18b438832 aco: Add RegisterDemand::operator!=
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Yonggang Luo
7db518cfe4 aco: Fixes warning: function get_branch_target/to_clrx_device_name defined but not used
../../src/amd/compiler/aco_print_asm.cpp:156:1: warning: 'bool aco::{anonymous}::get_branch_target(char**, aco::Program*, const std::vector<bool>&, char**)' defined but not used [-Wunused-function]
  156 | get_branch_target(char** output, Program* program, const std::vector<bool>& referenced_blocks,
      | ^~~~~~~~~~~~~~~~~
../../src/amd/compiler/aco_print_asm.cpp:105:1: warning: 'const char* aco::{anonymous}::to_clrx_device_name(amd_gfx_level, radeon_family)' defined but not used [-Wunused-function]
  105 | to_clrx_device_name(amd_gfx_level gfx_level, radeon_family family)
      | ^~~~~~~~~~~~~~~~~~~

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289>
2025-09-13 08:23:07 +00:00
Rhys Perry
e2181744c2 aco/tests: add barrier-to-waitcnt tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
0f32b573a4 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup lds barriers
fossil-db (navi21):
Totals from 36594 (45.84% of 79825) affected shaders:
Instrs: 19922581 -> 19922563 (-0.00%)
CodeSize: 103616980 -> 103616956 (-0.00%)
Latency: 69862064 -> 69053273 (-1.16%)
InvThroughput: 14607708 -> 14606308 (-0.01%); split: -0.01%, +0.00%

fossil-db (navi31):
Totals from 1641 (2.06% of 79825) affected shaders:
Instrs: 1247591 -> 1247875 (+0.02%); split: -0.00%, +0.03%
CodeSize: 6259516 -> 6260612 (+0.02%); split: -0.00%, +0.02%
Latency: 7657224 -> 7577299 (-1.04%); split: -1.05%, +0.00%
InvThroughput: 1150669 -> 1148171 (-0.22%); split: -0.22%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
ac882985c0 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup vmem barriers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
145b178de2 aco: fix workgroup-scope barrier between vmem and lds
A barrier between two lds/vmem instructions needs to ensure that the
second starts after the first finishes, which means that we can't just
skip workgroup-scope vmem barriers if there is a lds instruction later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
02718fd4c5 aco: use a separate event for sendmsg_rtn
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
5812c2ea89 aco: update waitcnt events for exports
Include primitive, dual source blend and POS4 exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
711c023b55 aco: remove waitcnt code for POPS
We now insert barriers around these instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
005694fe1f aco: remove waitcnt code for SMEM stores
These were removed in GFX10.3 and we haven't used them in a while.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
20cd5cf5f7 aco: delay barrier waitcnt until they are needed
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
843acfa50b aco: add a separate barrier_info for release/acquire barriers
These can wait for different sets of accesses.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
6c446c2f83 aco: refactor waitcnt pass to use barrier_info
Currently there's just barrier_info_all, but more will be added later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
21332609b9 aco: don't move acquire barriers before interlock begin
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
0ee1c137f9 aco: don't move release barriers after interlock end
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
7c056dd473 aco: add is_atomic_or_control_instr helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
df6a3b7619 aco: reduce cost of using values defined in predecessors
For code like:
   if (cond) {
      val = load()
   }
   use(val)
The "use(val)" now has a similar cost to a use inside the IF.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Yonggang Luo
773a7f347a clang-format: Update the .clang-format files to conformance clang-format json-schema
The document is at
https://clang.llvm.org/docs/ClangFormatStyleOptions.html

The json-schema at
https://www.schemastore.org/clang-format.json

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Rhys Perry
1105f7b98f aco: fix signed integer overflow
Fix UBSan error:
runtime error: signed integer overflow: 2147483647 + 32 cannot be represented in type 'int'

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37055>
2025-09-03 11:47:00 +00:00
Samuel Pitoiset
decf9af472 radv/rt: only use one user SGPR for the traversal shader addr
All shaders are allocated in the 32-bit addr space. To avoid an issue
with alignment, and also for future work, there is an unused user SGPR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37133>
2025-09-03 05:53:41 +00:00
Daniel Schürmann
441d5aab08 aco/ra: coalesce vector affinities with tied definitions
Totals from 19310 (24.19% of 79839) affected shaders: (Navi48)

MaxWaves: 564238 -> 564542 (+0.05%); split: +0.06%, -0.01%
Instrs: 10856428 -> 10803360 (-0.49%); split: -0.53%, +0.04%
CodeSize: 56405088 -> 56189384 (-0.38%); split: -0.41%, +0.02%
VGPRs: 986120 -> 985952 (-0.02%); split: -0.50%, +0.48%
Latency: 53956142 -> 53940850 (-0.03%); split: -0.11%, +0.09%
InvThroughput: 8769260 -> 8735595 (-0.38%); split: -0.49%, +0.11%
VClause: 237471 -> 237452 (-0.01%); split: -0.05%, +0.04%
SClause: 225385 -> 225389 (+0.00%)
Copies: 799792 -> 744150 (-6.96%); split: -7.25%, +0.30%
Branches: 208574 -> 208572 (-0.00%); split: -0.00%, +0.00%
VALU: 6116920 -> 6061448 (-0.91%); split: -0.95%, +0.04%
SALU: 1442068 -> 1441990 (-0.01%); split: -0.01%, +0.00%
VOPD: 1914 -> 1744 (-8.88%); split: +0.10%, -8.99%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:27 +00:00