Commit graph

3951 commits

Author SHA1 Message Date
Natalie Vock
a06f38e5ae aco/vn: Don't combine expressions across calls
This increases live state across calls, which in turn increases spilling
and makes for slower shaders overall.

On top of RT function calls:
Totals from 7 (0.01% of 81072) affected shaders:

Instrs: 8980 -> 8955 (-0.28%); split: -0.88%, +0.60%
CodeSize: 51976 -> 51684 (-0.56%); split: -1.02%, +0.46%
SpillSGPRs: 248 -> 244 (-1.61%); split: -3.63%, +2.02%
SpillVGPRs: 367 -> 365 (-0.54%); split: -1.09%, +0.54%
Scratch: 32768 -> 31744 (-3.12%)
Latency: 135669 -> 128720 (-5.12%); split: -5.13%, +0.01%
InvThroughput: 35301 -> 34783 (-1.47%); split: -1.51%, +0.05%
VClause: 241 -> 242 (+0.41%)
SClause: 117 -> 120 (+2.56%)
Copies: 1311 -> 1338 (+2.06%); split: -0.69%, +2.75%
PreSGPRs: 899 -> 895 (-0.44%); split: -1.56%, +1.11%
PreVGPRs: 1103 -> 1099 (-0.36%)
VALU: 6143 -> 6098 (-0.73%); split: -1.22%, +0.49%
SALU: 913 -> 933 (+2.19%); split: -0.11%, +2.30%
VMEM: 989 -> 967 (-2.22%)
SMEM: 201 -> 214 (+6.47%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:21 +00:00
Natalie Vock
575d3adbf5 aco/validate: Validate call instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:21 +00:00
Natalie Vock
28dc185966 aco/sched: Handle calls
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
2be37a91fa aco/live_var_analysis: Handle calls
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
9c8a17e172 aco/lower_to_hw_instr: Lower calls
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
3667a7b687 aco: Add call info
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
af812862b7 aco: Add call-related program/block properties
Indicates various properties about calls: Whether a program is an
indirect callee, whether a program or block contains function calls, and
whether registers used by a caller need to be preserved.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
917a98b722 aco: Add ABI and Pseudo CALL format
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
e850650f92 aco: Add function call attributes
ACO needs RADV to set certain attributes on NIR functions to help with
compilation of function calls.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Natalie Vock
d18b438832 aco: Add RegisterDemand::operator!=
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34531>
2025-09-15 17:16:20 +00:00
Yonggang Luo
7db518cfe4 aco: Fixes warning: function get_branch_target/to_clrx_device_name defined but not used
../../src/amd/compiler/aco_print_asm.cpp:156:1: warning: 'bool aco::{anonymous}::get_branch_target(char**, aco::Program*, const std::vector<bool>&, char**)' defined but not used [-Wunused-function]
  156 | get_branch_target(char** output, Program* program, const std::vector<bool>& referenced_blocks,
      | ^~~~~~~~~~~~~~~~~
../../src/amd/compiler/aco_print_asm.cpp:105:1: warning: 'const char* aco::{anonymous}::to_clrx_device_name(amd_gfx_level, radeon_family)' defined but not used [-Wunused-function]
  105 | to_clrx_device_name(amd_gfx_level gfx_level, radeon_family family)
      | ^~~~~~~~~~~~~~~~~~~

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289>
2025-09-13 08:23:07 +00:00
Rhys Perry
e2181744c2 aco/tests: add barrier-to-waitcnt tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
0f32b573a4 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup lds barriers
fossil-db (navi21):
Totals from 36594 (45.84% of 79825) affected shaders:
Instrs: 19922581 -> 19922563 (-0.00%)
CodeSize: 103616980 -> 103616956 (-0.00%)
Latency: 69862064 -> 69053273 (-1.16%)
InvThroughput: 14607708 -> 14606308 (-0.01%); split: -0.01%, +0.00%

fossil-db (navi31):
Totals from 1641 (2.06% of 79825) affected shaders:
Instrs: 1247591 -> 1247875 (+0.02%); split: -0.00%, +0.03%
CodeSize: 6259516 -> 6260612 (+0.02%); split: -0.00%, +0.02%
Latency: 7657224 -> 7577299 (-1.04%); split: -1.05%, +0.00%
InvThroughput: 1150669 -> 1148171 (-0.22%); split: -0.22%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
ac882985c0 aco/gfx10: skip waitcnts or use vm_vsrc(0) for workgroup vmem barriers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
145b178de2 aco: fix workgroup-scope barrier between vmem and lds
A barrier between two lds/vmem instructions needs to ensure that the
second starts after the first finishes, which means that we can't just
skip workgroup-scope vmem barriers if there is a lds instruction later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
02718fd4c5 aco: use a separate event for sendmsg_rtn
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
5812c2ea89 aco: update waitcnt events for exports
Include primitive, dual source blend and POS4 exports.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
711c023b55 aco: remove waitcnt code for POPS
We now insert barriers around these instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
005694fe1f aco: remove waitcnt code for SMEM stores
These were removed in GFX10.3 and we haven't used them in a while.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
20cd5cf5f7 aco: delay barrier waitcnt until they are needed
fossil-db (navi21):
Totals from 44 (0.06% of 79825) affected shaders:
Instrs: 16001 -> 15932 (-0.43%); split: -0.46%, +0.02%
CodeSize: 85800 -> 85548 (-0.29%); split: -0.30%, +0.01%
Latency: 190124 -> 173458 (-8.77%)
InvThroughput: 23605 -> 22756 (-3.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
843acfa50b aco: add a separate barrier_info for release/acquire barriers
These can wait for different sets of accesses.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
6c446c2f83 aco: refactor waitcnt pass to use barrier_info
Currently there's just barrier_info_all, but more will be added later.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
21332609b9 aco: don't move acquire barriers before interlock begin
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
0ee1c137f9 aco: don't move release barriers after interlock end
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
7c056dd473 aco: add is_atomic_or_control_instr helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Rhys Perry
df6a3b7619 aco: reduce cost of using values defined in predecessors
For code like:
   if (cond) {
      val = load()
   }
   use(val)
The "use(val)" now has a similar cost to a use inside the IF.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36491>
2025-09-09 12:34:40 +00:00
Yonggang Luo
773a7f347a clang-format: Update the .clang-format files to conformance clang-format json-schema
The document is at
https://clang.llvm.org/docs/ClangFormatStyleOptions.html

The json-schema at
https://www.schemastore.org/clang-format.json

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Rhys Perry
1105f7b98f aco: fix signed integer overflow
Fix UBSan error:
runtime error: signed integer overflow: 2147483647 + 32 cannot be represented in type 'int'

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37055>
2025-09-03 11:47:00 +00:00
Samuel Pitoiset
decf9af472 radv/rt: only use one user SGPR for the traversal shader addr
All shaders are allocated in the 32-bit addr space. To avoid an issue
with alignment, and also for future work, there is an unused user SGPR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37133>
2025-09-03 05:53:41 +00:00
Daniel Schürmann
441d5aab08 aco/ra: coalesce vector affinities with tied definitions
Totals from 19310 (24.19% of 79839) affected shaders: (Navi48)

MaxWaves: 564238 -> 564542 (+0.05%); split: +0.06%, -0.01%
Instrs: 10856428 -> 10803360 (-0.49%); split: -0.53%, +0.04%
CodeSize: 56405088 -> 56189384 (-0.38%); split: -0.41%, +0.02%
VGPRs: 986120 -> 985952 (-0.02%); split: -0.50%, +0.48%
Latency: 53956142 -> 53940850 (-0.03%); split: -0.11%, +0.09%
InvThroughput: 8769260 -> 8735595 (-0.38%); split: -0.49%, +0.11%
VClause: 237471 -> 237452 (-0.01%); split: -0.05%, +0.04%
SClause: 225385 -> 225389 (+0.00%)
Copies: 799792 -> 744150 (-6.96%); split: -7.25%, +0.30%
Branches: 208574 -> 208572 (-0.00%); split: -0.00%, +0.00%
VALU: 6116920 -> 6061448 (-0.91%); split: -0.95%, +0.04%
SALU: 1442068 -> 1441990 (-0.01%); split: -0.01%, +0.00%
VOPD: 1914 -> 1744 (-8.88%); split: +0.10%, -8.99%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:27 +00:00
Daniel Schürmann
2f303636f3 aco/ra: consider precolor affinities in get_reg_vector()
No fossil changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:27 +00:00
Daniel Schürmann
6dbf8f7b90 aco/ra: don't set precolor affinities for already assigned temporaries
Also don't overwrite existing precolor affinities.

Totals from 248 (0.31% of 79839) affected shaders: (Navi48)

Instrs: 154427 -> 154401 (-0.02%); split: -0.12%, +0.10%
CodeSize: 812880 -> 812568 (-0.04%); split: -0.12%, +0.08%
VGPRs: 12432 -> 12408 (-0.19%)
Latency: 851623 -> 851801 (+0.02%); split: -0.03%, +0.05%
InvThroughput: 156569 -> 156581 (+0.01%); split: -0.04%, +0.05%
VClause: 2672 -> 2681 (+0.34%); split: -0.34%, +0.67%
Copies: 12645 -> 12660 (+0.12%); split: -0.53%, +0.65%
VALU: 82894 -> 82909 (+0.02%); split: -0.08%, +0.10%
SALU: 25406 -> 25424 (+0.07%); split: -0.07%, +0.14%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:26 +00:00
Daniel Schürmann
eb557fd090 aco/ra: add vector_info::index to indicate the Operand's index into the vector
This simplifies the code and will allow for a mismatch between the index and
the Operand's temporary.

Totals from 28 (0.04% of 79839) affected shaders: (Navi48)

Instrs: 18453 -> 18440 (-0.07%); split: -0.08%, +0.01%
CodeSize: 98588 -> 98532 (-0.06%); split: -0.06%, +0.00%
Copies: 1347 -> 1333 (-1.04%); split: -1.11%, +0.07%
VALU: 10431 -> 10417 (-0.13%); split: -0.14%, +0.01%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36851>
2025-09-02 10:24:26 +00:00
Marek Olšák
4c87d002e3 aco,radeonsi: expand 32-bit shader arg pointers to 64 bits for ACO
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 15:04:32 -04:00
Marek Olšák
7d5288b5b7 aco: check that global addresses are 64bit, apply_nuw_to_ssa to global_amd/smem
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37101>
2025-08-30 15:04:32 -04:00
Georg Lehmann
38e32e39a9 aco: never end wqm early for vmem
The remaining cases where disable_wqm isn't set are either uniform loads
or loads that influence control flow. In the first case, not ending WQM early
is free, and in the second case it's likely still better to not block scheduling.

Foz-DB GFX1201:
Totals from 483 (0.60% of 80287) affected shaders:
MaxWaves: 12654 -> 12642 (-0.09%)
Instrs: 485234 -> 484830 (-0.08%); split: -0.19%, +0.11%
CodeSize: 2630876 -> 2629184 (-0.06%); split: -0.15%, +0.08%
VGPRs: 29980 -> 30004 (+0.08%)
Latency: 4908015 -> 4813167 (-1.93%); split: -1.95%, +0.02%
InvThroughput: 751059 -> 748582 (-0.33%); split: -0.35%, +0.02%
VClause: 8723 -> 8705 (-0.21%); split: -0.30%, +0.09%
SClause: 11085 -> 10986 (-0.89%); split: -1.45%, +0.56%
Copies: 25155 -> 25183 (+0.11%); split: -0.26%, +0.37%
Branches: 6203 -> 6204 (+0.02%)
PreSGPRs: 23763 -> 23780 (+0.07%)
VALU: 296576 -> 296593 (+0.01%); split: -0.01%, +0.02%
SALU: 49095 -> 49416 (+0.65%); split: -0.04%, +0.69%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann
3d190f2e9c aco: implement skip_helpers for load_global_amd
Foz-DB GFX1201:
Totals from 119 (0.15% of 80287) affected shaders:
Instrs: 212449 -> 213452 (+0.47%)
CodeSize: 1120656 -> 1124708 (+0.36%)
Latency: 2854370 -> 2855772 (+0.05%); split: -0.02%, +0.07%
InvThroughput: 586142 -> 586210 (+0.01%); split: -0.00%, +0.01%
VClause: 3556 -> 3656 (+2.81%)
SClause: 2708 -> 2710 (+0.07%)
Copies: 14410 -> 14509 (+0.69%)
PreSGPRs: 6810 -> 6850 (+0.59%); split: -0.12%, +0.70%
VALU: 135945 -> 135942 (-0.00%); split: -0.01%, +0.01%
SALU: 22147 -> 23121 (+4.40%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann
ee7069f875 aco: implement skip_helpers for load_scratch
Foz-DB GFX1201:
Totals from 2 (0.00% of 80287) affected shaders:
Instrs: 4016 -> 4054 (+0.95%)
CodeSize: 22104 -> 22256 (+0.69%)
Latency: 17123 -> 17129 (+0.04%)
Copies: 406 -> 415 (+2.22%)
SALU: 323 -> 353 (+9.29%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:04 +00:00
Georg Lehmann
2bfd8918a5 aco: implement skip_helpers for load_ssbo/ubo/constant
Foz-DB GFX1201:
Totals from 6676 (8.32% of 80287) affected shaders:
Instrs: 8786161 -> 8829091 (+0.49%); split: -0.01%, +0.50%
CodeSize: 47141800 -> 47320480 (+0.38%); split: -0.01%, +0.39%
VGPRs: 376624 -> 376600 (-0.01%)
SpillSGPRs: 1251 -> 1250 (-0.08%)
Latency: 99716626 -> 99642361 (-0.07%); split: -0.11%, +0.04%
InvThroughput: 14893179 -> 14898323 (+0.03%); split: -0.01%, +0.04%
VClause: 149425 -> 153539 (+2.75%); split: -0.04%, +2.79%
SClause: 251247 -> 251842 (+0.24%); split: -0.06%, +0.30%
Copies: 580304 -> 586424 (+1.05%); split: -0.21%, +1.26%
Branches: 163014 -> 163013 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 356548 -> 357109 (+0.16%); split: -0.18%, +0.33%
VALU: 5149733 -> 5149797 (+0.00%); split: -0.00%, +0.00%
SALU: 1082176 -> 1122718 (+3.75%); split: -0.06%, +3.80%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:03 +00:00
Georg Lehmann
bdae511b18 aco: implement skip_helpers for image loads
Foz-DB GFX1201:
Totals from 5 (0.01% of 80287) affected shaders:
Instrs: 1406 -> 1417 (+0.78%)
CodeSize: 8012 -> 8056 (+0.55%)
Latency: 7279 -> 7282 (+0.04%)
Copies: 84 -> 85 (+1.19%)
SALU: 170 -> 180 (+5.88%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:02 +00:00
Georg Lehmann
bf453a7c6a aco/isel: add init_disable_wqm helper
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36785>
2025-08-28 06:29:01 +00:00
Georg Lehmann
635ac758c9 aco/optimizer: don't create undef copies from p_create_vector
p_create_vector allows undef operands, p_parallelcopy doesn't.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13765
Fixes: 01d20680e2 ("aco/optimizer: generalize p_create_vector of split vector opt")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36963>
2025-08-25 16:47:38 +00:00
Georg Lehmann
8903bb4618 aco/optimizer: don't apply packed clamp to v_fma_mix
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13758
Fixes: 345bf8a2f2 ("aco/optimizer: remove label_vop3p")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36963>
2025-08-25 16:47:38 +00:00
Georg Lehmann
791a57805c aco: fix ra validation for flat/global/scratch/ds load sbyte_d16
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Fixes: 18a53230eb ("aco: don't check dst_bitsize in apply_load_extract")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36964>
2025-08-25 10:09:16 +00:00
Konstantin Seurer
951b187b95 nir: Use nir_def_block in more places
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:10 +00:00
Konstantin Seurer
9df7b48d2f nir: Use nir_def_as_* in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:09 +00:00
Daniel Schürmann
219c53e6fc aco/ra: don't clear lateKill operands in get_reg_create_vector()
Fixes: 08f088479a ('aco/ra: set late-kill for operands of temporary p_create_vector')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36871>
2025-08-22 15:25:04 +00:00
Marek Olšák
3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Georg Lehmann
639b91bb48 aco/isel: fix vectorized i2i16 with 8bit vec8 source
The extract index is in dwords, not bytes.

Fixes: 92d433c54a ("aco: vectorize conversions from 8bit to 16bit")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36869>
2025-08-20 10:13:22 +00:00
Daniel Schürmann
0546ecfadb aco/scheduler: small refactor of schedule_VMEM()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36599>
2025-08-19 16:59:12 +00:00