Commit graph

388 commits

Author SHA1 Message Date
Georg Lehmann
0e66f2b2cc aco: use new disable_wqm for mimg
Foz-DB GFX1201:
Totals from 88 (0.11% of 80251) affected shaders:
Instrs: 81954 -> 82218 (+0.32%); split: -0.02%, +0.34%
CodeSize: 451824 -> 452880 (+0.23%); split: -0.02%, +0.25%
Latency: 308818 -> 308746 (-0.02%); split: -0.05%, +0.02%
VClause: 1324 -> 1318 (-0.45%)
Copies: 2795 -> 2784 (-0.39%)
PreSGPRs: 4029 -> 4035 (+0.15%)
SALU: 6563 -> 6809 (+3.75%); split: -0.15%, +3.90%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35970>
2025-08-15 07:03:46 +00:00
Rhys Perry
08f088479a aco/ra: set late-kill for operands of temporary p_create_vector
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13543
Fixes: c279dd6e61 ("aco: Support vector-aligned ops fixed to defs")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36469>
2025-08-06 09:44:01 +00:00
Daniel Schürmann
4ca3cc5a1a aco/ra: propagate precolor affinities through parallelcopies and tied definitions
Totals from 214 (0.27% of 79839) affected shaders: (Navi48)

Instrs: 65339 -> 65311 (-0.04%); split: -0.05%, +0.00%
CodeSize: 352616 -> 350952 (-0.47%); split: -0.55%, +0.07%
VGPRs: 9984 -> 9960 (-0.24%)
Latency: 207556 -> 207508 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 40422 -> 40397 (-0.06%)
Copies: 3180 -> 3155 (-0.79%)
VALU: 38347 -> 38322 (-0.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36345>
2025-08-01 17:15:54 +00:00
Daniel Schürmann
a667d9a68d aco/ra: propagate precolor affinities through phis
Totals from 917 (1.15% of 79839) affected shaders: (Navi48)

Instrs: 3217861 -> 3216947 (-0.03%); split: -0.04%, +0.01%
CodeSize: 17427204 -> 17432264 (+0.03%); split: -0.06%, +0.09%
VGPRs: 65328 -> 65316 (-0.02%)
Latency: 35336268 -> 35335528 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7305032 -> 7302187 (-0.04%); split: -0.04%, +0.00%
SClause: 120537 -> 120553 (+0.01%); split: -0.01%, +0.02%
Copies: 307257 -> 306852 (-0.13%); split: -0.21%, +0.08%
Branches: 115744 -> 115743 (-0.00%)
VALU: 1572522 -> 1572183 (-0.02%); split: -0.02%, +0.00%
SALU: 574229 -> 574155 (-0.01%); split: -0.05%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36345>
2025-08-01 17:15:54 +00:00
Daniel Schürmann
2ddd8ef0a3 aco/ra: don't optimize encodings on precolor affinity mismatch
Totals from 238 (0.30% of 79839) affected shaders: (Navi48)
Instrs: 137836 -> 137176 (-0.48%); split: -0.50%, +0.02%
CodeSize: 728616 -> 728668 (+0.01%); split: -0.06%, +0.07%
Latency: 1503248 -> 1500202 (-0.20%); split: -0.56%, +0.36%
InvThroughput: 297725 -> 296715 (-0.34%); split: -0.70%, +0.36%
Copies: 9390 -> 8825 (-6.02%); split: -6.33%, +0.31%
VALU: 89861 -> 89296 (-0.63%); split: -0.66%, +0.03%
SALU: 13166 -> 13167 (+0.01%); split: -0.05%, +0.06%

Suggested-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36345>
2025-08-01 17:15:54 +00:00
Daniel Schürmann
93606a19c6 aco/ra: collect register affinities for all precolored operands.
Totals from 1280 (1.60% of 79839) affected shaders: (Navi48)

Instrs: 817363 -> 812639 (-0.58%); split: -0.58%, +0.00%
CodeSize: 4262644 -> 4243540 (-0.45%); split: -0.45%, +0.00%
VGPRs: 61692 -> 61668 (-0.04%)
Latency: 4354318 -> 4347818 (-0.15%); split: -0.15%, +0.00%
InvThroughput: 711914 -> 707698 (-0.59%); split: -0.59%, +0.00%
VClause: 14685 -> 14677 (-0.05%); split: -0.09%, +0.03%
SClause: 25623 -> 25621 (-0.01%)
Copies: 50663 -> 46242 (-8.73%); split: -8.73%, +0.00%
VALU: 427744 -> 423323 (-1.03%); split: -1.03%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36345>
2025-08-01 17:15:54 +00:00
Daniel Schürmann
e32eec52f0 aco/ra: generalize register affinities
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36345>
2025-08-01 17:15:54 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Georg Lehmann
a6a6c2f691 aco/ra: convert bitwise instruction to gfx11+ 16bit on demand
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The 32bit versions are smaller, allow more optimizations and VOPD,
so only use the 16bit opcodes if nessecary.

Foz-DB Navi31:
Totals from 84 (0.10% of 80237) affected shaders:
Instrs: 176673 -> 176347 (-0.18%); split: -0.20%, +0.01%
CodeSize: 970148 -> 969716 (-0.04%); split: -0.08%, +0.03%
VGPRs: 5876 -> 5864 (-0.20%)
Latency: 2805974 -> 2805674 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 769007 -> 768738 (-0.03%); split: -0.04%, +0.01%
VClause: 2593 -> 2597 (+0.15%)
Copies: 23749 -> 23487 (-1.10%); split: -1.11%, +0.00%
VALU: 107124 -> 106862 (-0.24%); split: -0.25%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35919>
2025-07-31 12:07:07 +00:00
Georg Lehmann
b12db991eb aco/gfx10: optimize subgroupRotate(x, 32) and subgroupShuffleXor(x, 32)
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We don't have v_permlane64_b32 yet, but we can still optimize it using
shared vgprs. Using the DPP16 row mask, we can even avoid writing exec.

With v0 input/output and v24/v25 as shared vgprs, this results in:
v_mov_b32_dpp v24, v0 quad_perm:[0,1,2,3] row_mask:0x3 bank_mask:0xf
v_mov_b32_dpp v25, v0 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf
v_mov_b32_dpp v0, v24 quad_perm:[0,1,2,3] row_mask:0xc bank_mask:0xf
v_mov_b32_dpp v0, v25 quad_perm:[0,1,2,3] row_mask:0x3 bank_mask:0xf

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36390>
2025-07-29 06:33:20 +00:00
Natalie Vock
b2a95d2133 aco/ra: Add affinities for DS vector-aligned operands
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:40 +00:00
Natalie Vock
c279dd6e61 aco: Support vector-aligned ops fixed to defs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:38 +00:00
Daniel Schürmann
47ef60cbf1 aco/ra: always use bytes for register stride requirements
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
instead of a mixture between dwords and bytes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36053>
2025-07-14 08:45:29 +00:00
Daniel Schürmann
3f35b1329e aco: allow subdword vector-definitions on some VALU instructions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35784>
2025-07-09 14:10:36 +00:00
Rhys Perry
dce1d4ad4c aco/ra: fix repeated compact_linear_vgprs() in get_reg()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: b7738de4f9 ("aco/ra: rework linear VGPR allocation")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13431
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35838>
2025-07-02 09:26:04 +00:00
Daniel Schürmann
7620957193 aco/ra: always set fill_operands=true when handling operands
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This makes the behavior consistent and less prone to error.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35735>
2025-06-26 10:05:07 +00:00
Daniel Schürmann
ee8424d839 aco/ra: always fill moved operands when handling vector-operands
update_renames() assumes that killed operands are already removed from
the register file, except for precolored and copy-kill operands.
When dealing with vector-operands, however, unrelated operands might
also be moved, in order to make space.

Fixes: fb689f133e ('aco/ra: handle register assignment of vector-aligned operands')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35735>
2025-06-26 10:05:07 +00:00
Georg Lehmann
94c191e6d9 aco: remove p_v_cvt_pk_u8_f32
Now unused.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35391>
2025-06-10 07:32:04 +00:00
Daniel Schürmann
9091c3bf5b aco/ra: add affinities for MIMG vector-aligned operands
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>
2025-05-28 09:24:17 +00:00
Daniel Schürmann
fb689f133e aco/ra: handle register assignment of vector-aligned operands
Vector-aligned operands are handled by temporarily allocating
a vector-SSA value for the duration of the instruction.
On completion of the register assignment, the individual
operands are assigned to the reserved register space and,
if necessary, parallelcopies are emitted.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>
2025-05-28 09:24:17 +00:00
Daniel Schürmann
92b1154397 aco/ra: Always rename copy-kill operands, even if the temporary doesn't match
This makes it independent of whether the operand already got renamed or not.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>
2025-05-28 09:24:17 +00:00
Daniel Schürmann
4fad3514a9 aco/ra: only change registers of already handled operands in update_renames()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>
2025-05-28 09:24:17 +00:00
Daniel Schürmann
51a2e1eb94 aco/ra: don't use kill-flags as indicator in get_reg_create_vector()
We are about to re-use this function for vector-aligned operands.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34359>
2025-05-28 09:24:17 +00:00
Rhys Perry
b341a12526 aco/ra: rewrite handling of tied definitions
The old version worked by precoloring both the operand and definition to
whatever register the operand was at the time. This didn't allow moving
the operand/definition after precoloring them, or two tied operands with
the same temporary.

The new version works by temporarily making the operands late kill and
creating a copy if the temporary is live-through or used as another tied
operand. Then we can simply later make the operands early kill again and
assign the definitions to the operand's register.

This way, we can move the operand to make space and the new location will
not intersect with any other definition and won't cause the operand and
definition registers to mismatch.

fossil-db (gfx1201):
Totals from 2253 (2.84% of 79377) affected shaders:
Instrs: 1634747 -> 1630799 (-0.24%); split: -0.27%, +0.03%
CodeSize: 8688148 -> 8672348 (-0.18%); split: -0.20%, +0.02%
VGPRs: 106500 -> 106512 (+0.01%)
Latency: 11385480 -> 11382965 (-0.02%); split: -0.04%, +0.01%
InvThroughput: 1754430 -> 1754326 (-0.01%); split: -0.01%, +0.00%
SClause: 38954 -> 38964 (+0.03%); split: -0.01%, +0.04%
Copies: 110772 -> 110800 (+0.03%); split: -0.02%, +0.04%
Branches: 29093 -> 29092 (-0.00%)
VALU: 902011 -> 902008 (-0.00%)
SALU: 260175 -> 260203 (+0.01%); split: -0.01%, +0.02%

fossil-db (navi31):
Totals from 2 (0.00% of 79377) affected shaders:
Latency: 1766 -> 1765 (-0.06%)
InvThroughput: 3219 -> 3215 (-0.12%)

fossil-db (navi21):
Totals from 14 (0.02% of 79377) affected shaders:
(no affected stats)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34700>
2025-05-20 15:40:47 +00:00
Rhys Perry
f783f5df6d aco/ra: move optimize_encoding earlier
We have to handle tied definitions after optimize_encoding, but a rewrite
of that will need the register file from before clearing the killed
operands.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34700>
2025-05-20 15:40:47 +00:00
Rhys Perry
fd05181a26 aco/ra: replace skip_renaming with copy_kill
This will be used in the next commit.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34700>
2025-05-20 15:40:46 +00:00
Rhys Perry
c04ef28c56 aco: rename ops_fixed_to_def to tied_defs
This is less words.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34700>
2025-05-20 15:40:46 +00:00
Rhys Perry
96e49b7904 aco/ra: add ra_test_policy::use_compact_relocate
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
2025-04-29 15:15:10 +00:00
Rhys Perry
3c1dbc1d9b aco/ra: cleanup compact_relocate_vars fallback path
We don't need to duplicate these loops.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
2025-04-29 15:15:10 +00:00
Rhys Perry
a780345e01 aco: fix compact_relocate_vars fallback with scc/exec/m0 precolored regs
This probably doesn't fix anything in practice. I don't think this path is
ever taken for SGPRs except for pseudo-scalar transcendental instructions,
and those don't have any precolored operands/definitions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
2025-04-29 15:15:10 +00:00
Rhys Perry
f6581b41c4 aco/ra: don't require alignment for NPOT SGPR temporaries
Aligning these can create situations where register allocation is
impossible. Only pseudo-instructions can use these, and they don't require
any alignment.

I'm not sure if these temporaries actually happen in practice. This
probably only affects the get_reg()'s compact_relocate_vars fallback path,
which doesn't usually happen for SGPRs.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
2025-04-29 15:15:10 +00:00
Rhys Perry
623230a6ef aco/ra: change sorting in compact_relocate_vars
D16 MIMG or pseudo-scalar transcendental instructions might give lower
limits to the maximum register available for their definitions, so just
try to place them earlier.

This is also part of fixing compact_relocate_vars with aligned NPOT
def/killed-op space (the second part is the later commit which changes
get_stride()).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34343>
2025-04-29 15:15:10 +00:00
Rhys Perry
3c021b79b4 aco/ra: use a correct stride for subdword get_reg_impl
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
fossil-db (gfx1201):
Totals from 3 (0.00% of 79377) affected shaders:
Instrs: 1312 -> 1308 (-0.30%)
CodeSize: 7112 -> 7096 (-0.22%)
Latency: 5381 -> 5382 (+0.02%)
InvThroughput: 753 -> 752 (-0.13%)
Copies: 69 -> 68 (-1.45%)
VALU: 836 -> 835 (-0.12%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34679>
2025-04-25 14:43:41 +00:00
Rhys Perry
ae6d4f1195 aco/ra: update_renames() before add_subdword_definition()
The register file tests here should be done after update_renames().

Normally, get_reg() wouldn't have to move anything to make space for a 1-3
byte definition. This was encountered with skip_optimistic_path=true and a
get_reg_impl() bug (fixed in a later commit) which caused suboptimal
register assignment.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34679>
2025-04-25 14:43:41 +00:00
Natalie Vock
ee0f784858 aco/ra: Don't consider precolored ops/defs in get_reg_impl
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>
2025-04-17 20:20:40 +00:00
Natalie Vock
f309d76aab aco: Add support for multiple ops fixed to defs
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34273>
2025-04-17 20:20:40 +00:00
Rhys Perry
80fef30531 aco/ra: fix free register counting when moving variables
info.bounds might be smaller than the bounds available for the moved
variables.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 626aa7b648 ("aco: workaround GFX9 hardware bug for D16 image instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34158>
2025-03-25 15:14:16 +00:00
Georg Lehmann
d1dca26941 aco/ra: disallow vcc definitions for pseudo scalar trans instrs
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Foz-DB GFX1201:
Totals from 30 (0.04% of 79600) affected shaders:
Instrs: 58843 -> 58820 (-0.04%); split: -0.10%, +0.06%
CodeSize: 302228 -> 301944 (-0.09%); split: -0.13%, +0.04%
Latency: 204566 -> 204432 (-0.07%); split: -0.09%, +0.02%
InvThroughput: 136918 -> 136919 (+0.00%); split: -0.00%, +0.00%
SClause: 1241 -> 1249 (+0.64%); split: -0.56%, +1.21%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34006>
2025-03-14 13:53:55 +00:00
Natalie Vock
d5a2666ad9 aco/ra: Assert operands only clear their own id
This is useful for debugging register assignment, as this case would
usually result in RA silently assigning the same register to multiple
temps at the same time.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576>
2025-02-28 16:00:48 +00:00
Natalie Vock
b8bcc8e5c5 aco/ra: Handle temps fixed to different regs in different operands
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576>
2025-02-28 16:00:48 +00:00
Natalie Vock
7a4775b396 aco/ra: Add option to skip renaming for parallelcopies
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576>
2025-02-28 16:00:48 +00:00
Natalie Vock
b339bcfa38 aco/ra: Use struct for parallelcopies
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576>
2025-02-28 16:00:48 +00:00
Natalie Vock
3f182bc1fa aco/ra: Use iterators for linear VGPR copy extraction
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29576>
2025-02-28 16:00:48 +00:00
Daniel Schürmann
52253da783 aco: unify get_addr_sgpr_from_waves() and get_addr_vgpr_from_waves() into one function
which returns the limit as RegisterDemand.

Also remove the unused get_extra_sgprs() from aco_ir.h.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
2025-02-21 13:49:41 +00:00
Rhys Perry
d946a753e3 aco/ra: unconditionally call undo_renames
There's no real reason to not do this more.

fossil-db (navi21):
Totals from 2615 (3.29% of 79377) affected shaders:
Instrs: 4729505 -> 4729484 (-0.00%); split: -0.00%, +0.00%
CodeSize: 25210992 -> 25210036 (-0.00%); split: -0.00%, +0.00%
Latency: 31572966 -> 31572435 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 6918552 -> 6918560 (+0.00%); split: -0.00%, +0.00%
VClause: 132152 -> 132116 (-0.03%); split: -0.03%, +0.00%
SClause: 98595 -> 98575 (-0.02%); split: -0.04%, +0.02%

fossil-db (polaris10):
Totals from 1039 (1.68% of 61794) affected shaders:
Instrs: 708761 -> 708766 (+0.00%); split: -0.00%, +0.00%
CodeSize: 3588772 -> 3588792 (+0.00%); split: -0.00%, +0.00%
Latency: 7458892 -> 7459513 (+0.01%); split: -0.00%, +0.01%
InvThroughput: 3494669 -> 3494722 (+0.00%); split: -0.00%, +0.00%
VClause: 16754 -> 16737 (-0.10%)
SClause: 18190 -> 18156 (-0.19%); split: -0.49%, +0.31%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33444>
2025-02-11 23:10:37 +00:00
Rhys Perry
b94b4188a6 aco/ra: reverse renaming of operands outside update_renames
This lets us remove some special casing from update_renames and make it
simpler. Doing this in update_renames was also fragile, since the
parallelcopies vector is not final when update_renames is called, so it
might not have been safe to do so in the end.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33444>
2025-02-11 23:10:37 +00:00
Georg Lehmann
65506e635b aco/ra: don't write to scc/ttmp with s_fmac
Fixes: 4bd229ac50 ("aco/gfx11.5: select SOP2 float instructions")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
0b9e2a5427 aco/ra: disallow s_cmpk with scc operand
Fixes: 2d6b0a4177 ("aco/optimizer: Optimize SOPC with literal to SOPK.")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Georg Lehmann
fe0c72caec aco/ra: don't write to exec/ttmp with mulk/addk/cmovk
ttmp sgprs are readonly outside of trap handlers, so the instructions were
probably skipped. RA should also never create additional exec writes.

Fixes: e06773281b ("aco/ra: Optimize some SOP2 instructions with literal to SOPK.")

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32545>
2024-12-11 12:51:18 +00:00
Daniel Schürmann
b64fff7731 aco: remove definition from Pseudo branch instructions
They are not needed anymore.

Totals from 7019 (8.84% of 79395) affected shaders: (Navi31)

Instrs: 14805400 -> 14824196 (+0.13%); split: -0.00%, +0.13%
CodeSize: 78079972 -> 78132932 (+0.07%); split: -0.01%, +0.08%
SpillSGPRs: 4485 -> 4515 (+0.67%); split: -0.76%, +1.43%
Latency: 165862000 -> 165836134 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 30061764 -> 30057781 (-0.01%); split: -0.01%, +0.00%
SClause: 392323 -> 392286 (-0.01%); split: -0.01%, +0.00%
Copies: 1012262 -> 1012234 (-0.00%); split: -0.04%, +0.04%
Branches: 365910 -> 365909 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 360167 -> 355363 (-1.33%)
VALU: 8837197 -> 8837276 (+0.00%); split: -0.00%, +0.00%
SALU: 1402593 -> 1402621 (+0.00%); split: -0.03%, +0.03%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32037>
2024-12-06 14:34:03 +00:00