Commit graph

3162 commits

Author SHA1 Message Date
Rhys Perry
7c995df9aa aco: fix follow_operand with combined label_extract and label_split
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:22 +00:00
Rhys Perry
9ee24db882 aco: add missing isConstant()/isTemp() checks
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:22 +00:00
Rhys Perry
5e1d3f571d aco: turn split(vec()) into p_parallelcopy instead of p_create_vector
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:22 +00:00
Rhys Perry
f842bd81ca aco: use s_pack_*_b32_b16 more in p_insert/p_extract lowering
This opcode doesn't write SCC, which gives later passes more freedom to
move instructions.

fossil-db (navi21):
Totals from 727 (0.92% of 79395) affected shaders:
Latency: 14943483 -> 14942704 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 3225790 -> 3225766 (-0.00%); split: -0.00%, +0.00%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:22 +00:00
Rhys Perry
ca161a96d1 aco: combine extracts into s_pack_ll_b32_b16
fossil-db (navi21):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 45941 -> 45924 (-0.04%)
CodeSize: 241768 -> 241756 (-0.00%)
Latency: 176501 -> 176491 (-0.01%)
Copies: 6884 -> 6882 (-0.03%)
SALU: 6101 -> 6088 (-0.21%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:21 +00:00
Rhys Perry
98cb50297b aco: use s_pack_ll_b32_b16 for pack_32_2x16_split
fossil-db (navi21):
Totals from 3 (0.00% of 79395) affected shaders:
Instrs: 45963 -> 45941 (-0.05%)
CodeSize: 241908 -> 241768 (-0.06%)
Latency: 176508 -> 176501 (-0.00%)
SALU: 6123 -> 6101 (-0.36%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29912>
2024-07-01 17:34:21 +00:00
David Heidelberg
68215332a8 build: pass licensing information in SPDX form
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29972>
2024-06-29 12:42:49 -07:00
Georg Lehmann
7fc8ad2ddd aco/ir: remove unused vopc helpers
And rename get_swapped and get_inverse to show that they should only be used for VOPC.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>
2024-06-27 08:12:30 +00:00
Georg Lehmann
2225a32bb0 aco: remove ordered/unordered optimizations
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>
2024-06-27 08:12:30 +00:00
Georg Lehmann
c5ba17cd25 aco: implement ford, funord, fneo, fequ, fltu, fgeu
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>
2024-06-27 08:12:30 +00:00
Daniel Schürmann
c4a38c6583 aco/spill: don't remove spilled phis
They will be removed during register allocation.

Few changes due to different phi order.
Totals from 14 (0.02% of 79395) affected shaders: (GFX11)

Instrs: 315724 -> 315675 (-0.02%); split: -0.02%, +0.01%
CodeSize: 1673608 -> 1673268 (-0.02%); split: -0.03%, +0.00%
Latency: 3194243 -> 3189025 (-0.16%); split: -0.19%, +0.03%
InvThroughput: 638369 -> 637323 (-0.16%); split: -0.19%, +0.03%
VClause: 5716 -> 5714 (-0.03%)
Copies: 37786 -> 37748 (-0.10%); split: -0.13%, +0.03%
Branches: 10469 -> 10454 (-0.14%); split: -0.16%, +0.02%
VALU: 182498 -> 182454 (-0.02%); split: -0.03%, +0.00%
SALU: 36038 -> 36046 (+0.02%); split: -0.01%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:34 +00:00
Daniel Schürmann
634051f913 aco/live_var_analysis: ignore dead phis
Since we don't emit code for dead phis, we also don't
have to keep their operands around.

Totals from 44 (0.06% of 79395) affected shaders: (GFX11)

MaxWaves: 648 -> 650 (+0.31%)
Instrs: 449898 -> 449120 (-0.17%); split: -0.18%, +0.00%
CodeSize: 2395000 -> 2389300 (-0.24%); split: -0.24%, +0.00%
VGPRs: 5504 -> 5468 (-0.65%)
Latency: 9005058 -> 9000966 (-0.05%); split: -0.07%, +0.03%
InvThroughput: 2154567 -> 2139095 (-0.72%); split: -0.77%, +0.06%
VClause: 8362 -> 8354 (-0.10%)
SClause: 9135 -> 9134 (-0.01%)
Copies: 60678 -> 60118 (-0.92%); split: -0.93%, +0.01%
Branches: 14379 -> 14385 (+0.04%)
PreSGPRs: 3877 -> 3863 (-0.36%)
PreVGPRs: 6318 -> 6286 (-0.51%)
VALU: 266975 -> 266301 (-0.25%); split: -0.25%, +0.00%
SALU: 52741 -> 52667 (-0.14%); split: -0.15%, +0.01%
VMEM: 16140 -> 16132 (-0.05%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:34 +00:00
Daniel Schürmann
708e1a73f5 aco/live_var_analysis: slightly refactor handling of additional register demand for Operand copies
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:34 +00:00
Daniel Schürmann
5cfa5b784b aco: remove get_demand_before()
The register demand before executing an instruction is now included
in the instruction's register demand and this function is unused.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:34 +00:00
Daniel Schürmann
09f1c40f2e aco: track and use the live-in register demand per basic block
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:34 +00:00
Daniel Schürmann
001c8caae0 aco: calculate register demand per instruction as maximum necessary to execute the instruction
Previously, the register demand per instruction was calculated as the number of
live variables in the register file after executing an instruction plus additional
temporary registers, necessary during the execution of the instruction.
With this change, now it also includes all variables which are live right before
executing an instruction, i.e. killed Operands.

Care has been taken so that the invariant

register_demand[idx] = register_demand[idx - 1] - get_temp_registers(prev_instr)
                        + get_live_changes(instr) + get_temp_registers(instr)

still holds.

Slight changes in scheduling:

Totals from 316 (0.40% of 79395) affected shaders: (GFX11)

Instrs: 301329 -> 300777 (-0.18%); split: -0.31%, +0.12%
CodeSize: 1577976 -> 1576204 (-0.11%); split: -0.21%, +0.10%
SpillSGPRs: 448 -> 447 (-0.22%)
Latency: 1736349 -> 1726182 (-0.59%); split: -2.01%, +1.42%
InvThroughput: 243894 -> 243883 (-0.00%); split: -0.03%, +0.03%
VClause: 6134 -> 6280 (+2.38%); split: -1.04%, +3.42%
SClause: 6142 -> 6137 (-0.08%); split: -0.13%, +0.05%
Copies: 14037 -> 14032 (-0.04%); split: -0.56%, +0.52%
Branches: 3284 -> 3283 (-0.03%)
VALU: 182750 -> 182718 (-0.02%); split: -0.04%, +0.03%
SALU: 18522 -> 18538 (+0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:33 +00:00
Daniel Schürmann
4c2f231cc0 aco/spill: Unconditionally add 2 SGPRs to live-in demand
Due to undefined Operands, it might not be enough to check the
predecessors' register demand.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:33 +00:00
Daniel Schürmann
26c58ca9de aco/scheduler: fix register_demand validation debug code
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29804>
2024-06-26 09:42:33 +00:00
Rhys Perry
e3ffc244f5 aco: skip continue_or_break LCSSA phis when not needed
Fixes:
//exec is empty here
loop {
   %1:s[16-17] = ...
   if () {
      break
   }
   %2:s[16-17] = ...
   continue_or_break
}
%3 = phi %1, undef
//because of the undef, %2 can use s[16-17] and overwrite the address
load(%3:s[16-17])

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: bbe4652430 ("aco: create lcssa phis for continue_or_break loops when necessary")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11333
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29838>
2024-06-26 09:10:54 +00:00
Rhys Perry
17f2ebe8d2 aco: use 1.5x vgprs for gfx1151 and gfx12
From LLVM.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29839>
2024-06-24 18:39:40 +00:00
Rhys Perry
71afacff39 aco/insert_exec_mask: ensure top mask is not a temporary at loop exits
This is problematic when the successor of the loop exit is an invert
block. It assumes that the top mask is Operand(bld.lm) and doesn't change
it when entering the else branch.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11348
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29767>
2024-06-20 12:47:05 +00:00
Rhys Perry
bdc229231d aco: remove push constants
These are lowered in NIR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29675>
2024-06-20 12:09:29 +00:00
Rhys Perry
9fe3af1e2a aco: insert s_nop before discard early exit sendmsg(dealloc_vgpr)
Forgot about this one.

fossil-db (gfx1100):
Totals from 3920 (2.94% of 133461) affected shaders:
Instrs: 6632088 -> 6636008 (+0.06%)
CodeSize: 34165376 -> 34181056 (+0.05%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 37fbfa655a ("aco: insert s_nop before VGPR deallocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29770>
2024-06-18 20:17:38 +00:00
Georg Lehmann
c3c398d56d aco: make local functions static in files without anonymous namespace
I don't think adding an anonymous namespace in these files is worth it
given the amount of global functions

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Georg Lehmann
046414e061 aco: add more anonymous namespaces
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29740>
2024-06-18 17:53:07 +00:00
Daniel Schürmann
9b1a748b5e nir: remove nir_intrinsic_discard
The semantics of discard differ between GLSL and HLSL and
their various implementations. Subsequently, numerous application
bugs occurred and SPV_EXT_demote_to_helper_invocation was written
in order to clarify the behavior. In NIR, we now have 3 different
intrinsics for 2 things, and while demote and terminate have clear
semantics, discard still doesn't and can mean either of the two.

This patch entirely removes nir_intrinsic_discard and
nir_intrinsic_discard_if and replaces all occurences either with
nir_intrinsic_terminate{_if} or nir_intrinsic_demote{_if} in the
case that the NIR option 'discard_is_demote' is being set.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>
2024-06-17 19:37:16 +00:00
Daniel Schürmann
d5821bdf7d radv: emit discard as demote by default
Also removes radv_lower_discard_to_demote debug option.

Totals from 1506 (1.90% of 79439) affected shaders: (GFX11)
MaxWaves: 46432 -> 46448 (+0.03%)
Instrs: 664515 -> 667914 (+0.51%); split: -0.15%, +0.67%
CodeSize: 3569656 -> 3583440 (+0.39%); split: -0.12%, +0.51%
VGPRs: 50100 -> 49680 (-0.84%); split: -0.96%, +0.12%
Latency: 4221359 -> 4217875 (-0.08%); split: -0.67%, +0.59%
InvThroughput: 628809 -> 625565 (-0.52%); split: -0.53%, +0.02%
VClause: 9948 -> 9965 (+0.17%); split: -0.36%, +0.53%
SClause: 19656 -> 19695 (+0.20%); split: -0.77%, +0.97%
Copies: 32113 -> 33513 (+4.36%); split: -1.59%, +5.95%
Branches: 8406 -> 8378 (-0.33%)
PreSGPRs: 42328 -> 42555 (+0.54%); split: -0.39%, +0.93%
PreVGPRs: 38451 -> 38203 (-0.64%); split: -0.78%, +0.14%
VALU: 390770 -> 390208 (-0.14%); split: -0.16%, +0.02%
SALU: 43318 -> 46374 (+7.05%); split: -0.08%, +7.14%
VMEM: 15052 -> 15051 (-0.01%)
SMEM: 37225 -> 37215 (-0.03%); split: -0.03%, +0.01%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27617>
2024-06-17 19:37:15 +00:00
Daniel Schürmann
b7982152ff aco: use aco::monotonic_allocator for IDSet
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
97fd5d3f33 aco: make aco::monotonic_buffer_resource declaration visible for aco::IDSet
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
95967c2ca0 aco/reindex_ssa: replace live_var parameter with boolean
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
a497d105e3 aco: move live var information into struct Program
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
2322ab427e aco/scheduler: remove unused register_demand parameter
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29713>
2024-06-14 14:32:35 +00:00
Daniel Schürmann
677c9d9e93 aco/assembler: fix GFX67 MTBUF opcode encoding
Fixes: 56ac6f26e0 ('aco/assembler: slightly refactor MTBUF assembly for more readability')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29708>
2024-06-13 09:18:05 +00:00
Daniel Schürmann
56ac6f26e0 aco/assembler: slightly refactor MTBUF assembly for more readability
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Daniel Schürmann
14f4906e53 aco/assembler: fix MTBUF opcode encoding on GFX11
We have accidentally set the tfe bit for some opcodes.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29692>
2024-06-12 11:41:58 +00:00
Rhys Perry
7a4f121c5d aco: remove some missing label resets
In the case of:
   c = xor(a, b)
   d = not(c)
   xor(d, e)
it will be optimized to:
   d = xnor(a, b)
   xor(d, e)
because "d" would still had a label with "instr=not(c)", it would then be
further optimized to:
   d = xnor(a, b)
   xnor(c, e)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11309
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29650>
2024-06-11 09:30:16 +00:00
Friedrich Vock
15f2c9c553 aco: Limit rt stages to 128 vgprs
Totals from 35472 (7.40% of 479373) affected shaders:

MaxWaves: 206239 -> 283776 (+37.60%)
Instrs: 193922210 -> 202721106 (+4.54%)
CodeSize: 1056819972 -> 1110833680 (+5.11%); split: -0.00%, +5.11%
VGPRs: 6026704 -> 4540416 (-24.66%)
SpillSGPRs: 23742 -> 25754 (+8.47%)
SpillVGPRs: 118897 -> 2295118 (+1830.34%)
Scratch: 7201792 -> 152752128 (+2021.03%)
Latency: 2713432565 -> 3194796286 (+17.74%); split: -0.20%, +17.94%
InvThroughput: 1052131232 -> 935049835 (-11.13%); split: -16.59%, +5.46%
VClause: 6972784 -> 8716721 (+25.01%); split: -0.02%, +25.03%
SClause: 4879313 -> 4852452 (-0.55%); split: -0.88%, +0.33%
Copies: 32782141 -> 35223995 (+7.45%)
Branches: 11075847 -> 11094087 (+0.16%); split: -0.00%, +0.17%
VALU: 118525960 -> 120929058 (+2.03%)
SALU: 33924572 -> 33973293 (+0.14%); split: -0.03%, +0.17%
VMEM: 12419116 -> 17104582 (+37.73%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593>
2024-06-10 19:39:52 +00:00
Friedrich Vock
ec8512ce85 aco/spill: Don't spill phis with all-undef operands
Fixes some crashes when limiting RT stages to 128 VGPRs.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29593>
2024-06-10 19:39:52 +00:00
Rhys Perry
5297896856 aco: use ac_get_hw_cache_flags()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:43 +00:00
Rhys Perry
00eccf524f aco: use GFX12 scope/temporal-hint
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
b41f0f6cc1 aco: use ac_hw_cache_flags
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
cdaf269924 aco: inline store_vmem_mubuf/emit_single_mubuf_store
Both of these are only used once.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Rhys Perry
185fa04baa aco/gfx6: set glc for buffer_store_byte/short
For the same reason we set it for image stores. GFX6 has a caching bug
which requires this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29243>
2024-06-07 13:22:42 +00:00
Daniel Schürmann
c452a4d1cc aco/ra: use round robin register allocation
Totals from 74681 (94.06% of 79395) affected shaders: (GFX11)

MaxWaves: 2265668 -> 2263546 (-0.09%); split: +0.01%, -0.10%
Instrs: 44941647 -> 44412809 (-1.18%); split: -1.23%, +0.05%
CodeSize: 234173852 -> 232009132 (-0.92%); split: -0.97%, +0.05%
VGPRs: 3033208 -> 3403000 (+12.19%); split: -0.02%, +12.22%
Latency: 305575738 -> 301100302 (-1.46%); split: -1.70%, +0.23%
InvThroughput: 49366070 -> 49020000 (-0.70%); split: -0.91%, +0.21%
VClause: 875748 -> 854930 (-2.38%); split: -2.65%, +0.27%
SClause: 1369614 -> 1327212 (-3.10%); split: -3.43%, +0.33%
Copies: 2887932 -> 2883061 (-0.17%); split: -1.93%, +1.76%
Branches: 885041 -> 885101 (+0.01%); split: -0.01%, +0.02%
VALU: 25218078 -> 25215170 (-0.01%); split: -0.20%, +0.19%
SALU: 4328640 -> 4326052 (-0.06%); split: -0.20%, +0.14%
VOPD: 9129 -> 9611 (+5.28%); split: +7.48%, -2.20%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
197943ae27 aco/ra: change heuristic to first fit
Totals from 73175 (92.17% of 79395) affected shaders: (GFX11)

MaxWaves: 2217690 -> 2217930 (+0.01%); split: +0.02%, -0.01%
Instrs: 44780731 -> 44784895 (+0.01%); split: -0.14%, +0.15%
CodeSize: 233238960 -> 233255604 (+0.01%); split: -0.11%, +0.12%
VGPRs: 3009116 -> 3007684 (-0.05%); split: -0.29%, +0.24%
Latency: 304320163 -> 304286592 (-0.01%); split: -0.31%, +0.30%
InvThroughput: 49121992 -> 49145025 (+0.05%); split: -0.20%, +0.25%
VClause: 872566 -> 873242 (+0.08%); split: -0.25%, +0.33%
SClause: 1359666 -> 1361640 (+0.15%); split: -0.11%, +0.26%
Copies: 2879649 -> 2881646 (+0.07%); split: -1.13%, +1.20%
Branches: 887102 -> 887093 (-0.00%); split: -0.01%, +0.01%
VALU: 25128240 -> 25128572 (+0.00%); split: -0.12%, +0.12%
SALU: 4328852 -> 4330559 (+0.04%); split: -0.07%, +0.11%
VOPD: 8861 -> 8992 (+1.48%); split: +2.63%, -1.15%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
d76fc005b6 aco/ra: re-use registers from killed operands
Totals from 77283 (97.34% of 79395) affected shaders: (GFX11)

MaxWaves: 2348498 -> 2348250 (-0.01%); split: +0.01%, -0.02%
Instrs: 45304558 -> 45097367 (-0.46%); split: -0.57%, +0.11%
CodeSize: 235719656 -> 234957768 (-0.32%); split: -0.43%, +0.11%
VGPRs: 3065984 -> 3073244 (+0.24%); split: -0.41%, +0.65%
Latency: 308010576 -> 307008565 (-0.33%); split: -0.85%, +0.52%
InvThroughput: 49560307 -> 49464214 (-0.19%); split: -0.54%, +0.34%
VClause: 881895 -> 879739 (-0.24%); split: -0.78%, +0.53%
SClause: 1388139 -> 1374634 (-0.97%); split: -1.12%, +0.14%
Copies: 2918583 -> 2910434 (-0.28%); split: -1.92%, +1.64%
Branches: 893947 -> 893712 (-0.03%); split: -0.06%, +0.03%
VALU: 25260728 -> 25256766 (-0.02%); split: -0.20%, +0.19%
SALU: 4377750 -> 4373595 (-0.09%); split: -0.17%, +0.07%
VOPD: 8603 -> 9163 (+6.51%); split: +8.54%, -2.03%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
b054cfe704 aco/ra: move can_write_m0() check into get_reg_specified()
This way, affinities are also covered.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
8e817cf52b aco/ra: refactor get_reg_simple() with increased stride.
This should avoid some redundant calls.

Totals from 153 (0.19% of 79395) affected shaders: (GFX11)

Instrs: 301717 -> 301687 (-0.01%); split: -0.06%, +0.05%
CodeSize: 1583080 -> 1582988 (-0.01%); split: -0.06%, +0.05%
VGPRs: 10068 -> 10348 (+2.78%)
Latency: 6685446 -> 6685475 (+0.00%); split: -0.11%, +0.11%
InvThroughput: 999241 -> 999316 (+0.01%); split: -0.01%, +0.02%
VClause: 3868 -> 3870 (+0.05%)
Copies: 23752 -> 23769 (+0.07%); split: -0.27%, +0.34%
Branches: 6479 -> 6480 (+0.02%)
VALU: 179290 -> 179307 (+0.01%); split: -0.04%, +0.04%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
1b0edf3f33 aco/ra: Fix array access when finding register for subdword variables
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:15 +00:00
Daniel Schürmann
5326e033ff aco/ra: fix handling of killed operands in compact_relocate_vars()
Found by inspection.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29235>
2024-06-06 21:02:14 +00:00