Daniel Schürmann
c8348139fd
nir: change signature of nir_src_is_divergent()
...
Now, it takes nir_src * instead of nir_src.
Also move the implementation to nir_divergence_analysis.c.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787 >
2024-10-24 10:06:17 +00:00
Daniel Schürmann
ce0a3fe645
nir/opt_uniform_atomics: don't preserve divergence information
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787 >
2024-10-24 10:06:17 +00:00
Georg Lehmann
a4b179e445
aco/ssa_elimination: don't avoid saving exec when optimizing branching sequence
...
insert_exec_mask will no longer use s_and_saveexec if there was a previous copy
from a sgpr to exec, so this code path is no longer taken.
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567 >
2024-10-23 19:34:53 +00:00
Georg Lehmann
d2dcaf1f5e
aco/insert_exec: reuse old exec temp instead using s_and_saveexec
...
This means the v_cmpx optimization in ssa_elimination no longer
needs to insert a copy to save exec.
Foz-DB Navi31:
Totals from 13816 (17.40% of 79395) affected shaders:
Instrs: 23694267 -> 23670199 (-0.10%); split: -0.11%, +0.01%
CodeSize: 124559288 -> 124457508 (-0.08%); split: -0.09%, +0.01%
SpillSGPRs: 5324 -> 5354 (+0.56%); split: -1.00%, +1.56%
Latency: 207245846 -> 207213681 (-0.02%); split: -0.03%, +0.01%
InvThroughput: 35442657 -> 35437220 (-0.02%); split: -0.02%, +0.01%
VClause: 444672 -> 444670 (-0.00%); split: -0.00%, +0.00%
SClause: 639419 -> 639373 (-0.01%); split: -0.04%, +0.03%
Copies: 1529008 -> 1515871 (-0.86%); split: -1.02%, +0.16%
Branches: 557201 -> 557701 (+0.09%); split: -0.00%, +0.09%
PreSGPRs: 682840 -> 686048 (+0.47%)
VALU: 13978010 -> 13978032 (+0.00%); split: -0.00%, +0.00%
SALU: 2214600 -> 2197061 (-0.79%); split: -0.81%, +0.02%
VOPD: 5561 -> 5560 (-0.02%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567 >
2024-10-23 19:34:53 +00:00
Georg Lehmann
0471522377
aco/insert_exec: reuse old exec temp in loop pre-header
...
Avoid an exec copy.
Foz-DB Navi31:
Totals from 2315 (2.92% of 79395) affected shaders:
Instrs: 9082831 -> 9058990 (-0.26%); split: -0.27%, +0.00%
CodeSize: 48017244 -> 47858064 (-0.33%); split: -0.34%, +0.01%
SpillSGPRs: 1680 -> 1684 (+0.24%); split: -0.48%, +0.71%
Latency: 109511718 -> 109525041 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 20287085 -> 20289370 (+0.01%); split: -0.00%, +0.02%
VClause: 192259 -> 192260 (+0.00%)
SClause: 234082 -> 234124 (+0.02%); split: -0.01%, +0.03%
Copies: 667271 -> 645577 (-3.25%); split: -3.27%, +0.02%
Branches: 264086 -> 264088 (+0.00%)
PreSGPRs: 136831 -> 136966 (+0.10%)
VALU: 5234735 -> 5234740 (+0.00%); split: -0.00%, +0.00%
SALU: 949283 -> 927327 (-2.31%); split: -2.32%, +0.01%
VOPD: 1529 -> 1535 (+0.39%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567 >
2024-10-23 19:34:53 +00:00
Georg Lehmann
31f62a6123
aco/insert_exec: don't always reset top exec
...
This allows to re-use previous temporaries in case exec was restored
from a Temp, rather than having to create a new copy of exec.
Foz-DB Navi31:
Totals from 545 (0.69% of 79395) affected shaders:
Instrs: 216563 -> 215698 (-0.40%)
CodeSize: 1183536 -> 1180076 (-0.29%)
Latency: 1135269 -> 1135294 (+0.00%); split: -0.00%, +0.00%
Copies: 11933 -> 11072 (-7.22%)
SALU: 18990 -> 18129 (-4.53%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567 >
2024-10-23 19:34:53 +00:00
Georg Lehmann
4f04e6f0c4
aco/insert_exec: avoid phis for masks in exec
...
Exec always contains the same value as the top of stack, even if the
top of stack is a temporary/constant. So if the predecessors have different
top of stack operands, don't insert a phi and use exec as the new top of stack.
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567 >
2024-10-23 19:34:53 +00:00
Georg Lehmann
0338bb9ae8
aco/ssa_elimination: also optimize branching sequence with s_and without saveexec
...
insert_exec will start using this in the future, handle it the same
just without the path to save exec before the v_cmpx instruction.
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31567 >
2024-10-23 19:34:53 +00:00
Daniel Schürmann
e8472d484f
aco/spill: use float division for score() calculation rather than integers
...
This was the original intention and should result in more fine-grained
and thus better decisions.
Totals from 63 (0.08% of 79395) affected shaders: (Navi31)
Instrs: 3173500 -> 3174012 (+0.02%); split: -0.01%, +0.02%
CodeSize: 16345348 -> 16349288 (+0.02%); split: -0.01%, +0.03%
Latency: 18528036 -> 18526082 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 3619125 -> 3618709 (-0.01%); split: -0.02%, +0.01%
VClause: 82654 -> 82648 (-0.01%)
SClause: 61256 -> 61257 (+0.00%); split: -0.00%, +0.01%
Copies: 250037 -> 250158 (+0.05%); split: -0.06%, +0.11%
Branches: 101302 -> 101303 (+0.00%)
VALU: 1791447 -> 1791435 (-0.00%); split: -0.00%, +0.00%
SALU: 401898 -> 402007 (+0.03%); split: -0.03%, +0.06%
VOPD: 730 -> 741 (+1.51%)
Totals from 40 (0.06% of 63053) affected shaders: (Vega10)
Instrs: 161584 -> 161567 (-0.01%); split: -0.04%, +0.03%
CodeSize: 891168 -> 891004 (-0.02%); split: -0.04%, +0.03%
Latency: 3550766 -> 3549770 (-0.03%); split: -0.05%, +0.03%
InvThroughput: 2627028 -> 2626484 (-0.02%); split: -0.03%, +0.01%
VClause: 2970 -> 2971 (+0.03%)
SClause: 4203 -> 4205 (+0.05%); split: -0.26%, +0.31%
Copies: 19923 -> 19893 (-0.15%); split: -0.44%, +0.29%
VALU: 116045 -> 116054 (+0.01%); split: -0.01%, +0.02%
SALU: 22100 -> 22066 (-0.15%); split: -0.39%, +0.24%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31769 >
2024-10-23 14:35:29 +00:00
Daniel Schürmann
30d85b23ef
aco/spill: fix faulty assertions
...
By unintentionally using integer division for score(), these
assertions were likely to be raised by accident.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31769 >
2024-10-23 14:35:29 +00:00
Daniel Schürmann
d5581b1124
aco/live_var_analysis: check isFixed() for definitions in order to set needs_vcc
...
In rare cases, it could happen that during post-RA validation,
live-var-analysis sets needs_vcc = false after if was true
before register allocation.
Fixes: bb5eace0dc ('aco/live_var_analysis: check for isPrecolored flag rather than isFixed')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31791 >
2024-10-23 09:42:26 +00:00
Georg Lehmann
5da34ebee4
aco/insert_exec: remove get_exec_op
...
We used to only store Temps in the stack, so undef meant exec.
Then the stack was changed to operands, and some places started storing exec
directly, drop the undef handling by replacing everything with Operand(exec, lm)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31560 >
2024-10-22 17:03:26 +00:00
Georg Lehmann
8d148401cb
aco/ir: rework Operand equality to return true for equal fixed non-temp ops
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31560 >
2024-10-22 17:03:26 +00:00
Georg Lehmann
6716fb08d8
aco/insert_exec: remove unused includes
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31560 >
2024-10-22 17:03:26 +00:00
Georg Lehmann
23fb0883eb
aco/insert_exec: untangle add_branch_code control flow
...
All of the single ifs with return hide that this is effectively almost an
if-else chain, so convert it to one for real.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31560 >
2024-10-22 17:03:26 +00:00
Georg Lehmann
de7d931962
aco/insert_exec: remove stray break_cond variable
...
This was always trivial since some discard rework.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31560 >
2024-10-22 17:03:26 +00:00
Georg Lehmann
ade7f1a203
aco/insert_exec: replace pair with a named struct
...
.first and .second everywhere was hard to read.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31560 >
2024-10-22 17:03:26 +00:00
Georg Lehmann
a3054499ba
aco/insert_exec: don't pretend WQMState is a bit mask
...
It's a simple enum.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31560 >
2024-10-22 17:03:26 +00:00
Daniel Schürmann
ef47cce51c
aco/ra: always block register file for precolored operands
...
so that they don't accidentally get renamed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31387 >
2024-10-22 12:29:18 +00:00
Daniel Schürmann
18e7e8d8f0
aco/ra: make use of Precolored flag
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31387 >
2024-10-22 12:29:18 +00:00
Daniel Schürmann
bb5eace0dc
aco/live_var_analysis: check for isPrecolored flag rather than isFixed
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31387 >
2024-10-22 12:29:18 +00:00
Daniel Schürmann
e2705a9d85
aco: set Precolored flag before register allocation
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31387 >
2024-10-22 12:29:18 +00:00
Daniel Schürmann
c2ed4b474a
aco: introduce 'isPrecolored' flag for Operand and Definition
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31387 >
2024-10-22 12:29:18 +00:00
Georg Lehmann
10951bb11a
aco: fix 64bit extract_i8/extract_i16
...
The old code only sign extended to 32bit, with a zero hi half.
Fixes: 1f2518ef9f ("aco: implement nir_op_extract/nir_op_insert")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31734 >
2024-10-21 07:13:57 +00:00
Georg Lehmann
aabadb30fc
aco/print_ir: use parse_depctr_wait
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132 >
2024-10-17 11:16:16 +00:00
Georg Lehmann
ced7a01954
aco/statistics: update branch issue cycles
...
Foz-DB Navi31:
Totals from 14319 (18.04% of 79395) affected shaders:
Instrs: 20064495 -> 20062876 (-0.01%)
CodeSize: 105334568 -> 105327704 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132 >
2024-10-17 11:16:16 +00:00
Georg Lehmann
ec11cfc69d
aco/insert_delay_alu: do not delay lane mask fast forwarding
...
The delay actually hurts performance in this case.
Foz-DB Navi31:
Totals from 30340 (38.21% of 79395) affected shaders:
Instrs: 30778999 -> 30726605 (-0.17%); split: -0.17%, +0.00%
CodeSize: 162380180 -> 162170808 (-0.13%); split: -0.13%, +0.00%
Latency: 228185562 -> 228186976 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 39001151 -> 39000897 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132 >
2024-10-17 11:16:16 +00:00
Georg Lehmann
e4889fd4b5
aco/insert_delay_alu: consider more implicit waits
...
Foz-DB Navi31:
Totals from 37961 (47.81% of 79395) affected shaders:
Instrs: 34175286 -> 33978599 (-0.58%)
CodeSize: 180059352 -> 179190076 (-0.48%); split: -0.48%, +0.00%
Latency: 259826196 -> 259798474 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 42792700 -> 42789298 (-0.01%); split: -0.01%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132 >
2024-10-17 11:16:16 +00:00
Georg Lehmann
840b5841d3
aco: do not track ALU delay across jumps
...
This assumes that the best case jump latency is higher than the worst case
ALU latency.
Foz-DB Navi31:
Totals from 17720 (22.32% of 79395) affected shaders:
Instrs: 26009663 -> 25929989 (-0.31%); split: -0.31%, +0.00%
CodeSize: 136571496 -> 136254420 (-0.23%); split: -0.23%, +0.00%
Latency: 215731308 -> 215722059 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 36534197 -> 36532070 (-0.01%); split: -0.01%, +0.00%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132 >
2024-10-17 11:16:16 +00:00
Georg Lehmann
977f435f4c
aco/ir: add function to parse depctr waits
...
No Foz-DB changes on Navi31.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31132 >
2024-10-17 11:16:16 +00:00
Rhys Perry
33eb2d7fe4
aco: skip uniformization of certain merge phis
...
If a source is a VGPR, then skip if it's safe. This fixes the regressions
from the previous commit.
fossil-db (navi31):
Totals from 5118 (6.45% of 79395) affected shaders:
MaxWaves: 159560 -> 159520 (-0.03%); split: +0.01%, -0.03%
Instrs: 2165351 -> 2138456 (-1.24%); split: -1.26%, +0.02%
CodeSize: 11260340 -> 11152460 (-0.96%); split: -0.98%, +0.02%
VGPRs: 218124 -> 225144 (+3.22%); split: -0.13%, +3.35%
Latency: 11059208 -> 11116102 (+0.51%); split: -0.18%, +0.69%
InvThroughput: 1252148 -> 1230193 (-1.75%); split: -1.77%, +0.01%
VClause: 39513 -> 39518 (+0.01%); split: -0.48%, +0.49%
SClause: 59434 -> 59378 (-0.09%); split: -0.11%, +0.02%
Copies: 165997 -> 156172 (-5.92%); split: -6.68%, +0.76%
PreSGPRs: 181203 -> 181094 (-0.06%)
PreVGPRs: 139393 -> 139731 (+0.24%)
VALU: 1244301 -> 1220769 (-1.89%); split: -1.91%, +0.02%
SALU: 200240 -> 199567 (-0.34%); split: -0.34%, +0.00%
fossil-db (navi21):
Totals from 35520 (44.74% of 79395) affected shaders:
MaxWaves: 951870 -> 951830 (-0.00%)
Instrs: 20229388 -> 20227776 (-0.01%); split: -0.01%, +0.00%
CodeSize: 105379916 -> 105513740 (+0.13%); split: -0.01%, +0.13%
VGPRs: 1375232 -> 1375400 (+0.01%)
Latency: 81046435 -> 81013986 (-0.04%); split: -0.04%, +0.00%
InvThroughput: 15269166 -> 15273295 (+0.03%); split: -0.01%, +0.04%
VClause: 354314 -> 354310 (-0.00%); split: -0.00%, +0.00%
SClause: 417049 -> 417047 (-0.00%); split: -0.00%, +0.00%
Copies: 1699445 -> 1699488 (+0.00%); split: -0.01%, +0.01%
Branches: 591274 -> 591269 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1371062 -> 1370567 (-0.04%)
PreVGPRs: 1100716 -> 1100953 (+0.02%)
VALU: 11076189 -> 11075167 (-0.01%); split: -0.01%, +0.00%
SALU: 3648002 -> 3647378 (-0.02%); split: -0.02%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211 >
2024-10-10 14:59:27 +00:00
Rhys Perry
ce33ffd03a
aco: ensure phis uniformized by divergence analysis are SGPR
...
Otherwise, they might not actually be uniform when divergence analysis
claimed they are.
fossil-db (navi31):
Totals from 5118 (6.45% of 79395) affected shaders:
MaxWaves: 159520 -> 159560 (+0.03%); split: +0.03%, -0.01%
Instrs: 2138456 -> 2165351 (+1.26%); split: -0.02%, +1.28%
CodeSize: 11152460 -> 11260340 (+0.97%); split: -0.02%, +0.98%
VGPRs: 225144 -> 218124 (-3.12%); split: -3.25%, +0.13%
Latency: 11116102 -> 11059208 (-0.51%); split: -0.69%, +0.18%
InvThroughput: 1230193 -> 1252148 (+1.78%); split: -0.01%, +1.80%
VClause: 39518 -> 39513 (-0.01%); split: -0.49%, +0.48%
SClause: 59378 -> 59434 (+0.09%); split: -0.02%, +0.11%
Copies: 156172 -> 165997 (+6.29%); split: -0.81%, +7.10%
PreSGPRs: 181094 -> 181203 (+0.06%)
PreVGPRs: 139731 -> 139393 (-0.24%)
VALU: 1220769 -> 1244301 (+1.93%); split: -0.02%, +1.95%
SALU: 199567 -> 200240 (+0.34%); split: -0.00%, +0.34%
fossil-db (navi21):
Totals from 35520 (44.74% of 79395) affected shaders:
MaxWaves: 951830 -> 951870 (+0.00%)
Instrs: 20227773 -> 20229388 (+0.01%); split: -0.00%, +0.01%
CodeSize: 105513724 -> 105379916 (-0.13%); split: -0.13%, +0.01%
VGPRs: 1375400 -> 1375232 (-0.01%)
Latency: 81013985 -> 81046435 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 15273291 -> 15269166 (-0.03%); split: -0.04%, +0.01%
VClause: 354310 -> 354314 (+0.00%); split: -0.00%, +0.00%
SClause: 417047 -> 417049 (+0.00%); split: -0.00%, +0.00%
Copies: 1699486 -> 1699445 (-0.00%); split: -0.01%, +0.01%
Branches: 591269 -> 591274 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1370567 -> 1371062 (+0.04%)
PreVGPRs: 1100953 -> 1100716 (-0.02%)
VALU: 11075164 -> 11076189 (+0.01%); split: -0.00%, +0.01%
SALU: 3647378 -> 3648002 (+0.02%); split: -0.00%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211 >
2024-10-10 14:59:26 +00:00
Rhys Perry
67ad7359ff
nir/divergence_analysis: disable phi undef optimization by default
...
If the backend does not implement this too, or some other future transform
modifiess the phi so that this isn't the case (replace the phi with a
bcsel or replace undef with zero), then it will not actually be uniform.
This keeps it enabled to some degree for RADV/ACO.
fossil-db (navi31):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 195008 -> 195282 (+0.14%)
CodeSize: 1012592 -> 1015884 (+0.33%)
Latency: 3892826 -> 3898843 (+0.15%); split: -0.00%, +0.15%
InvThroughput: 460681 -> 460964 (+0.06%)
Copies: 13508 -> 13516 (+0.06%)
Branches: 5244 -> 5412 (+3.20%)
PreVGPRs: 5092 -> 5096 (+0.08%)
VALU: 116177 -> 116197 (+0.02%)
SALU: 23449 -> 23785 (+1.43%)
fossil-db (navi21):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 164471 -> 164981 (+0.31%)
CodeSize: 883988 -> 888420 (+0.50%)
Latency: 4074287 -> 4082043 (+0.19%)
InvThroughput: 783783 -> 784276 (+0.06%); split: -0.00%, +0.06%
Branches: 5262 -> 5430 (+3.19%)
PreVGPRs: 5100 -> 5104 (+0.08%)
VALU: 116375 -> 116381 (+0.01%)
SALU: 23589 -> 23925 (+1.42%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211 >
2024-10-10 14:59:26 +00:00
Daniel Schürmann
19583023a2
aco/ra: remove unnecessary check for duplicate precolored operands
...
An instruction can have at most one operand precolored to the same register.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:20 +00:00
Daniel Schürmann
9b2c4c4644
aco/ra: manually fill killed operands when required
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:20 +00:00
Daniel Schürmann
b530b67c73
aco/ra: add RegisterFile::fill_killed_operands(Instruction*) helper
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:20 +00:00
Daniel Schürmann
1499848487
aco/live_var_analysis: don't test whether phis are assigned to VCC
...
This check is redundant.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:19 +00:00
Daniel Schürmann
1d3e01cd62
aco: remove Program::allocationId
...
It is a duplicate of temp_rc.size().
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:19 +00:00
Daniel Schürmann
39fc327b8f
aco/reindex_ssa: remove update_live_out parameter
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:19 +00:00
Daniel Schürmann
bc2d166b50
aco/lower_to_hw: don't allocate new temporaries
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:19 +00:00
Daniel Schürmann
30e7644e5f
aco: simplify Definition constructors
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31362 >
2024-10-07 07:00:19 +00:00
Georg Lehmann
07032102e9
aco: use s_pack_lh for bitfield_select(0xffff)
...
Foz-DB Navi31
Totals from 13 (0.02% of 79206) affected shaders:
Instrs: 44871 -> 44838 (-0.07%)
CodeSize: 223804 -> 223608 (-0.09%)
Latency: 220186 -> 220191 (+0.00%); split: -0.01%, +0.02%
InvThroughput: 54169 -> 54186 (+0.03%); split: -0.00%, +0.03%
SALU: 5048 -> 5023 (-0.50%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31509 >
2024-10-05 17:55:08 +00:00
Georg Lehmann
a6f82cf16d
aco: use s_pack_hl for shfr16
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31509 >
2024-10-05 17:55:08 +00:00
Rhys Perry
96e7cd89ea
aco: fix is_vector_intact for GFX11 BVH
...
fossil-db (navi31):
Totals from 44 (0.06% of 79395) affected shaders:
Instrs: 1539111 -> 1539109 (-0.00%); split: -0.00%, +0.00%
CodeSize: 7880452 -> 7880380 (-0.00%); split: -0.00%, +0.00%
Latency: 7578794 -> 7578844 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 1450872 -> 1450876 (+0.00%); split: -0.00%, +0.00%
VClause: 40014 -> 40010 (-0.01%)
Copies: 116005 -> 116001 (-0.00%); split: -0.01%, +0.01%
VALU: 854630 -> 854626 (-0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31346 >
2024-10-03 17:55:56 +00:00
Rhys Perry
24c60be1ad
aco: create vector affinities for phi operands
...
fossil-db (navi21):
Totals from 2934 (3.70% of 79395) affected shaders:
Instrs: 8368484 -> 8365630 (-0.03%); split: -0.05%, +0.01%
CodeSize: 46032152 -> 45998480 (-0.07%); split: -0.09%, +0.01%
VGPRs: 200360 -> 200280 (-0.04%); split: -0.12%, +0.08%
Latency: 85556147 -> 85562615 (+0.01%); split: -0.09%, +0.10%
InvThroughput: 19066462 -> 19065173 (-0.01%); split: -0.09%, +0.09%
VClause: 209834 -> 209783 (-0.02%); split: -0.14%, +0.12%
SClause: 261811 -> 261826 (+0.01%); split: -0.00%, +0.01%
Copies: 727502 -> 724394 (-0.43%); split: -0.56%, +0.13%
Branches: 291083 -> 291120 (+0.01%); split: -0.01%, +0.03%
VALU: 5564021 -> 5560975 (-0.05%); split: -0.07%, +0.02%
SALU: 1100996 -> 1100942 (-0.00%); split: -0.02%, +0.02%
fossil-db (navi31):
Totals from 34207 (43.08% of 79395) affected shaders:
MaxWaves: 1036893 -> 1036781 (-0.01%); split: +0.01%, -0.02%
Instrs: 21977229 -> 21884600 (-0.42%); split: -0.47%, +0.05%
CodeSize: 112680884 -> 112298404 (-0.34%); split: -0.38%, +0.04%
VGPRs: 1590832 -> 1615912 (+1.58%); split: -0.25%, +1.83%
Latency: 142542601 -> 142670271 (+0.09%); split: -0.12%, +0.21%
InvThroughput: 19481055 -> 19434110 (-0.24%); split: -0.44%, +0.20%
VClause: 462865 -> 462558 (-0.07%); split: -0.20%, +0.13%
SClause: 619822 -> 619685 (-0.02%); split: -0.02%, +0.00%
Copies: 1704870 -> 1610889 (-5.51%); split: -5.89%, +0.38%
Branches: 518238 -> 518241 (+0.00%); split: -0.01%, +0.01%
VALU: 12230157 -> 12136112 (-0.77%); split: -0.82%, +0.05%
SALU: 2444075 -> 2444099 (+0.00%); split: -0.01%, +0.01%
VOPD: 3443 -> 3476 (+0.96%); split: +1.80%, -0.84%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11186
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31346 >
2024-10-03 17:55:56 +00:00
Rhys Perry
1e60509135
aco: stop using instructions in ra_ctx::vectors
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31346 >
2024-10-03 17:55:56 +00:00
Rhys Perry
7f092cbd91
aco: workaround hazards in emit_long_jump
...
fossil-db (navi31):
Totals from 29 (0.04% of 79395) affected shaders:
CodeSize: 17612888 -> 17615096 (+0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31316 >
2024-09-30 09:04:35 +00:00
Rhys Perry
9fb97085d1
aco/tests: update assembler tests for llvm
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Backport-to: 24.2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31316 >
2024-09-30 09:04:35 +00:00
Rhys Perry
93372ea9af
aco: do not use inline constants for 16-bit pseudo scalar trancendentals
...
Like https://github.com/llvm/llvm-project/pull/104395
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30729 >
2024-09-27 11:11:42 +00:00
Georg Lehmann
a9f8089240
nir: replace nir_opt_remove_phis_block with a single source version
...
This is what callers actually want, and it simplifies nir_opt_remove_phis
because we can assume dominance meta data is valid.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031 >
2024-09-27 05:19:16 +00:00