Daniel Schürmann
4c7a5b1e51
aco: don't use shared VGPRs for shaders consisting of multiple binaries
...
When using multiple binaries, we don't know the required number of VGPRs beforehand,
which means we either have to over-allocate VGPRs or avoid shared VGPRs.
As bpermute is the only instructions needing shared VGPRs, we decide for the latter.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22267 >
2023-04-04 18:35:43 +00:00
Daniel Schürmann
37df8edf34
aco/ra: adjust_max_used_regs() for fixed Operands
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22267 >
2023-04-04 18:35:43 +00:00
Daniel Schürmann
8c68aba678
aco: split ps_epilog args before exporting them
...
This avoids some unnecessary copies from extracting from the input vectors.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22267 >
2023-04-04 18:35:42 +00:00
Timur Kristóf
836204da25
aco: Better phi lowering for merge block when else-side is const.
...
Add a new special case for binary merge blocks to boolean
phi lowerig. This special case benefits shaders that
have divergent branches with an empty else block,
for example all NGG culling shaders.
Fossil DB stats on Rembrandt (NGG culling enabled):
Totals from 61778 (45.79% of 134913) affected shaders:
SpillVGPRs: 2268 -> 2284 (+0.71%); split: -1.10%, +1.81%
CodeSize: 164317952 -> 162962772 (-0.82%); split: -0.83%, +0.00%
Instrs: 31249824 -> 30910686 (-1.09%); split: -1.09%, +0.00%
Latency: 154948555 -> 154781097 (-0.11%); split: -0.12%, +0.02%
InvThroughput: 30397664 -> 30370872 (-0.09%); split: -0.13%, +0.04%
VClause: 529239 -> 529229 (-0.00%); split: -0.00%, +0.00%
SClause: 783417 -> 783430 (+0.00%)
Copies: 2627570 -> 2595161 (-1.23%); split: -1.25%, +0.02%
Branches: 976506 -> 976508 (+0.00%); split: -0.00%, +0.00%
Fossil DB stats on GFX11 (NGG culling disabled):
Totals from 895 (0.66% of 134913) affected shaders:
SpillVGPRs: 2258 -> 2322 (+2.83%); split: -0.44%, +3.28%
CodeSize: 6229152 -> 6215880 (-0.21%); split: -0.37%, +0.16%
Scratch: 216576 -> 215808 (-0.35%); split: -0.47%, +0.12%
Instrs: 1202077 -> 1198396 (-0.31%); split: -0.43%, +0.13%
Latency: 15921336 -> 16000561 (+0.50%); split: -0.74%, +1.24%
InvThroughput: 7425765 -> 7474891 (+0.66%); split: -0.67%, +1.33%
VClause: 22976 -> 23008 (+0.14%); split: -0.03%, +0.17%
SClause: 38269 -> 38271 (+0.01%)
Copies: 123244 -> 123896 (+0.53%); split: -0.30%, +0.83%
Branches: 47570 -> 47574 (+0.01%); split: -0.00%, +0.01%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493 >
2023-04-03 14:36:07 +00:00
Timur Kristóf
81b4806d64
aco: Call dominator_tree before lower_phis.
...
This just makes it possible to use the dominator
tree information during phi lowering.
No Fossil DB changes on GFX11.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493 >
2023-04-03 14:36:07 +00:00
Timur Kristóf
0eb7c49c7f
aco: Pop branch operands when targets are same in SSA elimination.
...
The branch instruction is no longer conditional when the targets are the
same, so the operand is not necessary and can be removed.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493 >
2023-04-03 14:36:07 +00:00
Timur Kristóf
739bd03c37
aco: Don't verify branch exec read when eliminating exec writes.
...
Verifying that the branch instruction reads exec is not actually
necessary because the pattern that we look for already implies that.
This prepares for the next commit which will remove the exec operand
from branches that have the same target. These branches will no
longer read exec, but they should still get the same optimization.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493 >
2023-04-03 14:36:07 +00:00
Timur Kristóf
9eb04d8f96
aco: Simplify get_phi_operand using Operand::c32_or_c64.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493 >
2023-04-03 14:36:07 +00:00
Timur Kristóf
0211e66f65
aco: Don't remove exec writes that also write other registers.
...
Don't eliminate an instruction that writes registers other than exec and scc.
It is possible that this is eg. an s_and_saveexec and the saved value is
used by a later branch.
Fixes: bc13049747
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493 >
2023-04-03 14:36:07 +00:00
Timur Kristóf
54da863956
aco: Consider p_cbranch_nz as divergent branch too.
...
A p_cbranch_nz instruction that reads exec is divergent too.
Fixes: f030b75b7d
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21493 >
2023-04-03 14:36:07 +00:00
Rhys Perry
6974e5479c
aco: fix nir_var_shader_out barriers for task shaders
...
These will be used in a future commit.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22211 >
2023-04-01 14:46:50 +00:00
Rhys Perry
0f60c18f29
aco: don't optimize s_or_b64(v_cmp_u_f32(a, b), cmp(a, a))
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22214 >
2023-03-31 19:41:54 +00:00
Samuel Pitoiset
d87c813da1
aco: remove unused aco_shader_info::vb_desc_usage_mask
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22192 >
2023-03-30 11:21:19 +00:00
Georg Lehmann
dae13f3dc1
aco: add tests for neg(mul) with opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:35 +00:00
Georg Lehmann
02b94037f6
aco/tests: run optimize.mad_mix.input_conv.modifiers on gfx11
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:35 +00:00
Georg Lehmann
728146b2fc
aco: add test for min/max combining with opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:35 +00:00
Georg Lehmann
9499f202e8
aco: add tests for cmp ordering with opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:35 +00:00
Georg Lehmann
22903bcded
aco: add tests for swap operand with opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:35 +00:00
Georg Lehmann
0b29dc5c06
aco: add tests for dpp with opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
f7bb794dda
aco: add tests for fma with opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
bb7c2b70c1
aco/optimizer: remove to_SDWA
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
e699a4181c
aco: keep label_mul/usedef/minmax in apply_extract
...
16bit int mad/fma/minmax combining can work with opsel set.
All other optimizations should already check if the instruction uses sdwa,
because we don't check this when applying the label initially.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
7014145ab2
aco/optimizer: use opsel for VOP12C
...
Foz-DB GFX1100:
Totals from 11759 (8.72% of 134864) affected shaders:
VGPRs: 848288 -> 844556 (-0.44%); split: -0.44%, +0.00%
SpillSGPRs: 8527 -> 8543 (+0.19%)
SpillVGPRs: 1411 -> 1423 (+0.85%); split: -0.21%, +1.06%
CodeSize: 114337120 -> 113882472 (-0.40%); split: -0.40%, +0.01%
Scratch: 128768 -> 129024 (+0.20%); split: -0.20%, +0.40%
MaxWaves: 250962 -> 252014 (+0.42%)
Instrs: 22187426 -> 22062378 (-0.56%); split: -0.57%, +0.00%
Latency: 232655375 -> 232376977 (-0.12%); split: -0.20%, +0.08%
InvThroughput: 28292530 -> 28217699 (-0.26%); split: -0.45%, +0.18%
VClause: 352463 -> 352364 (-0.03%); split: -0.12%, +0.10%
SClause: 659282 -> 659354 (+0.01%); split: -0.02%, +0.04%
Copies: 1371369 -> 1342340 (-2.12%); split: -2.30%, +0.19%
Branches: 495903 -> 495941 (+0.01%); split: -0.00%, +0.01%
PreSGPRs: 867295 -> 863664 (-0.42%)
PreVGPRs: 793480 -> 790549 (-0.37%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
9650724370
aco/gfx11: allow opsel for VOP12C
...
Foz-DB GFX1100:
Totals from 515 (0.38% of 134864) affected shaders:
CodeSize: 2768228 -> 2761076 (-0.26%)
Instrs: 520301 -> 518523 (-0.34%)
Latency: 5190860 -> 5187254 (-0.07%)
InvThroughput: 2120844 -> 2119447 (-0.07%)
Copies: 57238 -> 56101 (-1.99%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
3907c54443
aco: don't label mul with opsel as abs/neg
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
ace017bba8
aco/ir: copy opsel when converting to DPP
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
a60b9313d3
aco: swap opsel when swapping VOP2/C operands
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
fc1bf9c3b4
aco: return true in usesModifiers for VOP12C with opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
82f7b3acfa
aco: support neg(mul)/abs(mul) optimization in more cases
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
9d841507e1
aco: support v_cvt_f32_f16 with opsel in combine_mad_mix
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
9d6e223a7a
aco: update match_op3_for_vop3 for VOP12C opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
0896ecec9a
aco: handle opsel in combine_constant_comparison_ordering
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
d8f07a0ddc
aco: handle opsel in combine_ordering_test
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
4db43415e5
aco: handle opsel in combine_comparison_ordering
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
8e6d79d10d
aco/optimizer: preserve opsel when fusing fma
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
32d7a11acf
aco/ra: prepare for VOP12C opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
8ee1519cee
aco/to_hw_instr: use VOP1 opsel for v_mov_b16
...
Foz-DB GFX1100:
Totals from 4661 (3.46% of 134864) affected shaders:
CodeSize: 36500568 -> 36391704 (-0.30%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
066cee0896
aco: validate VOP12C opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
2c49b7babf
aco/assembler: support VOP12C opsel
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
9b4ea9ff90
aco/vn: hash opsel for VOP12C
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Georg Lehmann
c62e5ef82e
aco/ra: don't reallocate VOP3 instruction for non-vcc lane mask
...
This would need to copy opsel soon but we can just reuse the old instruction.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22069 >
2023-03-30 03:34:34 +00:00
Friedrich Vock
424825c6e5
aco: Un-swap addressable VGPRs/SGPRs in RT prolog
...
Fixes: 6446b79168 ("aco: implement select_rt_prolog()")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22188 >
2023-03-30 02:55:54 +00:00
Georg Lehmann
fd3ea4ffc2
aco: clean up to_mad_mix
...
These instructions are 32bit, so they don't support opsel anyway.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22103 >
2023-03-28 23:30:08 +00:00
Qiang Yu
67f295f1e2
aco: implement float16 nir_op_pack_(s|u)norm_2x16
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21552 >
2023-03-28 19:57:11 +00:00
Georg Lehmann
16c03fd756
aco/util: override default assignment operator for bitfield helpers
...
Otherwise, the default assignment operator copies the whole uint,
not just few bits we are interested in.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: e7559da757 ("aco: add bitfield array helper classes")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22154 >
2023-03-28 10:49:07 +00:00
Georg Lehmann
ed03696ed9
aco/ir: fix copy paste bug in convert_to_SDWA
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 60cd3ba39f ("aco: copy abs/neg with assignment")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22154 >
2023-03-28 10:49:07 +00:00
Friedrich Vock
1979e551a8
aco: Swap operands for v_and_b32 in RT prolog
...
The second operand must be a VGPR, only the first can be a literal.
With a literal, this code was wrongly assembled and resulted in artifacts on GFX11.
Fixes: 6446b79168 ("aco: implement select_rt_prolog()")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8642
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22142 >
2023-03-28 09:16:56 +00:00
Georg Lehmann
dde7c5506c
aco: make .clang-format usable with tests
...
Code between BEGIN_TEST and END_TEST should be indented,
and comments used by the test itself should not be reformatted.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22122 >
2023-03-27 20:43:22 +00:00
Georg Lehmann
5e9ea15484
aco: fix p_interp_gfx11 comment
...
It no longer uses a tmp exec and scc.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22107 >
2023-03-27 15:09:21 +00:00
Georg Lehmann
b1668aedaf
aco: don't check usesModifiers for pseudo instructions
...
This can't happen.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22106 >
2023-03-27 14:22:07 +00:00