Commit graph

1437 commits

Author SHA1 Message Date
Timur Kristóf
aabe9d2f6e aco: Eliminate SALU comparison when SCC can be used instead.
For example:

s0, scc = s_and_u32 ...
scc = s_cmp_eq_u32 s0, 0
p_cbranch_sccz

is turned into:

s0, scc = s_and_u32 ...
p_cbranch_sccnz

Fossil DB results on Sienna Cichlid:

Totals from 85267 (56.91% of 149839) affected shaders:
CodeSize: 202539256 -> 202237268 (-0.15%)
Instrs: 38964493 -> 38888996 (-0.19%)
Latency: 750062328 -> 749913450 (-0.02%); split: -0.02%, +0.00%
InvThroughput: 167408952 -> 167405157 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
a93092d0ed aco: Use s_cbranch_vccz/nz in post-RA optimization.
A simple post-RA optimization which takes advantage of the
s_cbranch_vccz and s_cbranch_vccnz instructions.

It works on the following pattern:

vcc = v_cmp ...
scc = s_and vcc, exec
p_cbranch scc

The result looks like this:

vcc = v_cmp ...
p_cbranch vcc

Fossil DB results on Sienna Cichlid:

Totals from 4814 (3.21% of 149839) affected shaders:
CodeSize: 15371176 -> 15345964 (-0.16%)
Instrs: 3028557 -> 3022254 (-0.21%)
Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
0e4747d3fb aco: Introduce a new, post-RA optimizer.
This commit adds the skeleton of a new ACO post-RA optimizer,
which is intended to be a simple pass called after RA, and
is meant to do code changes which can only be done
after RA.

It is currently empty, the actual optimizations will be added
in their own commits. It only has a DCE pass, which deletes
some dead code generated by the spiller.

Fossil DB results on Sienna Cichlid:

Totals from 375 (0.25% of 149839) affected shaders:
CodeSize: 2933056 -> 2907192 (-0.88%)
Instrs: 534154 -> 530706 (-0.65%)
Latency: 12088064 -> 12084907 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 4433454 -> 4432421 (-0.02%); split: -0.02%, +0.00%
Copies: 81649 -> 78203 (-4.22%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
6f3c472f2e aco: New writeout overloads for the test framework.
These will be used by future tests.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
8d37aa91d6 aco: Add Operand(Temp, PhysReg) constructor.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Timur Kristóf
4491b94d58 aco: Don't DCE instructions that write non-temps, eg. exec.
No Fossil DB changes.
This commit makes DCE usable after RA.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Samuel Pitoiset
729ebe4b17 aco: fix emitting discard when the program just ends
For fragment shaders that only contain a discard, the exec mask has
to be zero'd and everything discarded.

It seems unnecessary to emit an export here because if the FS has no
exports, the compiler already emits a null export at the end.

Fixes incorrect hair rendering in Detroit: Become Human.

fossil-db (Sienna Cichlid):
Totals from 3 (0.00% of 149839) affected shaders:
CodeSize: 2896 -> 2872 (-0.83%)
Instrs: 556 -> 553 (-0.54%)
Latency: 29266 -> 29214 (-0.18%)
InvThroughput: 3374 -> 3372 (-0.06%)

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10955>
2021-05-26 10:32:59 +00:00
Timur Kristóf
c783293e47 aco: Don't eliminate exec write when it's used by a copy later.
Fixes: bc13049747
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10920>
2021-05-25 13:50:43 +00:00
Daniel Schürmann
32c7d17120 aco: remove condition operand from branch in invert block
As value numbering only handles logical blocks, this
could lead to invalid IR until insert_exec_mask().
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10894>
2021-05-20 17:44:20 +00:00
Timur Kristóf
020c3c403f aco/util: Initialize IDSet::bits_set to zero.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10806>
2021-05-20 17:11:22 +00:00
Timur Kristóf
c4f6e4d6b0 aco/insert_exec_mask: Fixed unused variable warning in release build.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10806>
2021-05-20 17:11:22 +00:00
Samuel Pitoiset
fe2a5716ee aco: fix derivatives/intrinsics with SGPR sources
ds_swizzle_b32 requires a VGPR and DPP can't encode SGPR sources.

Fixes
dEQP-VK.graphicsfuzz.cov-derivative-uniform-vector-global-loop-count.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10840>
2021-05-20 13:24:31 +00:00
Rhys Perry
3013670dfd aco: disallow SGPRs on DPP instructions
They can't be encoded.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10841>
2021-05-19 14:25:37 +00:00
Bas Nieuwenhuizen
c7904b5b9b aco: Implement bvh64_intersect_ray_amd intrinsic.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10818>
2021-05-18 23:02:25 +02:00
Bas Nieuwenhuizen
bfe2802188 aco: Add load_sbt_amd intrinsic implementation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9767>
2021-05-18 18:29:36 +00:00
Timur Kristóf
bc13049747 aco: Eliminate useless exec writes in jump threading.
Eliminate exec writes which are unused by subsequent instructions.

Fossil DB results on Sienna Cichlid:

Totals from 80960 (54.03% of 149839) affected shaders:
CodeSize: 162953748 -> 161749372 (-0.74%)
Instrs: 31462273 -> 31161179 (-0.96%)
Copies: 2171239 -> 1942293 (-10.54%)
Branches: 807771 -> 807747 (-0.00%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Timur Kristóf
e230dcc30b aco: Refactor SSA elimination phi info to use vector instead of map.
No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Timur Kristóf
25a7947da7 aco: Don't use s_and_saveexec with branches when exec is constant.
When exec is constant, we can remember the constant as the old exec,
and just copy the condition and use it as the new exec. There is no
need to save the constant.

Due to using p_parallelcopy which is lowered to s_mov_b64 (or 32),
many exec restores now become copies, hence the increase in the copy
stats.

Fossil DB changes on Sienna Cichlid:

Totals from 73969 (49.37% of 149839) affected shaders:
SpillSGPRs: 1768 -> 1610 (-8.94%)
CodeSize: 99053892 -> 99047884 (-0.01%); split: -0.02%, +0.01%
Instrs: 19372852 -> 19370398 (-0.01%); split: -0.02%, +0.01%
VClause: 515154 -> 515142 (-0.00%); split: -0.00%, +0.00%
SClause: 719236 -> 718395 (-0.12%); split: -0.14%, +0.02%
Copies: 1109770 -> 1254634 (+13.05%); split: -0.07%, +13.12%
Branches: 374338 -> 374348 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1776481 -> 1653761 (-6.91%)

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Timur Kristóf
c850af936a aco: Remember when exec mask is const, and restore the const then.
Previously, we would store even the constant -1 exec mask from the
beginning of every merged shader. With this change it is no longer
necessary because we can restore to constant exec mask directly.

Hence, this frees up a register pair (single register for Wave32)
in every merged shader.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Timur Kristóf
04f90db9a0 aco: Use Operand instead of Temp for the exec mask stack.
This will enable us to store non-temporary values,
such as constant operands there.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Timur Kristóf
662bbf6ad4 aco: Determine whether a few more instructions need exec.
These don't really need the exec mask (and never have), but we haven't
needed to include them in needs_exec_mask yet.

No Fossil DB changes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10691>
2021-05-18 11:48:22 +00:00
Rhys Perry
fb31dda909 aco/ra: use flags instead of booleans for update_renames()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10459>
2021-05-17 13:31:07 +00:00
Rhys Perry
6fd6374e27 aco/ra: fix get_reg_for_operand() with vector operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10459>
2021-05-17 13:31:07 +00:00
Rhys Perry
c08bfa110c aco/ra: fix get_reg_for_operand() when the blocking var is a vector
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10459>
2021-05-17 13:31:07 +00:00
Rhys Perry
bc95d55e1f aco/ra: fix get_reg_for_operand() with no free registers
fossil-db (Sienna Cichlid):
Totals from 195 (0.13% of 149839) affected shaders:
CodeSize: 2352160 -> 2356720 (+0.19%); split: -0.00%, +0.20%
Instrs: 431976 -> 433124 (+0.27%); split: -0.00%, +0.27%
Latency: 10174434 -> 10174897 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 4044388 -> 4044425 (+0.00%); split: -0.00%, +0.00%
Copies: 67634 -> 68762 (+1.67%); split: -0.00%, +1.67%

fossil-db (Polaris):
Totals from 186 (0.12% of 151365) affected shaders:
CodeSize: 2272356 -> 2276848 (+0.20%); split: -0.00%, +0.20%
Instrs: 432390 -> 433513 (+0.26%); split: -0.00%, +0.26%
Latency: 13153394 -> 13160194 (+0.05%); split: -0.00%, +0.05%
InvThroughput: 10889509 -> 10889967 (+0.00%); split: -0.00%, +0.00%
SClause: 12745 -> 12747 (+0.02%)
Copies: 74832 -> 75945 (+1.49%); split: -0.01%, +1.50%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10459>
2021-05-17 13:31:07 +00:00
Rhys Perry
4e459df0fc aco/ra: initialize temp_in_scc earlier
We need to know if there's a temporary in SCC before the instruction, not
after.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10459>
2021-05-17 13:31:07 +00:00
Daniel Schürmann
b960169257 aco/ra: also prevent overflow register for p_create_vector operands
Fixes: d659ce0d6c ('aco/ra: prevent underflow register for p_create_vector operands')
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10832>
2021-05-17 11:18:25 +00:00
Connor Abbott
a40714abf7 nir/lower_phis_to_scalar: Add "lower_all" option
We don't want to have to deal with vector phis in freedreno, because
vectors are always split/unsplit around vectorized instructions anyways,
and the stated reason for not scalarising them (it hurting coalescing)
won't apply to us because we won't be using nir_from_ssa. Add this
option so that we don't have to do the equivalent thing while
translating from NIR.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>
2021-05-17 09:59:45 +00:00
Daniel Schürmann
d659ce0d6c aco/ra: prevent underflow register for p_create_vector operands
It could happen that we tested negative out-of-range
registers for p_create_vector operands resulting in a crash.

Fixes: 8962510e38 ('aco/ra: Conservatively refactor get_reg_specified to use PhysRegInterval')
Closes: #4697
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10799>
2021-05-14 17:26:41 +00:00
Tony Wasserka
80ee9d3947 aco/scheduler: Verify register demand invariants in debug mode
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10644>
2021-05-13 15:27:57 +00:00
Tony Wasserka
50ba919d37 aco/scheduler: Fix register demand computation for upwards moves
The initial value needs to be taken from the instruction that is being
moved over, not the one to be moved.

Additionally the parameter of this function was removed because it was
misleading. Setting it to any value other than source_idx would cause
register_demand to be initialized incorrectly. (Instead, the maximum
demand among the covered instructions would need to be determined.)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10644>
2021-05-13 15:27:57 +00:00
Tony Wasserka
c528af1076 aco/scheduler: Fix register demand computation for downwards moves
Previously, changes in total_demand_clause were not always propagated to
total_demand. For instance, clause moves do not change the local register
demand at the end of a clause, yet they may still affect the total maximum.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 8235bc6411 ("aco: try to group together VMEM loads of the same resource")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4533
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10644>
2021-05-13 15:27:57 +00:00
Daniel Schürmann
c7d679f0f7 aco: relax validation rules for p_reduce dst RegType
By exposing a subgroupSize of 64, reductions with
cluster_size 32 in wave32 might be considered divergent,
and thus, result in a VGPR.

Fixes: dEQP-VK.subgroups.clustered.graphics.subgroupclustered* with wave32

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10769>
2021-05-13 15:10:24 +00:00
Daniel Schürmann
989e9867a6 aco: fix additional register requirements for spilling
It could happen that VGPR spilling without SGPR spilling
calculated a negative spills_to_vgpr number and then
increasing the VGPR target demand above the limit.

Cc: mesa-stable

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10756>
2021-05-12 14:13:24 +00:00
Timur Kristóf
bb127c2130 radv: Use new NIR lowering of NGG GS when ACO is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
9732881729 radv: Use new NGG NIR lowering for VS/TES when ACO is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
89a76ff786 aco: Implement new NGG specific NIR intrinsics.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
75a002f809 aco: Split ngg_emit_sendmsg_gs_alloc_req from the wave0 check.
This allows us to emit the gs_alloc_req independently of the
wave ID check, which is what the NIR lowering will need.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
ad8dd39bd3 aco: Fixup the NIR metadata after sanitize_cf_list.
sanitize_cf_list can in fact invalidate the dominance metadata,
which we need to use eg. nir_unsigned_upper_bound.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
00fd087f0a aco: Allow workgroup barrier and shared scope for NGG shaders.
NGG already needs to use workgroup barriers, but this commit allows
them to come from NIR as opposed to just emitting it in ACO
instruction selection.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Rhys Perry
a54f111831 radv,aco: compact vertex buffer descriptors
It seems common for there to be holes.

fossil-db (GFX10.3, robustBufferAccess enabled):
Totals from 33791 (23.10% of 146267) affected shaders:
(no statistics changed)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
20a0744e22 Revert "radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2"
This reverts commit a8a6b9fb2f.

This is no longer necessary now that we fixup the size when creating the
descriptors.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
157c6b0f33 radv,aco: use per-attribute vertex descriptors for robustness
We have to use a different num_records for each attribute to correctly
implement robust buffer access.

fossil-db (GFX10.3, robustBufferAccess enabled):
Totals from 60059 (41.06% of 146267) affected shaders:
VGPRs: 2169040 -> 2169024 (-0.00%); split: -0.02%, +0.02%
CodeSize: 79473128 -> 81156016 (+2.12%); split: -0.00%, +2.12%
MaxWaves: 1635360 -> 1635258 (-0.01%); split: +0.00%, -0.01%
Instrs: 15559040 -> 15793205 (+1.51%); split: -0.01%, +1.52%
Latency: 90954792 -> 91308768 (+0.39%); split: -0.30%, +0.69%
InvThroughput: 14937873 -> 14958761 (+0.14%); split: -0.04%, +0.18%
VClause: 444280 -> 412074 (-7.25%); split: -9.22%, +1.97%
SClause: 588545 -> 644141 (+9.45%); split: -0.54%, +9.99%
Copies: 1010395 -> 1011232 (+0.08%); split: -0.44%, +0.53%
Branches: 274279 -> 274282 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1431171 -> 1405056 (-1.82%); split: -2.89%, +1.07%
PreVGPRs: 1575253 -> 1575259 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
dfa38fa0c7 aco: group loads from the same vertex binding into the same clause
In the future, we might have vertex attribute loads from the same binding
but with different descriptors. Since they will be loading from the same
buffer, we should continue grouping them into clauses.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Tony Wasserka
741e84f554 aco/spill: Fix improper handling of exec phis
The "continue" was placed in the wrong loop, leading to exec being
counted as a spilled register when it wasn't.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: a56ddca4e8 ('aco: make all exec accesses non-temporaries')
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4533
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10486>
2021-05-03 10:31:07 +00:00
Rhys Perry
ee9b744cb5 radv,aco: use nir_address_format_vec2_index_32bit_offset
The vec2 index helps the compiler make use of SMEM's SOFFSET field when
loading descriptors.

fossil-db (GFX10.3):
Totals from 126326 (86.37% of 146267) affected shaders:
VGPRs: 4898704 -> 4899088 (+0.01%); split: -0.02%, +0.03%
SpillSGPRs: 13490 -> 14404 (+6.78%); split: -1.10%, +7.87%
CodeSize: 306442996 -> 302277700 (-1.36%); split: -1.36%, +0.01%
MaxWaves: 3277108 -> 3276624 (-0.01%); split: +0.01%, -0.02%
Instrs: 58301101 -> 57469370 (-1.43%); split: -1.43%, +0.01%
VClause: 1208270 -> 1199264 (-0.75%); split: -1.02%, +0.28%
SClause: 2517691 -> 2432744 (-3.37%); split: -3.75%, +0.38%
Copies: 3518643 -> 3161097 (-10.16%); split: -10.45%, +0.29%
Branches: 1228383 -> 1228254 (-0.01%); split: -0.12%, +0.11%
PreSGPRs: 3973880 -> 4031099 (+1.44%); split: -0.19%, +1.63%
PreVGPRs: 3831599 -> 3831707 (+0.00%)
Cycles: 1785250712 -> 1778222316 (-0.39%); split: -0.42%, +0.03%
VMEM: 52873776 -> 50663317 (-4.18%); split: +0.18%, -4.36%
SMEM: 8534270 -> 8361666 (-2.02%); split: +1.79%, -3.82%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9523>
2021-04-27 15:56:07 +00:00
Samuel Pitoiset
4c2add8cba aco: adjust NGG if provoking vertex mode is last
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10449>
2021-04-27 07:31:03 +00:00
James Park
1351fcf3c3 amd: Fix warnings around variable sizes
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6162>
2021-04-23 10:37:22 +00:00
Timur Kristóf
74c467d988 aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7.
On GFX6-7, the 8 and 16-bit integer add reductions use the 32-bit v_add
instruction, which clobbers the VCC register.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10346>
2021-04-22 11:29:49 +00:00
Rhys Perry
776ba40115 aco: add and use Program::progress
This is used when printing the program and to avoid updating register
demand during post-RA liveness analysis.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>
2021-04-21 11:09:33 +00:00