Commit graph

645 commits

Author SHA1 Message Date
Daniel Schürmann
1e2639026f aco: Format.
Manually adjusted some comments for more intuitive line breaks.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>
2021-07-12 21:27:31 +00:00
Samuel Pitoiset
ee79b87c62 radv: lower primitive shading rate in NIR
This allows more potential compiler optimizations if the value is a
constant or from a scalar load.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11579>
2021-07-12 17:54:07 +00:00
Daniel Schürmann
0eea0e55ad aco: add 'common/' and 'llvm/' prefix to #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Daniel Schürmann
59fdaa1985 aco: reorder and cleanup #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Samuel Pitoiset
543eb42c35 aco: use nir_ssa_def_is_unused() to determine if atomic dest is used
Instead of duplicating this chunk everywhere.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11793>
2021-07-09 12:12:57 +00:00
Samuel Pitoiset
74a221bcfd aco: fix shared_atomic_comp_swap if the second source isn't a VGPR
Only VGPRs are valid with DS instructions.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11777>
2021-07-08 10:41:14 +00:00
Rhys Perry
a9c4a31d8d aco: handle NIR loops without breaks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11626>
2021-07-01 10:01:52 +00:00
Rhys Perry
c094765a01 aco: remove resource flags
After disabling SMEM stores, nir_opt_access() now does the same analysis
and we don't need this anymore. Doing it in isel is also too late if we
want to lower descriptor loads in NIR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11652>
2021-06-30 19:07:12 +01:00
Timur Kristóf
e6bf5cfe59 aco/gfx10: Emit barrier at the start of NGG VS and TES.
The Navi 1x NGG hardware can hang in certain conditions when
not every wave launched before s_sendmsg(GS_ALLOC_REQ).

As a workaround, to ensure this never happens, let's emit a
workgroup barrier at the beginning of NGG VS and TES.
Note that NGG GS already has a workgroup barrier so it doesn't
need this.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10837>
2021-06-22 14:32:27 +00:00
Timur Kristóf
f9447abb36 aco/gfx10: NGG zero output workaround for conservative rasterization.
Navi 1x GPUs have an issue: they can hang when the output vertex
and primitive counts are zero. The workaround is exporting a dummy
triangle.

This commit changes the dummy triangle's vertex so its positions
are all NaN. This should make sure the triangle is never rendered.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10837>
2021-06-22 14:32:27 +00:00
Jason Ekstrand
f0f713960b nir,amd: Suffix nir_op_cube_face_coord/index with _amd
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>
2021-06-21 09:03:34 -05:00
Rhys Perry
1d50ef9ca6 aco: adjust the condition for expanding vertex fetch data format
Instead of avoiding out-of-bounds access, avoid creating a load larger
than the original attribute. This should work just as well, since the only
situations expending a load helped was because we shrunk it first.

Also fixes a bug where a 3 component load (4 components with the first
component skipped) would be incorrectly expanded to 4 components because
the stride check would never be performed. Maybe we should avoid skipping
the first component in some situations, but I'm not sure if it's worth
the VGPR cost.

fossil-db (vega10):
Totals from 583 (0.39% of 149974) affected shaders:
CodeSize: 1496848 -> 1500868 (+0.27%); split: -0.03%, +0.30%
Instrs: 286155 -> 286575 (+0.15%); split: -0.07%, +0.22%
Latency: 2947101 -> 2946865 (-0.01%); split: -0.23%, +0.22%
InvThroughput: 797396 -> 797127 (-0.03%); split: -0.08%, +0.04%

fossil-db (polaris10):
Totals from 583 (0.39% of 151365) affected shaders:
SGPRs: 38880 -> 39216 (+0.86%)
VGPRs: 24440 -> 24356 (-0.34%)
CodeSize: 1506808 -> 1510876 (+0.27%); split: -0.01%, +0.28%
Instrs: 288735 -> 289167 (+0.15%); split: -0.06%, +0.21%
Latency: 2963263 -> 2961884 (-0.05%); split: -0.24%, +0.19%
InvThroughput: 802351 -> 801665 (-0.09%); split: -0.12%, +0.04%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007>
2021-06-14 09:48:32 +00:00
Rhys Perry
91f8f82806 radv,aco: use all attributes in a binding to obtain an alignment for fetch
Instead of assuming scalar alignment for an attribute, we can use the
required alignment of other attributes in a binding to expect a higher
one.

This uses the alignment of all attributes in the pipeline, not just the
ones loaded. This can create slightly better code, but could break
pipelines which relied on unused (and unaligned) attributes no being
loaded. I don't think such pipelines are allowed by the spec.

fossil-db (Sienna Cichlid):
Totals from 44350 (30.32% of 146267) affected shaders:
VGPRs: 1694464 -> 1700616 (+0.36%); split: -0.08%, +0.44%
CodeSize: 60207184 -> 58093836 (-3.51%); split: -3.51%, +0.00%
MaxWaves: 1175998 -> 1174948 (-0.09%); split: +0.02%, -0.11%
Instrs: 11763444 -> 11458952 (-2.59%); split: -2.60%, +0.01%
Latency: 70679612 -> 67062215 (-5.12%); split: -5.27%, +0.15%
InvThroughput: 11482495 -> 11362911 (-1.04%); split: -1.20%, +0.16%
VClause: 359459 -> 343248 (-4.51%); split: -6.36%, +1.85%
SClause: 422404 -> 419229 (-0.75%); split: -1.17%, +0.42%
Copies: 754384 -> 764368 (+1.32%); split: -1.74%, +3.06%
Branches: 197472 -> 197474 (+0.00%); split: -0.03%, +0.03%
PreVGPRs: 1215348 -> 1215503 (+0.01%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9007>
2021-06-14 09:48:32 +00:00
Rhys Perry
9162963f0a aco: fix emit_mbcnt() with a VGPR mask
Found by inspection. Should be possible with nir_intrinsic_mbcnt_amd.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11295>
2021-06-10 11:21:47 +00:00
Timur Kristóf
18337fbcf2 aco: Use as_vgpr for the second source of mbcnt_amd.
Fixes: 1e49018ced
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11292>
2021-06-10 10:13:02 +00:00
Timur Kristóf
1e49018ced amd: Add extra source to the mbcnt_amd NIR intrinsic.
The v_mbcnt instructions can take an extra source that they add to
the result. This is not exposed in SPIR-V but we now expose it in NIR.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
ce141e4c5f aco: Implement byte and lane permute intrinsics.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
fd6605367d aco: Implement nir_op_sad_u8x4.
Fix up the operand size for v_sad instructions, and implement
the new NIR horizontal add. There is no viable way to do this
in SALU, so let's always use a VGPR destination.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Rhys Perry
c129ede523 aco: use ds_read_{u8,u16}_d16
This allows partial writes and writes to the upper half of the destination.

fossil-db (Sienna Cichlid):
Totals from 135 (0.09% of 149839) affected shaders:

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>
2021-06-09 12:06:50 +00:00
Rhys Perry
6334d73fc9 aco: don't ever widen 8/16-bit sgpr load_shared
Doesn't seem to create incorrect code, but it is suboptimal.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>
2021-06-09 12:06:50 +00:00
Rhys Perry
4870d7d829 aco: use v1b/v2b for ds_read_u8/ds_read_u16
The p_extract_vector isn't necessary.

For ds_read_u8 and ds_read_u16, we used a 32-bit regclass, but did't load
32 bits, and used dst_hint for vector loads when we shouldn't have.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4863
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>
2021-06-09 12:06:50 +00:00
Samuel Pitoiset
3761d994f6 aco: fix range checking for SSBO loads/stores with SGPR offset on GFX6-7
GFX6-7 are affected by a hw bug that prevents address clamping to work
correctly when the SGPR offset is used. Use the VGPR offset to fix it.

Fixes various hangs with dEQP-VK.robustness.robustness2.* on Bonaire.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11238>
2021-06-09 06:40:16 +00:00
Rhys Perry
daa329f664 aco: use byte/word extract pseudo-instructions
fossil-db (Sienna Cichild):
Totals from 1890 (1.26% of 149839) affected shaders:
CodeSize: 5104196 -> 5104300 (+0.00%); split: -0.00%, +0.01%
Latency: 11572943 -> 11572880 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 4876941 -> 4876982 (+0.00%); split: -0.00%, +0.00%
SClause: 26774 -> 26775 (+0.00%)
Copies: 125778 -> 125813 (+0.03%)
PreSGPRs: 56452 -> 56451 (-0.00%)

fossil-db (Polaris):
Totals from 1884 (1.24% of 151365) affected shaders:
CodeSize: 3849340 -> 3849312 (-0.00%); split: -0.00%, +0.00%
Instrs: 741391 -> 741382 (-0.00%)
Latency: 13533815 -> 13533439 (-0.00%)
InvThroughput: 12058777 -> 12058500 (-0.00%)
Copies: 120890 -> 120891 (+0.00%)
PreSGPRs: 48940 -> 48939 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
1f2518ef9f aco: implement nir_op_extract/nir_op_insert
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Caio Marcelo de Oliveira Filho
c8a7bd0dc8 nir: Rename WORK_GROUP (and similar) to WORKGROUP
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Daniel Schürmann
dc807dff3e radv,aco: scalarize all phis via nir_lower_phis_to_scalar()
This allows to remove some ACO code which did so previously.

Totals from 93 (0.06% of 149839) affected shaders (Navi2):
CodeSize: 582424 -> 582348 (-0.01%); split: -0.10%, +0.08%
Instrs: 107083 -> 107011 (-0.07%); split: -0.08%, +0.01%
Latency: 483338 -> 484881 (+0.32%); split: -0.09%, +0.40%
InvThroughput: 101129 -> 101532 (+0.40%); split: -0.03%, +0.42%
Copies: 9893 -> 9774 (-1.20%); split: -1.28%, +0.08%
Branches: 2862 -> 2858 (-0.14%)
PreSGPRs: 3342 -> 3339 (-0.09%)
PreVGPRs: 4567 -> 4565 (-0.04%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11181>
2021-06-04 16:47:01 +00:00
Rhys Perry
903f814b78 aco: don't create 4 and 5 dword NSA instructions on GFX10
"stability issues", apparently: https://reviews.llvm.org/D103348

fossil-db (Navi10):
Totals from 4512 (3.01% of 149839) affected shaders:
VGPRs: 221516 -> 223308 (+0.81%); split: -0.07%, +0.88%
CodeSize: 23000080 -> 23070672 (+0.31%); split: -0.08%, +0.39%
MaxWaves: 107718 -> 107496 (-0.21%); split: +0.11%, -0.32%
Instrs: 4321890 -> 4362822 (+0.95%); split: -0.00%, +0.95%
Latency: 71495710 -> 71581476 (+0.12%); split: -0.07%, +0.19%
InvThroughput: 11858568 -> 11938960 (+0.68%); split: -0.00%, +0.68%
VClause: 76575 -> 76585 (+0.01%); split: -0.05%, +0.07%
SClause: 168771 -> 168709 (-0.04%); split: -0.06%, +0.02%
Copies: 182305 -> 221948 (+21.75%); split: -0.00%, +21.75%
PreVGPRs: 194657 -> 195635 (+0.50%); split: -0.00%, +0.50%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c353895c92 ("aco: use non-sequential addressing")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10898>
2021-06-03 03:49:07 +00:00
Timur Kristóf
a93092d0ed aco: Use s_cbranch_vccz/nz in post-RA optimization.
A simple post-RA optimization which takes advantage of the
s_cbranch_vccz and s_cbranch_vccnz instructions.

It works on the following pattern:

vcc = v_cmp ...
scc = s_and vcc, exec
p_cbranch scc

The result looks like this:

vcc = v_cmp ...
p_cbranch vcc

Fossil DB results on Sienna Cichlid:

Totals from 4814 (3.21% of 149839) affected shaders:
CodeSize: 15371176 -> 15345964 (-0.16%)
Instrs: 3028557 -> 3022254 (-0.21%)
Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
2021-05-28 12:14:53 +00:00
Samuel Pitoiset
729ebe4b17 aco: fix emitting discard when the program just ends
For fragment shaders that only contain a discard, the exec mask has
to be zero'd and everything discarded.

It seems unnecessary to emit an export here because if the FS has no
exports, the compiler already emits a null export at the end.

Fixes incorrect hair rendering in Detroit: Become Human.

fossil-db (Sienna Cichlid):
Totals from 3 (0.00% of 149839) affected shaders:
CodeSize: 2896 -> 2872 (-0.83%)
Instrs: 556 -> 553 (-0.54%)
Latency: 29266 -> 29214 (-0.18%)
InvThroughput: 3374 -> 3372 (-0.06%)

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10955>
2021-05-26 10:32:59 +00:00
Daniel Schürmann
32c7d17120 aco: remove condition operand from branch in invert block
As value numbering only handles logical blocks, this
could lead to invalid IR until insert_exec_mask().
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10894>
2021-05-20 17:44:20 +00:00
Samuel Pitoiset
fe2a5716ee aco: fix derivatives/intrinsics with SGPR sources
ds_swizzle_b32 requires a VGPR and DPP can't encode SGPR sources.

Fixes
dEQP-VK.graphicsfuzz.cov-derivative-uniform-vector-global-loop-count.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10840>
2021-05-20 13:24:31 +00:00
Bas Nieuwenhuizen
c7904b5b9b aco: Implement bvh64_intersect_ray_amd intrinsic.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10818>
2021-05-18 23:02:25 +02:00
Bas Nieuwenhuizen
bfe2802188 aco: Add load_sbt_amd intrinsic implementation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9767>
2021-05-18 18:29:36 +00:00
Timur Kristóf
bb127c2130 radv: Use new NIR lowering of NGG GS when ACO is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
9732881729 radv: Use new NGG NIR lowering for VS/TES when ACO is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
89a76ff786 aco: Implement new NGG specific NIR intrinsics.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
75a002f809 aco: Split ngg_emit_sendmsg_gs_alloc_req from the wave0 check.
This allows us to emit the gs_alloc_req independently of the
wave ID check, which is what the NIR lowering will need.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Timur Kristóf
00fd087f0a aco: Allow workgroup barrier and shared scope for NGG shaders.
NGG already needs to use workgroup barriers, but this commit allows
them to come from NIR as opposed to just emitting it in ACO
instruction selection.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>
2021-05-12 13:47:04 +00:00
Rhys Perry
a54f111831 radv,aco: compact vertex buffer descriptors
It seems common for there to be holes.

fossil-db (GFX10.3, robustBufferAccess enabled):
Totals from 33791 (23.10% of 146267) affected shaders:
(no statistics changed)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
20a0744e22 Revert "radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2"
This reverts commit a8a6b9fb2f.

This is no longer necessary now that we fixup the size when creating the
descriptors.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
157c6b0f33 radv,aco: use per-attribute vertex descriptors for robustness
We have to use a different num_records for each attribute to correctly
implement robust buffer access.

fossil-db (GFX10.3, robustBufferAccess enabled):
Totals from 60059 (41.06% of 146267) affected shaders:
VGPRs: 2169040 -> 2169024 (-0.00%); split: -0.02%, +0.02%
CodeSize: 79473128 -> 81156016 (+2.12%); split: -0.00%, +2.12%
MaxWaves: 1635360 -> 1635258 (-0.01%); split: +0.00%, -0.01%
Instrs: 15559040 -> 15793205 (+1.51%); split: -0.01%, +1.52%
Latency: 90954792 -> 91308768 (+0.39%); split: -0.30%, +0.69%
InvThroughput: 14937873 -> 14958761 (+0.14%); split: -0.04%, +0.18%
VClause: 444280 -> 412074 (-7.25%); split: -9.22%, +1.97%
SClause: 588545 -> 644141 (+9.45%); split: -0.54%, +9.99%
Copies: 1010395 -> 1011232 (+0.08%); split: -0.44%, +0.53%
Branches: 274279 -> 274282 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 1431171 -> 1405056 (-1.82%); split: -2.89%, +1.07%
PreVGPRs: 1575253 -> 1575259 (+0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
dfa38fa0c7 aco: group loads from the same vertex binding into the same clause
In the future, we might have vertex attribute loads from the same binding
but with different descriptors. Since they will be loading from the same
buffer, we should continue grouping them into clauses.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>
2021-05-10 12:09:14 +00:00
Rhys Perry
ee9b744cb5 radv,aco: use nir_address_format_vec2_index_32bit_offset
The vec2 index helps the compiler make use of SMEM's SOFFSET field when
loading descriptors.

fossil-db (GFX10.3):
Totals from 126326 (86.37% of 146267) affected shaders:
VGPRs: 4898704 -> 4899088 (+0.01%); split: -0.02%, +0.03%
SpillSGPRs: 13490 -> 14404 (+6.78%); split: -1.10%, +7.87%
CodeSize: 306442996 -> 302277700 (-1.36%); split: -1.36%, +0.01%
MaxWaves: 3277108 -> 3276624 (-0.01%); split: +0.01%, -0.02%
Instrs: 58301101 -> 57469370 (-1.43%); split: -1.43%, +0.01%
VClause: 1208270 -> 1199264 (-0.75%); split: -1.02%, +0.28%
SClause: 2517691 -> 2432744 (-3.37%); split: -3.75%, +0.38%
Copies: 3518643 -> 3161097 (-10.16%); split: -10.45%, +0.29%
Branches: 1228383 -> 1228254 (-0.01%); split: -0.12%, +0.11%
PreSGPRs: 3973880 -> 4031099 (+1.44%); split: -0.19%, +1.63%
PreVGPRs: 3831599 -> 3831707 (+0.00%)
Cycles: 1785250712 -> 1778222316 (-0.39%); split: -0.42%, +0.03%
VMEM: 52873776 -> 50663317 (-4.18%); split: +0.18%, -4.36%
SMEM: 8534270 -> 8361666 (-2.02%); split: +1.79%, -3.82%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9523>
2021-04-27 15:56:07 +00:00
Samuel Pitoiset
4c2add8cba aco: adjust NGG if provoking vertex mode is last
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10449>
2021-04-27 07:31:03 +00:00
James Park
1351fcf3c3 amd: Fix warnings around variable sizes
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6162>
2021-04-23 10:37:22 +00:00
Timur Kristóf
74c467d988 aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7.
On GFX6-7, the 8 and 16-bit integer add reductions use the 32-bit v_add
instruction, which clobbers the VCC register.

Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10346>
2021-04-22 11:29:49 +00:00
Rhys Perry
0eaa5dfac0 aco: remove image parameter from get_sampler_desc()
We can just check whether tex_instr is NULL instead.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>
2021-04-20 17:42:21 +00:00
Rhys Perry
3cbe9894f7 aco: set TRUNC_COORD=0 for nir_texop_tg4
Fixes black squares in Assassin's Creed: Valhalla and rendering of
FidelityFX-CACAO demo.

fossil-db (sienna cichlid):
Totals from 3052 (2.09% of 146267) affected shaders:
SpillSGPRs: 8437 -> 8646 (+2.48%)
CodeSize: 30993832 -> 31116916 (+0.40%); split: -0.00%, +0.40%
Instrs: 5869934 -> 5886783 (+0.29%); split: -0.00%, +0.29%
Latency: 250330521 -> 250463770 (+0.05%); split: -0.00%, +0.05%
InvThroughput: 59797617 -> 59814584 (+0.03%); split: -0.00%, +0.03%
VClause: 92114 -> 92132 (+0.02%)
SClause: 197373 -> 197338 (-0.02%); split: -0.02%, +0.01%
Copies: 479482 -> 482394 (+0.61%); split: -0.01%, +0.61%
Branches: 219629 -> 219635 (+0.00%)
PreSGPRs: 248970 -> 249366 (+0.16%)

fossil-db (polaris10):
Totals from 3050 (2.06% of 147787) affected shaders:
SGPRs: 282864 -> 282912 (+0.02%); split: -0.01%, +0.02%
VGPRs: 242572 -> 242612 (+0.02%)
SpillSGPRs: 10387 -> 10675 (+2.77%)
CodeSize: 31872460 -> 31996128 (+0.39%)
MaxWaves: 10924 -> 10925 (+0.01%)
Instrs: 6222217 -> 6239072 (+0.27%)
Latency: 317482545 -> 317773685 (+0.09%); split: -0.00%, +0.09%
InvThroughput: 156149624 -> 156242072 (+0.06%); split: -0.00%, +0.06%
VClause: 92295 -> 92254 (-0.04%); split: -0.05%, +0.01%
SClause: 243342 -> 243321 (-0.01%); split: -0.01%, +0.00%
Copies: 678902 -> 681700 (+0.41%); split: -0.00%, +0.41%
Branches: 219698 -> 219703 (+0.00%)
PreSGPRs: 244251 -> 244644 (+0.16%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 58f25098a0 ("radv: Use TRUNC_COORD on samplers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3110
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>
2021-04-20 17:42:21 +00:00
Samuel Pitoiset
9434675d60 aco: fix opquantize2f16 on GFX6-7
Make sure to preserve signed zeroes.

Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero
on GFX6 (Pitcairn). Untested on GFX7.

Fixes: 54a09545ec ("aco: optimize a*0.0")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10319>
2021-04-19 16:33:37 +00:00
Marek Olšák
ec1ddb976a amd/registers: rename IMG_FORMAT to GFX10_FORMAT to disambiguate the meaning
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>
2021-04-17 02:37:49 +00:00