Commit graph

7119 commits

Author SHA1 Message Date
Daniel Schürmann
18ba93e673 aco/cssa: rewrite lower_to_cssa pass
The previous pass was based on misconceptions and
rounded up with bug fixes. The new pass is entirely
rewritten and basically just one-to-one from the paper:
 "Revisiting Out-of-SSA Translation for Correctness, CodeQuality, and Efficiency"
 by B. Boissinot et al.
It also incorporates the value-equality testing.

The regressions are mainly due to creating parallelcopies for
exec phis at loop headers (mitigated in the next commit).

Totals from 4933 (3.61% of 136546) affected shaders (Raven):
SpillSGPRs: 16249 -> 16527 (+1.71%); split: -0.28%, +1.99%
SpillVGPRs: 1771 -> 1595 (-9.94%)
CodeSize: 57544436 -> 58280304 (+1.28%); split: -0.00%, +1.28%
Scratch: 176128 -> 179200 (+1.74%)
Instrs: 11265783 -> 11445884 (+1.60%); split: -0.00%, +1.60%
Latency: 552596156 -> 555880540 (+0.59%); split: -0.53%, +1.13%
InvThroughput: 271431862 -> 273097423 (+0.61%); split: -0.18%, +0.79%
VClause: 160240 -> 160241 (+0.00%); split: -0.02%, +0.02%
SClause: 386863 -> 386685 (-0.05%); split: -0.07%, +0.02%
Copies: 1180801 -> 1345633 (+13.96%); split: -0.02%, +13.98%
Branches: 379129 -> 393052 (+3.67%); split: -0.01%, +3.69%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
9d73a4a412 aco: add new reindex_ssa() pass
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
d75c73e6a6 aco: fix kill flags on phi operands
Fossil-db changes are likely due to how the CSSA pass works.
Totals from 1782 (1.31% of 136546) affected shaders (Raven):
CodeSize: 25333292 -> 25294020 (-0.16%); split: -0.16%, +0.00%
Instrs: 4916059 -> 4908218 (-0.16%); split: -0.16%, +0.00%
Latency: 282860167 -> 282707176 (-0.05%); split: -0.08%, +0.03%
InvThroughput: 136487564 -> 136394958 (-0.07%); split: -0.12%, +0.05%
VClause: 74791 -> 74795 (+0.01%)
Copies: 542115 -> 534280 (-1.45%); split: -1.48%, +0.04%
Branches: 168977 -> 168966 (-0.01%); split: -0.01%, +0.01%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
13e4fed01f aco: lower p_spill with constants correctly
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
4a57787006 aco/spill: use correct next_use_distances at loop header
To decide which variables to spill, we must use the distances at the
beginning of the loop-header, and not the distances at the end of the
loop-preheader. The difference are that the former includes phis which are
viable to be spilled as opposed to the phi operands which would be reloaded
by add_coupling_code(), ending up in potentially too high register pressure
before the loop.

Totals from 206 (0.15% of 136546) affected shaders (Raven):
SpillSGPRs: 5154 -> 5000 (-2.99%)
CodeSize: 3654072 -> 3647184 (-0.19%); split: -0.19%, +0.00%
Instrs: 701482 -> 700526 (-0.14%); split: -0.14%, +0.00%
Latency: 40988780 -> 40872506 (-0.28%); split: -0.29%, +0.00%
InvThroughput: 20364560 -> 20306006 (-0.29%)
SClause: 20192 -> 20198 (+0.03%)
Copies: 77732 -> 77688 (-0.06%); split: -0.08%, +0.03%
Branches: 24204 -> 24050 (-0.64%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
b56ea19111 aco/spill: refactor live-in registerDemand calculation
This also fixes some hypothetical issue for loops without phis
and for loops with higher register pressure at the end of the
loop preheader.

No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
282eacc3e0 aco/spill: refactor some more spill decision taking
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
dfb10e4f4b aco/spill: don't count phis as variable access
This increases the chance of evicting phis
if these have longer next-use distances.

Totals from 6 (0.00% of 146267) affected shaders (Navi10):
CodeSize: 476992 -> 464388 (-2.64%)
Instrs: 81785 -> 79952 (-2.24%)
VClause: 2380 -> 2374 (-0.25%)
Copies: 26836 -> 25131 (-6.35%)
Branches: 2494 -> 2492 (-0.08%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
b2a6346df7 aco/spill: spill phi constants and exec directly to VGPR
This lets us avoid some CSSA copies.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
99936d7142 aco/spill: reload spilled exec masks directly to exec
This handles the case of
   exec = p_linear_phi %a, %b
where %a or %b might have been spilled.
By directly reloading these variables into the exec mask register,
we can avoid additional CSSA parallelcopies.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Daniel Schürmann
beb292343a aco/spill: refactor spill decision taking
No fossil-db changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Bas Nieuwenhuizen
61a1a385d3 radv: Re-enable retiling.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10037>
2021-04-13 12:08:24 +00:00
Samuel Pitoiset
3720c6a6f6 radv: fix needed dynamic state for VRS
If the pipeline struct isn't found, the state might still be dynamic.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10193>
2021-04-13 08:48:42 +02:00
Marek Olšák
3120113ee7 radeonsi: implement DCC MSAA 4x/8x fast clear using DCC equations on gfx9
MSAA 4x and 8x should only clear the first 2 samples because other samples
are uncompressed. The compute shader only clears that subset of DCC.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák
7e68fae25f ac,radeonsi: rewrite DCC retiling without the DCC retile map
The retile map is removed and replaced by direct DCC address computations
in the retile shader using the new function ac_nir_dcc_addr_from_coord.

The RADV code is disabled.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák
35adf91de7 ac/surface: limit the number of swizzle modes that can have displayable DCC
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák
5ce8c440dd ac/surface: add a test of DccAddrFromCoord prototype outside of addrlib
The test takes over 2 minutes on a 12C/24T CPU with OpenMP.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák
df2cbdd2e3 amd/addrlib: expose DCC address equations to drivers
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák
8771d45a74 ac/surface/tests: fix a random segfault in the modifier test
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák
23b2cf032a ac/surface/tests: test Sienna Cichlid and Navy Flounder
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák
1b3dbde3b9 ac/surface: only apply the 3D swizzle mode tuning to gfx10+
This fixes an addrlib failure on gfx9.

Fixes: b43f40166c "ac/surface: select best swizzle mode for 3D sampler performance"

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
James Park
f3bd5f9590 amd: Hide drm_fourcc.h on Windows
Declare missing definitions instead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9708>
2021-04-13 00:24:02 +00:00
James Park
7dfc9e4431 amd: Hide amdgpu_drm.h on Windows
Declare missing definitions instead.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9708>
2021-04-13 00:24:02 +00:00
James Park
1aeebac4e6 ac/rgp: BSD elf library compatibility
Allow compilation on Windows using modified BSD elf library.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9210>
2021-04-12 22:50:52 +00:00
Marek Olšák
faf10bd49d ac/surface: use named "color and "zs" structures in unions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
468836317b ac/surface: unify htile_* and dcc_* fields as meta_* fields
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
34fd07aa56 ac/surface: unify htile_levels and dcc_levels as meta_levels
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
b16c7b706f ac/surface: pack radeon_surf better
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
988f148db3 ac/surface: overlap color and Z/S fields using a union in gfx9_surf_layout
to save space

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
ab00557685 ac/surface: pack alignments by storing log2 in radeon_surf
Only one bit is set in alignments, so store the bit offset (log2) and
change the type from uint32_t to uint8_t.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
cb016bb600 ac/surface: pack radeon_surf::num_htile_levels better
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
8837d1d833 ac/surface: pack gfx9_surf_layout:resource_type better to save 8 bytes
Yes, this saves 8 bytes. See pahole for yourself.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
4bd5f62966 ac/surface: pack gfx9_surf_meta_flags better
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
3f074d2f45 ac/surface: inline and reorder gfx9_surf_flags for better packing
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
48dbdc62bf ac/surface: change legacy_surf_level::offset to 32-bit offset_256B shifted by 8
Images are always aligned to 256B (enforced by register and descriptor
fields) and limited to 40-bit addresses. This saves some space.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
bbda20bf29 ac/surface: overlap color and Z/S fields using a union in legacy_surf_layout
to save space

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák
2fd8018845 ac/surface: split dcc level info from surface_info to save space
stencil level info doesn't have DCC

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Leo Liu
75a725e4c5 ac: add function for querying video capabilities
It will be used to query caps of decode and encode
for hardware AMDGPU supports

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10095>
2021-04-12 17:33:32 +00:00
Rhys Perry
723b000d27 radv: don't use fp16 for 8-bit division lowering before GFX9
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
2021-04-12 16:19:46 +00:00
Rhys Perry
d8f12fd421 aco: fix 16-bit f2{u8,i8} on GFX6/7
Not really tested.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
2021-04-12 16:19:46 +00:00
Rhys Perry
d0e15b8c22 aco: fix 16-bit u2f32
This shouldn't sign-extend.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
2021-04-12 16:19:46 +00:00
Rhys Perry
a2619b97f5 nir/lower_idiv: add options to use fp32 for 8-bit division lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
2021-04-12 16:19:46 +00:00
Rhys Perry
7db8d307bc radv: remove second nir_lower_idiv
nir_lower_idiv now lowers 8/16-bit divisions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
2021-04-12 16:19:45 +00:00
Bas Nieuwenhuizen
fcd8aaf07a radv: Update editorconfig.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10091>
2021-04-10 03:31:58 +02:00
Bas Nieuwenhuizen
59c501ca35 radv: Format.
Using

find ./src/amd/vulkan -regex '.*/.*\.\(c\|h\|cpp\)' | xargs -P8 -n1 clang-format --style=file -i

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10091>
2021-04-10 03:31:58 +02:00
Bas Nieuwenhuizen
8451b41022 radv: Add clang-format for AMD code.
Copied from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8883
with increased colum width as 80 really made a mess.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10091>
2021-04-10 03:31:32 +02:00
Samuel Pitoiset
1ad295ed6f radv: allow to force VRS rates on GFX10.3 with RADV_FORCE_VRS
This allows to force the VRS rates via RADV_FORCE_VRS, the supported
values are 2x2, 1x2 and 2x1. This supports the primitive shading rate
mode for non GUI elements.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7794>
2021-04-09 14:47:53 +02:00
Samuel Pitoiset
549f41754a radv: use explicit VRS mode when configuring PA_CL_VRS_CNTL
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7794>
2021-04-09 14:47:52 +02:00
Samuel Pitoiset
b4a66b29cd radv: add MSAA support to CopyImage() on compute queue
CopyImage supports copying MSAA images if the number of samples match.
Found by inspection because this is untested by CTS for some reasons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10055>
2021-04-09 07:05:24 +00:00
Samuel Pitoiset
4caf19eb7d radv: do not clamp framebuffer dimensions to the minimum dimension
This shouldn't be needed and this is going to be wrong with VRS
attachments because dimensions are divided by the VRS texel size.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10111>
2021-04-09 08:32:05 +02:00