Commit graph

7447 commits

Author SHA1 Message Date
Daniel Stone
d0e5203855 ci/lava: Use per-job rootfs overlay for environment
Trying to get arbitrary strings suitably quoted for shell, embedded in a
YAML file, processed by Python templating, is like seven bad ideas all
embedded into one big can of bees.

Reuse the same script we use for bare-metal to generate the environment,
tar that up into a per-job overlay which is added to the
inter-pipeline-reusable rootfs built by the container jobs and the
intra-pipeline-reusable overlay built by the build jobs.

@anholt wrote a chunk of this - replacing the $ENV_VARS GitLab CI
variable with a Python loop across the POSIX job environment - in
!11192, but this still had YAML quoting nightmares, and was more
needless duplication between LAVA and bare-metal.

The diff is large and annoying, but is mostly a sed job to get
ENV_VARS="FOO=bar BAZ=quux" into FOO: bar\nBAZ: quux.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Co-authored-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
2021-06-11 12:13:00 +00:00
Daniel Schürmann
bb1c06343d aco/ra: refactor register assignment for vector operands
No functional changes.

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764>
2021-06-11 12:35:46 +02:00
Daniel Schürmann
09b99f1b7c aco/ra: refactor affinity coalescing
Also adds v_interp_p2_f32 to the list of
affinity-related instructions.

Totals from 68 (0.05% of 149839) affected shaders (GFX10.3):
CodeSize: 792928 -> 792056 (-0.11%)
Instrs: 152843 -> 152625 (-0.14%)
Latency: 1235353 -> 1235278 (-0.01%)
InvThroughput: 224087 -> 224049 (-0.02%)
Copies: 9218 -> 9000 (-2.36%)

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764>
2021-06-11 12:35:31 +02:00
Daniel Schürmann
3a98f484d1 aco/ra: only create phi-affinities for killed operands
If a phi-operand is not killed, it must be copied anyway.
The additional affinity would only overwrite any potential
better affinity that was already created

Totals from 1067 (0.71% of 149839) affected shaders (GFX10.3):
VGPRs: 68072 -> 68064 (-0.01%)
CodeSize: 8252588 -> 8245220 (-0.09%); split: -0.12%, +0.03%
Instrs: 1596146 -> 1593941 (-0.14%); split: -0.16%, +0.02%
Latency: 18828176 -> 18823914 (-0.02%); split: -0.08%, +0.06%
InvThroughput: 3575063 -> 3574787 (-0.01%); split: -0.05%, +0.04%
VClause: 24345 -> 24325 (-0.08%); split: -0.16%, +0.07%
Copies: 88712 -> 87398 (-1.48%); split: -1.77%, +0.29%
Branches: 52067 -> 51364 (-1.35%); split: -1.38%, +0.03%

Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8764>
2021-06-11 12:35:12 +02:00
Georg Lehmann
d3f735a249 ac: Enable 32bit predication on gfx9 with fw feature version 52.
Amdvlk does this as well and it passes the vulkan CTS on renoir.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11297>
2021-06-11 06:07:10 +00:00
Georg Lehmann
fc437ef944 ac: Enable 32bit predication on gfx10.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11297>
2021-06-11 06:07:10 +00:00
Georg Lehmann
a41ba20cbd ac: Check me_fw_feature for 32bit predication on gfx10.3
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11297>
2021-06-11 06:07:10 +00:00
Samuel Pitoiset
4026a07e74 radv: fix aligning the image offset by using align64()
This doesn't fix anything known. Found by inspection.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11302>
2021-06-11 07:35:32 +02:00
Rhys Perry
6204e17b44 radv: increase maxComputeSharedMemorySize
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11262>
2021-06-10 12:55:53 +00:00
Rhys Perry
9162963f0a aco: fix emit_mbcnt() with a VGPR mask
Found by inspection. Should be possible with nir_intrinsic_mbcnt_amd.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11295>
2021-06-10 11:21:47 +00:00
Timur Kristóf
18337fbcf2 aco: Use as_vgpr for the second source of mbcnt_amd.
Fixes: 1e49018ced
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11292>
2021-06-10 10:13:02 +00:00
Samuel Pitoiset
3a643a9ce1 ci: add expected list of failures for Bonaire (RADV)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11239>
2021-06-10 09:36:55 +00:00
Samuel Pitoiset
cfe7e81214 radv/winsys: add a small comment explaining the CHAIN bit
Without it the hardware launches an IB2 which might hang in some
rare situations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11214>
2021-06-10 08:31:11 +02:00
Samuel Pitoiset
a234840e60 radv: do not launch an IB2 for secondary cmdbuf with INDIRECT_MULTI on GFX7
It's illegal to emit DRAW_{INDEX}_INDIRECT_MULTI from an IB2 on GFX7.

PAL applies this workaround for indirect dispatches and also on
GFX8-9 but it doesn't seem needed.

This fixes various GPU hangs on Bonaire (GFX7).

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11214>
2021-06-10 08:31:08 +02:00
Timur Kristóf
1e49018ced amd: Add extra source to the mbcnt_amd NIR intrinsic.
The v_mbcnt instructions can take an extra source that they add to
the result. This is not exposed in SPIR-V but we now expose it in NIR.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
f6b2db298f ac/nir: Refactor and optimize the repacking sequence.
According to feedback, the terminology with "exclusive scan"
and "reduction" is difficult. Change it to use "repack" instead,
which better fits what this sequence is actually used for.

The new sequence stores only 1 byte / wave to LDS, and uses packed
instructions to produce the results. This has lower latency and
fewer instructions than what we previously had.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
b4e22eb482 aco: Keep VGPR destinations for uniform shared loads when beneficial.
When the result of these loads is only used by cross-lane instructions,
it is beneficial to use a VGPR destination. This is because this allows
to put the s_waitcnt further down, which decreases latency.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
ce141e4c5f aco: Implement byte and lane permute intrinsics.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
5713e059ea aco: Add validation for v_permlane instructions.
Previously there hasn't been any validation for these instructions,
but after shooting myself in the leg with it a few times, I decided
to add the validation now.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
fd6605367d aco: Implement nir_op_sad_u8x4.
Fix up the operand size for v_sad instructions, and implement
the new NIR horizontal add. There is no viable way to do this
in SALU, so let's always use a VGPR destination.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Timur Kristóf
228169c87c aco: Add note about v_alignbyte in the ISA README.
We tried to use this instruction for a more optimal sequence,
but it turned out that it doesn't exactly work as it was
supposed to. This note is to help others who want to use it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>
2021-06-09 16:48:51 +00:00
Rhys Perry
c129ede523 aco: use ds_read_{u8,u16}_d16
This allows partial writes and writes to the upper half of the destination.

fossil-db (Sienna Cichlid):
Totals from 135 (0.09% of 149839) affected shaders:

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>
2021-06-09 12:06:50 +00:00
Rhys Perry
6334d73fc9 aco: don't ever widen 8/16-bit sgpr load_shared
Doesn't seem to create incorrect code, but it is suboptimal.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>
2021-06-09 12:06:50 +00:00
Rhys Perry
d2b9c7e982 radv: improve LDS alignment check for load/store vectorization
Previously, this could vectorize two scalar 16-bit loads into a u8vec4
load.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>
2021-06-09 12:06:50 +00:00
Rhys Perry
4870d7d829 aco: use v1b/v2b for ds_read_u8/ds_read_u16
The p_extract_vector isn't necessary.

For ds_read_u8 and ds_read_u16, we used a 32-bit regclass, but did't load
32 bits, and used dst_hint for vector loads when we shouldn't have.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4863
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11113>
2021-06-09 12:06:50 +00:00
Samuel Pitoiset
2fb436e92a ci: update list of expected failures for Pitcairn/Oland (RADV)
The robustness2 failures were a mistake because they are actually
not supported (no VK_EXT_scalar_block_layout on GFX6).

The sparse related failures are no longer supported since sparse
is only enabled for Polaris10+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11243>
2021-06-09 11:27:44 +00:00
Samuel Pitoiset
d169dad393 aco: fix emitting literal offsets with SMEM on GFX7
When the offset is negative, reg() isn't 255. Fix this by splitting
SGPR and literal emission. While we are at it, adjust a comment
saying that literals are also accepted on GFX6 which is wrong.

Fixes another batch of robustness tests.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11247>
2021-06-09 11:10:38 +00:00
Samuel Pitoiset
13efad3086 radv: dump SPIR-V instead of using spirv-dis when generating a hang report
Useful when spirv-dis isn't found.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11034>
2021-06-09 10:07:17 +00:00
Georg Lehmann
3149eccc1c radv: Implement VK_EXT_global_priority_query.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11215>
2021-06-09 08:25:25 +00:00
Samuel Pitoiset
3761d994f6 aco: fix range checking for SSBO loads/stores with SGPR offset on GFX6-7
GFX6-7 are affected by a hw bug that prevents address clamping to work
correctly when the SGPR offset is used. Use the VGPR offset to fix it.

Fixes various hangs with dEQP-VK.robustness.robustness2.* on Bonaire.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11238>
2021-06-09 06:40:16 +00:00
Caio Marcelo de Oliveira Filho
8af6766062 nir: Move workgroup_size and workgroup_variable_size into common shader_info
Move it out the "cs" sub-struct, since these will be used for other
shader stages in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>
2021-06-08 09:23:55 -07:00
Caio Marcelo de Oliveira Filho
b5f6fc442c nir: Move zero_initialize_shared_memory into common shader_info
Move it out the "cs" sub-struct, since the bit will be used for other
shader stages in the future.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>
2021-06-08 09:23:55 -07:00
Tony Wasserka
3b81f53e34 aco/ra: Split print_regs by lines of 64 registers
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10517>
2021-06-08 17:03:08 +02:00
Tony Wasserka
69584478c9 aco/ra: Clean up print_regs output and support byte-allocated variables
Example output:
       00 03 06 09 12 15 18 21 24 27 30 33 36 39 42
sgprs: ·▉█▉███▉▉█··████···········▉████············

       00 03 06 09 12 15 18 21 24 27 30 33 36 39 42
vgprs: ▉▉··▉▉▉▉▘▀▉▉▉···▉▘▘▉▉▉▉···▉▉▉▀▀▉············

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10517>
2021-06-08 17:03:08 +02:00
Tony Wasserka
5bfef2de66 aco/ra: Fix off-by-one-error in print_regs
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 3675aefa84 ("aco/ra: Fix build with print_regs enabled")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10517>
2021-06-08 17:03:08 +02:00
Leo Liu
43c04ab2b4 radeonsi: separate video hw info based on HW engine individually
This removes previous "has_hw_decode" and "uvd_enc_supported" and
makes information more accuate for cases where HW decode, HW encode,
and HW JPEG decode might partially available.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11201>
2021-06-08 09:32:48 -04:00
Samuel Pitoiset
9f7e63e12a ac/debug: fix color printing PKT3 when count in header is too low
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11211>
2021-06-08 11:19:00 +02:00
Rhys Perry
c768d7d8f2 aco/tests: add SDWA tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
24418304b0 aco/tests: add tests for p_extract/p_insert lowering
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
8e0c6e196e aco: disallow literals with some instruction formats
Because isVOPn() is true for many VOP3, SDWA and DPP instructions, this
would often not complain.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
cf22eabc68 aco: make validate_ir() output usable in tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
54292e99c7 aco: optimize 32-bit extracts and inserts using SDWA
Still need to use dst_u=preserve field to optimize packs

fossil-db (Sienna Cichlid):
Totals from 15974 (10.66% of 149839) affected shaders:
VGPRs: 1009064 -> 1008968 (-0.01%); split: -0.03%, +0.02%
SpillSGPRs: 7959 -> 7964 (+0.06%)
CodeSize: 101716436 -> 101159568 (-0.55%); split: -0.55%, +0.01%
MaxWaves: 284464 -> 284490 (+0.01%); split: +0.02%, -0.01%
Instrs: 19334216 -> 19224241 (-0.57%); split: -0.57%, +0.00%
Latency: 375465295 -> 375230478 (-0.06%); split: -0.14%, +0.08%
InvThroughput: 79006105 -> 78860705 (-0.18%); split: -0.25%, +0.07%

fossil-db (Polaris):
Totals from 11369 (7.51% of 151365) affected shaders:
SGPRs: 787920 -> 787680 (-0.03%); split: -0.04%, +0.01%
VGPRs: 681056 -> 681040 (-0.00%); split: -0.01%, +0.00%
CodeSize: 68127288 -> 67664120 (-0.68%); split: -0.69%, +0.01%
MaxWaves: 54370 -> 54371 (+0.00%)
Instrs: 13294638 -> 13214109 (-0.61%); split: -0.62%, +0.01%
Latency: 373515759 -> 373214571 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 166529524 -> 166275291 (-0.15%); split: -0.20%, +0.05%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
63659fc15c radv: use byte/word extract/insert instructions
ACO doesn't yet combine extract/insert into instructions, but it seems to
already generate less instructions because NIR optimizes shift+and to
these instructions. Code size is worse in some cases though because we
have to always use a literal when masking.

fossil-db (Sienna Cichlid):
Totals from 14361 (9.58% of 149839) affected shaders:
VGPRs: 850152 -> 850304 (+0.02%); split: -0.02%, +0.04%
SpillSGPRs: 7979 -> 7989 (+0.13%); split: -0.03%, +0.15%
CodeSize: 88031216 -> 88162520 (+0.15%); split: -0.01%, +0.16%
MaxWaves: 269414 -> 269426 (+0.00%)
Instrs: 16695182 -> 16662852 (-0.19%); split: -0.21%, +0.01%
Latency: 375592693 -> 375544364 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 75627700 -> 75607720 (-0.03%); split: -0.07%, +0.04%

fossil-db (Polaris):
Totals from 13816 (9.13% of 151365) affected shaders:
SGPRs: 984896 -> 982512 (-0.24%); split: -0.29%, +0.05%
VGPRs: 809220 -> 809112 (-0.01%); split: -0.02%, +0.01%
SpillSGPRs: 9181 -> 9185 (+0.04%); split: -0.04%, +0.09%
CodeSize: 82017952 -> 82123484 (+0.13%); split: -0.01%, +0.14%
MaxWaves: 65721 -> 65723 (+0.00%)
Instrs: 16008744 -> 15988007 (-0.13%); split: -0.18%, +0.05%
Latency: 439911623 -> 439869622 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 185898770 -> 185841742 (-0.03%); split: -0.08%, +0.05%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
7d76b07d6b ac/llvm: implement byte/word extract/insert instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
daa329f664 aco: use byte/word extract pseudo-instructions
fossil-db (Sienna Cichild):
Totals from 1890 (1.26% of 149839) affected shaders:
CodeSize: 5104196 -> 5104300 (+0.00%); split: -0.00%, +0.01%
Latency: 11572943 -> 11572880 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 4876941 -> 4876982 (+0.00%); split: -0.00%, +0.00%
SClause: 26774 -> 26775 (+0.00%)
Copies: 125778 -> 125813 (+0.03%)
PreSGPRs: 56452 -> 56451 (-0.00%)

fossil-db (Polaris):
Totals from 1884 (1.24% of 151365) affected shaders:
CodeSize: 3849340 -> 3849312 (-0.00%); split: -0.00%, +0.00%
Instrs: 741391 -> 741382 (-0.00%)
Latency: 13533815 -> 13533439 (-0.00%)
InvThroughput: 12058777 -> 12058500 (-0.00%)
Copies: 120890 -> 120891 (+0.00%)
PreSGPRs: 48940 -> 48939 (-0.00%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
1f2518ef9f aco: implement nir_op_extract/nir_op_insert
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:43 +00:00
Rhys Perry
2f94353735 aco: add p_extract/p_insert
These will let us make the SDWA optimizer much simpler than if we were to
recognize combinations of shift/and/bfe.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Rhys Perry
e9d1643288 aco: disallow SDWA for instructions with 64-bit definitions/operands
For example, v_cvt_f64_i32. LLVM doesn't seem to allow this either and it
doesn't seem to work correctly.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Rhys Perry
1cbcfb8b38 nir, nir/algebraic: add byte/word insertion instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Samuel Pitoiset
736893060f radv: emit PA_SC_CONSERVATIVE_RASTERIZATION_CNTL only on GFX9+
This context register doesn't exist on older generations.

Cc: 21.1 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11210>
2021-06-08 05:58:01 +00:00