Samuel Pitoiset
23ef0fb277
radv: do not allocate a clear value for images that support comp-to-single
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12565 >
2021-08-30 07:18:19 +00:00
Samuel Pitoiset
df688e6941
radv: do not load/store the clear value for comp-to-single images
...
Images that are fast cleared with the comp-to-single mode clears DCC
to 0x10 which tells the hardware to get the clear value from the
main surface instead of the reg.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12565 >
2021-08-30 07:18:19 +00:00
Samuel Pitoiset
0c550a5fe6
radv: disable DCC image stores on Navi12-14 for displayable DCC corruption
...
DCC image stores require 128B but 64B is used for displayable DCC.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5265
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5106
Cc: 21.2 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12521 >
2021-08-30 08:28:37 +02:00
Timur Kristóf
cfb0d931f2
aco: Emit zero for the derivatives of uniforms.
...
Observed in a shader from Resident Evil Village.
This also helps prevent emitting invalid IR.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12599 >
2021-08-27 20:34:22 +00:00
Daniel Schürmann
2eeaaabb8e
aco/optimizer: combine v_pk_mul_u16 + v_pk_add_u16 -> v_pk_mad_u16
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678 >
2021-08-27 19:57:59 +00:00
Daniel Schürmann
be16ebc5ca
aco/optimizer: fuse v_mul_f64 + v_add_f64 -> v_fma_f64
...
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678 >
2021-08-27 19:57:59 +00:00
Daniel Schürmann
8e27ca9953
aco/optimizer: combine v_mul_lo_u16 + v_add_u16 -> v_mad_u16
...
Totals from 192 (0.13% of 150170) affected shaders: (GFX10.3)
CodeSize: 1027224 -> 1019872 (-0.72%)
Instrs: 174784 -> 173863 (-0.53%)
Latency: 4235742 -> 4232177 (-0.08%); split: -0.11%, +0.03%
InvThroughput: 1777026 -> 1775945 (-0.06%); split: -0.09%, +0.03%
Copies: 34098 -> 34099 (+0.00%); split: -0.03%, +0.03%
PreVGPRs: 4920 -> 4850 (-1.42%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678 >
2021-08-27 19:57:59 +00:00
Daniel Schürmann
23d5865f42
aco: refactor nir_op_imul selection
...
Previously, the optimization to use v_mul_lo_u16 for
32bit multiplications was done in instruction_selection.
This was moved to the optimizer to ease some case distinctions.
The mixed results are due to increased use of SDWA.
Totals from 2616 (1.74% of 150170) affected shaders: (GFX10.3)
VGPRs: 143888 -> 143872 (-0.01%); split: -0.02%, +0.01%
CodeSize: 5604032 -> 5604080 (+0.00%); split: -0.01%, +0.01%
Instrs: 1086798 -> 1083915 (-0.27%); split: -0.27%, +0.01%
Latency: 8215793 -> 8213023 (-0.03%); split: -0.10%, +0.07%
InvThroughput: 20765157 -> 20773766 (+0.04%); split: -0.02%, +0.06%
VClause: 35256 -> 35260 (+0.01%); split: -0.02%, +0.03%
SClause: 29021 -> 29024 (+0.01%); split: -0.00%, +0.01%
Copies: 74163 -> 74306 (+0.19%); split: -0.05%, +0.24%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678 >
2021-08-27 19:57:59 +00:00
Daniel Schürmann
d8eef134d8
aco: only apply extract if not used more than 4 times
...
Totals from 61 (0.04% of 150170) affected shaders: (GFX10.3)
CodeSize: 1087732 -> 1087380 (-0.03%); split: -0.22%, +0.18%
Instrs: 192343 -> 192205 (-0.07%); split: -0.16%, +0.09%
Latency: 7231670 -> 7148073 (-1.16%); split: -1.19%, +0.04%
InvThroughput: 3436715 -> 3394926 (-1.22%); split: -1.25%, +0.04%
VClause: 4831 -> 4833 (+0.04%)
Copies: 50130 -> 49934 (-0.39%); split: -0.67%, +0.28%
Branches: 5945 -> 5948 (+0.05%)
PreSGPRs: 3486 -> 3472 (-0.40%)
PreVGPRs: 5154 -> 5152 (-0.04%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678 >
2021-08-27 19:57:59 +00:00
Timur Kristóf
589ccf3d77
aco: Consider maximum number of workgroups per CU/WGP on Navi.
...
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12517 >
2021-08-27 16:41:08 +00:00
Timur Kristóf
c8698199a1
aco: Consider LDS usage by PS inputs in MaxWaves calculation.
...
Before PS waves are launched, PS inputs are moved from PC to LDS
and the corresponding part of the PC is deallocated.
Each PS input occupies 3 * vec4 (3 * 16 = 48 bytes) of LDS space.
See Figure 10.3 in the GCN3 ISA manual.
These limit occupancy the same way as other stages' LDS usage does.
Note that PS can request additional LDS space via EXTRA_LDS_SIZE,
so that also must be taken into account here.
No Fossil DB changes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12517 >
2021-08-27 16:41:08 +00:00
Samuel Pitoiset
d90a8c79df
radv: remove unecessary radv_finishme() for invalid color formats
...
Something really bad happen (likely driver bug) if this is triggered.
Replace with some assertions to catch an eventual issue in debug build.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12556 >
2021-08-27 07:29:17 +00:00
Samuel Pitoiset
df90bb3f88
radv: remove useless check about number of samples in the HW resolve path
...
Although this can likely hang, this is invalid and should be caught
by the validation layers. There is many ways to hang the GPU with VK,
this check alone is useless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12556 >
2021-08-27 07:29:17 +00:00
Samuel Pitoiset
b05c2023cc
radv: remove outdated radv_finishme() in the HW resolve path
...
Resolving layered MSAA images is actually implemented by the HW
resolve path but never used because the driver uses the compute path.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12556 >
2021-08-27 07:29:17 +00:00
Timur Kristóf
626b125857
aco: Use workgroup size from input shader info.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12321 >
2021-08-26 09:46:18 +00:00
Timur Kristóf
c4ca08548b
radv: Remove superfluous workgroup size calculations.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12321 >
2021-08-26 09:46:18 +00:00
Timur Kristóf
9fd36bbacd
radv: Calculate workgroup sizes in radv_pipeline.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12321 >
2021-08-26 09:46:18 +00:00
Timur Kristóf
395c0c52c7
ac: Calculate workgroup sizes of HW stages that operate in workgroups.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12321 >
2021-08-26 09:46:18 +00:00
Samuel Pitoiset
66b5f05727
ci: update the list of skipped tests for Fiji/RADV
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12553 >
2021-08-26 09:09:47 +00:00
Timur Kristóf
5b7446d74c
radv, ac, aco: Use indices 0-2 of gs_vtx_offset argument array on GFX9+.
...
Previously, indices 0, 2, 4 were used.
This worked, but it was somewhat unintuitive.
This commit changes it to use indices 0, 1, 2 instead, which
makes the code easier to understand.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12511 >
2021-08-26 05:20:15 +00:00
Samuel Pitoiset
0f05c84bba
radv: allow storage images with VK_FORMAT_E5B9G9R9_UFLOAT_PACK32 on GFX10.3+
...
It should be supported.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12543 >
2021-08-25 16:27:46 +00:00
Daniel Schürmann
cd489e5388
aco: remove redundant s_and exec after nir_op_inot
...
Totals from 22585 (15.04% of 150170) affected shaders: (GFX10.3)
VGPRs: 1474048 -> 1473904 (-0.01%)
CodeSize: 155238876 -> 155187688 (-0.03%); split: -0.06%, +0.03%
MaxWaves: 385086 -> 385122 (+0.01%)
Instrs: 29297735 -> 29284442 (-0.05%); split: -0.08%, +0.04%
Latency: 675841742 -> 675764151 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 174859037 -> 174854796 (-0.00%); split: -0.01%, +0.01%
VClause: 479790 -> 479781 (-0.00%); split: -0.01%, +0.00%
SClause: 1106900 -> 1106615 (-0.03%); split: -0.03%, +0.01%
Copies: 1829037 -> 1828042 (-0.05%); split: -0.09%, +0.03%
Branches: 859971 -> 859967 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 1341850 -> 1342356 (+0.04%); split: -0.01%, +0.04%
PreVGPRs: 1327322 -> 1327034 (-0.02%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573 >
2021-08-25 12:43:50 +00:00
Timur Kristóf
abcc83e713
aco: Fix to_uniform_bool_instr when operands are not suitable.
...
Don't attempt to transform uniform boolean instructions when
their operands are unsuitable. This can happen eg. due to other
optimizations that combine SALU instructions which clear out
the uniform instruction labels.
Cc: mesa-stable
Fixes: 8a32f57fff
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573 >
2021-08-25 12:43:50 +00:00
Samuel Pitoiset
e0a703af11
ci: update the list of expected failures/skips for RADV
...
Against CTS 1.2.7.0.
Tested chips are Pitcairn, Polaris10, Navi14 and Sienna Cichlid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12539 >
2021-08-25 13:00:07 +02:00
Daniel Schürmann
a3110c308f
radv: call nir_lower_flrp() after the first radv_optimize_nir()
...
instead of inside the optimization loop
Totals from 2504 (1.67% of 150170) affected shaders: (GFX10.3)
VGPRs: 162592 -> 162416 (-0.11%); split: -0.12%, +0.01%
CodeSize: 18399756 -> 18383552 (-0.09%); split: -0.10%, +0.01%
MaxWaves: 42654 -> 42748 (+0.22%)
Instrs: 3499404 -> 3497075 (-0.07%); split: -0.08%, +0.01%
Latency: 87087238 -> 87064270 (-0.03%); split: -0.06%, +0.03%
InvThroughput: 21159621 -> 21150546 (-0.04%); split: -0.05%, +0.01%
VClause: 56653 -> 56667 (+0.02%); split: -0.00%, +0.03%
Copies: 226332 -> 226423 (+0.04%); split: -0.15%, +0.19%
Branches: 110027 -> 110025 (-0.00%); split: -0.05%, +0.04%
PreSGPRs: 168087 -> 168076 (-0.01%); split: -0.01%, +0.00%
PreVGPRs: 160814 -> 160705 (-0.07%)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12061 >
2021-08-24 16:10:30 +00:00
Rhys Perry
94ed2ab3a1
radv: use nir_vector_insert_imm in lower_intrinsics
...
This creates a single nir_op_vecn instead of a nir_op_vecn and several
copies.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12469 >
2021-08-24 10:35:19 +00:00
Rhys Perry
b23a9dd1f6
aco/scheduler: allow moving down VMEM stores to below VMEM loads
...
fossil-db (Vega10):
Totals from 93 (0.06% of 150305) affected shaders:
SGPRs: 4832 -> 4768 (-1.32%)
VGPRs: 4084 -> 4144 (+1.47%)
CodeSize: 316080 -> 317208 (+0.36%); split: -0.11%, +0.47%
MaxWaves: 589 -> 580 (-1.53%)
Instrs: 60229 -> 60511 (+0.47%); split: -0.15%, +0.61%
Latency: 636477 -> 540029 (-15.15%); split: -15.26%, +0.10%
InvThroughput: 293027 -> 283043 (-3.41%); split: -4.21%, +0.80%
VClause: 2557 -> 2716 (+6.22%); split: -0.35%, +6.57%
SClause: 1381 -> 1395 (+1.01%); split: -0.14%, +1.16%
Copies: 9424 -> 9728 (+3.23%); split: -0.74%, +3.97%
fossil-db (Sienna Cichlid):
Totals from 88 (0.06% of 150170) affected shaders:
VGPRs: 3840 -> 3872 (+0.83%)
CodeSize: 300544 -> 300960 (+0.14%); split: -0.09%, +0.23%
Instrs: 53714 -> 53871 (+0.29%); split: -0.05%, +0.35%
Latency: 489854 -> 462001 (-5.69%); split: -6.30%, +0.61%
InvThroughput: 100307 -> 95142 (-5.15%); split: -5.50%, +0.35%
VClause: 2322 -> 2564 (+10.42%); split: -0.39%, +10.81%
SClause: 1345 -> 1358 (+0.97%); split: -0.15%, +1.12%
Copies: 4113 -> 4351 (+5.79%); split: -0.66%, +6.44%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12211 >
2021-08-23 16:48:31 +00:00
Rhys Perry
2201f5a58c
aco: remove label_extract if the extract is used by a non-VALU
...
If an extract is used by a non-VALU instruction, it can't be applied to
all instructions, so it's not beneficial to try to apply it.
This check isn't needed because can_apply_extract()/can_use_SDWA() should
already handle non-VALU instructions.
fossil-db (Sienna Cichlid):
Totals from 1020 (0.68% of 150170) affected shaders:
SpillSGPRs: 1577 -> 1571 (-0.38%)
CodeSize: 7863668 -> 7858336 (-0.07%); split: -0.07%, +0.00%
Instrs: 1431583 -> 1431083 (-0.03%); split: -0.04%, +0.01%
Latency: 25891250 -> 25890916 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 7248683 -> 7248655 (-0.00%); split: -0.01%, +0.01%
SClause: 49072 -> 49071 (-0.00%)
Copies: 126649 -> 126580 (-0.05%); split: -0.11%, +0.06%
Branches: 39129 -> 39120 (-0.02%); split: -0.03%, +0.01%
PreSGPRs: 53071 -> 52943 (-0.24%); split: -0.26%, +0.02%
PreVGPRs: 57437 -> 57435 (-0.00%); split: -0.01%, +0.01%
fossil-db (Polaris10):
Totals from 654 (0.43% of 151696) affected shaders:
CodeSize: 5814552 -> 5811568 (-0.05%); split: -0.05%, +0.00%
Instrs: 1105783 -> 1105049 (-0.07%); split: -0.07%, +0.00%
Latency: 20261458 -> 20259744 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 9011785 -> 9011749 (-0.00%); split: -0.00%, +0.00%
Copies: 104693 -> 103904 (-0.75%); split: -0.76%, +0.00%
PreSGPRs: 36105 -> 36095 (-0.03%); split: -0.03%, +0.01%
PreVGPRs: 43813 -> 43809 (-0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12212 >
2021-08-23 14:56:37 +01:00
Samuel Pitoiset
e0353296da
radv: allocate shaders to 32-bit address to skip PGM_HI
...
This reduces the number of emitted registers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466 >
2021-08-23 11:28:21 +00:00
Samuel Pitoiset
2dc90ca8a4
radv: don't use SQ_NON_EVENT before GE_PC_ALLOC for better perf on Navi1x
...
Seems it make the perf worse.
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12466 >
2021-08-23 11:28:21 +00:00
Daniel Schürmann
77ffdf41b1
aco: add more validation rules for SDWA operands
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Daniel Schürmann
077776a866
aco/opcodes: remove definition_size[]
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Daniel Schürmann
f6b281a1c2
aco/validate: simplify get_subdword_bytes_written()
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Daniel Schürmann
ec1bbfa608
aco/ra: refactor subdword operand stride
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Daniel Schürmann
c75138ed64
aco/ra: refactor subdword definition info
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Daniel Schürmann
e11b23f7cd
aco: add instr_is_16bit() helper function
...
to indicate whether some instruction writes partial registers, only.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Daniel Schürmann
3d6ca41e44
aco: use VOPC_SDWA on GFX9+
...
Totals from 5138 (3.42% of 150170) affected shaders: (GFX10.3)
VGPRs: 409520 -> 409416 (-0.03%); split: -0.03%, +0.00%
CodeSize: 43056360 -> 43035696 (-0.05%); split: -0.06%, +0.02%
MaxWaves: 69296 -> 69310 (+0.02%)
Instrs: 8161016 -> 8153365 (-0.09%); split: -0.10%, +0.01%
Latency: 109397002 -> 109756208 (+0.33%); split: -0.05%, +0.38%
InvThroughput: 23238920 -> 23310761 (+0.31%); split: -0.11%, +0.42%
VClause: 135141 -> 135100 (-0.03%); split: -0.05%, +0.02%
SClause: 349511 -> 349489 (-0.01%); split: -0.01%, +0.00%
Copies: 388107 -> 387754 (-0.09%); split: -0.48%, +0.38%
Branches: 184629 -> 184503 (-0.07%); split: -0.08%, +0.01%
PreSGPRs: 258807 -> 258839 (+0.01%)
PreVGPRs: 372561 -> 372184 (-0.10%); split: -0.10%, +0.00%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Daniel Schürmann
60e171af06
aco/print_ir: fix printing of VOPC_SDWA definitions
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12364 >
2021-08-23 10:31:40 +00:00
Rhys Perry
8852c5448d
aco: fix vectorized 16-bit load_input/load_interpolated_input
...
Seems we haven't encountered this before because
nir_lower_io_to_scalar_early usually scalarizes this.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12486 >
2021-08-23 10:11:36 +00:00
Samuel Pitoiset
e4e2d45cc6
radv: remove useless DISABLE_{ZMASK,SMEM}_EXPCLEAR_OPTIMIZATION state
...
This has no effect without enabling EXPCLEAR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12326 >
2021-08-23 09:52:51 +02:00
Samuel Pitoiset
98d10eed48
radv: remove unused fast depth-stencil gfx clear path with expclear
...
This has never been used because it requires to know the previous
clear values which is not really possible in Vulkan.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12326 >
2021-08-23 09:52:48 +02:00
Samuel Pitoiset
be6bdb0918
radv: fix copying depth+stencil images on compute
...
Using separate aspects is required.
Fixes few CTS failures (dEQP-VK.api.copy_and_blit.*) when the compute
path is forced in the driver. Note that CTS coverage of compute queue
is rather limited.
Cc: 21.2 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12287 >
2021-08-20 16:43:22 +00:00
Samuel Pitoiset
067599f8bc
radv: remove incorrect comment about compressed writes to HTILE on GFX10+
...
This seems to be unsupported.
COMPRESSION_EN=1 and WRITE_COMPRESS_ENABLE=1 don't update HTILE
with image stores.
Note that there is no issue because depth/stencil images will be
decompressed for image stores, and TC-compat HTILE is disabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12450 >
2021-08-20 15:53:32 +00:00
Samuel Pitoiset
1c26751969
radv: remove unnecessary check in radv_layout_is_htile_compressed()
...
The driver doesn't enable TC-compat HTILE for storage images, so this
was actually always TRUE.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12450 >
2021-08-20 15:53:32 +00:00
Marek Olšák
556c10c02c
ac/surface: allow arbitrary swizzle modes for displayable DCC
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430 >
2021-08-20 14:28:36 +00:00
Marek Olšák
94d261029e
radv: allow arbitrary swizzle modes for displayable DCC
...
by adding retile pipeline variants
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430 >
2021-08-20 14:28:36 +00:00
Rhys Perry
4a7714ab7b
aco/tests: add tests for post-RA DPP combining
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924 >
2021-08-19 18:17:33 +00:00
Rhys Perry
12be7c8feb
aco/tests: add tests for pre-RA DPP combining
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924 >
2021-08-19 18:17:33 +00:00
Rhys Perry
4ac47ad1cd
aco: combine DPP into VALU after RA
...
Mostly helps a bunch of Cyberpunk 2077 shaders.
fossil-db (Siena Cichlid):
Totals from 26 (0.02% of 150170) affected shaders:
CodeSize: 83208 -> 81528 (-2.02%)
Instrs: 14728 -> 14308 (-2.85%)
Latency: 48041 -> 47793 (-0.52%)
InvThroughput: 10836 -> 10578 (-2.38%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924 >
2021-08-19 18:17:33 +00:00
Rhys Perry
2e6834d4f6
aco: combine DPP into VALU before RA
...
Mostly helps a bunch of Cyberpunk 2077 shaders. Catches some of the cases
that the post-RA can't optimize because of register assignment.
fossil-db (Siena Cichlid):
Totals from 25 (0.02% of 150170) affected shaders:
CodeSize: 78808 -> 75764 (-3.86%)
Instrs: 14311 -> 13547 (-5.34%)
Latency: 278697 -> 277885 (-0.29%)
InvThroughput: 63428 -> 62754 (-1.06%)
Copies: 1348 -> 1349 (+0.07%); split: -0.07%, +0.15%
PreVGPRs: 1035 -> 1011 (-2.32%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924 >
2021-08-19 18:17:33 +00:00