Georg Lehmann
533d0008c7
aco: remove stale TODOs about v_interp opsel
...
These are already handled correctly according to the ISA docs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21096 >
2023-02-10 12:01:56 +00:00
Rhys Perry
b4383821e7
aco: don't modify exec in p_interp_gfx11
...
The RDNA3 ISA docs say that lds_param_load write the entire quad
regardless of exec, so this isn't needed.
fossil-db (gfx1100):
Totals from 5291 (3.93% of 134574) affected shaders:
Instrs: 4891396 -> 4789628 (-2.08%)
CodeSize: 25519032 -> 25111960 (-1.60%)
Latency: 36122982 -> 36074300 (-0.13%); split: -0.14%, +0.00%
InvThroughput: 4162436 -> 4161424 (-0.02%); split: -0.02%, +0.00%
Copies: 263862 -> 263838 (-0.01%)
PreSGPRs: 225012 -> 224179 (-0.37%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21171 >
2023-02-08 19:35:54 +00:00
Georg Lehmann
6e4598f7b9
aco: support omod/imod for v_fmac_f16
...
Only matters for post-RA DPP16.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
2deda5c0be
aco: don't list imod/omod support v_fmaak_f32/v_fmamk_f32
...
We can never use them anyway because these opcodes don't support VOP3/DPP16/SDWA
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
4c9ac73064
aco: allow output modifiers for ldexp_f16
...
It also supports imod for the first operand, but we cannot express that at
moment.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
b63aa2bb8e
aco: don't allow output modifiers for v_cvt_pkrtz_f16_f32
...
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21174 >
2023-02-08 18:52:28 +00:00
Georg Lehmann
5038a049f1
aco: add mov/cndmask opcodes to does_fp_op_flush_denorms
...
For completeness sake also add v_mov_b32, even if we don't use imod for it
because it's only supported since gfx10.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21170 >
2023-02-08 13:07:46 +00:00
Georg Lehmann
c8adf16278
aco: fix imod/omod for gfx11 VOP3 opcodes
...
Fixes: d8d99c3c4f ("aco: add GFX11 opcode numbers")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21170 >
2023-02-08 13:07:46 +00:00
Rhys Perry
fad1f716dd
aco: fix out-of-bounds access when moving s_mem(real)time across SMEM
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8224
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21138 >
2023-02-07 14:50:43 +00:00
Rhys Perry
24618721d3
ac: move ring_offsets to ac_shader_args
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19202 >
2023-02-06 14:25:15 +00:00
Qiang Yu
ed419f46aa
aco: remove early_rast wait insert
...
It's done in nir position export.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:44 +00:00
Qiang Yu
f6b194b648
nir,ac/llvm,aco,radv,radeonsi: remove nir_export_vertex_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:44 +00:00
Qiang Yu
f44872c7b6
nir,ac/llvm,aco: remove nir_export_primitive_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:44 +00:00
Qiang Yu
601ad9e0a9
amd,radeonsi: implement nir_load_force_vrs_rates_amd in driver abi
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:43 +00:00
Qiang Yu
8331842258
aco: implement nir_export_amd
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20691 >
2023-02-03 12:27:43 +00:00
Georg Lehmann
3c25edfdb7
aco: Improve wave64 cycle estimates.
...
Reviewed-By: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20507 >
2023-02-02 17:59:23 +01:00
Rhys Perry
bfd4ac4581
aco: limit VALUPartialForwardingHazard search
...
Complicated CFG and lots of SALU can cause this to take an extremely long
time to finish.
Fixes
dEQP-VK.graphicsfuzz.cov-value-tracking-selection-dag-negation-clamp-loop
and Monster Hunter Rise demo compile times.
fossil-db (gfx1100):
Totals from 57 (0.04% of 134574) affected shaders:
Instrs: 170919 -> 171165 (+0.14%)
CodeSize: 860144 -> 861128 (+0.11%)
Latency: 961466 -> 961505 (+0.00%)
InvThroughput: 127598 -> 127608 (+0.01%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8153
Fixes: 5806f0246f ("aco/gfx11: workaround VALUPartialForwardingHazard")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20941 >
2023-02-01 18:52:40 +00:00
Georg Lehmann
2b264455b5
aco: use s_pack_ll_b32_b16 for constant copies
...
Totals from 2 (0.00% of 134913) affected shaders:
CodeSize: 28636 -> 28628 (-0.03%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970 >
2023-02-01 17:07:25 +00:00
Georg Lehmann
9ee9b0859b
aco: use s_bfm_64 for constant copies
...
Foz-DB Navi21:
Totals from 1025 (0.76% of 134913) affected shaders:
CodeSize: 1436752 -> 1432412 (-0.30%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20970 >
2023-02-01 17:07:25 +00:00
Rhys Perry
bbc5247bf7
aco/spill: always end spill vgpr after control flow
...
To fix a hypothetical issue:
v0 = start_linear_vgpr
if (...) {
} else {
use_linear_vgpr(v0)
}
v0 = phi
We need a p_end_linear_vgpr to ensure that the phi does not use the same
VGPR as the linear VGPR.
This is also much simpler.
fossil-db (gfx1100):
Totals from 1195 (0.89% of 134574) affected shaders:
Instrs: 4123856 -> 4123826 (-0.00%); split: -0.00%, +0.00%
CodeSize: 21461256 -> 21461100 (-0.00%); split: -0.00%, +0.00%
Latency: 62816001 -> 62812999 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 9339049 -> 9338564 (-0.01%); split: -0.01%, +0.00%
Copies: 304028 -> 304005 (-0.01%); split: -0.02%, +0.01%
PreVGPRs: 115761 -> 115762 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621 >
2023-02-01 15:45:22 +00:00
Rhys Perry
850d945baf
aco/tests: add setup_reduce_temp.divergent_if_phi
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621 >
2023-02-01 15:45:22 +00:00
Rhys Perry
44fdd2ebcb
aco: end reduce tmp after control flow, when used within control flow
...
In the case of:
v0 = start_linear_vgpr
if (...) {
} else {
use_linear_vgpr(v0)
}
v0 = phi
We need a p_end_linear_vgpr to ensure that the phi does not use the same
VGPR as the linear VGPR.
fossil-db (gfx1100):
Totals from 3763 (2.80% of 134574) affected shaders:
MaxWaves: 90296 -> 90164 (-0.15%)
Instrs: 6857726 -> 6856608 (-0.02%); split: -0.03%, +0.01%
CodeSize: 35382188 -> 35377688 (-0.01%); split: -0.02%, +0.01%
VGPRs: 234864 -> 235692 (+0.35%); split: -0.01%, +0.36%
Latency: 47471923 -> 47474965 (+0.01%); split: -0.03%, +0.04%
InvThroughput: 5640320 -> 5639736 (-0.01%); split: -0.04%, +0.03%
VClause: 93098 -> 93107 (+0.01%); split: -0.01%, +0.02%
SClause: 214137 -> 214130 (-0.00%); split: -0.00%, +0.00%
Copies: 369895 -> 369305 (-0.16%); split: -0.31%, +0.15%
Branches: 164996 -> 164504 (-0.30%); split: -0.30%, +0.00%
PreVGPRs: 210655 -> 211438 (+0.37%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20621 >
2023-02-01 15:45:22 +00:00
Rhys Perry
695cf75266
aco: set has_color_exports with GPL
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 192486b7aa ("aco/gfx11: export mrtz in discard early exit for non-color shaders")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20937 >
2023-01-27 16:51:56 +00:00
Erik Faye-Lund
d54c8a47c6
meson: avoid using deprecated build_root() method
...
The meson.build_root() method has been deprecated, so let's switch to
meson.project_build_root(), which usually means the same thing. The case
where it doesn't do the same thing is if Mesa is a subproject to some
other project, but in that case I believe we want the build root of Mesa,
not of the parent project anyway.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20907 >
2023-01-27 11:35:50 +00:00
Timur Kristóf
c644461b71
radv, aco, ac: Implement pack_half_2x16_rtz_split.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15838 >
2023-01-26 12:24:24 +00:00
Timur Kristóf
9fc5d8d211
aco: Remove dynamic VS input loads.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20733 >
2023-01-26 02:43:11 +00:00
Timur Kristóf
81620fc7b0
aco: Enable constant exec mask based optimization on compute shaders.
...
We know for sure exec is initially -1 when the shader always has full subgroups.
Fossil DB stats on GFX11:
Totals from 3884 (2.88% of 134913) affected shaders:
SpillSGPRs: 1673 -> 1697 (+1.43%); split: -1.67%, +3.11%
SpillVGPRs: 2316 -> 2310 (-0.26%); split: -0.65%, +0.39%
CodeSize: 19584436 -> 19567156 (-0.09%); split: -0.13%, +0.04%
Scratch: 217088 -> 216832 (-0.12%)
Instrs: 3784596 -> 3780303 (-0.11%); split: -0.15%, +0.03%
Latency: 39971204 -> 39794967 (-0.44%); split: -0.47%, +0.03%
InvThroughput: 7885552 -> 7801247 (-1.07%); split: -1.14%, +0.07%
VClause: 74654 -> 74611 (-0.06%); split: -0.07%, +0.01%
SClause: 103139 -> 103043 (-0.09%); split: -0.13%, +0.04%
Copies: 279864 -> 281995 (+0.76%); split: -0.72%, +1.48%
Branches: 92082 -> 92084 (+0.00%); split: -0.03%, +0.03%
PreSGPRs: 155637 -> 149491 (-3.95%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20670 >
2023-01-26 01:59:26 +00:00
Timur Kristóf
39448c8e9c
radv, aco: Add uses_full_subgroups to compute shader info.
...
Allow the compiler to assume that the shader always has full subgroups,
meaning that the initial EXEC mask is -1 in all waves (all lanes enabled).
This assumption is incorrect for ray tracing and internal (meta) shaders
because they can use unaligned dispatch.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20670 >
2023-01-26 01:59:26 +00:00
Georg Lehmann
e527f686ca
Revert "aco: Combine v_cvt_u32_f32 with insert to v_cvt_pk_u8_f32."
...
This reverts commit 6d02054047 .
v_cvt_pk_u8_f32 returns 0xff instead of v_cvt_u32_f32 & 0xff if the input is
larger than 255.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8128
Cc: mesa-stable
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20829 >
2023-01-23 16:22:55 +00:00
Rhys Perry
26e4621fa2
aco/tests: update assembler tests for latest LLVM 16
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20747 >
2023-01-23 12:30:28 +00:00
Rhys Perry
b0fa106dc6
aco/tests: fix assembler.gfx11.vop12c_v128 with LLVM 15
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8089
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20747 >
2023-01-23 12:30:28 +00:00
Dylan Baker
3c5e969144
meson: replace uses of ExternalProgram.path with .full_path
...
The former is deprecated
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20409 >
2023-01-19 16:29:03 +00:00
Rhys Perry
068c84f275
aco: add support for fp32 addition atomics
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19810 >
2023-01-17 17:39:15 +00:00
Timur Kristóf
faba30a8f3
aco/optimizer: Optimize p_extract + v_mul_u32_u24 to v_mad_u32_u16.
...
This should perform the same but removes SDWA from the address
calculations in NGG culling shaders for example.
This is done because SDWA is no longer available on GFX11.
Fossil DB stats on GFX1100:
Totals from 36 (0.03% of 134913) affected shaders:
CodeSize: 300968 -> 300884 (-0.03%); split: -0.04%, +0.01%
Instrs: 60955 -> 60863 (-0.15%); split: -0.15%, +0.00%
Latency: 426809 -> 426819 (+0.00%); split: -0.06%, +0.06%
InvThroughput: 39076 -> 39025 (-0.13%); split: -0.14%, +0.01%
VClause: 1440 -> 1443 (+0.21%)
Copies: 5714 -> 5725 (+0.19%)
Fossil DB stats on GFX1100 with NGG culling enabled:
Totals from 60953 (45.18% of 134913) affected shaders:
VGPRs: 2273172 -> 2273160 (-0.00%)
CodeSize: 186401864 -> 186403036 (+0.00%); split: -0.00%, +0.00%
Instrs: 37038048 -> 36977353 (-0.16%); split: -0.16%, +0.00%
Latency: 146466770 -> 146350172 (-0.08%); split: -0.08%, +0.00%
InvThroughput: 15342790 -> 15228585 (-0.74%); split: -0.74%, +0.00%
VClause: 669662 -> 669665 (+0.00%)
Copies: 2972380 -> 2972482 (+0.00%); split: -0.01%, +0.01%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17924 >
2023-01-16 19:27:39 +00:00
Timur Kristóf
171d76ded1
aco/optimizer: Add missing v_lshlrev condition to can_apply_extract.
...
This was already handled by apply_extract but missing from
can_apply_extract, therefore may not be properly applied everywhere.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17924 >
2023-01-16 19:27:39 +00:00
Rhys Perry
39c214769b
aco: restore semantic_can_reorder for GS output stores
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20296 >
2023-01-16 17:25:51 +00:00
Rhys Perry
18d3e4fecd
radv,aco: use ac_nir_lower_legacy_gs
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20296 >
2023-01-16 17:25:51 +00:00
Bas Nieuwenhuizen
edca10e9c9
aco: Pass correct number of coords to Vega 1D LOD instruction.
...
If we pass a physical 2D texture descriptor we should also pass 2
coords. Otherwise it just uses the random content in the second
register which ends up funny sometimes.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20696 >
2023-01-13 16:55:06 +01:00
Samuel Pitoiset
e11e68b56b
radv,aco: fix enable_mrt_output_nan_fixup for RAGE2 again
...
Driver workarounds for game bugs can be easily broken. This one
shouldn't be applied to meta shaders and this restores previous logic.
Fixes: da32cbb5c6 ("aco: fix missing uses of MRT output flags")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20637 >
2023-01-11 15:55:32 +00:00
Georg Lehmann
2b28983c5d
aco: Use NSA on GFX11 with more than 5 vaddr registers.
...
On GFX11 the first 4 vaddr are single registers and the last contains the remaining vector.
image_bvh64_intersect_ray has a special NSA layout.
Foz-DB GFX1100:
Totals from 2763 (2.05% of 134913) affected shaders:
VGPRs: 145884 -> 145056 (-0.57%); split: -1.03%, +0.46%
CodeSize: 18406864 -> 18326136 (-0.44%); split: -0.47%, +0.04%
MaxWaves: 76030 -> 76146 (+0.15%)
Instrs: 3559785 -> 3525287 (-0.97%); split: -0.97%, +0.00%
Latency: 44278460 -> 43303419 (-2.20%); split: -2.33%, +0.13%
InvThroughput: 4966295 -> 4914927 (-1.03%); split: -1.04%, +0.01%
VClause: 51755 -> 51991 (+0.46%); split: -0.05%, +0.50%
SClause: 105241 -> 105267 (+0.02%); split: -0.08%, +0.10%
Copies: 214141 -> 182419 (-14.81%); split: -14.82%, +0.01%
Branches: 69525 -> 69521 (-0.01%)
PreVGPRs: 120910 -> 120256 (-0.54%); split: -0.56%, +0.02%
No changes on Navi21.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20370 >
2023-01-11 00:00:38 +00:00
Georg Lehmann
9538d523b6
aco: Validate GFX11 NSA correctly.
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20370 >
2023-01-11 00:00:38 +00:00
Georg Lehmann
9abe4850ba
aco: Handle NSA with vectors in get_mimg_nsa_dwords.
...
No Foz-DB changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20370 >
2023-01-11 00:00:38 +00:00
Rhys Perry
b1e59646de
aco/gfx11: increase vgpr_limit to 256
...
fossil-db (gfx1100):
Totals from 280 (0.21% of 134574) affected shaders:
MaxWaves: 3124 -> 2846 (-8.90%); split: +3.46%, -12.36%
Instrs: 1139038 -> 1091407 (-4.18%); split: -4.18%, +0.00%
CodeSize: 5809332 -> 5486812 (-5.55%); split: -5.55%, +0.00%
VGPRs: 35004 -> 42864 (+22.45%); split: -1.85%, +24.31%
SpillSGPRs: 1896 -> 1865 (-1.64%); split: -2.37%, +0.74%
SpillVGPRs: 17807 -> 2382 (-86.62%)
Scratch: 2573312 -> 736256 (-71.39%)
Latency: 27470485 -> 17981296 (-34.54%); split: -34.54%, +0.00%
InvThroughput: 5606102 -> 6527051 (+16.43%); split: -4.19%, +20.61%
VClause: 32319 -> 19927 (-38.34%); split: -39.13%, +0.78%
SClause: 15014 -> 14897 (-0.78%); split: -0.95%, +0.17%
Copies: 102977 -> 93511 (-9.19%); split: -9.93%, +0.74%
Branches: 15164 -> 14969 (-1.29%)
PreSGPRs: 19132 -> 19014 (-0.62%)
PreVGPRs: 30494 -> 37460 (+22.84%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
6872f8d861
aco/gfx11: allow true 16-bit instructions to access v128+
...
It looks like the LLVM assembler promotes true 16-bit instructions to VOP3
in this case.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
254b178d5b
aco: disallow SGPRS/constants with interpolation instructions
...
https://reviews.llvm.org/D137575
The VINTRP format cannot encode anything except VGPRs.
Reading VINTERPInstructions.td, looks like it's the same for GFX11.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
5af891a747
aco: add more opcodes to can_use_DPP()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
c3dd1931d9
aco: allow Builder::Result to be dereferenced
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
e386523380
aco/gfx11: fix discard early exit removal optimization
...
This optimization never happened because the NULL target was removed in
GFX11.
fossil-db (gfx1100):
Totals from 5439 (4.04% of 134574) affected shaders:
Instrs: 407865 -> 387123 (-5.09%)
CodeSize: 2163340 -> 2060644 (-4.75%)
Latency: 3432378 -> 3327802 (-3.05%)
InvThroughput: 270133 -> 262980 (-2.65%)
Branches: 8524 -> 3085 (-63.81%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20513 >
2023-01-10 14:01:29 +00:00
Rhys Perry
810ced93f3
aco: align scratch size during assembly
...
This lets us use less scratch if both VGPR spilling and scratch intrinsics
are used.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20534 >
2023-01-09 21:46:13 +00:00
Rhys Perry
c9846158cd
aco/gfx11: reduce scratch allocation alignment
...
fossil-db (gfx1100):
Totals from 112 (0.08% of 134574) affected shaders:
Scratch: 1513472 -> 1455360 (-3.84%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20534 >
2023-01-09 21:46:13 +00:00