mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2025-12-27 08:20:12 +01:00
radv: use byte/word extract/insert instructions
ACO doesn't yet combine extract/insert into instructions, but it seems to already generate less instructions because NIR optimizes shift+and to these instructions. Code size is worse in some cases though because we have to always use a literal when masking. fossil-db (Sienna Cichlid): Totals from 14361 (9.58% of 149839) affected shaders: VGPRs: 850152 -> 850304 (+0.02%); split: -0.02%, +0.04% SpillSGPRs: 7979 -> 7989 (+0.13%); split: -0.03%, +0.15% CodeSize: 88031216 -> 88162520 (+0.15%); split: -0.01%, +0.16% MaxWaves: 269414 -> 269426 (+0.00%) Instrs: 16695182 -> 16662852 (-0.19%); split: -0.21%, +0.01% Latency: 375592693 -> 375544364 (-0.01%); split: -0.04%, +0.03% InvThroughput: 75627700 -> 75607720 (-0.03%); split: -0.07%, +0.04% fossil-db (Polaris): Totals from 13816 (9.13% of 151365) affected shaders: SGPRs: 984896 -> 982512 (-0.24%); split: -0.29%, +0.05% VGPRs: 809220 -> 809112 (-0.01%); split: -0.02%, +0.01% SpillSGPRs: 9181 -> 9185 (+0.04%); split: -0.04%, +0.09% CodeSize: 82017952 -> 82123484 (+0.13%); split: -0.01%, +0.14% MaxWaves: 65721 -> 65723 (+0.00%) Instrs: 16008744 -> 15988007 (-0.13%); split: -0.18%, +0.05% Latency: 439911623 -> 439869622 (-0.01%); split: -0.04%, +0.03% InvThroughput: 185898770 -> 185841742 (-0.03%); split: -0.08%, +0.05% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
This commit is contained in:
parent
7d76b07d6b
commit
63659fc15c
1 changed files with 0 additions and 4 deletions
|
|
@ -72,10 +72,6 @@ static const struct nir_shader_compiler_options nir_options = {
|
|||
.lower_unpack_unorm_2x16 = true,
|
||||
.lower_unpack_unorm_4x8 = true,
|
||||
.lower_unpack_half_2x16 = true,
|
||||
.lower_extract_byte = true,
|
||||
.lower_extract_word = true,
|
||||
.lower_insert_byte = true,
|
||||
.lower_insert_word = true,
|
||||
.lower_ffma16 = true,
|
||||
.lower_ffma32 = true,
|
||||
.lower_ffma64 = true,
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue