Timur Kristóf
2c40215ab9
aco/optimizer: Change v_cmp with subgroup invocation to constant.
...
When a shader has a comparison with the subgroup invocation id,
we can use a constant instead, saving a VALU instruction.
When the constant can't be represented as a 64-bit literal,
use the s_bfm_b64 instruction to generate it instead, which
is still a win.
Fossil DB stats on GFX11:
Totals from 300 (0.22% of 134913) affected shaders:
CodeSize: 2223052 -> 2214336 (-0.39%); split: -0.43%, +0.04%
Instrs: 430216 -> 429882 (-0.08%); split: -0.14%, +0.06%
Latency: 5881180 -> 5878181 (-0.05%); split: -0.05%, +0.00%
InvThroughput: 731846 -> 729293 (-0.35%)
Copies: 31662 -> 31847 (+0.58%); split: -0.03%, +0.61%
Branches: 8241 -> 8100 (-1.71%)
PreVGPRs: 15788 -> 15786 (-0.01%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20843 >
2023-02-18 21:16:58 +01:00
Georg Lehmann
281a505ef0
aco: new 16bit VOP3 opcodes can use opsel
...
No Foz-DB changes on gfx11.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20705 >
2023-02-14 16:14:55 +00:00
Georg Lehmann
533d0008c7
aco: remove stale TODOs about v_interp opsel
...
These are already handled correctly according to the ISA docs.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21096 >
2023-02-10 12:01:56 +00:00
Rhys Perry
b1e59646de
aco/gfx11: increase vgpr_limit to 256
...
fossil-db (gfx1100):
Totals from 280 (0.21% of 134574) affected shaders:
MaxWaves: 3124 -> 2846 (-8.90%); split: +3.46%, -12.36%
Instrs: 1139038 -> 1091407 (-4.18%); split: -4.18%, +0.00%
CodeSize: 5809332 -> 5486812 (-5.55%); split: -5.55%, +0.00%
VGPRs: 35004 -> 42864 (+22.45%); split: -1.85%, +24.31%
SpillSGPRs: 1896 -> 1865 (-1.64%); split: -2.37%, +0.74%
SpillVGPRs: 17807 -> 2382 (-86.62%)
Scratch: 2573312 -> 736256 (-71.39%)
Latency: 27470485 -> 17981296 (-34.54%); split: -34.54%, +0.00%
InvThroughput: 5606102 -> 6527051 (+16.43%); split: -4.19%, +20.61%
VClause: 32319 -> 19927 (-38.34%); split: -39.13%, +0.78%
SClause: 15014 -> 14897 (-0.78%); split: -0.95%, +0.17%
Copies: 102977 -> 93511 (-9.19%); split: -9.93%, +0.74%
Branches: 15164 -> 14969 (-1.29%)
PreSGPRs: 19132 -> 19014 (-0.62%)
PreVGPRs: 30494 -> 37460 (+22.84%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
6872f8d861
aco/gfx11: allow true 16-bit instructions to access v128+
...
It looks like the LLVM assembler promotes true 16-bit instructions to VOP3
in this case.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
5af891a747
aco: add more opcodes to can_use_DPP()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20251 >
2023-01-10 16:01:38 +00:00
Rhys Perry
c9846158cd
aco/gfx11: reduce scratch allocation alignment
...
fossil-db (gfx1100):
Totals from 112 (0.08% of 134574) affected shaders:
Scratch: 1513472 -> 1455360 (-3.84%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20534 >
2023-01-09 21:46:13 +00:00
Georg Lehmann
c241980751
aco: Mark more instructions as 16bit on GFX10.
...
p_cvt_f16_f32_rtne will be lowered to v_cvt_f16_f32 and we already know that
preserves the high bits.
I tested the others on GFX1036.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20574 >
2023-01-09 18:54:35 +00:00
Bas Nieuwenhuizen
352e492c7b
aco: Add isTrans helper.
...
For the s_delay_alu tracking.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19743 >
2022-12-07 22:05:25 +00:00
Marek Olšák
8956682810
amd: rename enums ARCTURUS -> MI100, ALDEBARAN -> MI200
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19477 >
2022-11-06 17:20:39 -05:00
Rhys Perry
50073d6135
aco/gfx11: increase gfx1100/gfx1101 physical vgprs
...
https://reviews.llvm.org/D134522
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18825 >
2022-11-02 17:09:32 +00:00
Illia Abernikhin
aa4ac5ff8b
utils: Merge util/debug.* into util/u_debug.* and remove util/debug.*
...
Rename env_var_as_unsigned() -> debug_get_num_option(), because duplicate
Rename env_var_as_bool() -> debug_get_bool_option(), because duplicate
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7177
Signed-off-by: Illia Abernikhin <illia.abernikhin@globallogic.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19336 >
2022-11-02 07:25:39 +00:00
Daniel Schürmann
a36e27e507
aco: change thread_local memory resource to pointer
...
Apparently the TLS constructor doesn't work well if RADV
is instantiated multiple times and/or used by a program with
already existing threads.
Fixes: a128d444cb ('aco: use monotonic_buffer_resource for instructions')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19219 >
2022-10-25 09:08:08 +00:00
Timur Kristóf
a17e801a9c
aco: Add ACO_DEBUG=novalidateir option.
...
This disables IR validation in debug/debugoptimized builds.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103 >
2022-10-24 20:14:16 +00:00
Georg Lehmann
79a8a7662b
aco: fmaak/fmamk can't use SDWA.
...
Found by inspection.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19255 >
2022-10-24 18:04:53 +00:00
Rhys Perry
36703a60bf
aco: add ACO_DEBUG=force-waitdeps
...
GFX11 has a lot of complicated data dependency hazards.
For debugging GFX10+ data dependency hazards. This creates an excessive
amount of s_waitcnt_depctr.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18273 >
2022-10-19 02:46:03 +00:00
Rhys Perry
2930317cea
aco/gfx11: deallocate VGPRs at the end of the shader
...
fossil-db (gfx1100):
Totals from 65987 (40.81% of 161689) affected shaders:
Instrs: 57123207 -> 57199947 (+0.13%)
CodeSize: 308402500 -> 308709460 (+0.10%)
Latency: 680527139 -> 680527160 (+0.00%)
InvThroughput: 131620026 -> 131620045 (+0.00%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17710 >
2022-09-30 20:57:02 +00:00
Rhys Perry
7b21af3f51
aco: improve wait_imm unpack
...
Add GFX11 support and use wait_imm::unset_counter. Looping in the waitcnt
pass was probably broken on GFX11 because of this.
fossil-db (gfx1100):
Totals from 899 (0.56% of 161689) affected shaders:
Instrs: 1319368 -> 1319179 (-0.01%)
CodeSize: 7124640 -> 7123884 (-0.01%)
Latency: 26554304 -> 26404606 (-0.56%)
InvThroughput: 9032485 -> 8978773 (-0.59%); split: -0.59%, +0.00%
No navi10 fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17710 >
2022-09-30 20:57:02 +00:00
Daniel Schürmann
a128d444cb
aco: use monotonic_buffer_resource for instructions
...
As monotonic_buffer_resource is not thread-safe,
we use a thread_local instance which gets allocated once.
This change reduces the compile time spent in ACO by
approximately 10%.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18112 >
2022-09-28 09:25:20 +00:00
Rhys Perry
826ed52174
aco/tests: add GFX11 assembly tests
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333 >
2022-09-26 14:49:57 +00:00
Rhys Perry
aadb7aef01
aco: add VINTERP instruction format
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333 >
2022-09-26 14:49:56 +00:00
Rhys Perry
55cd74d468
aco: add LDSDIR instruction format
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333 >
2022-09-26 14:49:56 +00:00
Rhys Perry
a7a9aad14d
aco: limit GFX11 to 128 VGPRs for now
...
See https://reviews.llvm.org/D128054
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333 >
2022-09-26 14:49:56 +00:00
Rhys Perry
d8d99c3c4f
aco: add GFX11 opcode numbers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17333 >
2022-09-26 14:49:56 +00:00
Rhys Perry
dd105f7c1e
aco: fix assembly of vopc_sdwa writing exec
...
We would assemble an instruction writing vcc instead.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: 5ffc73896f ("aco/assembler: Fix v_cmpx with SDWA.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077 >
2022-08-16 17:31:33 +00:00
Rhys Perry
f60cb8d0af
aco: rename is_cmp to is_fp_cmp
...
The old name is no longer accurate.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18077 >
2022-08-16 17:31:33 +00:00
Georg Lehmann
22d860fe4a
aco/ir: Add swapped opcode for v_cmp_u/v_cmp_o.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17763 >
2022-07-27 14:52:29 +00:00
Georg Lehmann
8f7ceff774
aco/ir: Add v_cmp_class to get_cmp_info.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17763 >
2022-07-27 14:52:29 +00:00
Georg Lehmann
578d0a1934
aco/ir: Add vcmpx opcode to get_cmp_info.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17763 >
2022-07-27 14:52:29 +00:00
Georg Lehmann
369b8c031a
aco/ir: Fix swapped nle.
...
Cc: mesa-stable
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17763 >
2022-07-27 14:52:29 +00:00
Georg Lehmann
9c727b958e
aco/ir: Add integer get_cmp_info.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17763 >
2022-07-27 14:52:29 +00:00
Georg Lehmann
590b93ae65
aco/ir: Generalize (un)ordered_swapped.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17763 >
2022-07-27 14:52:29 +00:00
Rhys Perry
d2d94b62f2
aco: initialize scratch base registers on GFX9-GFX10.3
...
fossil-db (navi21):
Totals from 1142 (0.70% of 162293) affected shaders:
Instrs: 271636 -> 271974 (+0.12%)
CodeSize: 1532020 -> 1533792 (+0.12%)
Latency: 7484066 -> 7485698 (+0.02%)
InvThroughput: 4048824 -> 4049579 (+0.02%)
SClause: 4171 -> 4212 (+0.98%)
PreSGPRs: 11203 -> 12276 (+9.58%)
fossil-db (vega10):
Totals from 3327 (2.06% of 161355) affected shaders:
Instrs: 257413 -> 257601 (+0.07%)
CodeSize: 1424244 -> 1425372 (+0.08%)
Latency: 8598402 -> 8600466 (+0.02%)
InvThroughput: 7906335 -> 7908234 (+0.02%)
SClause: 4932 -> 4973 (+0.83%)
PreSGPRs: 22010 -> 25405 (+15.42%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079 >
2022-07-08 14:49:03 +00:00
Rhys Perry
cbeb25ce91
aco: make FLAT_instruction::offset signed
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17079 >
2022-07-08 14:49:03 +00:00
Daniel Schürmann
c298ab0d23
aco: correctly validate v_fma_mixhi_f16 register assignment
...
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15176 >
2022-06-27 15:07:27 +00:00
Rhys Perry
dae1629778
aco: disable sdwa on gfx11
...
Instead of SDWA v_mov_b32/v_xor_b32, we can use a combination of
v_add_u16/v_sub_u16 (add/sub swap, similar to xor swap) and v_perm_b32
with a literal.
I don't know yet if GFX11 adds any new instructions which makes this
easier, but this approach should have full functionality.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16595 >
2022-05-31 18:07:34 +00:00
Marek Olšák
39800f0fa3
amd: change chip_class naming to "enum amd_gfx_level gfx_level"
...
This aligns the naming with PAL.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pellou-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16469 >
2022-05-13 14:56:22 -04:00
Samuel Pitoiset
7647097d3d
aco: update waitcnt on GFX11
...
Not sure if the vmcnt field can use more than 0x3f bits.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
765aa36b9d
aco: update LDS allocation granularity for PS on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Samuel Pitoiset
52952f51cd
aco: do not align VGPRS to 8 or 16 on GFX11
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16369 >
2022-05-12 15:46:20 +00:00
Dave Airlie
04c07a2413
aco/radv: convert to aco shader info at the radv level.
...
This removes the radv shader info type from aco completely.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
87df607ff5
aco: move to a minimal aco shader info struct.
...
This should be kept to only things aco uses, and expanded when
radeonsi support is added. Things should be removed if lowered in NIR.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Dave Airlie
a2701bfdb8
aco: move info pointer to a copy.
...
This is just setup to move this to a different struct later.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16342 >
2022-05-11 19:07:11 +00:00
Daniel Schürmann
2d1e6b756e
aco: remove 'high' parameter from can_use_opsel()
...
No fossil-db changes.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15551 >
2022-03-25 22:02:50 +00:00
Rhys Perry
e12bee3cb7
aco: improve support for v_fma_mix
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14769 >
2022-03-17 19:04:17 +00:00
Daniel Schürmann
825cd696dc
aco/insert_exec_mask: stay in WQM while helper lanes are still needed
...
This patch flags all instructions WQM which don't require
Exact mode, but depend on the exec mask as long as WQM
is needed on any control flow path afterwards.
This will mostly prevent accidental copies of WQM values
within Exact mode, and also makes a lot of other workarounds
unnecessary.
Totals from 17374 (12.88% of 134913) affected shaders: (GFX10.3)
VGPRs: 526952 -> 527384 (+0.08%); split: -0.01%, +0.09%
CodeSize: 33740512 -> 33766636 (+0.08%); split: -0.06%, +0.14%
MaxWaves: 488166 -> 488108 (-0.01%); split: +0.00%, -0.02%
Instrs: 6254240 -> 6260557 (+0.10%); split: -0.08%, +0.18%
Latency: 66497580 -> 66463472 (-0.05%); split: -0.15%, +0.10%
InvThroughput: 13265741 -> 13264036 (-0.01%); split: -0.03%, +0.01%
VClause: 122962 -> 122975 (+0.01%); split: -0.01%, +0.02%
SClause: 334805 -> 334405 (-0.12%); split: -0.51%, +0.39%
Copies: 275728 -> 282341 (+2.40%); split: -0.91%, +3.31%
Branches: 92546 -> 90990 (-1.68%); split: -1.68%, +0.00%
PreSGPRs: 504119 -> 504352 (+0.05%); split: -0.00%, +0.05%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Rhys Perry
16e0c312fa
aco: preserve pass_flags during format conversions
...
This helps the "vopc() & exec" optimization.
fossil-db (Sienna Cichlid):
Totals from 1638 (1.21% of 134913) affected shaders:
CodeSize: 3331804 -> 3327520 (-0.13%); split: -0.19%, +0.06%
Instrs: 611807 -> 610096 (-0.28%)
Latency: 5579326 -> 5574874 (-0.08%)
InvThroughput: 936782 -> 936731 (-0.01%); split: -0.01%, +0.00%
Copies: 43324 -> 43302 (-0.05%); split: -0.06%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773 >
2022-01-31 13:45:01 +00:00
Rhys Perry
f68797ead7
aco: create v_mac_legacy_f32/v_fmac_legacy_f32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Tatsuyuki Ishi
da0412e55b
aco: support DPP8
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971 >
2021-12-31 20:56:39 +00:00
Daniel Schürmann
1502c22e2c
aco: don't allow SDWA on VOP3P instructions
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576 >
2021-12-31 14:52:14 +00:00