Rhys Perry
1210e0bd62
aco: create 16-bit input and output modifiers
...
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4552 -> 4540 (-0.26%)
Instrs: 863 -> 861 (-0.23%)
Cycles: 3452 -> 3444 (-0.23%)
VMEM: 490 -> 489 (-0.20%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
f5a5674178
aco: update comment about preserving fp16/fp64 denormals
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
7f511efa16
aco: create 16-bit mad/fma
...
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4868 -> 4552 (-6.49%)
Instrs: 956 -> 863 (-9.73%)
Cycles: 3824 -> 3452 (-9.73%)
VMEM: 504 -> 490 (-2.78%)
SMEM: 109 -> 107 (-1.83%)
VClause: 19 -> 20 (+5.26%)
Copies: 54 -> 58 (+7.41%)
PreVGPRs: 43 -> 41 (-4.65%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
1b10764e50
aco: try to use fma instead of mad when denormals are enabled
...
v_mad_f32 doesn't support denormals but v_fma_f32 does.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
6cb42cdd8f
aco: create mads when signed zeros should be preserved
...
This check was added because I thought v_mad_f32 didn't preserve the
signess of zero, but I can't reproduce that and this isn't mentioned
anywhere in LLVM.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
1b6a319c15
aco: add and set precise flag
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
a8f800a836
aco: use p_as_uniform in emit_vop1_instruction
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
b6d9e45f47
aco: improve code for f2{i,u}{8,16}
...
Use sub-dword definitions so that the RA can use SDWA
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Rhys Perry
34d481fd1f
aco: use num_opcodes instead of last_opcode
...
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245 >
2020-06-15 18:24:22 +00:00
Samuel Pitoiset
013d096d15
ac: add ac_choose_spi_color_formats() to common code
...
It's similar between RADV and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5436 >
2020-06-15 08:16:07 +02:00
Erik Faye-Lund
cd01d0dee1
radv: update internal reference
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4630 >
2020-06-13 10:42:01 +00:00
Daniel Schürmann
1f98d8c804
aco: fix shared subdword loads
...
Shared subdword loads don't need byte alignment as they are split
into multiple loads if necessary.
Fixes: 5cde4989d3 ('aco: remove unnecessary split- and create_vector instructions for subdword loads')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5441 >
2020-06-12 13:56:12 +00:00
Samuel Pitoiset
54e7ae569b
radv/llvm: implement radv_enable_mrt_output_nan_fixup workaround
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359 >
2020-06-12 14:43:58 +02:00
Samuel Pitoiset
7b44f549b3
aco: implement radv_enable_mrt_output_nan_fixup workaround
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359 >
2020-06-12 14:43:57 +02:00
Samuel Pitoiset
6f21995f98
radv: add new drirc option radv_enable_mrt_output_nan_fixup
...
To replace NaN from FS with zeros to fix game bugs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359 >
2020-06-12 14:43:31 +02:00
Samuel Pitoiset
07aefe8065
radv: set DB_SHADER_CONTROL.CONSERVATIVE_Z_EXPORT correctly
...
Use the SPIR-V execution modes if set.
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5404 >
2020-06-12 08:22:47 +00:00
Mauro Rossi
7afdc549f4
android: aco: add aco_ir.cpp to Makefile.sources
...
Fixes the following building errors:
FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
ld.lld: error: undefined symbol: aco::can_use_SDWA(chip_class, std::__1::unique_ptr<aco::Instruction, aco::instr_deleter_functor> const&)
...
ld.lld: error: undefined symbol: aco::can_use_opsel(chip_class, aco_opcode, int, bool)
...
clang-9: error: linker command failed with exit code 1 (use -v to see invocation)
Fixes: d9cfb8ad ("aco: validate instructions reading/writing upper halves/bytes")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5425 >
2020-06-11 20:00:16 +02:00
Marek Olšák
0b3e344212
ac/surface: don't free dcc_retile_map on failure
...
because the hash table now owns it.
Fixes: bd553f0546 - ac/surface: cache DCC retile maps (v2)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424 >
2020-06-11 10:01:57 +00:00
Marek Olšák
56f2a77a41
ac/surface: enable DCC for the first level in the mip tail on gfx10
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424 >
2020-06-11 10:01:57 +00:00
Marek Olšák
7406ea37e6
ac/surface: require that gfx8 doesn't have DCC in order to be displayable
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424 >
2020-06-11 10:01:57 +00:00
Marek Olšák
374f6d568f
ac/surface: don't set is_displayable if displayable DCC is missing
...
If flags.display isn't set, then displayable DCC will not be computed, so
is_displayable will always be false.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424 >
2020-06-11 10:01:57 +00:00
Marek Olšák
0fcf55329b
amd/addrlib: fix the C++ one definition rule violation
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1854
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5414 >
2020-06-11 05:11:50 -04:00
Marek Olšák
bd553f0546
ac/surface: cache DCC retile maps (v2)
...
This reduces overhead when resizing windows or when allocating
similar image sizes over and over again.
v2: optimize the memory footprint of the cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Marek Olšák
4cf674c8f7
ac/surface: add a wrapper structure to hold ADDR_HANDLE
...
and more things in the future.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Marek Olšák
e6996d6fbd
amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Marek Olšák
a99f4d5382
amd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call
...
This decreases the DCC retile map overhead from 23% to 18%.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Marek Olšák
a1b9eb62f6
ac/surface: don't recompute the DCC retile map for imported textures
...
The retile map is not used in this case, and the retile map computation
takes 39% of CPU time when resizing a window.
This brings it down to 23%.
The dcc_retile_use_uint16 setting has to be derived from DCC sizes.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398 >
2020-06-10 15:35:46 +00:00
Rhys Perry
1b2e1163b2
aco: fix moving sub-dword values out of a register for a fixed definition
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
edf863d1d2
aco: use Info::definition_size instead of definition's regclass
...
16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
207c35cbe8
aco: add Info::{operand_size,definition_size}
...
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
62ea429a99
aco: prefer 4-byte aligned definitions
...
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
CodeSize: 811984 -> 806224 (-0.71%)
Instrs: 155733 -> 155939 (+0.13%); split: -0.04%, +0.18%
Cycles: 1982568 -> 1984400 (+0.09%); split: -0.06%, +0.15%
VMEM: 7187 -> 7121 (-0.92%); split: +0.86%, -1.78%
SMEM: 1770 -> 1769 (-0.06%)
VClause: 1475 -> 1476 (+0.07%)
Copies: 12406 -> 12606 (+1.61%); split: -0.46%, +2.07%
Branches: 5901 -> 5900 (-0.02%); split: -0.25%, +0.24%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
56345b8c61
aco: allow reading/writing upper halves/bytes when possible
...
Use SDWA, opsel or a different opcode to achieve this.
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
98060ba0f0
aco: p_extract_vector in 64-bit u2f16/i2f16
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Rhys Perry
d9cfb8ad48
aco: validate instructions reading/writing upper halves/bytes
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040 >
2020-06-10 15:05:11 +00:00
Pierre-Eric Pelloux-Prayer
8275dc1ed5
ac/surface: fix epitch when modifying surf_pitch
...
This is needed otherwise it can cause bad rendering of UYVY files.
The align(..., 256 / surf->bpe) constraint comes from addrlib.
Fixes: 69aadc4933 ("radeonsi: fix surf_pitch for subsampled surface")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5314 >
2020-06-10 09:11:23 +00:00
Pierre-Eric Pelloux-Prayer
e9826a1bb2
ac/surface: set SCANOUT if surf->is_displayable
...
Fixes: ba10fb3f7f ("radeonsi: preserve the scanout flag for shared resources on gfx9 and gfx10")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5314 >
2020-06-10 09:11:23 +00:00
Samuel Pitoiset
9b58c4958b
ac/nir: fix integer comparisons with pointers
...
If we get a comparison between a pointer and an integer, LLVM
complains if the operands aren't of the same type.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3085
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5397 >
2020-06-10 08:18:22 +00:00
Samuel Pitoiset
64f2d45c3b
radv/aco: enable shaderInt8 and VK_KHR_shader_float16_int8 on GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
be4dd6abd1
radv/aco: enable shaderInt16 on GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
b3aee3aa23
radv/aco: enable 8-bit/16-bit storage on GFX6-GFX7
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
5cde4989d3
aco: remove unnecessary split- and create_vector instructions for subdword loads
...
This helps GFX6/7 by removing unnecessary shuffle code.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
5446e3cf2e
aco: fix alignment of vectors with 4 elements
...
I think this case was just missing.
This fixes a bunch of 16-bit storage related CTS failures like
dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Samuel Pitoiset
c7bd0f8cd5
aco: implement 8-bit/16-bit conversions on GFX6-GFX7
...
Use v_bfe to implement small bitsize conversions because the
compiler probably optimizes this better.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
db957f9135
aco: optimize packing of 16bit subdword registers on GFX6/7
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
2a51840c52
aco: skip partial copies on first iteration when lowering to hw
...
Helps some Detroit : Become Human shaders.
Totals from affected shaders: (VEGA)
Code Size: 47693912 -> 47670212 (-0.05 %) bytes
Instructions: 9183788 -> 9177863 (-0.06 %)
Copies: 910052 -> 904127 (-0.65 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
1d6f667193
aco: coalesce copies more aggressively when lowering to hw
...
Helps some Detroit : Become Human shaders.
Totals from affected shaders: (VEGA)
Code Size: 9880420 -> 9879088 (-0.01 %) bytes
Instructions: 1918553 -> 1918220 (-0.02 %)
Copies: 177783 -> 177450 (-0.19 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
b21d2d9a9f
aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7
...
This is needed to lower some corner cases correctly,
in case the same operand occurs multiple times:
e.g. v0 = p_create_vector(v0[0:8], v0[0:8], v0[0:8], v0[0:8])
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
9e8e12ea6d
aco: adjust GFX6 subdword lowering workarounds for 8bit
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
b083581010
aco: Workarounds subdword lowering on GFX6/7
...
As there are no SDWA instructions, we need to take care not to overwrite
the upper bits of other copy_operation's operands.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00
Daniel Schürmann
942e3c40c3
aco: use full-register instructions to implement subdword packing on GFX6/7
...
On GFX6/7, there are no SDWA instructions.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226 >
2020-06-09 21:25:38 +00:00