Commit graph

4340 commits

Author SHA1 Message Date
Samuel Pitoiset
39760793b5 ac/llvm: fix ac_to_integer_type() for 32-bit const addr space pointers
This fixes some crashes with dEQP-VK.descriptor_indexing.* when
read_first_invocation has its source from a descriptor.

Most of these tests still fail because of an LLVM bug (they work
with ACO).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 22:32:01 +02:00
Rhys Perry
73184e51d1 aco: run opt_algebraic in a loop
Totals from affected shaders:
SGPRS: 13920 -> 13656 (-1.90 %)
VGPRS: 12972 -> 12960 (-0.09 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 1005680 -> 1000648 (-0.50 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Max Waves: 688 -> 688 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 19:18:30 +00:00
Rhys Perry
132ae89b19 aco: use nir_lower_idiv_precise
v7: rename _nv50/_llvm to _fast/_precise

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00
Rhys Perry
8b98d0954e nir/lower_idiv: add new llvm-based path
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00
Daniel Schürmann
0e4bd261b1 aco: ensure that uniform booleans are computed in WQM if their uses happen in WQM
This fixes graphical corruption in SC2.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-21 17:39:46 +00:00
Timur Kristóf
7e5f87b533 aco/gfx10: Update constant addresses in fix_branches_gfx10.
Due to a bug in GFX10 hardware, s_nop instructions must be added
if a branch is at 0x3f. We already do this, but forgot to also update
the constant addresses that come after this instruction.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 14:33:54 +00:00
Timur Kristóf
f380398f8f aco/gfx10: Fix PS exports for SPI_SHADER_32_AR.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 14:33:54 +00:00
Timur Kristóf
1749953ea3 aco/gfx10: Wait for pending SMEM stores before loads
Currently if you have an SMEM store followed by an SMEM load that
loads the same location as was written, it won't work because the
store isn't finished before the load is executed. This is NOT
mitigated by an s_nop instruction on GFX10.

Since we currently don't have proper alias analysis, this commit adds
a workaround which will insert an s_waitcnt lgkmcnt(0) before each
SSBO load if they follow a store. We should further refine this in
the future when we can make sure to only add the wait when we load the
same thing as has been stored.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 14:33:54 +00:00
Samuel Pitoiset
b72205a4c1 radv: advertise VK_KHR_spirv_1_4
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 09:21:40 +02:00
Samuel Pitoiset
b139198b06 radv: do not dump descriptors twice in hang reports
If a pipeline has both graphics and compute, descriptors are same.
While we are at it, use queue->device for simplicity.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
cf5e55558e radv: dump trace files earlier if a GPU hang is detected
To make sure a trace file is generated in case the driver crashes
during the hang report generation (which happens sometimes).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
bc2319deb2 radv: print which ring is dumped in hang reports
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
076f9dce7c radv: do not print useless descriptors info in hang reports
This information has never been useful. All descriptors are
already dumped with colors etc, and it's more useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
9da94e510c radv: enable VK_KHR_shader_float_controls on GFX6-GFX7
Disable 16-bit features because fp16 isn't exposed on these chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:47:28 +02:00
Samuel Pitoiset
7c50214aab radv: implement VK_KHR_shader_float_controls
This exposes what's required for DX and this is what we already
configure. The driver flushes denorms for FP32 and preserves them
for FP16/FP64. Note that we can't allow both preserving and
flushing denorms because this won't work for merged shaders. This
will require LLVM to update the float mode register to make it work.

Only enabled on GFX8+ with the LLVM path because it's untested on
previous chips and ACO doesn't support it.

This extension is required for SPIRV 1.4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:58 +02:00
Samuel Pitoiset
2c2aaf275c ac/llvm: force fneg/fabs to flush denorms to zero if requested
LLVM optimizes these instructions with XOR/AND and it loses
the sign bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:55 +02:00
Samuel Pitoiset
7dfb15fff1 ac/llvm: add AC_FLOAT_MODE_ROUND_TO_ZERO
Because some instructions will be optimized by the backend compiler,
the driver has to manually flush to zero to keep the result exact.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:51 +02:00
Samuel Pitoiset
d94bd4e512 ac/llvm: add ac_build_canonicalize() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:48 +02:00
Bas Nieuwenhuizen
fd21ee8b52 radv: Fix single stage constant flush with merged shaders.
e.g. a VERTEX only flush with tess on Vega should look at the TCS
to see which bits are needed.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1953
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-18 10:49:29 +00:00
Daniel Schürmann
4b458b3e8f aco: don't combine minmax3 if there is a neg or abs modifier in between
This fixes a graphical corruption in HotS.
No pipelinedb changes other than that.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-17 16:21:19 +00:00
Samuel Pitoiset
c644644c65 radv: fix DCC fast clear code for intensity formats (correctly)
Previous fix was pretty bogus.

This fixes a rendering regression with Nier (minimap too large).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1943
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1952
Fixes: ea92273cea ("radv: fix DCC fast clear code for intensity formats")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-17 15:29:43 +02:00
Rhys Perry
88f1c0a360 aco: emit_split_vector() s_memtime results
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-16 15:31:19 +01:00
Rhys Perry
ded51b13da aco: don't CSE s_memtime
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-16 15:31:19 +01:00
Rhys Perry
d7838152f5 aco: fix scheduling with s_memtime/s_memrealtime
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-16 15:31:19 +01:00
Samuel Pitoiset
4a3bdc6d22 Revert "radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+"
This reverts commit 2ca8629fa9.

This was initially ported from RadeonSI, but in the meantime it has
been reverted because it might hang. Be conservative and re-introduce
this packet emission.

Unfortunately this doesn't fix anything known.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-15 15:58:34 +02:00
Samuel Pitoiset
50c8c4144b radv: rename VK_KHR_shader_float16_int8 structs/constants
Trivial change.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-15 12:13:53 +02:00
Mauro Rossi
072c94f724 android: amd/common: export amd/llvm headers
Fixes the following building error:

external/mesa/src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c:42:10:
fatal error: 'ac_llvm_util.h' file not found
         ^~~~~~~~~~~~~~~~
1 error generated.

Fixes: 3a08110 ("amd: Move all amd/common code that depends on LLVM to amd/llvm.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-14 10:46:45 +02:00
Samuel Pitoiset
ea92273cea radv: fix DCC fast clear code for intensity formats
This fixes a rendering issue with DiRT 4 on GFX10. Only GFX10 was
affected because intensity formats are different.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1923
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-14 08:36:14 +02:00
Eric Engestrom
48289d8853 radv: add exported symbols check
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-13 17:40:54 +01:00
Rhys Perry
f13ad839f1 aco: don't use p_as_uniform for vgpr sampler/image indices
p_as_uniform can get CSE'd, which can be incorrect and break some
dEQP-VK.descriptor_indexing.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
0c3fe323b6 aco: implement divergent vulkan_resource_index
Fixes the UBO/SSBO dEQP-VK.descriptor_indexing.* tests

v2: remove bld.copy() usage

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
5526a557ee aco: readfirstlane vgpr pointers in convert_pointer_to_64_bit()
This can happen when bcsel is used between the results of two
vulkan_resource_index. It's also probably needed for non-uniform
descriptor indexing

Fixes dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.reads_opselect_two_buffers

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
45d6c69b99 aco: use can_accept_constant in valu_can_accept_literal
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
b37857bcea aco: don't apply sgprs/constants to read/write lane instructions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-11 14:26:58 +00:00
Rhys Perry
2026ff5165 aco: update print_ir
Mostly adds GFX10 stuff.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
283eda71cf aco: rework scratch resource code
Fix compute, cleanup and add GFX10 support.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
f64b1a3454 aco/gfx10: disable GFX9 1D texture workarounds
Navi added back support for 1D textures.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
de0748c42e aco/gfx10: fix inline uniform blocks
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Rhys Perry
ba71be228f radv/aco: disable NGG when ACO is used
Note that radv_device.c still has to be modified to use ACO with Navi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 20:02:36 +00:00
Marek Olšák
b7fc082b28 ac/nir: add back nir_op_fmod
radeonsi doesn't lower it for doubles.

This partially reverts commit d861401554.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-10 15:57:50 -04:00
Bas Nieuwenhuizen
e6986bcb73 radv: Enable VK_ANDROID_external_memory_android_hardware_buffer.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
e92b9c5f4f radv: Check the size of the imported buffer.
This is a security feature to disallow malicious apps from passing
a buffer that is too small.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
dad047a56a radv: Expose image handle compat types for Android handles.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
1b0ceba925 radv: Allow Android image binding.
Using delayed layout of images.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
83a012b603 radv/android: Add android hardware buffer import/export.
Support does not include images yet.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
adad61239c radv: Deal with Android external formats.
To abstract things a bit, this adds a helper function in radv_android.c.
However, this means we have to link in radv_android.c on non-android as
well, which means some scaffolding changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
041fc7beb8 radv: Derive android usage from create flags.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
53b1372571 radv: Disallow sparse shared images.
Since we really cannot share them ever.

Also remove an unused switch.

Fixes: b70829708a "radv: Implement VK_KHR_external_memory"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
f36b52740a radv/android: Add android hardware buffer queries.
Derived from the Intel code.

For the internal format we just use the internal Vulkan format,
as we have Vulkan formats for all android formats we care about.

For the ycbcr properties we just do something. I do not have a real
clue what would be recommended.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen
a34e4dd0d2 radv/android: Add android hardware buffer field to device memory.
You cannot go from BO to Android hardware buffer, so for export we
have to remember it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00