Commit graph

2985 commits

Author SHA1 Message Date
Samuel Pitoiset
320b058d32 radv: simplify allocating user SGPRS for descriptor sets
Unnecesary to check the current stages if desc_set_used_mask
is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:36 +01:00
Samuel Pitoiset
d1994ed229 radv: remove radv_userdata_info::indirect field
Always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:33 +01:00
Timothy Arceri
0907ae35ad radv/ac: fix some fp16 handling
Fixes: b722b29f10 ("radv: add support for 16bit input/output")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 10:41:48 +11:00
Bas Nieuwenhuizen
b4870a15ae radv: Remove unused variable.
Trivial.
2019-01-27 13:51:35 +01:00
Niklas Haas
804cc44d09 radv: add device->instance extension dependencies
From the vulkan spec 33.3 "Extension Dependencies":

"Any device extension that has an instance extension dependency that is
not enabled by vkCreateInstance is considered to be unsupported, hence
it must not be returned by vkEnumerateDeviceExtensionProperties for any
VkPhysicalDevice child of the instance."

Therefore we need to check whether the instance-level extensions are
actually enabled when deciding to support a device-level extension or
not.

Furthermore, we need to do this for all instance-level extensions of any
(transitive) device-level extension dependency, due to the following
paragraph:

"If an extension is supported (as queried by
vkEnumerateInstanceExtensionProperties or
vkEnumerateDeviceExtensionProperties), then required extensions of that
extension must also be supported for the same instance or physical
device."

Finally, because some of these vulkan extensions may be implicitly
promoted to future vulkan core API versions, we can also satisfy the
dependency if the vulkan API version is high enough.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 13:50:35 +01:00
Niklas Haas
d12dc39396 radv: correctly use vulkan 1.0 by default
From the vulkan spec 3.2 "Instances":

"Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing an
apiVersion of 0 is equivalent to providing an apiVersion of
VK_MAKE_VERSION(1,0,0)."

Fixes: ffa15861ef "radv: UseEnumerateInstanceVersion for the default version."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 12:49:28 +01:00
Timothy Arceri
5d66f7103f ac/nir_to_llvm: fix clamp shadow reference for more hardware
Fixes the following piglit test on my VEGA and matches the behaviour in the
tgsi backend.

tests/spec/glsl-1.10/execution/samplers/glsl-fs-shadow2D-clamp-z.shader_test

Fixes: 625dcbbc45 ("amd/common: pass address components individually to ac_build_image_intrinsic")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-26 12:03:24 +11:00
Samuel Pitoiset
378e2d2414 radv: fix computing number of user SGPRs for streamout buffers
Streamout buffers are emitted like push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-25 15:36:16 +01:00
Samuel Pitoiset
963c044c55 radv: always pass the GFX9 fence data to si_cs_emit_cache_flush()
Remove two useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:14 +01:00
Samuel Pitoiset
5f0b17d581 radv: compute the GFX9 fence VA at allocation time
Instead of doing every time we emit cache flushes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:12 +01:00
Samuel Pitoiset
e7ac792400 radv: only allocate the GFX9 fence and EOP BOs for the gfx queue
It's invalid to emit a ZPASS_DONE event on the compute queue, and
the fence BO is unused on the compute queue (ie. we don't flush
CB or DB caches).
This saves some space in the upload BO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:09 +01:00
Samuel Pitoiset
bd098884f1 radv: remove old_fence parameter from si_cs_emit_write_event_eop()
This parameter is actually useless as the immediate value
can always be zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:07 +01:00
Samuel Pitoiset
698afa177e radv: improve gathering of load_push_constants with dynamic bindings
For example, if a pipeline has two stages VS and FS. And if only
the fragment stage needs dynamic bindings, we shouldn't allocate
an extra user SGPR for the vertex stage. Of course, if the vertex
stage loads constants, it needs an user SGPR.

This should reduce the number of SET_SH_REG packets that are emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 09:43:53 +01:00
Timothy Arceri
678ef2a4a5 ac/nir_to_llvm: fix interpolateAt* for structs
This fixes the arb_gpu_shader5 interpolateAt* tests that contain
structs.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Timothy Arceri
559e5b0408 ac/nir_to_llvm: add bindless support for uniform handles
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-23 10:41:37 +11:00
Marek Olšák
e402961e1d radeonsi: correct WRITE_DATA.DST_SEL definitions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Karol Herbst
8bb46de08b mesa: add MESA_SHADER_KERNEL
used for CL kernels

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 20:36:41 +01:00
Rhys Perry
f0ba826054 radv: prevent dirtying of dynamic state when it does not change
DXVK often sets dynamic state without actually changing it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
e4c6423c5e radv: avoid context rolls when binding graphics pipelines
It's common in some applications to bind a new graphics pipeline without
ending up changing any context registers.

This has a pipline have two command buffers: one for setting context
registers and one for everything else. The context register command buffer
is only emitted if it differs from the previous pipeline's.

v2: ensure late scissor emission is done when radv_emit_rbplus_state() is
    called
v2: make use of cmd_buffer->state.workaround_scissor_bug
v3: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5564a797f2 radv: add missed situations for scissor bug workaround
v2: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5d1a29071a radv: pass radv_draw_info to radv_emit_draw_registers()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Karol Herbst
b9fec2b38c nir: replace more nir_load_system_value calls with builder functions
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
36a76b7192 nir: rename nir_var_shared to nir_var_mem_shared
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Karol Herbst
9b24028426 nir: rename nir_var_function to nir_var_function_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Timothy Arceri
9e669ed22b ac/nir_to_llvm: fix interpolateAt* for arrays
This builds on the recent interpolate fix by Rhys ee8488ea3b.

This fixes the arb_gpu_shader5 interpolateAt* tests that contain
arrays.

Fixes: ee8488ea3b ("ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 10:59:38 +11:00
Samuel Pitoiset
f682ed11c3 radv: initialize the per-queue descriptor BO only once
Totally useless to write the descriptors inside the loop.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:32 +01:00
Samuel Pitoiset
72d9745a40 radv: do not write unused descriptors to the per-queue BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:30 +01:00
Samuel Pitoiset
8c164ea8f5 radv: reduce size of the per-queue descriptor BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:28 +01:00
Samuel Pitoiset
83cc87ead4 radv: drop unused code related to 16 sample locations
The driver only supports up to 8 sample locations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:24 +01:00
Timothy Arceri
cb527d2c4c ac/nir_to_llvm: add support for structs to get_sampler_desc()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
b12316cc92 ac/nir_to_llvm: fix regression in bindless support
This wasn't ported over when deref support was implemented.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Timothy Arceri
292887ac0d ac/nir_to_llvm: fix type handling in image code
The current code only strips off arrays and cannot find the type
for images that are struct members.

Instead of trying to get the image type from the variable, we just
get it directly from the deref instruction.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-01-17 10:35:36 +11:00
Rhys Perry
8a52e4cc4f radv: use dithered alpha-to-coverage
This matches the behaviour of AMDVLK and hides banding.
It is also seems to be allowed by the Vulkan spec.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-16 20:49:23 +00:00
Samuel Pitoiset
d5d7b5e950 ac/nir: don't trash L1 caches for store operations with writeonly memory
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-16 13:57:22 +01:00
Marek Olšák
11735d6c9c winsys/amdgpu: fix whitespace 2019-01-15 19:10:16 -05:00
Samuel Pitoiset
7bef192018 radv: add support for VK_EXT_memory_budget
A simple Vulkan extension that allows apps to query size and
usage of all exposed memory heaps.

The different usage values are not really accurate because
they are per drm-fd, but they should be close enough.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:37 +01:00
Samuel Pitoiset
9784400a6b radv: add two small helpers for getting VRAM and visible VRAM sizes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:35 +01:00
Samuel Pitoiset
a6e5ce5130 radv: remove unnecessary returns in GetPhysicalDevice*Properties()
These functions return nothing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:17 +01:00
Bas Nieuwenhuizen
568e7a2998 radv: Set partial_vs_wave for pipelines with just GS, not tess.
Looking at -pro we need to enable it for pipelines with just a
GS too.

This seems to reduce the hangs from
https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to
the point where I can't reproduce, after the false start with the
wd_switch_on_eop patch due to flakiness.

(but people are reporting it does not fix the issue completely for
 them on polaris 11)

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-15 10:22:30 +01:00
Samuel Pitoiset
ad6ceb2872 ac: add missing 16-bit types to glsl_base_to_llvm_type()
Fix crashes with
dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.*16

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 21:18:23 +01:00
Bas Nieuwenhuizen
76b12fa564 radv: Only use 32 KiB per threadgroup on Stoney.
Causes hangs on some machines.

What works for dEQP-VK.tessellation.shader_input_output.barrier:

- running num_patches = 6 (which limits LDS to 32 KiB)
- running num_patches = 8, and artificially cutting LDS size at 32 KiB.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-14 19:58:27 +00:00
Samuel Pitoiset
929df7afaf ac/nir: set cache policy when loading/storing buffer images
This was missing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 17:59:51 +01:00
Samuel Pitoiset
af2a85df74 ac/nir: add get_cache_policy() helper and use it
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 17:59:49 +01:00
Michel Dänzer
1a20b56798 amd/common: Restore v4i32 suffix for llvm.SI.load.const intrinsic
It was accidentally dropped in commit e4803ab7d2 "amd/common: use
llvm.amdgcn.s.buffer.load for LLVM 8.0", breaking the universe with LLVM
7.

Trivial.
2019-01-14 12:52:52 +01:00
Nicolai Hähnle
7fbd48fdc0 amd/common/vi+: enable SMEM loads with GLC=1
Only on LLVM 8.0+, which supports the new intrinsic.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 08:30:15 +01:00
Nicolai Hähnle
e4803ab7d2 amd/common: use llvm.amdgcn.s.buffer.load for LLVM 8.0
llvm.SI.load.const is deprecated.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-14 08:30:12 +01:00
Eric Engestrom
53fbde4df3 radv: remove a few more unnecessary KHR suffixes
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2019-01-10 16:53:44 +00:00
Rhys Perry
ee8488ea3b ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics
Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is
enabled with DXVK.
It also fixes artifacts with Fallout 4's god rays with DXVK.
Various piglit interpolateAt*() tests under NIR are also fixed.

v2: formatting fix
    update commit message to include Fallout 4 and the Fixes tag

Fixes: f4e499ec79 ('radv: add initial non-conformant radv vulkan driver')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-01-09 14:57:07 +00:00
Samuel Pitoiset
b8c4f523b4 radv: skip draws with instance_count == 0
Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 14:22:38 +01:00
Samuel Pitoiset
a2b5cc3c39 radv: enable variable pointers
The Vulkan spec 1.1.97 says:
   "variablePointers specifies whether the implementation supports
    the SPIR-V VariablePointers capability. When this feature is
    not enabled, shader modules must not declare the
    VariablePointers capability."

As the SPIR-V feature is enabled, we should turn on the
extension feature as well.

All dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.*
pass with the khronos internal repo. Note that a bunch of them
fails with the public repo, but it's expected as they violate the
specification.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 12:32:18 +01:00