Commit graph

1953 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
ead54d4a42 radv/winsys: Set winsys bo priority on creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:41 +01:00
Samuel Pitoiset
3a8d6c0880 radv: re-enable fast depth clears for 16-bit surfaces on VI
This has been disabled some months ago because it introduced
rendering issues with Shadow Of Warrier II (DXVK). This game is
no longer affected, I wonder if 824cfc1ee5 ("radv: rework the
TC-compat HTILE hardware bug with COND_EXEC") fixed the problem.
I checked The Forest on my Polaris, and it renders fine too.

According to Phillip, this gives +5.5% with Rise Of The Tomb
Raider and DXVK. This is because DXVK  uses 16-bit depth surfaces
while the native port from Feral uses 32-bit depth surfaces.

Unfortunately, Shadow Of The Tomb Raider isn't affected because
it clears each layer of a D16 array texture individually. So it
doesn't hit the fast clear path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-29 15:20:55 +01:00
Samuel Pitoiset
afeef3cacf radv: set noalias/dereferenceable LLVM attributes based on param types
Instead of using this useless array_params_mask variable.
This should set these two attributes to streamout buffers too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:38 +01:00
Samuel Pitoiset
320b058d32 radv: simplify allocating user SGPRS for descriptor sets
Unnecesary to check the current stages if desc_set_used_mask
is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:36 +01:00
Samuel Pitoiset
d1994ed229 radv: remove radv_userdata_info::indirect field
Always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:33 +01:00
Timothy Arceri
0907ae35ad radv/ac: fix some fp16 handling
Fixes: b722b29f10 ("radv: add support for 16bit input/output")

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 10:41:48 +11:00
Bas Nieuwenhuizen
b4870a15ae radv: Remove unused variable.
Trivial.
2019-01-27 13:51:35 +01:00
Niklas Haas
804cc44d09 radv: add device->instance extension dependencies
From the vulkan spec 33.3 "Extension Dependencies":

"Any device extension that has an instance extension dependency that is
not enabled by vkCreateInstance is considered to be unsupported, hence
it must not be returned by vkEnumerateDeviceExtensionProperties for any
VkPhysicalDevice child of the instance."

Therefore we need to check whether the instance-level extensions are
actually enabled when deciding to support a device-level extension or
not.

Furthermore, we need to do this for all instance-level extensions of any
(transitive) device-level extension dependency, due to the following
paragraph:

"If an extension is supported (as queried by
vkEnumerateInstanceExtensionProperties or
vkEnumerateDeviceExtensionProperties), then required extensions of that
extension must also be supported for the same instance or physical
device."

Finally, because some of these vulkan extensions may be implicitly
promoted to future vulkan core API versions, we can also satisfy the
dependency if the vulkan API version is high enough.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 13:50:35 +01:00
Niklas Haas
d12dc39396 radv: correctly use vulkan 1.0 by default
From the vulkan spec 3.2 "Instances":

"Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing an
apiVersion of 0 is equivalent to providing an apiVersion of
VK_MAKE_VERSION(1,0,0)."

Fixes: ffa15861ef "radv: UseEnumerateInstanceVersion for the default version."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-27 12:49:28 +01:00
Samuel Pitoiset
378e2d2414 radv: fix computing number of user SGPRs for streamout buffers
Streamout buffers are emitted like push constants.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-25 15:36:16 +01:00
Samuel Pitoiset
963c044c55 radv: always pass the GFX9 fence data to si_cs_emit_cache_flush()
Remove two useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:14 +01:00
Samuel Pitoiset
5f0b17d581 radv: compute the GFX9 fence VA at allocation time
Instead of doing every time we emit cache flushes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:12 +01:00
Samuel Pitoiset
e7ac792400 radv: only allocate the GFX9 fence and EOP BOs for the gfx queue
It's invalid to emit a ZPASS_DONE event on the compute queue, and
the fence BO is unused on the compute queue (ie. we don't flush
CB or DB caches).
This saves some space in the upload BO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:09 +01:00
Samuel Pitoiset
bd098884f1 radv: remove old_fence parameter from si_cs_emit_write_event_eop()
This parameter is actually useless as the immediate value
can always be zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:07 +01:00
Samuel Pitoiset
698afa177e radv: improve gathering of load_push_constants with dynamic bindings
For example, if a pipeline has two stages VS and FS. And if only
the fragment stage needs dynamic bindings, we shouldn't allocate
an extra user SGPR for the vertex stage. Of course, if the vertex
stage loads constants, it needs an user SGPR.

This should reduce the number of SET_SH_REG packets that are emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 09:43:53 +01:00
Marek Olšák
e402961e1d radeonsi: correct WRITE_DATA.DST_SEL definitions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Rhys Perry
f0ba826054 radv: prevent dirtying of dynamic state when it does not change
DXVK often sets dynamic state without actually changing it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
e4c6423c5e radv: avoid context rolls when binding graphics pipelines
It's common in some applications to bind a new graphics pipeline without
ending up changing any context registers.

This has a pipline have two command buffers: one for setting context
registers and one for everything else. The context register command buffer
is only emitted if it differs from the previous pipeline's.

v2: ensure late scissor emission is done when radv_emit_rbplus_state() is
    called
v2: make use of cmd_buffer->state.workaround_scissor_bug
v3: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5564a797f2 radv: add missed situations for scissor bug workaround
v2: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry
5d1a29071a radv: pass radv_draw_info to radv_emit_draw_registers()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Karol Herbst
b9fec2b38c nir: replace more nir_load_system_value calls with builder functions
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
9b24028426 nir: rename nir_var_function to nir_var_function_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Samuel Pitoiset
f682ed11c3 radv: initialize the per-queue descriptor BO only once
Totally useless to write the descriptors inside the loop.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:32 +01:00
Samuel Pitoiset
72d9745a40 radv: do not write unused descriptors to the per-queue BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:30 +01:00
Samuel Pitoiset
8c164ea8f5 radv: reduce size of the per-queue descriptor BO
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:28 +01:00
Samuel Pitoiset
83cc87ead4 radv: drop unused code related to 16 sample locations
The driver only supports up to 8 sample locations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-18 13:26:24 +01:00
Rhys Perry
8a52e4cc4f radv: use dithered alpha-to-coverage
This matches the behaviour of AMDVLK and hides banding.
It is also seems to be allowed by the Vulkan spec.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-16 20:49:23 +00:00
Samuel Pitoiset
7bef192018 radv: add support for VK_EXT_memory_budget
A simple Vulkan extension that allows apps to query size and
usage of all exposed memory heaps.

The different usage values are not really accurate because
they are per drm-fd, but they should be close enough.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:37 +01:00
Samuel Pitoiset
9784400a6b radv: add two small helpers for getting VRAM and visible VRAM sizes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:35 +01:00
Samuel Pitoiset
a6e5ce5130 radv: remove unnecessary returns in GetPhysicalDevice*Properties()
These functions return nothing.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-15 11:18:17 +01:00
Bas Nieuwenhuizen
568e7a2998 radv: Set partial_vs_wave for pipelines with just GS, not tess.
Looking at -pro we need to enable it for pipelines with just a
GS too.

This seems to reduce the hangs from
https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to
the point where I can't reproduce, after the false start with the
wd_switch_on_eop patch due to flakiness.

(but people are reporting it does not fix the issue completely for
 them on polaris 11)

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-15 10:22:30 +01:00
Bas Nieuwenhuizen
76b12fa564 radv: Only use 32 KiB per threadgroup on Stoney.
Causes hangs on some machines.

What works for dEQP-VK.tessellation.shader_input_output.barrier:

- running num_patches = 6 (which limits LDS to 32 KiB)
- running num_patches = 8, and artificially cutting LDS size at 32 KiB.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-14 19:58:27 +00:00
Eric Engestrom
53fbde4df3 radv: remove a few more unnecessary KHR suffixes
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1)
2019-01-10 16:53:44 +00:00
Rhys Perry
ee8488ea3b ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics
Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is
enabled with DXVK.
It also fixes artifacts with Fallout 4's god rays with DXVK.
Various piglit interpolateAt*() tests under NIR are also fixed.

v2: formatting fix
    update commit message to include Fallout 4 and the Fixes tag

Fixes: f4e499ec79 ('radv: add initial non-conformant radv vulkan driver')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-01-09 14:57:07 +00:00
Samuel Pitoiset
b8c4f523b4 radv: skip draws with instance_count == 0
Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 14:22:38 +01:00
Samuel Pitoiset
a2b5cc3c39 radv: enable variable pointers
The Vulkan spec 1.1.97 says:
   "variablePointers specifies whether the implementation supports
    the SPIR-V VariablePointers capability. When this feature is
    not enabled, shader modules must not declare the
    VariablePointers capability."

As the SPIR-V feature is enabled, we should turn on the
extension feature as well.

All dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.*
pass with the khronos internal repo. Note that a bunch of them
fails with the public repo, but it's expected as they violate the
specification.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 12:32:18 +01:00
Samuel Pitoiset
d58b11e709 radv: get rid of bunch of KHR suffixes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-09 12:26:48 +01:00
Karol Herbst
d0c6ef2793 nir: rename global/local to private/function memory
the naming is a bit confusing no matter how you look at it. Within SPIR-V
"global" memory is memory accessible from all threads. glsl "global" memory
normally refers to shader thread private memory declared at global scope. As
we already use "shared" for memory shared across all thrads of a work group
the solution where everybody could be happy with is to rename "global" to
"private" and use "global" later for memory usually stored within system
accessible memory (be it VRAM or system RAM if keeping SVM in mind).
glsl "local" memory is memory only accessible within a function, while SPIR-V
"local" memory is memory accessible within the same workgroup.

v2: rename local to function as well
v3: rename vtn_variable_mode_local as well

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:51:46 +01:00
Jason Ekstrand
05d72d6d48 spirv: Sort supported capabilities
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-07 18:41:15 -06:00
Jason Ekstrand
63b9aa2e25 spirv: Add support for using derefs for UBO/SSBO access
For now, it's hidden behind a cap.  Hopefully, we can eventually drop
that along with all the manual offset code in spirv_to_nir.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
adc155a815 spirv: Add explicit pointer types
Instead of baking in uvec2 for UBO and SSBO pointers and uint for push
constant and shared memory pointers, make it configurable.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Jason Ekstrand
fc9c4f89b8 nir: Move propagation of cast derefs to a new nir_opt_deref pass
We're going to want to do more deref optimizations going forward and
this gives us a central place to do them.  Also, cast propagation will
get a bit more complicated with the addition of ptr_as_array derefs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-08 00:38:30 +00:00
Bas Nieuwenhuizen
3cc940277a radv: Fix rasterization precision bits.
Note that these limits are exact, not a "precision is at least x",
as texel coords also get snapped to a multiple of this step size
before filtering.

This fixes CTS tests

dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat
dEQP-VK.texture.explicit_lod.2d.sizes.57x35_nearest_linear_mipmap_nearest_repeat

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109151
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:27:30 +01:00
Bas Nieuwenhuizen
64c83efaee radv: Remove unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:33 +01:00
Bas Nieuwenhuizen
656c1c488c radv: Remove device path.
unused and gcc complains about strncpy. (from what I can see because
strncpy does not leave a 0 byte on truncate. That said we don't use
it so this does not fix a real bug).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:14 +01:00
Timothy Arceri
50de3f80a8 nir: rename nir_link_constant_varyings() nir_link_opt_varyings()
The following patches will add support for an additional
optimisation so this function will no longer just optimise varying
constants.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Bas Nieuwenhuizen
8c93ef5de9 radv: Do a cache flush if needed before reading predicates.
This caused random failures for two conditional rendering tests:

dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard
dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard

These wrote the predicate with the vertex shader, did a barrier and then
started the conditional rendering. However the cache flushes for the barrier
only happen on first draw, so after the predicate has been read.

Fixes: e45ba51ea4 "radv: add support for VK_EXT_conditional_rendering"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-12-31 20:52:08 +01:00
Bas Nieuwenhuizen
bba5749484 radv: Fix wrongly positioned paren.
Trivial.

Fixes: 9f0bfbed11 "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."
2018-12-21 21:06:55 +01:00
Samuel Pitoiset
9606310081 radv: enable shaderStorageImageMultisample feature on GFX8+
Untested on older chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:19 +01:00
Samuel Pitoiset
6b976024a8 radv: add support for FMASK expand
Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:17 +01:00