Commit graph

1894 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
f67dea5e19 radv: Fix multiview depth clears
We were not using the view mask for depth clears, causing only the
first view to be cleared.

Fixes: 2e86f6b259 "radv: Add multiview clears."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 20:16:26 +00:00
Bas Nieuwenhuizen
9add63a3a5 radv: Remove redundant format check.
The switch directly after the check has a default case that returns
NULL too, so the effective return value is not changed. Also this
check is wrong once we start dealing with formats introduced by an
extension (e.g. YUV formats).

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 20:09:38 +00:00
Samuel Pitoiset
445867c80d radv: report Vulkan version 1.1.90 for real
I thought the value was correctly propagated, but actually not.

Fixes: 2ac6d55f38 ("radv: bump reported version to 1.1.90")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-17 17:51:48 +01:00
Jason Ekstrand
cae373117c anv,radv: Re-enable VK_EXT_pci_bus_info
Now at version 2 with the fixed header.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 10:42:35 -06:00
Rhys Perry
ef198e8c6a radv: switch from nir_bcsel to nir_b32csel
Fixes: 191a1dce92 ('nir: Add 1-bit Boolean opcodes')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 14:52:39 +00:00
Rhys Perry
bba94a3d85 radv: don't set surf_index for stencil-only images
Fixes: f8d5b377c8 ('radv: set cb base tile swizzles for MRT speedups (v4)')
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108116
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-17 14:52:10 +00:00
Jason Ekstrand
47e1e0692c radv: Fix a stupid if in gather_intrinsic_info
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 15:06:07 -06:00
Jason Ekstrand
11dc130779 nir: Add a bool to int32 lowering pass
We also enable it in all of the NIR drivers.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Dave Airlie
b3f2b03ece radv/xfb: fix counter buffer bounds checks.
If we gave this function 0 counter buffers, we'd still try and
access pCounterBuffers[0] as this check was incorrect.

Fixes crash with ext_transform_feedback-pipeline-basic-primgen
on zink on radv.

Fixes: 677b496b6 (radv: fix begin/end transform feedback with 0 counter buffers.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-13 19:27:05 +00:00
Samuel Pitoiset
5088ba2aeb radv: don't check if format is depth in radv_image_can_enable_hile()
This is always TRUE if htile_size is not 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:21 +01:00
Samuel Pitoiset
eb0034fe28 radv: check if addrlib enabled HTILE in radv_image_can_enable_htile()
When hile_size is 0, we can't enable HTILE. This doesn't change
anything, except not calling radv_image_alloc_htile().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:19 +01:00
Samuel Pitoiset
d8325f1f07 radv: switch on EOP when primitive restart is enabled with triangle strips
Otherwise, Yakuza hangs the GPU with DXVK. We don't know if
linetrip and pointlist are affected, so my point is to do that
only for triangle strips.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:16 +01:00
Samuel Pitoiset
74cf3b627c radv: allow to skip DCC decompressions with the new predicate
Feral games aren't affected because they don't decompress DCC.
F1 2018 has one DCC decompression per frame, but I don't see
any performance improvements. This new predicate will be
probably more useful for DCC/MSAA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:14 +01:00
Samuel Pitoiset
3a5adc2879 radv: add a predicate for reflecting DCC decompression state
It's somehow similar to the FCE predicate.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:10 +01:00
Samuel Pitoiset
2ac6d55f38 radv: bump reported version to 1.1.90
After going through the spec changelog, it looks like RADV
is up to date. Note that ANV also reports 1.1.90.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-12 13:51:16 +01:00
Jason Ekstrand
8f401b0ce6 anv,radv: Disable VK_EXT_pci_bus_info
The Vulkan working group recently discovered that we made a mistake in
assuming that PCI domains are 16-bit even though they can potentially be
32-bit values.  To fix this, the next spec update will change the types
in the VK_EXT_pci_bus_info struct to be 32 bits which will be a
backwards-incompatible change.  Normally, Khronos tries very hard to
never make backwards incompatible changes to specs.  Hopefully, the
extension is new enough (2 months) that there are no shipping apps which
use the extension so this should be safe.

This commit disables the extension for both anv and radv in mesa and
should be back-ported to 18.3 ASAP so we avoid any potential issues with
new apps running on old drivers.  I'll send out a commit (which we can
also back-port to 18.3 if we really care) to re-enable the extension in
both drivers once this week's spec update ships.  The one known use of
this extension is internal to mesa and will continue working with the
extension disabled and will naturally update when we get a new header.

Cc: "18.3" <mesa-stable@lists.freedesktop.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-11 11:30:05 -06:00
Samuel Pitoiset
3fbdcd942f amd: remove support for LLVM 6.0
User are encouraged to switch to LLVM 7.0 released in September 2018.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Samuel Pitoiset
49ef890733 radv: expose VK_EXT_scalar_block_layout
Nothing to do, the compiler already handles that.

All new dEQP.VK.ubo.* and dEQP.VK.ssbo.* pass, except some
16-bit tests that are quite related to fdo bug #108114.

Only enable the extension on CIK+ because it might not work on SI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 17:38:20 +01:00
Samuel Pitoiset
c7ada4901a radv: wait on the high 32 bits of timestamp queries
In case we are unlucky if the low part is 0xffffffff.

Fixes: 5d6a560a29 ("radv: do not use the availability bit for timestamp queries")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 13:05:58 +01:00
Samuel Pitoiset
e899728769 radv: reset pending_reset_query when flushing caches
If the driver used a compute shader for resetting a query pool,
it should be completed when caches are flushed.

This might reduce the number of stalls if operations are done
between vkCmdResetQueryPool() and vkCmdBeginQuery()
(or vkCmdWriteTimestamp()).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
2018-12-05 13:05:55 +01:00
Alex Smith
c1b6cb068c radv: Flush before vkCmdWriteTimestamp() if needed
As done for vkCmdBeginQuery() already. Prevents timestamps from being
overwritten by previous vkCmdResetQueryPool() calls if the shader path
was used to do the reset.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108925
Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-05 10:52:48 +00:00
Samuel Pitoiset
824cfc1ee5 radv: rework the TC-compat HTILE hardware bug with COND_EXEC
After investigating on this, it appears that COND_WRITE doesn't
work correctly in some situations. I don't know exactly why does
it fail to update DB_Z_INFO.ZRANGE_PRECISION, but as AMDVLK
also uses COND_EXEC I think there is a reason.

Now the driver stores a new metadata value in order to reflect
the last fast depth clear state. If a TC-compat HTILE is fast cleared
with 0.0f, we have to update ZRANGE_PRECISION to 0 in order to
work around that hardware bug.

This fixes rendering issues with The Forest and DXVK and doesn't
seem to introduce any regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108914
Fixes: 68dead112e ("radv: update the ZRANGE_PRECISION value for the TC-compat bug")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 09:26:31 +01:00
Dave Airlie
1363a47c9c radv: use 3d shader for gfx9 copies if dst is 3d
This fixes some crucible 3d miptree tests I've been working on
when executed using the compute shader path.

Fixes: d08f267814 (radv/gfx9: fix 3d image to image transfers on compute queues.)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 10:42:31 +10:00
Bas Nieuwenhuizen
12e35a64c0 radv: Check for shareable images in central place.
One place to put the logic makes things easier to change.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen
3bf48741e1 radv/android: Use buffer metadata to determine scanout compat.
These days we don't always allocate scanout compatible textures anymore.
That does mean we have to fix the radv android WSI though.

Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Bas Nieuwenhuizen
51091b3e1f radv/android: Mark android WSI image as shareable.
Fixes: b1444c9ccb "radv: Implement VK_ANDROID_native_buffer."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-04 01:21:38 +01:00
Tobias Klausmann
9401a2f2e6 amd/vulkan: meson build - use radv_deps for libvulkan_radeon
Without this the build breaks with:

FAILED: src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o
cc -Isrc/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha -Isrc/amd/vulkan
-I../src/amd/vulkan -Isrc/../include -I../src/../include -Isrc -I../src
-Isrc/mapi -I../src/mapi -Isrc/mesa -I../src/mesa -I../src/gallium/include
-Isrc/gallium/auxiliary -I../src/gallium/auxiliary -Isrc/amd -I../src/amd
-Isrc/amd/common -I../src/amd/common -Isrc/compiler -I../src/compiler
-Isrc/vulkan/util -I../src/vulkan/util -Isrc/vulkan/wsi -I../src/vulkan/wsi
-Isrc/compiler/nir -I../src/compiler/nir -I/usr/include -I/usr/include/libdrm
-fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch
-std=c99 -O2 -g '-DVERSION="18.3.0-rc5"' -DPACKAGE_VERSION=VERSION
'-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"'
-DGLX_USE_TLS -DHAVE_ST_VDPAU -DENABLE_ST_OMX_BELLAGIO=0
-DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING
-DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DENABLE_SHADER_CACHE
-DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ
-DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT
-DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT
-DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE
-DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN
-DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE
-DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT
-DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT
-DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL
-DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS
-DHAVE_FUNC_ATTRIBUTE_NORETURN -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS
-DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H
-DHAVE_ENDIAN_H -DHAVE_DLFCN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP
-DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L
-DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD
-DHAVE_PTHREAD_SETAFFINITY -DHAVE_LIBDRM -DHAVE_LLVM=0x0600
-DMESA_LLVM_VERSION_PATCH=1 -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED
-DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Werror=implicit-function-declaration
-Werror=missing-prototypes -Werror=return-type -fno-math-errno
-fno-trapping-math -Wno-missing-field-initializers -Wno-format-truncation -O2
-Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables
-fasynchronous-unwind-tables -fstack-clash-protection -DNDEBUG -fPIC -pthread
-D__STDC_FORMAT_MACROS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS
-D__STDC_LIMIT_MACROS -fvisibility=hidden -Wno-override-init
-DVK_USE_PLATFORM_XCB_KHR -DVK_USE_PLATFORM_XLIB_KHR
-DVK_USE_PLATFORM_WAYLAND_KHR -DVK_USE_PLATFORM_DISPLAY_KHR
-DVK_USE_PLATFORM_XLIB_XRANDR_EXT  -MD -MQ
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -MF
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o.d' -o
'src/amd/vulkan/src@amd@vulkan@@vulkan_radeon@sha/radv_pipeline.c.o' -c
../src/amd/vulkan/radv_pipeline.c
In file included from ../src/vulkan/util/vk_alloc.h:29,
                 from ../src/amd/vulkan/radv_private.h:52,
                 from ../src/amd/vulkan/radv_debug.h:27,
                 from ../src/amd/vulkan/radv_pipeline.c:30:
../src/../include/vulkan/vulkan.h:54:10: fatal error: wayland-client.h: Datei
oder Verzeichnis nicht gefunden
 #include <wayland-client.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.

The above command misses the include directory for wayland:
    -I/usr/include/wayland

The missing include is contained in the (until now) unused radv_deps:

if with_platform_wayland
  radv_deps += dep_wayland_client
  radv_flags += '-DVK_USE_PLATFORM_WAYLAND_KHR'
  libradv_files += files('radv_wsi_wayland.c')
endif

Fixes: 673dda8330 "meson: build "radv" vulkan driver for radeon hardware"
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-12-03 09:18:48 -08:00
Nicolai Hähnle
776b911365 amd/addrlib: update Mesa's copy of addrlib
Update to the internal master as of 2018-11-15.

This has a lot of gratuitous whitespace change, but on the plus
side it's built using the same tooling that's used for AMDVLK,
which should help going forward.
2018-11-29 13:18:24 +01:00
Nicolai Hähnle
729ebdf07e radv: remove dependency on addrlib gfx9_enum.h
v2:
- use SI_CONTEXT_REG_OFFSET

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-29 13:18:23 +01:00
Samuel Pitoiset
cc7deb749c radv: drop few useless state changes when doing color/depth decompressions
Viewport/scissor don't need to be updated for array textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:55 +01:00
Samuel Pitoiset
6d4f65deea radv: remove unused pending_clears param in the transition path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:53 +01:00
Samuel Pitoiset
4b9df824f7 radv: optimize CmdClear{Color,DepthStencil}Image() for layered textures
If all layers are bound we can perform a fast color or depth clear
instead of iterating over all layers. This has the advantage
to avoid trashing the framebuffer for nothing if you we end up by
doing a fast clear when calling radv_clear_image_layer(), and
clearing all layers in one shot is obviously faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
7484bc894b radv: refactor the fast clear path for better re-use
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
f78ee19702 radv: simplify a check in emit_fast_color_clear()
Currently only true if RADV_PERFTEST=dccmsaa is set.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
eca931a726 radv: add radv_can_fast_clear_{color,depth}() helpers
For further optimisations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
93f5ce8fa7 radv: add radv_image_view_can_fast_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
aeaf8dbd09 radv: add radv_image_can_fast_clear() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Samuel Pitoiset
3e718db1ff radv: remove useless check in emit_fast_color_clear()
The driver doesn't support DCC/CMASK for mipmapped textures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:42 +01:00
Bas Nieuwenhuizen
6569644bb6 radv: Align large buffers to the fragment size.
Improves performance in Talos by about 15% (and significant improvements
in RotR and possibly other but did not bench with final patch) on
kernel 4.19 and earlier.

On 4.20+ a similar effect comes from

433ca054949a "drm/amdgpu: try allocating VRAM as power of two"

v2: Do not impact the alignment of the physical memory.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
CC: <mesa-stable@lists.freedesktop.org>
2018-11-27 22:17:42 +01:00
Bas Nieuwenhuizen
08ea6b9d9b radv: Clamp gfx9 image view extents to the allocated image extents.
Mirrors AMDVLK. Looks like if we go over the alignment of height
we actually start to change the addressing. Seems like the extra
miplevels actually work with this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245
Fixes: f6cc15dccd "radv/gfx9: fix block compression texture views. (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-27 10:19:52 +01:00
Bas Nieuwenhuizen
3c96a1e3a9 radv: Fix opaque metadata descriptor last layer.
We used the layer count which results in an off by one error.

Not sure this really affects anything.

Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-26 09:29:39 +01:00
Samuel Pitoiset
9fc1ce258c radv: ignore subpass self-dependencies for CreateRenderPass() too
We really need to refactor this...

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 11:59:11 +01:00
Samuel Pitoiset
2951a766bd radv: remove useless sync before CmdClear{Color,DepthStencil}Image()
We don't need to flush anything before these two commands as well.
This is because they have to be externally synchronized, so the
app should have called CmdPipelineBarrier() prior to that and the
driver should have flushed the caches.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-23 11:59:08 +01:00
Samuel Pitoiset
4ff4af3d91 radv: remove useless sync after CmdClear{Color,DepthStencil}Image()
'post_flush' is only set to NULL for the normal clear path
(ie. only vkCmdClearColorImage() and vkCmdClearDepthStencilImage()
are affected commands).

Because these two operations have to be externally synchronized
with VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT,
it's useless to set those flags internallY.

VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. RADV_CMD_FLAG_WRITEBACK_GLOBAL_L2 will be superseded
by RADV_CMD_FLAG_INV_GLOBAL_L2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-22 08:56:36 +01:00
Samuel Pitoiset
4b9bc4791b radv: only sync CP DMA for transfer operations or bottom pipe
CP DMA can only be busy when the driver copies buffers. The
only affected Vulkan commands are vkCmdCopyBuffer() and
vkCmdUpdateBuffer() (because we fallback to a copy depending on
a threshold). Clear operations are currently not concerned
because the driver always syncs after the last DMA operation.

Per the spec, these two operations have to be externally
synchronized with VK_PIPELINE_STAGE_TRANSFER_BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 10:03:01 +01:00
Samuel Pitoiset
457ac6ce1e radv: ignore subpass self-dependencies
Unnecessary as they allow the app to call vkCmdPipelineBarrier()
inside the render pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 10:02:59 +01:00
Bas Nieuwenhuizen
dd0172e865 radv: Use structured intrinsics instead of indexing workaround for GFX9.
These force the index to be used in the instruction so we don't need the
workaround.

Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-19 23:36:00 +01:00
Samuel Pitoiset
724107553c radv: implement fast HTILE clears for depth or stencil only on GFX9
This allows to fast clear the depth part (or the stencil part)
of a depth+stencil surface when HTILE is enabled. I didn't test
on GFX8, so it's disabled currently.

This gives a very nice boost, for example when clearing the depth
aspect of a 4096x4096 D32_SFLOAT_S8_UINT image (18x faster).

BEFORE: 235 us
AFTER: 13 us

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:18 +01:00
Samuel Pitoiset
7dcddbe54d radv: rewrite the condition that checks allowed depth/stencil values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:16 +01:00
Samuel Pitoiset
9133bbf186 radv: check allowed fast HTILE clears a bit earlier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 16:32:14 +01:00