Commit graph

643 commits

Author SHA1 Message Date
David Airlie
674ecbfef2 radv: emit db_htile_surface reg on gfx9 as well
This is also a GFX9 register.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:54:09 +10:00
Dave Airlie
fc600eb98d radv/gfx9: remove some leftover gfx6 descriptor setup.
We set this later in the non-gfx9 path, just remove these
bits from here.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:54:03 +10:00
Dave Airlie
5247b311e9 radv/gfx9: fix set predication packet.
The predication packet changed format on GFX9, update the driver.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:52:50 +10:00
Dave Airlie
82ba384c10 radv: force cs/ps/l2 flush at end of command stream. (v2)
This seems like a workaround, but we don't see the bug on CIK/VI.

On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.

Could also be a kernel difference between SI and CIK.

v2: hit this with a bigger hammer. This fixes a bunch of hangs
in the vk cts with the robustness tests.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-09 23:19:15 +01:00
Bas Nieuwenhuizen
bfed189ee0 radv: remove semicolon in if(...);
Trivial.

Fixes: a6a6146aa9 "radv: Don't allow fmask swizzling for shareable images."
2017-08-08 00:01:47 +02:00
Alex Smith
2e9a13bf22 radv: Fix decompression on multisampled depth buffers
Need to take the sample count into account in the depth decompress and
resummarize pipelines and render pass.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
2017-08-07 23:47:49 +02:00
Bas Nieuwenhuizen
a6a6146aa9 radv: Don't allow fmask swizzling for shareable images.
Also adds an assert because you never know how the winsys changes, and
multiprocess format differences are annoying.

Fixes: 1e696b962b "radv: add separate fmask tile swizzle counter."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-07 23:44:59 +02:00
Dave Airlie
8bf3930751 radv: fix MSAA on SI gpus.
This ports the workaround from radeonsi, that was missing in radv.

This fixes Talos rendering when MSAA is enabled on my Tahiti card.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-07 08:38:14 +01:00
Dave Airlie
1e696b962b radv: add separate fmask tile swizzle counter.
This mirrors what Marek has done for radeonsi, and uses
a separate counter to handle the fmask surface for MSAA
MRTs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-07 00:08:43 +01:00
Bas Nieuwenhuizen
acba3a3151 radv: Use the correct channel for alpha in resolve srgb conversion.
The argument here is a bitmask, so the old code selected .xy, which
got silently truncated to .x when constructing the vec4 from components,
instead of using .w.

Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-06 16:07:13 +02:00
Bas Nieuwenhuizen
15e5a7a683 radv: Only convert linear->srgb in compute resolves.
It justs works with the fragment shader resolve, so no need to do
a custom conversion. In fact with SRGB dest, it actually gives
wrong results.

Fixes: 69136f4e63 "radv/meta: add resolve pass using fragment/vertex shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-06 16:07:09 +02:00
Bas Nieuwenhuizen
8286c3a49f radv: Don't use SRGB format for image stores during resolve.
These seem to store very bogus results. Luckily there is some code
that converts srgb->linear already, so just making the descriptor
format UNORM should work.

Fixes: 588185eb6b "radv/meta: add srgb conversion to end of resolve shader."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-06 16:06:50 +02:00
Andres Rodriguez
14cad8786a radv: generate the same driver UUID as radeonsi
These need to match for interop compatibility queries.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-06 12:42:07 +10:00
Andres Rodriguez
f8ea71f047 radv: generate same device UUID as radeonsi
This is required for interop use cases. The same device must report
identical UUIDs through the GL and Vulkan APIs so that users can
identify when it is safe to perform a memory object import.

v2: use ac helpers to calculate the uuid

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-08-06 12:42:07 +10:00
Dave Airlie
36a1b61321 radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.

v2: get it right, reverse the polarity.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-05 03:52:44 +01:00
Dave Airlie
fc625ba072 radv: also fix texture image descriptors for mipmap tile swizzle
This fixes the image descriptors for mipmapped tile swizzle

Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-04 07:13:40 +01:00
Dave Airlie
a6b4f04d9b radv: fix tile swizzle regression on mipmaps.
When Marek enabled mipmapped swizzle, radv didn't
have the code in place to handle it. This fixes the
regression.

I'll look more into GFX9 once I have a vega card (soon).
Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures)

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-04 06:45:36 +01:00
Marek Olšák
59144d4bf5 ac/surface: increment surf_index only when tile swizzle is allowed
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
d311e837f4 ac/surface: remove RADEON_SURF_HAS_TILE_MODE_INDEX
it's useless

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Marek Olšák
4662e45350 ac/surface: move tile_swizzle to ac_surface and document it
Gfx9 will use it too.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-08-04 02:10:04 +02:00
Bas Nieuwenhuizen
c9d4b571ad radv: Add suballocation for shaders.
This reduces the number of BOs that we need for the BO lists during
a submission.

Currently uses a fairly simple linear search for finding free space,
that could eventually be improved to a binary tree, which with some
per-node info could make a check for space O(1) and finding it O(log n),
in the number of buffers in that slab.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-03 00:45:13 +02:00
Dave Airlie
df61a05019 radv: handle 10-bit format clamping workaround.
This fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.*
for a2r10g10b10 formats as destination on SI/CIK hardware.

This adds support to the meta program for emitting 10-bit
outputs, and adds 10-bit support to the fragment shader key.

It also only does the int8/10 on SI/CIK.

Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-01 00:10:23 +01:00
Bas Nieuwenhuizen
8229706ad8 radv: Don't underflow non-visible VRAM size.
In some APU situations the reported visible size can be larger than
VRAM size. This properly clamps the value.

Surprisingly both CTS and spec seem to allow a heap type with size 0,
so this seemed like the easiest option to me.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 4ae84efbc5 "radv: Use enum for memory heaps."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2017-07-31 22:50:13 +02:00
Nicolai Hähnle
7de445377c ac/nir,radv: move force_persample to ac_shader_info::force_persample
Avoid accessing radv-specific structures during the meat of NIR-to-LLVM
translation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:43 +02:00
Dave Airlie
800d162209 radv: for stencil only set Z tile mode index to same value
On SI this was causing a hang in
dEQP-VK.pipeline.render_to_image.core.2d_array.mipmap.r16g16_sint_s8_uint

This was due to not handling the tile mode index for depth like
I fixed previously for new GPUs.

Fixes: 01d0c5a9 (radv: fix stencil regression since new addrlib import)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-28 04:12:32 +01:00
Dave Airlie
d4b079e708 radv/winsys: fix padding command stream for SI
We were adding pad to size after creating the object, so we could
submit a CS bigger than the bo created for it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-26 23:38:23 +01:00
Andres Rodriguez
a973b9a9f8 radv: rename physical_device->uuid[] to cache_uuid[]
We have a few UUIDs, so lets be more specific.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-07-26 20:42:36 +10:00
Dave Airlie
6cbc8cf178 radv: only report external semaphore info for opaque fd.
Until we support sync fd, don't report the info.

Fixes CTS dEQP-VK.api.external.semaphore.sync_fd.* from crashing.

Fixes: eaa56eab6 (radv: initial support for shared semaphores (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-25 15:38:56 +10:00
Dave Airlie
ca82ef5ac7 radv: fix buffer views on SI/CIK.
Fixes CTS dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
on SI/CIK with radv.

Fixes: f4e499ec (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 21:54:04 +01:00
Dave Airlie
feef47bb59 radv: enable sample shading
This calculates ps_iter_samples from the minSampleShading input

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 17:45:03 +10:00
Dave Airlie
486472a98d radv: don't set dedicated bit for buffer external memory.
This is an alternate fix for the buffer export dedicated interaction.

Fixes CTS dEQP-VK.api.external.memory.opaque_fd.dedicated.buffer.info

Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 08:30:15 +01:00
Dave Airlie
75392e76ad radv: fix non-0 based layer clears.
If the layer base was > 0, it wasn't getting passed as the start
instance or getting added in the shaders.

Fixes CTS dEQP-VK.api.image_clearing.core.clear_color_attachment.2d_r8_uint_multiple_layers

Fixes: 7e0382fb (radv: add support for layered clears (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 17:27:55 +10:00
Dave Airlie
22b59b99cb radv: check enabled device features.
The spec says we should return VK_ERROR_FEATURE_NOT_PRESENT.

Ported from anv.

Fixes CTS test dEQP-VK.api.device_init.create_device_unsupported_features

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 08:16:52 +01:00
Dave Airlie
b7cc07432a radv: for external memory imports close the fd on import success
If we get an fd, we need to close it before returning.

Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.device_only.import_multiple_times

Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-24 04:41:36 +01:00
Bas Nieuwenhuizen
daaf7efb93 radv: Don't segfault when exporting an image which hasn't been bound yet.
The image is set on Memory allocation already, but the image doesn't
have to have the BindImageMemory called yet. Luckily, we know offset
within a BO has to be 0 for dedicated allocations, so we can just
use the dummy 0 in the address calaculations.

Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.image.export_bind_import_bind

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: b70829708a "radv: Implement VK_KHR_external_memory"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-24 01:50:52 +02:00
Bas Nieuwenhuizen
ea08a296fe radv: Handle VK_ATTACHMENT_UNUSED in color attachments.
This just sets them to INVALID COLOR,  instead of shifting the
attachments together.

This also fixes a number of cases where we use it first and only
then check if it is VK_ATTACHMENT_UNUSED.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-24 01:50:52 +02:00
Dave Airlie
22bca8ef19 radv: reset non-syncobj semaphore context after wait.
When I ported from libdrm, I forgot to add the line to reset
the sem, we just need to reset the context.

This fixes a regression in DOOM.

Fixes: 9ac1432a57 ("radv: port to new libdrm API.")
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-22 00:03:26 +01:00
Dylan Baker
59a141c95a radv: rebase radv_entrypoints_gen.py on anv_entrypoints_gen.py
The two generators forked from each other, and they remain basically the
same. This rebases the radv version on the anv version, but with the
radv changes ported over. The result is that we get rid of the "cat |"
madness and gain mako, correct "generated by" attributions, and write
files out directly.

The only differences between the output is whitespace and comments.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-21 14:27:02 -07:00
Alex Smith
af9d6a8a99 radv: Generate storage image descriptors unconditionally
We can also use storage images internally for resolves, which don't
require TRANSFER_DST usage on the image, so currently we may not create
the needed descriptors.

Just create these descriptors unconditionally.

Fixes: 0e1886efb9 ("radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT")
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-22 06:40:29 +10:00
Dave Airlie
eaa56eab6d radv: initial support for shared semaphores (v2)
This adds support for sharing semaphores using kernel syncobjects.

Syncobj backed semaphores are used for any semaphore which is
created with external flags, and when a semaphore is imported,
otherwise we use the current non-kernel semaphores.

Temporary imports from syncobj fd are also available, these
just override the current user until the next wait, when the
temp syncobj is dropped.

v2: allocate more chunks upfront, fix off by one after
previous refactor of syncobj setup, remove unnecessary null
check.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-21 21:31:54 +01:00
Dave Airlie
b5670beb31 radv/winsys: add syncobj hooks
This just adds syncobj create/destroy/export/import paths into
the winsys interface.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-21 21:31:54 +01:00
Bas Nieuwenhuizen
21d777a122 radv: Add support for VK_KHR_variable_pointers.
Just a trivial enable.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-20 09:13:01 +02:00
Bas Nieuwenhuizen
31469c0265 radv: Add VK_KHR_storage_buffer_storage_class support.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-07-20 09:13:01 +02:00
Dave Airlie
9ac1432a57 radv: port to new libdrm API.
This bumps the libdrm requirement for amdgpu to the 2.4.82.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-20 01:56:04 +01:00
Dave Airlie
aee382510e radv: introduce some wrapper in cs code to make porting off libdrm_amdgpu easier.
This just introduces a central semaphore info struct, and passes it around,
and introduces some wrappers that will make porting off libdrm_amdgpu easier.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-20 01:55:36 +01:00
Alex Smith
f25c7f9f3e radv: Set the RADEON_SURF_OPTIMIZE_FOR_SPACE flag for images
This looks like a regression from df30123794 ("radv: use
ac_compute_surface"). Before that, the opt4Space addrlib flag was set
to true unless the image has FMASK (ac_compute_surface will similarly
only set that flag for images without FMASK).

This saves multiple gigabytes of VRAM on one of our games, and brings
its VRAM utilisation on RADV in line with AMDGPU-PRO and NVIDIA.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-18 16:18:35 +10:00
Dave Airlie
687d241559 radv: don't shadow meta_va.
Coverity warned about dead code below, as meta_va was being shadowed.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-18 16:17:28 +10:00
Emil Velikov
4168c162c5 radv: advertise v6 of the wayland surface extension
Jason updated the Khronos spec to explicitly state that Wayland surfaces
must support VK_PRESENT_MODE_MAILBOX_KHR.

ANV did so since day one (back in 2015)

Cc: mesa-stable@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-07-17 15:24:48 +01:00
Dave Airlie
9ee67467c9 radv: predicate cmask eliminate when using DCC.
When using DCC some clear values don't require a cmask eliminate
step. This patch adds support for black and black with alpha 1,
there are other values, but I don't have access to a comprehensive list.

This works by setting the cmask eliminate predicate when doing the
fast clear, and later when doing the cmask elimination making sure
the draws are predicated.

This increases the fps on Sascha Willems deferred.

Tonga: 580fps->670fps on a Tonga PRO card.
Polaris 730->850fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:43 +01:00
Dave Airlie
8eed291c2c radv/clear: add r32g32b32a32 fast clear support (v2)
We can only fast clear 128-bit images if the r/g/b channels
are the same, and we are using DCC.

For DCC we'll bail out on translate if this isn't true,
and we catch cmask clears explicitly.

v2: remove 64-bit block (Bas), add uint32 as well.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:25 +01:00