The hardware path doesn't support resolving layers, for both
source and destination images.
This fixes a reflection issue when MSAA is enabled which
affects GTA V and probably DIRT3.
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107786
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Gregor Münch <gr.muench_at_gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
In environments where we cannot cache, e.g. Android (no homedir),
ChromeOS (readonly rootfs) or sandboxes (cannot open cache), the
startup cost of creating a device in radv is rather high, due
to compiling all possible built-in pipelines up front. This meant
depending on the CPU a 1-4 sec cost of creating a Device.
For CTS this cost is unacceptable, and likely for starting random
apps too.
So if there is no cache, with this patch radv will compile shaders
on demand. Once there is a cache from the first run, even if
incomplete, the driver knows that it can likely write the cache
and precompiles everything.
Note that I did not switch the buffer and itob/btoi compute pipelines
to on-demand, since you cannot really do anything in Vulkan without
them and there are only a few.
This reduces the CTS runtime for the no caches scenario on my
threadripper from 32 minutes to 8 minutes.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Needed for VK_KHR_create_renderpass2.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
I don't think the hw resolve path can't handle multi-layer images.
This fixes all the:
dEQP-VK.renderpass.multisample_resolve.layers_*
tests on my VI card.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
For a multi-layer subpass resolve we want to make sure we flush all
the layers.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Multisampled source images (ie. color attachments) can be now
DCC compressed, so the driver needs to perform a DCC decompression
pass before resolving
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
To handle the source color image transitions in the same place.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
This helper shares common code before resolving using either
a fragment or a compute shader.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The source and destination image parameters were swapped.
No CTS changes on Polaris10, but I suspect this might
fix something.
Fixes: 2a04f5481d ("radv/meta: select resolve paths")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
radeonsi has a workaround for this, but it uses a R16A16 format,
which vulkan doesn't have, we could probably come up with a work
around but for now just avoid hw resolves.
Fixes:
dEQP-VK.renderpass.suballocation.multisample.r16g16_*norm*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From reading AMDVLK it currently never uses hw resolve paths.
This patch takes from radeonsi which doesn't use hw resolve
for integer formats, and does the same for radv.
This fixes:
dEQP-VK.renderpass.suballocation.multisample*uint tests.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 2a04f5481d (radv/meta: select resolve paths)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Some of the hw resolve passes need the SPI color format setup
correctly.
This fixes lots of 16-bit and 32-bit format tests in
dEQP-VK.renderpass.suballocation.multisample*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Signed-off-by: Dave Airlie <airlied@redhat.com>
Apps can use this for render feedback loops, where things are
defined if they render each pixel only once. However, DCC fails
here, as the level of coherence is a block not a pixel, so disable it.
This is also going to help implementing other stuff.
Even if we optimize this later to only happen if there actually is
a loop (if possible at all ...), then the machinery is still useful
to exclude images accessible by the SDMA queue when that is implemented.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
They are dummy objects but the spec requires layout to not be
NULL, this just makes sure we are creating valid pipeline layout
objects. This will allow us to remove some useless checks.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
And merge radv_meta_save_novertex() with
radv_meta_save_graphics_reset_vport_scissor_novertex().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This will allow us to save/restore the different states on-demand
based on the meta operation. For now, this saves/restores all
states. Compute will follow once the graphics part is done.
The main idea is to merge all save/restore helpers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Unnecessary to double check that handles are not NULL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This is a bug in the app, but I'd rather avoid hanging the GPU,
esp if someone is running in validation and it takes out their
development environment.
v2: get it right, reverse the polarity.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just sets them to INVALID COLOR, instead of shifting the
attachments together.
This also fixes a number of cases where we use it first and only
then check if it is VK_ATTACHMENT_UNUSED.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Commit e1af20f18a changed the shader_info
from being embedded into being just a pointer. The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct. This, however, has
caused a few problems:
1) There are many things which generate NIR without GLSL. This means
we have to support both NIR shaders which come from GLSL and ones
that don't and need to have an info elsewhere.
2) The solution to (1) raises all sorts of ownership issues which have
to be resolved with ralloc_parent checks.
3) Ever since 00620782c9, we've been
using nir_gather_info to fill out the final shader_info. Thanks to
cloning and the above ownership issues, the nir_shader::info may not
point back to the gl_shader anymore and so we have to do a copy of
the shader_info from NIR back to GLSL anyway.
All of these issues go away if we just embed the shader_info in the
nir_shader. There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
There are 3 resolve paths, the fastest being the hw resolver
but it has restriction on tile modes and can't do subresolves,
the compute resolver is next speed wise, but can't handle DCC
destinations, the fragment resolver handles that case.
This will end up with a slow down as currently we hack the
hw resolver paths when they shouldn't work, but we shouldn't
keep doing that.
The next patch removes the hacks.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In order to resolve into DCC enabled dests we need to use
the fragment shader. This reuses the code from the compute
path and implements a resolve path in vertex/fragment shader.
This code isn't used until later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is to rework the surface code like radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The vs vertex generate and fs noop shaders are used in a few places,
so refactor them out.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
I think we should only flush right before an action (draw/dispatch etc.),
as otherwise it is too easy to issue redundant flushes.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This iterates the fast clear flush across the layers in the
specified range.
It also moves the compute resolve flush into the function
and builds the range in there.
This fixes:
dEQP-VK.geometry.layered.* regressions since fast clears.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We need to initialize dcc like we do in the subpass path.
v2: fix initial/final layouts
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
For the hw resolve there is no need to emit any sort
of texture coordinates, so drop them all in the meta path.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.
There are other advantages such as being able to reuse the
shader info populated by GLSL IR.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Before we can read the fmask using the compute shader, we need
to decompress the fmask in place.
This fixes a bunch of remaining failure and hopefully multisampling
in Talos.
This squashes all the radv development up until now into
one for merging.
History can be found:
https://github.com/airlied/mesa/tree/semi-interesting
This requires llvm 3.9 and is in no way considered
a conformant vulkan implementation. It can run a number
of vulkan applications, and supports all GPUs using
the amdgpu kernel driver.
Thanks to Intel for providing anv and spirv->nir,
and Emil Velikov for reviewing build integration.
Parts of this are:
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Authors: Bas Nieuwenhuizen and Dave Airlie
Signed-off-by: Dave Airlie <airlied@redhat.com>