These were being fed to the shader as floats via the vertex
path, so also push them as floats here.
This fixes missing overlay in Sascha Willems demos.
Signed-off-by: Dave Airlie <airlied@redhat.com>
After moving everything to using push constants,
these paths are no longer needed.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The color clear value is uniform and needs only to be emitted from
the frag shader, so just push it down via a push constant,
and remove the vertex buffer completely.
The depth clear value needs to be emitted from the vertex
shader, but is only a single value.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This isn't necessary yet but I'd like to use the range in
some future patches.
[airlied: add new resolve pass]
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This drops the resolve workarounds that change an image
tiling mode behinds it's back, this is horrible and breaks
the image_view->image relationship. Remove all this.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
There are 3 resolve paths, the fastest being the hw resolver
but it has restriction on tile modes and can't do subresolves,
the compute resolver is next speed wise, but can't handle DCC
destinations, the fragment resolver handles that case.
This will end up with a slow down as currently we hack the
hw resolver paths when they shouldn't work, but we shouldn't
keep doing that.
The next patch removes the hacks.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In order to resolve into DCC enabled dests we need to use
the fragment shader. This reuses the code from the compute
path and implements a resolve path in vertex/fragment shader.
This code isn't used until later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds a path to allow compute resolves to be used
for subpass resolves.
This isn't used yet, but will be later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This will allow to add a subpass compute resolve path.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
I want to reuse the same code for the fragment shader
version of the resolve shaders.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we are resolving into an srgb dest, we need to convert
to linear so the store does the conversion back.
This should fix some wierdness seen when we subresolves
hit the compute path.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
As I pointed out for radeonsi, and AMD confirmed, so fix this
in radv as well.
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
radv_bind_descriptor_set cannot be used to bind a push descriptor set
since a push descriptor set does not have a buffer list. However,
there is no need to add the buffers again when restoring a set, so
this fix is also an optimization.
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This just adds the chip in the right places.
We don't set the partial_vs_wave workaround, as radeonsi
doesn't, but have to confirm it's not required.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
LLVM 3.8:
- had broken indirect resource indexing
- didn't have scratch coalescing
- was the last user of problematic v16i8
- only supported OpenGL 4.1
This leaves us with LLVM 3.9 and LLVM 4.0 support for Mesa 17.2.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
There is no reason to advertise transfer ability for formats we can't
use for anything else. This stops some CTS tests hitting internal
error for 64-bit types when they see the transfer flags.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Setting both offset to 0x20 and flat shade results in passthrough
mode instead of the constant.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: f205e19e4f "radv/ac: eliminate unused vertex shader outputs. (v2)"
After successful drmGetDevices2() call, drmFreeDevices() needs to be called.
Fixes: 743315f2 "radv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
drmGetDevices2 takes count and not size. Probably hasn't caused problems
yet in practice and was missed as setups with more than 8 DRM devices
are not very common.
Fixes: 743315f2 "radv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This fixes:
dEQP-VK.glsl.builtin.precision.min.*
dEQP-VK.glsl.builtin.precision.max.*
dEQP-VK.glsl.builtin.precision.clamp.*
The problem is the hw doesn't compare denorms properly,
so we have to flush them, even though the spec says
flushing is optional, if you don't flush the results
should be correct.
The -pro driver changes the shader float mode,
it would be nice if llvm could grow that perhaps.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
SPIR-V defines the f32->f16 operation as flushing denormals to 0,
this compares the class using amd class opcode.
Thanks to Matt Arsenault for figuring it out.
This fix is VI+ only, add a TODO for SI/CIK.
This fixes:
dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Having it in the winsys didn't work when multiple devices use
the same winsys, as we then have multiple contexts per queue,
and each context counts separately.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 7b9963a28f "radv: Enable userspace fence checking."
Loop unroll asserts if it hits a sub, we don't really want
to lower subs as llvm handles these things, but do this for
now, until we can fix loop unroll to work with subs.
Fixes: 14ae0bfa5 (radv: Add NIR loop unrolling)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
All of the dynamic states apply to rasterization & fragment processing,
so we don't need to set them if we don't rasterize.
We don't clear the dirty flags for them though, so we don't miss any
updates for the next pipeline with rasterization.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 76603aa90b "radv: Drop the default viewport when 0 viewports are given."
This still doesn't give us complete pWaitDstStageMask support,
but it should provide enough to be correct if not as efficent as
possible.
If we have wait semaphores we must flush between submits and
flush the shaders as well.
This fixes the remaining fails in:
dEQP-VK.synchronization.op.single_queue.semaphore.*ssbo*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If we are clearing stencil only, we still need to provide a
a valid Z output from the vertex shader, we can't rely
on the depth clear value having any meaning, as we use this
for the position output, and it could get clipped, so we
don't end up clearing anything.
Fixes:
dEQP-VK.renderpass.simple.stencil
since I added S8 support.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This ports
0fcb92c17d
anv: wsi: report presentation error per image request
This fixes:
dEQP-VK.wsi.xlib.incremental_present.scale_none.*
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We would be storing this info twice per image, no need to,
remove it from the surface struct.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is to rework the surface code like radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just makes it easier to do the follow in cleanups of the surface.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Set the bit in the same stage as the timestamp, instead always at top of pipe.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Not much effect on dota2/talos, but positive on deferred.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Timothy Arceri <timothy.arceri@itsqueeze.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This assert wasn't in the original radeonsi code but I added
it without totally understanding the original code, it caused
some regressions in variable-indexing tessellation shaders.
Fixes: e2659176 radeonsi/ac: move vertex export remove to common code.
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>