At one point in time, we may have used the mapping to ISL_FORMAT_RAW for
certain buffer surfaces but that time has long since passed. This fixes a
bug where doing format queries on VK_FORMAT_UNDEFINED would assert-fail.
In d8347f12ea, we added support for
skipping SIMD8 generation when the program local size is too large for
SIMD8 to be usable. This change was missed in that commit.
This bug would impact gen7 platforms when the compute shader local
size is greater than 512, and gen8 platforms when the local size is
greater than 448.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Also, we don't actually need it for clipping because meta always colors
inside the lines and, for all other operations, the user is required to set
a scissor. Since DRAWING_RECTANGLE stalls the GPU, we want to emit it as
little as possible.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This is in contrast to emitting it directly in vkCmdPipelineBarrier. This
has a couple of advantages. First, it means that no matter how many
vkCmdPipelineBarrier calls the application strings together it gets one or
two PIPE_CONTROLs. Second, it allow us to better track when we need to do
stalls because we can flag when a flush has happened and we need a stall.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This has been declared as a uint since SNB but it's only one bit.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Instead of blasting it out as part of the pipeline, we put it in the
command buffer and only blast it out when it's really needed. Since the
PUSH_CONSTANT_ALLOC commands aren't pipelined, they immediately cause a
stall which we would like to avoid.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
The old code called this on the prelinked shader list,
but at this point we have the linked shader, so we should
call the interface on that alone.
This fixes a regression in:
dEQP-GLES31.functional.ssbo.layout.random.all_per_block_buffers.13
introduced in
5b2675093e
glsl: handle implicit sized arrays in ssbo
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96228
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reported-by: Mark James
Signed-off-by: Dave Airlie <airlied@redhat.com>
These all show up as unused warnings here, so drop them for now.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Now that we have the better nir_foreach_block macro, there's no reason to
use the archaic block version for everything.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Instead of doing a add and then mask out the upper bits, we can
simply do a add with a half wide type (this, of course, assumes
the hw can actually do it...), so we'll get the required zero
in the upper bits automatically.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
There were complaints from a mingw build:
u_draw.h:134:14: error: invalid conversion from ‘uint {aka unsigned int}’
to ‘pipe_prim_type’ [-fpermissive]
Reviewed-by: Brian Paul <brianp@vmware.com>
For a load locked, we might not use the first result but the second
result is the predicate result of the locking. In that case the load
splitting logic doesn't apply (which is designed for splitting 128-bit
loads). Instead we take the predicate and move it into the first
position (as having a dead result in first def's position upsets all
sorts of things including RA). Update the emitters to deal with this as
well.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
For user-supplied constbufs, fileIndex is 0. In that case, when we
subtract 1, we'll end up loading from constbuf offset -16. This is
illegal, and there are asserts to avoid it. Normally we'd just DCE it,
but no point in generating the instructions if they're not going to be
used.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
SPIR-V specifies that a bunch of stuff gets applied to types. This means
taht a local variable could get, for instance, an array stride. Just
because it's pointless doesn't mean you'll never see it.
Tested with new piglit gl-3.2-adj-prims test.
v2: re-order trisadj and tristripadj code, per Roland.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The unfilled index translator/generator functions should only be
called when the primitive mode is one of the triangle types.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
The original mode test was valid before we had GS support.
Regression tested with full piglit run. Though, I don't think we have
any piglit tests that exercise drawing unfilled adjacency primitives.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Only one dEQP io_blocks test fails. This test fails for the same reason
as the match_different_member_struct_names test in a previous commit.
dEQP-GLES31.functional.separate_shader.validation.io_blocks.match_different_member_struct_names
v2: Add to release notes.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
v2: Also support GL_EXT_shader_io_blocks. It's pretty much identical to
the OES extension. Suggested by Ilia.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>