This lets us vectorize gl_TessLevel{Inner,Outer} writes, using a pass
developed for RADV. Not all backends are prepared to handle this, so
we make it optional.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>
VS, TCS, TES, and GS threads must end with a URB write message with the
EOT (end of thread) bit set. For VS and TES, we shadow output variables
with temporaries and perform all stores at the end of the shader, giving
us an existing message to do the EOT.
In tessellation control shaders, we don't defer output stores until the
end of the thread like we do for vertex or evaluation shaders. We just
process store_output and store_per_vertex_output intrinsics where they
occur, which may be in control flow. So we can't guarantee that there's
a URB write being at the end of the shader.
Traditionally, we've just emitted a separate URB write to finish TCS
threads, doing a writemasked write to an single patch header DWord.
On Broadwell, we need to set a "TR DS Cache Disable" bit, so this is
a convenient spot to do so. But on other platforms, there's no such
field, and this write is purely wasteful.
Insetad of emitting a separate write, we can just look for an existing
URB write at the end of the program and tag that with EOT, if possible.
We already had code to do this for geometry shaders, so just lift it
into a helper function and reuse it.
No changes in shader-db.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17944>
We're getting a number of UBSan error while running intel-clc in that
image. It seems that we're the first ones to run into a number of code
paths with intel-clc and it shows a number of undefined behavior
operations like signed extension stuff in NIR/IntelBackend, unaligned
pointer accesses in embedded list iterators, etc...
Preparing some patches in a different MR to fix this.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18788>
Debian (unlike Ubuntu) has a broken libclc package missing files we
would very much like to have in our image, so that intel_clc doesn't
fail. Namely :
/usr/lib/clc/spirv-mesa3d-.spv
/usr/lib/clc/spirv64-mesa3d-.spv
Dropping libclc from the distribution and build int the build & base
test image.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18788>
Debian stable and Fedora do not package the required version for
intel-clc.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18788>
This is needed by Anv to parse GRL (meta opencl kernels).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18788>
now that these aren't always lowered from indirects,
64bit function temp arrays may exist in shaders, and they need to use
all the same rewrite machinery
fixes#7310
SoroushIMG <soroush.kashani@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18728>
somehow 64bit lowering creates patterns like
vec1 64 ssa_1 = undefined
ssa_2 = unpack_64_2x32_split_x ssa_1
and then the 64bit value is never otherwise used. for this case, rewriting
the unpack to just be a 32bit undef allows the 64bit undef to be optimized
out, avoiding spec violations
fixes#6945
SoroushIMG <soroush.kashani@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18728>
some drivers (mostly desktop) are known to ignore this pipeline flag,
thus it can be set unconditionally in pipeline state to deduplicate
pipeline variants without affecting performance
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18787>
an implicit feedback loop occurs when an app happens to bind the same image
as both a framebuffer attachment and a sampler for the same draw
an explicit feedback loop occurs when an app uses fbfetch to read data back
from the framebuffer using input attachments
fbfetch is already handled, but implicit feedback loops require more work:
* detecting them happens on-the-fly
* pipeline variants are required
this handles implicit feedback loops by detecting them at draw time during
barrier updates and then flagging pipeline state change to trigger variant creation.
the bits are then unset when the framebuffer/sampler binds are removed
fixes#7309
fixes (tu):
KHR-GL46.texture_barrier.disjoint-texels
KHR-GL46.texture_barrier.overlapping-texels
KHR-GL46.texture_barrier.same-texel-rw-multipass
KHR-GL46.texture_barrier_ARB.disjoint-texels
KHR-GL46.texture_barrier_ARB.overlapping-texels
KHR-GL46.texture_barrier_ARB.same-texel-rw-multipass
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18787>
this was my (feeble) attempt at tracking validity for swapchain images,
but it doesn't actually work right, and zink_resource::valid is better
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18787>
The comment here is from the EGL_MESA_configless_context era, and maybe
that was true, but EGL_KHR_no_config_context has no such restriction.
The only restriction is that both surfaces be compatible with the
context, but a no-config context is defined to be compatible with any
surface.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18824>
It's a little difficult to see from the diff, but this is effectively
the same calling sequence as before, and more importantly it means the
backend only cleans up backend state rather than needing to call up to
the core.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18824>
... when we're only changing one of the two to zero. I could maybe
understand the old behavior here if we really did need both read/draw to
either be both from winsys or both user, but we don't. You can still
get to that half-and-half state even with the old code, you just need to
set whichever of read/draw to 0 that you want first, and then set the
other to the user fbo.
The first code to introduce a read fb to this path was dbfb375805,
which was trying to make sure we restored both values passed to
glXMakeCurrentRead when going back to fb 0. The commit message suggests
that "the API" doesn't allow split read/draw, but that seems to have
ignored the #ifdef EXT_framebuffer_object code immediately nearby which
did exactly that.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18824>
When multiview is used, the last pre-rasterization stage should export
the layer to the fragment shader. With GPL, the next stage is
implicitly FS.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18699>
When compiling the pre-rasterization stages with GPL, we can't know if
the fragment shader reads the primitive ID or the clip/cull distances
as inputs and we have to export them unconditionally.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18699>
When the fragment shader reads the viewport index as input and the last
pre-rasterization stage doesn't export it, it should be implicitly
zero (ie. first viewport). When the next stage is known, it's already
lowered in NIR, but with GPL we can't because we don't know if the FS
reads it. To fix this use AC_EXP_PARAM_DEFAULT_VAL_000.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18699>