The information about a shader using discard/kill is interesting
to other parts of the driver, as depth states need to programmed
differently depending on this. As we don't want to deal with
NIR/TGSI differences in other parts of the driver, track this
usage in the common etna_shader_variant.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7396>
Some depth config states changes require the depth cache to be
flushed, leading to a GPU hang if not done. As the conditions that
require the flush are not toally clear, better be safe than sorry
and always flush the cache on depth state changes.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7396>
The RA_EARLY_DEPTH is a depth state and so must be emitted on
dirty ZSA, instead of dirty SHADER.
Fixes: 785e2707b0 (etnaviv: Fix disabling early-z rejection on GC7000L)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7396>
Multiple threads may access st_update_* function at same time. Issues
happen when the threads modify lists managed by shader compiler.
Issues were found with script that runs multithread tests 1000 times in
a row with MESA_GLSL_CACHE_DISABLE=1 set. Problems start when 2
simultaneous st_create_[vp|fp]_variant calls start to compile a new
shader variant for the same program and various nir passes use and
modify same exec_lists.
Example failure:
deqp-egl: ../src/compiler/glsl/list.h:575: exec_list_validate: Assertion `node->next->prev == node' failed.
v2: instead of introducing new mutex, lock shared state
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7418>
ctz is a CL2.0 opcode but 3.0 requires it as well so just add support
for it.
Tested against CTS integer_ops integer_ctz test.
(long line broken up)
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7468>
Identical to GL_MESA_pack_invert in effect, just need to check for a
different enum value for GLES vs GL. The spec claims that "OpenGL 1.5 or
OpenGL ES 1.0 are required", but ReadPixels isn't a thing for ES1 so we
only enable it for ES2+.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3156>
This newly enables the extension for r100 and vieux. As far as I can
tell, that's safe to do. vieux's ReadPixels is just _mesa_readpixels,
which clearly already handles it correctly because the extension is
enabled for classic swrast. r100 has some custom acceleration paths
before falling down to _mesa_readpixels, which winds its way through to
r100_blit, which already has code to handle pack->Invert being set.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3156>
subpass_idx is of type uint32_t.
Fix defects reported by Coverity Scan.
Macro compares unsigned to 0 (NO_EFFECT)
unsigned_compare: This greater-than-or-equal-to-zero comparison of
an unsigned value is always true. subpass_idx >= 0U.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7325>
We need to consider shader calls as potential writes to their payloads.
For other ray-tracing intrinsics, we may not have a shader payload
pointer and have to treat them more like a barrier. We also need to
ensure that global and SSBO reads/writes aren't propagated across shader
call intrinsics.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
The SPV_KHR_ray_tracing extension adds 6 new storage classes which is a
bit on the ridiculous side. In order to avoid adding that many variable
modes to NIR, we make a few simplifying assumptions:
1. CallableData and RayPayload data actually lives on the stack
somewhere, presumably in the caller's stack. We assume that these
are no different from global variables and use nir_var_shader_temp
for them. We still need a separate storage class for the incoming
variants but only so we can figure out which one the incoming one
is and lower it to something useful.
2. There's no difference between incoming CallableData and RayPaolad
data. We can use a single storage class for both.
3. ShaderRecordBuffer data is just a global memory access. This lets
us avoid NIR variables entirely and just fetch the pointer via the
shader_record_ptr system value and it's accessed using a 64-bit
global memory pointer.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
If we were desperate to reduce bits, we could probably also use
shader_in/out for hit attributes as they really are an output from
intersection shaders and read-only in any-hit and closest-hit shaders.
However, other passes such as nir_gether_info like to assume that
anything with nir_var_shader_in/out is indexed using vec4 locations for
interface matching. It's easier to just add a new variable mode.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
These are a bit more tricky than most because they're matrix system
values. We make the intentional choice here to not bother with allowing
indirect addressing of columns for these. Since they're system values,
they may be magically constructed somehow or come from weird hardware so
it's easier on back-ends to just handle any indirects with bcsel.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
This is an operation we have to do already for nir_vector_extract and
I'm about to do something very similar for matrix columns. Having a
more generic helper is useful.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
Missing in this commit are NIR intrinsics for the ObjectToWorld and
WorldToObject built-ins. Those are matrices and so they take a bit more
work and justify a separate commit. For now, we add the enums and leave
the SYSTEM_VALUE <-> nir_intrinsic conversion commented out.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
For now, we assume its a 64-bit global pointer.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
We already fail in these same cases in vk_desc_type_for_mode. These
additional assertions are just extra code to update.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
This particular ordering makes them conveniently match
VkShaderStageFlagBits, which is a property we already take advantage
of in the previous shader stages.
Abbreviations are based on the ones used in glslangValidator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>
some drivers may have pre-lowered gl_ClipDistance to 2x vec4 to match hw
usage, so for those cases we'll be getting deref_var here and then components
will be stored to the deref at some point
fixesmesa/mesa#3480
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6563>
we can just check the bits using clip_distance_array_size here to simplify
everything and more easily determine if we need to be running this pass
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6563>
We are limited to 64 threads per dispatched group, regardless of what
num_cs_threads claims, so advertise that limit correctly.
Fixes (on TGL and up):
dEQP-VK.subgroups.size_control.compute.required_subgroup_size_min
and other *.required_subgroup_size_min tests.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7453>
Add missing GetPhysicalDeviceImageFormatProperties2 logic for the extension
and enable it.
Also stop exposing optimal tiling for formats which are linear only, to
simplify dealing with those.
Passes dEQP-VK.drm_format_modifiers.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6940>
Use VkImageFormatListCreateInfo, and enable VK_KHR_image_format_list to
expose it. (and reorganize linear fallback code a bit)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6940>
Validation errors mention the pretty-printed instruction including
operands with the reserved % character, which caused vasprintf to
expect more format arguments than aco provided.
Fixes: c2b1978aa4 ("aco: rework the way various compilation/validation errors are reported")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7442>
As it needs firmware to probe, and we cannot bundle it within the kernel
image because it is incompatible with the GPL.
Currently we rebind the driver after boot but that's slow and fragile,
as unloads of DRM drivers aren't generally tested.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7420>
Runs are much more stable now with the new kernel, and lots of tests
do pass now.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7420>
this coincidentally worked because radeonsi has a hardcoded value of 64, but
other drivers do not use this value and then things are subtly broken
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7452>