When binning with a GS, both VS and GS are active. This means that we
could have to use the safe-const variant for the GS. However we only
emitted VPC state for the binning case with the "normal" GS variant.
Emit the VPC state with the safe-const variant too, and select between
the state variants at link time.
This fixes a few tests like
dEQP-VK.spirv_assembly.instruction.graphics.8bit_storage.32struct_to_8struct.uniform_uint_geom
with TU_DEBUG=gmem,forcebin.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35294>
ANGLE requires VK_EXT_transform_feedback or vertexPipelineStoresAndAtomics
to enable OpenGL ES 3.1 support. As we currently don't support this extension,
we enable support for vertexPipelineStoresAndAtomics via DRICONF
to allow XFB emulation on hardware without speculative behaviors around vertices (v13+).
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34758>
Atomics and memory write on the vertex stage have implementation-defined
behaviors on how many invocations will happen. However for some reasons,
atomic counters on GL/GLES specs are quite ambigous here and even have tests
counting how many invocations have been made on VS.... This pass detects
atomics that result in a direct store output of one specific IDVS stage and
ensure it's only executed for said stage.
This allows "dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.*" to
pass under panvk+ANGLE.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34758>
We were unconditionally writing to vs anonymous union on other stages
than VS. this was not causing issues as pan_shader_compile
unconditionally overrite the value for fragment shaders and compute
shaders union is too small to be affecte.
Fixes: 1d21de788d ("pan/bi: Specialize shaders for IDVS")
Reported-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34758>
This works out of the box.
We could have more (up to 127) but this cause timeouts on
"dEQP-VK.api.device_init.create_device_various_queue_counts.basic" and
realistically we only need 2 for Android 14+ HWUI framework.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35310>
If the render area is restricted to a section of the framebuffer, there
is no need to consider all the framebuffer size when configuring the
supertiles, as only the supertiles coordinates of the affected area will
be submitted.
This allow to create supertiles smaller than the ones in case
considering the full screen, reducing the tiles that need to be
processed.
This also fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/13218.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35257>
So far the driver was configuring the supertiles to be less than 256.
But actually, there can be up to 256, not strictly less than 256.
There is one restriction though: the frame width or height in supertiles
must be less than 256.
It also moves this limit to the limits file, which is shared by v3d and
v3dv.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35257>
Looks like it's actually also affected by the memory explosion caused
by zerovram alloc by default in AMDGPU. Though it's very random,
sometimes the job will finish in 40 minutes, sometimes it needs more
than 1h15m. Let's bump the timeout because it's a post-merge job.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35157>
I think this was unlikely to cause issues, even if the qsort()
implementation is unstable.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>
Fixes "left shift of negative value -128" with parallel_rdp/00f93a9497dfbb3b
and UBSan.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>
Without this cast, the left shift is promoted to 'int'.
Fixes "left shift of 50432 by 16 places cannot be represented in type 'int'"
with horizon_zero_dawn/001064f580f8e3be and UBSan.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>
"shift exponent 1020 is too large for 32-bit type 'unsigned int'" with
madmax/25b8180e05220b8c and UBSan
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>
Not all LAVA jobs appear to require kernel modules, so only apply the
kernel-modules overlay when HWCI_KERNEL_MODULES is explicitly set.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35129>
Change how we package kernel modules: instead of storing them in
.tar.zst archives with uncompressed .ko files inside, we now compress
each .ko file individually with ZSTD and bundle them into a plain tar
archive.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35129>
this was a big nightmare because of how the pipe_surface object worked,
but now it's more possible to move the backing multisampled image
onto the base resource and reuse the 'valid' flag instead of the special
surface one for transients
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35240>
Fixes#13241, where iris_bufmgr occasionally deadlocks while allocating buffers.
The deadlock happens when iris_bufmgr.c calls intel_aux_map_unmap_range while
holding the bufmgr lock, with a range that includes pages that were never created,
which can happen because the iris_resource that adds aux mappings will sometimes
use a slightly larger buffer with an offset to ensure the resource is aligned
correctly.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35266>
This mirrors AMDVLK. 128-byte alignment is possible, but DOOM: The Dark
Ages screws up scratch allocation with alignments <256 bytes.
Fixes hangs in DOOM: The Dark Ages.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35152>
On Navi33, certain box sorting modes combined with infinity/-infinity in
the child AABBs cause image_bvh64_intersect_ray to return garbage node
pointers.
To avoid this, convert infinity to the maximum representable
floating-point value, which will still intersect with any non-inf ray.
Fixes consistent hangs in DOOM: The Dark Ages.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35254>
the intent of this option is to create a single capture for the lifetime
of the app, which is great for unit test debugging, and this instead
created a capture for every queue submission, which is a nightmare
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35292>