From the OpenCL specification:
`CL_MEM_KERNEL_READ_AND_WRITE`: This flag is only used by
clGetSupportedImageFormats to query image formats that may be both
read from and written to by the same kernel instance. To create a
memory object that may be read from and written to use
CL_MEM_READ_WRITE.
If an application follows the instructions above, i.e. query a list of
supported image formats, using `CL_MEM_KERNEL_READ_AND_WRITE` as
input, and then attempts to create an image using one of the supported
image formats, by calling `clCreateImage` and passing
`CL_MEM_READ_WRITE`, the call to the image creation entry point should
succeed. This instead fails on Mali devices with the error
`CL_IMAGE_FORMAT_NOT_SUPPORTED`.
Rusticl fails when validating the image format against its supported
flags. Formats that support `PIPE_BIND_SHADER_IMAGE` have their
supported flags set as `CL_MEM_WRITE_ONLY` and
`CL_MEM_KERNEL_READ_AND_WRITE`.
This changes the supported CL flags to be `CL_MEM_WRITE_ONLY` for
`PIPE_BIND_SHADER_IMAGE` and `CL_MEM_READ_WRTE |
CL_MEM_KERNEL_READ_AND_WRITE` for `PIPE_BIND_SAMPLER_VIEW |
PIPE_BIND_SHADER_IMAGE`.
Fixes: 3386e142 (rusticl: support read_write images)
Fixes OpenCL-CTS test: `test_image_streams` on Mali. Invocation:
```
test_image_streams write 1D CL_RGB CL_SIGNED_INT8
```
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
(cherry picked from commit e77c984cef)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
When converting the index buffer from 4-bytes to 2-bytes, we use the
uploader for the job. Since commit b3133e250e we do an uploader alloc
ref, which releases the uploader buffer if there is no enough space,
creating a new one.
The problem happens when we also need this buffer because it is the one
containing the index buffer to convert. This happens, for instance, if
we need to convert the primitives because they are not supported (e.g.,
converting quads to triangles), as this is done
also using the uploader.
The solution is to ensure the uploader's buffer has an extra reference
so when released, it is not destroyed. This can easily achieved by
calling first pipe_buffer_map_range(), which is required to access the
buffer, and it increases the references.
This fixes `spec@!opengl 1.1@longprim`.
Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 48c086cb42)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
R300_PACKET3_3D_CLEAR_HIZ encodes COUNT in 14 bits (COUNT[13:0]), so a
single packet can clear at most 0x3fff dwords.
Large depth surfaces on R5xx can require more HiZ dwords than that.
When we emitted a single packet, COUNT truncated and part of HiZ RAM
remained uncleared, which could show up as HyperZ corruption.
Emit CLEAR_HIZ in chunks of R300_CLEAR_HIZ_COUNT_MAX and reserve enough
atom space for the worst-case packet count derived.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/360
Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
(cherry picked from commit fddc101070)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Introduce r300_hyperz_pipe_count and use it in\nr300_setup_hyperz_properties.\n\nRV530 selects pipe topology from NUM_Z_PIPES, while other families use\nNUM_GB_PIPES. Keeping this in one helper avoids duplicated family checks\nand prepares follow-up HiZ clear sizing changes to reuse the same rule.
(cherry picked from commit e97ac38ff3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Ian Romanick reported some "undefined behaviour" warnings during some
not specified tests, relating to introduction of RGB[A}16_UNORM formats
in merge request
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588
This due to overflowing the 32-bits masks[], and then during assignment
the red/green/blue/alphaMask fields in struct gl_config when using a 16
bpc format. Iow. the red/green/blue/alphaMask would not be usable.
Suppress this warning by setting masks[] to zero for unorm16 formats,
just as was previously done for is_float16, ie. fp16 formats.
16 bpc formats are only exposed for display on non-X11 WSI target
platforms like GBM+DRM, Wayland, surfaceless, and these platforms do
not use the info in red/green/blue/alphaMask at all, so the "undefined
behaviour" is meaningless.
Fixes: f2aaa9ce00 ("dri,gallium: Add support for RGB[A]16_UNORM display formats.")
Reported-by: Ian Romanick @idr
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit ab94515b0a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Fix a crash in image descriptor emission caused by stale image_mask bits.
Root cause:
- set_shader_images used a shift expression with count==64 when clearing
image_mask, which is undefined behavior in C.
- This could leave image_mask inconsistent with actual image bindings,
so panfrost_emit_images() might dereferences NULL image resources.
Fixes:
- Use 64-bit-safe bit helpers for mask updates to avoid invalid shifts.
Crash observed when running: OpenCL-CTS api/test_api
Backtrace:
#0 util_image_to_sampler_view (v->resource is NULL)
#1 panfrost_emit_images
#2 panfrost_update_shader_state
#3 panfrost_launch_grid_on_batch
#4 panfrost_launch_grid
Backport-to: *
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit c1770565f3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
I have been running into crashes in this function when using blender.
Some of the entries in ice->state.framebuffer.base.cbufs[0] can
apparently have the texture field be null, which was causing a segfault
in this loop.
In my case, nr_cbufs was 3, and the first two cbufs entries had a null
texture and format set to PIPE_FORMAT_NONE. The last entry had format of
PIPE_FORMAT_R16G16_FLOAT and a non-null texture.
Adding this null check before attempting to dereference the texture
fixes the crash for me and allows blender to work normally.
Fixes: ca96f8517c ("iris: remove uses of pipe_surface as a pointer")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit e16c8cc579)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
It didn't save states properly. The only correct place to save them is
si_blitter_begin. Unfortunately, we can't skip saving and restoring
those states because we don't know in advance whether the rectangle path
will be used.
Cc: mesa-stable
Reviewed-by: Pierre-Eric
(cherry picked from commit 556ceb1b75)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
The referenced commit switched from a passthrough shader
to fs_clear_color[write_all_cbufs=0]. It shouldn't matter since
the shader isn't supposed to be executed - it's only setup to get
the first color output active.
On some chips (gfx8) it seems to cause issues (hangs or page fault)
for some piglit tests, eg:
framebuffer-blit-levels draw stencil
To fix this, introduce a 3rd variant, where a constant buffer isn't
required and instead the color is hardcoded in the shader.
Fixes: ca09c173f6 ("gallium/u_blitter: remove UTIL_BLITTER_ATTRIB_COLOR, use a constant buffer")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2ff9fa8b72)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Variants can modify which outputs get written so we must update
these fields otherwise spi_shader_col_format will be incorrect.
This can happen for instance with uniforms inlining:
uniform bool depth_only;
void main() {
if (depth_only) return;
...
}
When depth_only is true, this shader becomes empty after uniforms
inlining but spi_shader_col_format wasn't updated properly,
causing a hang.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14737
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 88986dcc9c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
with the unlowering pass, there is no longer a separate gl_LastFragData variable,
so this workaround just breaks color outputs
fixes dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data
cc: mesa-stable
(cherry picked from commit 4b2022a8f5)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
When destroying H264/5 decode context we check the profile from decoder to
free the H264/5 PPS/SPS objects, but decoder is only created when decoding
first frame so these objects will never get freed in case decoder is NULL.
Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit 5134d37e7d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
This code path is usually used by lavapipe when importing dmabufs, not
for output.
The resulting size_required is then used to calculate the size
requirements for VkMemoryRequirements2 etc. Requiring a multiple of
LP_RASTER_BLOCK_SIZE - 4 - can eventually result in lavapipe rejecting
dmabuf imports.
An example is YUV420 at a resolution of 1680x1050 produced by Gstreamer
1.28 - e.g. from a screencasts. In this case we currently compute a size
of 3235840, while other drivers like radv compute 3225600. The actual
size is 3227648, fitting into the later but not the former.
Removing the alignment brings lavapipe in line with other drivers.
Cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
(cherry picked from commit 0bbc26d2c4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
The preprocessor symbol we want is `PAN_ARCH`, not `MALI_ARCH`.
Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 3945421c17)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
We were computing some positions using `void*` rather than pointers to
the appropriate structures. This caused bad pointers, the effect of
which depended on the current memory environment -- tests related to
texel buffers could pass or not depending on what other tests had run
previously.
Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 0142e2e5e3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Allows a uniform name to be passed to force_explicit_uniform_loc_zero
allowing us to set that uniform to an explicit location of zero.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 87ae5cab94)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Vertex shaders shorter than four instructions can hard-lock R3xx GPUs.
This seems to happen in combination with a small vertex count. This was
seen before, most notably with dummy shaders, but the earlier fix only
removed those dummy shaders, so some occurrences could still slip
through the cracks. Pad all vertex shaders to four instructions on R3xx.
Reviewed-by: Filip Gawin <filip@gawin.net>
Fixes: c6aa639ba9 ("r300: skip draws instead of using a dummy vertex shader")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/337
(cherry picked from commit 9b12664b72)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
It calls both for some reason but never handles any other booleans than
32-bit. This was probably a mistake.
Fixes: e63a7882a0 ("etnaviv: call nir_lower_bool_to_bitsize")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
(cherry picked from commit 6fb3995659)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
Zink uses the output primitive of the last vertex stage when deciding
the raster primitive. When we generate the gs the output primitive
depends on the raster primitive.
Not only does the generated gs output primitive have no value in chosing
the raster primitive, it can also get us stuck with the last raster
primitve which is of course incorrect.
Ignore it for generated shaders.
Cc: mesa-stable
(cherry picked from commit d526bbc29b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
Commit 0d4aa5f55f introduced the watermark to optimize the guardband
state changes and always computed new_distance as MAX2(distance,
watermark).
That is correct for point/line paths where distance > 0, but it keeps a
non-zero discard distance alive when the next draw sets distance = 0
(triangles). This leaks wide point/line clip-discard state into later
triangle draws and can clip away large parts of geometry (as observed in
Sauerbraten). Only apply the watermark when distance > 0 and reset it to
zero otherwise so triangle draws disable clip-discard as intended.
Fixes: 0d4aa5f55f ("r300: pop-free clipping")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14959
(cherry picked from commit ce33f82f83)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
Lavapipe relies on true udmabuf support for dmabuf export allocation.
This changes aligns the behavior with both llvmpipe_allocate_memory_fd
and llvmpipe_import_memory_fd.
Fixes: 7d0a631f20 ("llvmpipe: export dmabuf caps for kms_swrast")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
(cherry picked from commit 5ab8c8a439)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.
So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.
Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.
Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.
Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 5454221cfb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
allowed precision mismatches on uniforms, however if you lower precision on
16-bit consts, then this error triggers instead.
So here we relax the type matching and just make sure we match int vs
float.
Fixes: 0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5337
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 73bc604128)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
AMD docs support this:
R5xx Acceleration v1.5 says safest handling for ZFUNC changes is to disable
HiZ except specific LESS/LEQUAL and GREATER/GEQUAL transitions.
ATI OpenGL Programming and Optimization Guide advises avoiding ALWAYS when
trying to benefit from HiZ so that would imply fglrx also disables HiZ
there.
On RV530 this fixes the following dEQPs:
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.43
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.74
Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8093
(cherry picked from commit b0f019f8cf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The DISCARD_WHOLE_RESOURCE path in vc4_map_usage_prep() replaces the
resource's BO with vc4_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in vc4_submit_setup_rcl_surface(), any pending
write job would store to the new BO instead of the old one, corrupting
the new written data.
This is the same bug that was fixed in v3d in the previous commit.
Fixes: 18ccda7b86 ("vc4: When asked to discard-map a whole resource, discard it.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit ecb6c5d555)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
The DISCARD_WHOLE_RESOURCE path in v3d_map_usage_prep() replaces the
resource's BO with v3d_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in emit_rcl() any pending write job would
store to the new BO instead of the old one, corrupting the new
written data.
This is adressed by flushing all pending write jobs affecting the
resource before replacing its BO.
This fixes multiple tests where data copied to a renderbuffer was
overwritten by a previos GPU clear. Test are from the subgroup:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.*
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 1eaa46da09)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
We're setting this in the non-DRI codepath, but this was missed when we
started embedding the VA driver into libgallium. This means we no longer
were able to use VA-API from meson devenv, like we could before.
Fixes: 212d57f7e6 ("targets/va: Build va driver into libgallium when building with dri")
(cherry picked from commit 7e4744909b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
This change is useful when the compute shader is called multiple
times with the atomic operations enabled. It fixes some data
coherency issues. This is done by moving
evergreen_emit_atomic_buffer_setup() after r600_flush_emit().
This change is also a partial fix for compute_shader.pipeline-compute-chain.
In this specific case, it makes the memory barrier working.
This change was tested on cayman and barts; it makes these tests
fully deterministic:
khr-gl4[2-6]/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
khr-gles31/core/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
deqp-gles31/functional/synchronization/inter_call/without_memory_barrier/atomic_counter_dispatch_.*_calls_.*_invocations: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit dad942b468)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>