If there is a separate stencil in use, the resource invalidation
flag was not being removed for the depth buffer as rsc was assigned
to the separate stencil.
Fixes: 6ff509593c ("v3d: Only apply TLB load invalidation on first job after FB state update")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36030>
(cherry picked from commit 7d51a10cda)
The border color of the rv770 gpu behaves the same way as
the evergreen border color. This change updates the software
accordingly.
This change is enabled for all the pre-evergreen gpus.
This change fixes 120 piglit tests. The rv770 ci is updated
as well.
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34502>
(cherry picked from commit 5bee7c0b12)
This is the backport of 0c0b978938 "radeonsi: set NEVER as
the depth compare func if depth compare is disabled".
The function r600_tex_compare arguments are updated with the "const"
keyword.
This change fixes the test below which was broken after 0c6e56c391:
khr-gl4[5-6]/incomplete_texture_access/sampler: fail pass
Fixes: 0c6e56c391 ("mesa: (more) correctly handle incomplete depth textures")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Acked-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35968>
(cherry picked from commit d9baadcfb5)
this addresses an ancient race condition where unmapping memory
in one thread at the same time memory is mapped in a different thread
could proceed without synchronization and result in the second thread
writing to unmapped memory
this was the actual cause of #12533
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36076>
(cherry picked from commit 841080ed42)
When we need to perform format conversion, we use temporary surface
allocated with vlVaHandleSurfaceAllocate. If the driver requires
clearing the surface on allocation, it will create a fence that
must be destroyed later.
Fixes: 0f20a3a4f1 ("frontends/va: Add surface pipe_fence for vl_compositor rendering")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13198
Reported-by: Mariusz Białończyk <manio@skyboo.net>
Tested-by: Mariusz Białończyk <manio@skyboo.net>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36040>
(cherry picked from commit 2d6560611f)
First, we handle the case where GetMemoryFdKHR fails. This is unlikely
and, if it's a Mesa driver it probably won't stomp the FD but we should
be extra careful. Then, we can close the dma-buf file immediately after
we call drmIoctl() on it, ensuring we don't leak the dma-buf file
descriptor if drmIoctl() fails. If ImportSemaphoreFdKHR() fails, then
we need to clean up the sync file.
Fixes: d4f8ad27f2 ("zink: handle implicit sync for dmabufs")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36048>
(cherry picked from commit de4224a57c)
This issue was generating unwanted write accesses that
could overwrite previous operations.
Note: This functionality could also be tested with
nir_lower_wrmasks. This problem seems to only affect
the ssbos.
This change was tested on cypress, barts and cayman. Here are the tests fixed:
khr-gl4[3-6]/compute_shader/pipeline-pre-vs: fail pass
khr-gl4[5-6]/direct_state_access/queries_functional: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/advanced-cast-cs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/advanced-cast-fs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_storage_buffer_object/advanced-switchbuffers-cs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_storage_buffer_object/advanced-switchprograms-cs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_storage_buffer_object/basic-operations-case1-cs: fail pass
khr-gl4[3-6]/shader_storage_buffer_object/advanced-switchbuffers-cs: fail pass
khr-gl4[3-6]/shader_storage_buffer_object/advanced-switchprograms-cs: fail pass
khr-gl4[3-6]/shader_storage_buffer_object/basic-operations-case1-cs: fail pass
khr-gl4[4-6]/texture_buffer/texture_buffer_max_size: fail pass
khr-gles31/core/compute_shader/pipeline-pre-vs: fail pass
khr-gles31/core/shader_image_load_store/advanced-cast-cs: fail pass
khr-gles31/core/shader_image_load_store/advanced-cast-fs: fail pass
khr-gles31/core/shader_storage_buffer_object/advanced-switchbuffers-cs: fail pass
khr-gles31/core/shader_storage_buffer_object/advanced-switchprograms-cs: fail pass
khr-gles31/core/shader_storage_buffer_object/basic-operations-case1-cs: fail pass
khr-gles31/core/texture_buffer/texture_buffer_max_size: fail pass
khr-glesext/texture_buffer/texture_buffer_max_size: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35830>
(cherry picked from commit 9e5d11bff3)
the semaphore stage is VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
so the src access barrier must also use this in order to ensure it happens
after the acquire
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35940>
(cherry picked from commit 69b5abee14)
We haven't wired this up in the Midgard compiler, so we can't expose
sample shading on Midgard GPUs. This all seems fixable, because the KILL
instruction can update the coverage without the kill-flag (yeah, a bit
confusing naming), but until someone puts in the time to wire up that,
let's just disable the functionality to avoid crashes.
Fixes: 6bba718027 ("panfrost: Advertise SAMPLE_SHADING")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35881>
(cherry picked from commit 504e511c44)
We're now re-emitting push constants at the
start of compute batches, so we can avoid the
overhead of restoring them.
CC: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35873>
(cherry picked from commit 6f38d58db3)
Per Ken Graunke, corruption issues with push
constants for render batches on Gen12 graphics
have been observed and worked around by re-emitting
push constants at the start of the batch buffer.
We're seeing similar issues with compute batches,
so we'll apply the same work-around.
Fixes corruption reported in Blender on ADL/RPL
CC: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35873>
(cherry picked from commit 8fd008a45f)
For each dimension, we `threads *= lws`.. which is still zero if threads
is initialized to zero.
Fixes: eca4f0f632 ("rusticl/kernel: check that local size on dispatch doesn't exceed limits")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35864>
(cherry picked from commit 6bc47e65d7)
Piglit arb_texture_buffer_object-render-no-bo was generating
gpu resets because the uniform stream was missing the last
Fragment Shader uniform. So it was reading instead of the last
fragment shader uniform the first uniform of the vertex shader.
And using that unrelated VS uniform as the sampler address where
the texture should be read.
So now if a buffer object is not bound for a texture buffer object
we write the texture state base address to 0 (NULL) so the default
texture state is used.
So only is needed to set the 4 lower bits of the tmu_p0 with
the bit-mask of word enables.
Fixes: bb8285c258 ("v3d: add support for no buffer object bound")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35847>
(cherry picked from commit 0f8c681c5c)
this is required by the spec. fixes
gles-3.0-transform-feedback-uniform-buffer-object.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Backport-to: 25.1
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
(cherry picked from commit 03a5b7f25c)
nvidia hardware can't render to linear surfaces except under some
very limited circumstances, one of those is if Z is enabled.
However there appears to be some combination of gnome-shell, and
prime (with 2 nouveau cards) where we end up getting through the
GL API to the situation where we try this. This in a production
build causes the kernel to crash with a GR error.
However there existed a period of time where the hw/kernel due to
some other random hw misconfiguration didn't crash when this happened
and doing this was prefect fine. (linear + tiled Z).
This restores the userspace code to do this and just ignores the
Z buffers if we are asked for linear rendering, and seems sufficient
to fix the problem.
I do understand this is a workaround, but I think it's reasonable to
add to the nouveau GL driver at this time since we don't want to
maintain if for ever and it probably should fix a bunch of wierd
user problems with multi gpu and nouveau.
Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35221>
(cherry picked from commit 06e8db646a)
The new layout affects the whole buffer so it needs to be done
on a full clear.
This fixes this piglit test on a RX 6800 XT:
ext_framebuffer_multisample-accuracy 6 depth_resolve small depthstencil
Fixes: 75a03d733a ("radeonsi: simplify and fix enable_tc_compatible_htile_next_clear logic")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35582>
(cherry picked from commit 04d283c628)
The register footprint could limit occupancy. We need to take this into
account to avoid deadlocks when a kernel is using barriers.
Fixes: 6d85cd6a3b ("freedreno: Implement get_compute_state_info for Adreno 6xx/7xx")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35745>
(cherry picked from commit 2e00925c81)
It was absent when initialising a panfrost_resource from a winsys handle.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 7da251fc72 ("panfrost: Check in sources for command stream")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34224>
(cherry picked from commit cf4a137459)
When panfrost_resource_init_afbc_headers() fails, freeing the newly
created resource is not enough, because we need to unreference its BOs.
This will also take care of freeing its resource label.
Also replace instances of FREE() in error-handling paths with
panfrost_resource_destroy(), as it is capable of handling partially
initialised resources.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: e3f2bc7963 ("panfrost: handle mmap failures")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34224>
(cherry picked from commit 32b128be01)
The snorm formats are not compatible with the srf flag
which was set by the emit_image_load_or_atomic() function.
In this specific case, "use_const_fields" is not set which
implies that the format definition is local. The other
supported formats do not require the srf flag as well.
This change was tested on cypress, barts and cayman. Here are the tests fixed:
khr-gl4[2-6]/shader_image_load_store/basic-allformats-load: fail pass
khr-gl4[2-6]/shader_image_load_store/basic-alltargets-loadstorecs: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/basic-allformats-loadstorecomputestage: fail pass
khr-gl4[5-6]/es_31_compatibility/shader_image_load_store/basic-alltargets-loadstorecs: fail pass
khr-gles31/core/shader_image_load_store/basic-allformats-loadstorecomputestage: fail pass
khr-gles31/core/shader_image_load_store/basic-alltargets-loadstorecs: fail pass
deqp-gles31/functional/image_load_store/2d/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/2d/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/2d_array/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/2d_array/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/3d/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/3d/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/buffer/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/buffer/format_reinterpret/rgba8_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/cube/format_reinterpret/r32f_rgba8_snorm: fail pass
deqp-gles31/functional/image_load_store/cube/format_reinterpret/rgba8_rgba8_snorm: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35548>
(cherry picked from commit d27ed38d1a)
The mode r10g10b10a2_snorm processed as vertex on palm at the
hardware level doesn't follow the current standard. Indeed, the .w
component (2-bits) is not calculated as expected. The table below
describes the situation.
This change fixes this issue by adding three gpu instructions at
the vertex fetch shader stage. An equivalent C representation and
a gpu asm dump of the generated sequence are available below.
.w(2-bits) expected palm
0 0.0 0.000000
1 1.0 0.333333
2 -1.0 0.666667
3 -1.0 1.000000
w_out = (4.*w_in > 1. ? 1. : 4.*w_in) - (w_in > 0.5 ? 2. : 0.);
0002 00000008 A0080000 ALU 3 @16
0016 00000C02 A0000CC0 1 y: MOV*4_sat __.y, R2.w
0018 801F8C02 600004A0 w: SETGT*2 __.w, R2.w, 0.5
0020 839FC4FE 60400010 2 w: ADD R2.w, PV.y, -PV.w
Note: The rv770 and cypress don't need this correction. This is
definitely a hardware change between these gpus.
This change was tested on palm, barts and cayman. Here are the tests fixed:
spec/arb_vertex_type_2_10_10_10_rev/arb_vertex_type_2_10_10_10_rev-array_types: fail pass
deqp-gles3/functional/draw/random/124: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/normalize/int2_10_10_10/components4_quads1: fail pass
deqp-gles3/functional/vertex_arrays/single_attribute/normalize/int2_10_10_10/components4_quads256: fail pass
khr-gl43/vertex_attrib_binding/basic-input-case5: fail pass
khr-gl44/vertex_attrib_binding/basic-input-case5: fail pass
khr-gl45/vertex_attrib_binding/basic-input-case5: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32427>
(cherry picked from commit e8fa3b4950)
u_blitter sets a viewport transform with depth range [-1,1], which is
outside the [0,1] range that is allowed by opengl.
The mali hardware docs state that setting the LOW_DEPTH_CLAMP register
outside of [0,1] is undefined behavior. We haven't observed any problems
with this so far, but better to fix it.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 810135fb42 ("gallium/u_blitter: Fix depth.")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35225>
(cherry picked from commit b8c7fcda27)
The HW specifications require the size of shader resource tables to be a
multiple of 4, otherwise correct behaviour is not guaranteed.
Fixes: 713f5c3600 ("panvk: Prepare the cmd_desc_state logic for Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35553>
(cherry picked from commit 48e8d6d207)