In the WAIT_ALL case in spin_wait_for_sync_file(), we were returning the
moment we saw the first success. However, this isn't a wait-all, it's a
bad wait-any. We should instead just continue on to check the next sync
until we've ensured that every sync in the array has a sync file. The
only reason this wasn't blowing up in our face is because it only
affects non-timeline drivers (pretty rare these days) and because most
of the places where we use WAIT_PENDING on non-timeline drivers is to
guard a sync file export and those typically have only a single sync in
the array.
Cc: mesa-stable
Reviewed-by: Gurchetan Singh <gurchetansingh@google.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38635>
(cherry picked from commit e4e619d685)
The shader-db functionality was interfering with the error
filters.
Two new options are added: R600_DEBUG=shaderdb and
R600_DEBUG=precompile. The option precompile is added
to maintain the compatibility with the shader-db repository.
This change fixes 22 of these tests:
deqp-gles31/functional/debug/error_filters/case_.*: warn pass
deqp-gles31/functional/debug/error_groups/case_.*: warn pass
Fixes: 28d6a5af25 ("r600: Add shader precompile and shader-db support.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38485>
(cherry picked from commit f005c0b5ad)
Fixes this test on Xe2+:
INTEL_DEBUG=no32 ./deqp-vk -n dEQP-VK.spirv_assembly.instruction.maint9_vectorization.bit_field_u_extract.result_v16i-base_v16i-offset_s64u-count_s16i
Generate invalid code for that platform:
and(16) g37<1>UW g65<16,4,4>UW 0x000fUW { align1 1H I@5 };
ERROR: Invalid register region for source 0. See special restrictions section.
Several helpers like has_subdword_integer_region_restriction() do not
see the final type of the source, so compute it early.
Maybe new_src could be used in more cases. Being conservative for now.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38548>
(cherry picked from commit 8f9acc0150)
When probing on generic Linux platforms, the loading of d3d12 and
the first init of could fail, but the error returned causes a
loader warning to be printed.
Use the correct error return to stop this.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38611>
(cherry picked from commit c00b66fa71)
The option's description is:
> Whether to use LLVM for the Gallium draw module, if LLVM is included.
Let's disable it right away if LLVM is disabled, to avoid some
configurations from failing.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38558>
(cherry picked from commit 37c7d19e46)
Swizzle can include PIPE_SWIZZLE_0/_1 (4 and 5) which result in indexing
beyond the channel array.
Reported-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Fixes: 76e350671f ("freedreno/a6xx: Sysmem clear fixes")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38593>
(cherry picked from commit f0465ced7f)
Previously legacy_gs_info calculated based on
gs_info->legacy_gs_info.esgs_itemsize which is calculated based on gs
input varyings.
However, when using ESO vs/tes can have outputs not read by gs, which
leads to underestimating LDS usage.
Cc: mesa-stable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38514>
(cherry picked from commit 5e8885a339)
The versioning scheme changed in v45.0 (the previous version was
3.48.0). As such, this version check would wrongly accept e.g. 48.0.
Fixes: e9341568fa ("meson: require sysprof-capture-4 >= 4.49.0")
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38557>
(cherry picked from commit ad14942300)
Sadly this probably won't change anything in terms of perf as the CCS
engine has a bunch of other restrictions.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 243c01c703 ("anv/iris: implement Wa_18040903259")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38484>
(cherry picked from commit 07b7de35cc)
Previously, we update the sfb dst slot upon vn_SignalSemaphore so that
vn_GetSemaphoreCounterValue can poll just the feedback slot itself.
However, that can race with pending sfb cmds that are going to update
the slot value, ending up with stuck sync progression.
This change fixes it by disallowing vn_SignalSemaphore to touch the sfb
dst slot. To ensure counter query being monotonic, vn_GetSemaphoreCounterValue
now takes the greater of signaled counter and the sfb counter read.
Test with dEQP-VK.synchronization* group:
- w/o this: stuck shows up within 2 min with 8 parallel deqp runs
- with this: no stuck for multiple full runs of the same
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14304
Fixes: 5c7e60362c ("venus: enable timeline semaphore feedback")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38516>
(cherry picked from commit 829bd406c0)
vkQueueBeginDebugUtilsLabelEXT and vkQueueEndDebugUtilsLabelEXT
require queue to be externally synchronized, which means these functions
require the lock. Unfortunately, there's no guarantee that the debug
markers will be matched in the multithreaded case, but I suppose this is
better than crashing.
Fixes: 015eda4a41 ("zink: deduplicate VkDevice and VkInstance")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38414>
(cherry picked from commit 80db8171de)
We currently only create one queue per queue family on the device. The
device can be shared between multiple zink_screens, so having one lock
per screen can still lead to multiple locks per queue. Fix this by
allocating queue_lock along with the device.
This fixes an issue that was causing crashes with nvk+zink and
QtWebEngine with QTWEBENGINE_FORCE_USE_GBM=1 This can be reproduced by
resizing the window in either:
* anki - https://apps.ankiweb.net/ or
* Qt's simplebrowser example
https://doc.qt.io/qt-6/qtwebengine-webenginewidgets-simplebrowser-example.html
which would then cause this dmesg error:
nouveau 0000:01:00.0: anki[92007]: Failed to find syncobj (-> in): handle=40
along with a context loss.
With VK_LOADER_LAYERS_ENABLE=VK_LAYER_KHRONOS_validation we would additionally
get warnings like:
Validation Error: [ UNASSIGNED-Threading-MultipleThreads-Write ] | MessageID = 0xa05b236e
vkQueueSubmit(): THREADING ERROR : object of type VkQueue is simultaneously used in current thread 139824449189568 and thread 139823901816512
Objects: 1
[0] VkQueue 0x557a666783e0
Fixes: 015eda4a41 ("zink: deduplicate VkDevice and VkInstance")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38414>
(cherry picked from commit 9acce36652)
We were overflowing an array during bifrost disassembly. This was
only a problem if the user explicitly set an environment variable,
so unlikely to occur in casual use, and also only could be triggered
in very specific, dense code. But we still should get this right!
The specific CTS test that caused the assert is:
'dEQP-VK.graphicsfuzz.stable-quicksort-for-loop-with-injection'
with environment variable `BIFROST_MESA_DEBUG=shaders`. One of the
shaders has a clause with 6 constants (the maximum) and this overflowed
the array because we assume we always have an extra slot (used for
modifier processing).
Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38501>
(cherry picked from commit 65ba14519e)
GL_EXT_texture_buffer_object requires support for alpha, luminance,
luminance-alpha and intensity formats. If we can't support those, we
can't enable the extension.
Fixes: 45ca7798dc ("glsl: handle interactions between EXT_gpu_shader4 and texture extensions")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
(cherry picked from commit 6f2b8c3f61)
Most of the time, we remember to check for both extensions. But in one
case, it seems we forgot the GLES extension. Whoops.
Let's switch to a helper here, so we don't have to repeat the logic over
and over again.
Fixes: b4c0c514b1 ("mesa: add OES_texture_buffer and EXT_texture_buffer support")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38162>
(cherry picked from commit 9d5e0c1ad2)
Some GPU hangs witnessed in the wild on RDNA4 in Control and Arc Raiders
seem to point towards closest-hit shaders reading a stale value for the
SGPR pair containing the currently-executing shader's address.
This SGPR pair was read by VALU in the preceding traversal shader,
making it susceptible to VALUReadSGPRHazard. Inserting
VALUReadSGPRHazard mitigations before accessing the s_setpc target seems
to fix the hang. We don't have conclusive proof that this is hazardous,
but given that all signs point towards it and we have a reasonably
simple workaround, let's roll with this for now to mitigate the hangs.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38290>
(cherry picked from commit 1243d575a5)
When no workgroup size is specified we try to run with the most optimal one
possible. However we didn't take into account that we shouldn't run a
workgroup of higher dimensionality than requested by the application.
Fixes: 376d1e6667 ("rusticl: implement cl_khr_suggested_local_work_size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38375>
(cherry picked from commit d46be8fbf2)
There were two issues:
1. The global_work_offset parameter is optional but we errored on NULL
2. We didn't return the reqd_work_group_size when set on the kernel.
Fixes: 376d1e6667 ("rusticl: implement cl_khr_suggested_local_work_size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38375>
(cherry picked from commit 810dca450c)
We set sparseImageInt64Atomics to false on these formats, so there's
no need for the software detiling. Thus, we can not set the flag,
which will make ISL pick Tile64 for these formats, and things will
work.
Thanks to Lionel for pointing the fix here.
Testcase: dEQP-VK.api.info.image_format_properties.*d.optimal.r64_*int
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit a1628aba1f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38526>
_math_matrix_is_dirty() should only be used to decide if we need to
run _math_matrix_analyse(). We already decided that we had a new
texture matrix when we called _mesa_update_texture_matrices() so
we need to set _TexMatEnabled correctly otherwise we might
incorrectly return _NEW_FF_VERT_PROGRAM | _NEW_FF_FRAG_PROGRAM in
the following if-statement.
Fixes: ec978e002f ("mesa: only update fixed-func programs on texture matrix enablement changes")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14286
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38473>
(cherry picked from commit b0047be0c2)
The VMA of VkDeviceMemory has to accomodate all the resources that can
be bound to it. For sparse images it's 64KiB alignment, for other
tiled images it's 4KiB. But we also have a workaround that requires a
64KiB alignment for Tile4 images.
The initial version of the slab allocator missed the 4KiB alignment.
This fix adds the workaround handling too.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: dabb012423 ("anv: Implement anv_slab_bo and enable memory pool")
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38480>
(cherry picked from commit 401b2066b0)
We need to make sure the data part returned by sampler messages is
always aligned to a physical register. Just like the residency data
lives in a single physical register after the data.
Lowering a vec3 16bits per components led to a half a physical
register allocation which then confused the descriptor lowering
(expecting physical register units).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 295734bf88 ("intel/fs: fix residency handling on Xe2")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12794
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34008>
(cherry picked from commit 61d6aea401)
SB_ID(LS) is currently equal to zero, so this is not a behavior change,
but worth setting it explicitly for clarity and in case the sb
assignments change.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 885805560f ("panvk/csf: fix case where vk_meta is used before PROVOKING_VERTEX_MODE_LAST")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38458>
(cherry picked from commit ebbf05f9d2)
We check fn_set_fbds_provoking_vertex_stride == 0 to determine whether a
previous function variant has already been allocated, so this value must
be initialized to zero before we start the loop. We could fix this by
explicitly initializing just that field, but I figure it's simpler and
safer to just zero-initialize the whole struct.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 885805560f ("panvk/csf: fix case where vk_meta is used before PROVOKING_VERTEX_MODE_LAST")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38458>
(cherry picked from commit e899bc8be8)
send.ugm (1|M0) r125 r0 null:0 0x0 0x0200651F {$9} // wr:1+0, rd:0; fence invalid flush type scoped to tile
When destination of Send(s) is not null, the response length must not be 0.
Should only affect DG2 products.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38478>
(cherry picked from commit 4816318887)
The flag mega_fetch should be set on rv770 for a
read scratch operation (as written in the r700
documentation p357). Without this flag, read scratch
does not work and a gpu hang could be triggered.
Here are the tests fixed:
shaders/glsl-predication-on-large-array: fail pass
spec/glsl-1.10/execution/temp-array-indexing/glsl-fs-giant-temp-array: fail pass
spec/glsl-1.10/execution/temp-array-indexing/glsl-vs-giant-temp-array: fail pass
spec/glsl-1.30/execution/fs-large-local-array: fail pass
spec/glsl-1.30/execution/fs-large-local-array-vec2: fail pass
spec/glsl-1.30/execution/fs-large-local-array-vec3: fail pass
spec/glsl-1.30/execution/fs-large-local-array-vec4: fail pass
spec/glsl-1.30/execution/fs-multiple-large-local-arrays: fail pass
Fixes: 9c48a139b0 ("r600g: Support emitting scratch ops")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38353>
(cherry picked from commit f8de09a811)
This was copied from radeonsi which expected seq_force_screen_content_tools = 2
and seq_force_integer_mv = 2.
Fixes: 37e71a5cb2 ("radv/video: add support for AV1 encoding")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38371>
(cherry picked from commit 3858a6a696)
Disable sparse mappings on GFX7-8 due to GPU hangs in the VK CTS,
except Polaris where it happens to work "well enough" to pass
the VK CTS and run some games already.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38304>
(cherry picked from commit 567e1b56ef)
Also disable the sparse binding queue and other related features.
Using sparse on GFX6-8 can cause GPU hangs at the moment.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38304>
(cherry picked from commit 1c8881fc60)
To avoid incompatibility between the compiler implementations used by
the driver and the renderer, seq_cst ordering is picked here, which has
required a full mfence instruction. Then the renderer side acquire is
ensured to be ordered after the cache flush of ring cs updates.
Perf wise, there's no regression in headless vkmark runs. In theory,
the overhead introduced here weighs trivially as compared to the ring
cs encode/decode part. So we should go for better robustness.
Test: venus on windows guest works with renderer on Linux
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14277
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38435>
(cherry picked from commit 07d059f3e2)