The BSpec page "Flush Types" (46213) says the following about the Tex
Invalidate bit:
"Requires stall bit ([20] of DW) set for all GPGPU Workloads."
For newer platforms, this is documented in the description of the
texture invalidation bit in the PIPE_CONTROL page (56551):
"CS Stall bit in PIPE_CONTROL command must be always set for GPGPU
workloads when Texture Cache Invalidation Enable bit is set"
Iris had it only for GFX_VER 9 and 11, while Anv had it missing for
everything.
Please notice that this patch includes a revert of 397e728ef4.
Fixes: 397e728ef4 ("iris: Drop GPGPU Tex Invalidate restriction for TGL+")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28608>
(cherry picked from commit cf7e1f3817)
The new GTK+ GL renderer is extensively using instanced rendering. SVGA
driver was incorrectly detecting the instanced draws by only checking
whether the instance count was greater than 1. Base instance has to
be also checked to make sure that the draw correctly offsets the vertex
buffer.
Fix instanced draw detection by checking both the instance count and
the base instance. Fixes the new GTK+ 4 GL renderer.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: ccb4ea5a43 ("svga: Add GL4.1(compatibility profile) support in svga driver")
Reviewed-by: Neha Bhende <neha.bhende@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28616>
(cherry picked from commit 955444e068)
When trying to increase the height alignment to unlock multi-pipe resolve for
better performance we need to be careful to not overstep the source dimensions
as this would cause the blit to be rejected.
Do so and also rearrange the code a bit to make it more obvious what is being
done.
Fixes: 797454edfc ("etnaviv: rs: fix blits with insufficient alignment for dual pipe operation")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28598>
(cherry picked from commit 2964812aac)
Not setting this bits, it seems we get incorrect depth values (i.e
not zero) for null depth/stencil tiles.
Fixes vkd3d-proton's test_sparse_depth_stencil_rendering
CTS doesn´t seem to exercise any depth/stencil format.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28611>
(cherry picked from commit 2dd321963f)
copy_image calls blit now for multisampled images, including
depth. But blit explicitly uses only PIPE_MASK_RGBA, so it is
incapable of copying depth buffers.
This patch checks the destination format and uses PIPE_MASK_ZS if
it is a depth or stencil. Ideally we would simply use PIPE_MASK_RGBAZS
always, but not all drivers actually handle getting this mask
(they probably should, but that's another story).
The change to copy_image was in 5027b5aa2, so in some sense this
patch "fixes" that. In fact though the issue wasn't in the copy_image
change, it was always latent in blit().
Fixes: 5027b5aa28 ("gallium: stop calling resource_copy_region for multisampled copy_image")
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28585>
(cherry picked from commit 0cb852050d)
This is a perpetual bug that hits Windows. In the MSVC CRT, qsort
is unstable, where the glibc qsort is stable. So apps run fine on
Windows IHV drivers, and on Linux Mesa drivers, and only break down
when running on Windows Mesa drivers.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10922
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28586>
(cherry picked from commit acbf3ad1fb)
This got probably accidentally in, as Eric MR changing this was just
before this change got in.
Fixes: 16af090908 ("ci/lava: separate HW definitions from SW")
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28600>
(cherry picked from commit 5b69cbb80a)
When has_vm_control is supported it takes a different code path and
creates one context per engine and in this code path we were not
setting the protected context flag.
The lack of this is not causing any test to fail in our CI but it is
better do what we are supposed to do.
Fixes: fd40134487 ("anv: allow protected GEM context creation")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28299>
(cherry picked from commit 77c004f7ca)
some gnome tests are seeing this:
==4579== Invalid read of size 4
==4579== at 0x161B28FB: UnknownInlinedFun (simple_mtx.h:106)
==4579== by 0x161B28FB: st_save_zombie_sampler_view (st_context.c:213)
==4579== by 0x161D762A: st_texture_release_all_sampler_views.part.0 (st_sampler_view.c:272)
==4579== by 0x161D7CB6: st_texture_release_all_sampler_views (st_sampler_view.c:258)
==4579== by 0x161D7CB6: st_delete_texture_sampler_views (st_sampler_view.c:292)
==4579== by 0x16191B9E: _mesa_delete_texture_object (texobj.c:523)
==4579== by 0x16191CDC: _mesa_reference_texobj_ (texobj.c:637)
==4579== by 0x1619632F: _mesa_reference_texobj (texobj.h:92)
==4579== by 0x1619632F: _mesa_free_texture_data (texstate.c:1114)
==4579== by 0x162EEE1D: _mesa_free_context_data (context.c:1155)
==4579== by 0x161B3C0F: st_destroy_context (st_context.c:999)
==4579== by 0x160F155A: dri_destroy_context (dri_context.c:277)
==4579== by 0x1603C371: dri2_destroy_context (egl_dri2.c:1592)
==4579== by 0x1602EBF5: eglDestroyContext (eglapi.c:918)
==4579== by 0x502DF3C: gdk_gl_context_dispose (gdkglcontext.c:211)
==4579== Address 0x34dc9b08 is 7,016 bytes inside an unallocated block of size 10,336 in arena "client"
==4579==
It appears we destroy the mutex and zombie objects, but freeing
context data seems to add them back.
This might not be the complete answer.
Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28565>
(cherry picked from commit f8e48b561e)
this pointer is only valid if length is valid
fixes dEQP-GL45-ES3.functional.negative_api.shader.shader_binary with
glthread enabled
fixes#10915
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28501>
(cherry picked from commit 37be4bf1b7)
When we dispatch an indirect compute job, the buffer containing
the indirect parameters should be marked as read (since the GPU
will read the parameters from there). Without this there's a
race condition if the CPU later updates the buffer.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28512>
(cherry picked from commit ad7457fe20)
We need invalidate CCHE when we optimize out an invalidation of UCHE,
for example a storage image write to texture read. We missed this
earlier because of the blob's tendency to always over-flush, but the
blob does use this when building acceleration structures.
Fixes: 95104707f1 ("tu: Basic a7xx support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28445>
(cherry picked from commit fb1c3f7f5d)
this breaks interfaces where the consumer reads its input indirectly
and only some of the components are written by clobbering all
the components, even if the unwritten components are never accessed
Fixes: 459b49a174 ("zink: add a new linker pass to handle mismatched i/o components")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28466>
(cherry picked from commit 460cd99ea5)
A last memory leak related to constants_remap_table is happening.
This memory leak is triggered by two deqp-gles2 tests.
For instance, this issue is triggered with
"deqp-gles2 --deqp-case=dEQP-GLES2.functional.uniform_api.random.13":
Direct leak of 336 byte(s) in 1 object(s) allocated from:
#0 0x7f1b4a5de7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f1b401a2cdf in rc_remove_unused_constants ../src/gallium/drivers/r300/compiler/radeon_remove_constants.c:101
#2 0x7f1b40185386 in rc_run_compiler_passes ../src/gallium/drivers/r300/compiler/radeon_compiler.c:476
#3 0x7f1b40185625 in rc_run_compiler ../src/gallium/drivers/r300/compiler/radeon_compiler.c:498
#4 0x7f1b401c14d2 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:172
#5 0x7f1b401b669a in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#6 0x7f1b401baf73 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:592
#7 0x7f1b40128db7 in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1071
#8 0x7f1b3e67799d in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1073
#9 0x7f1b3e680285 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1119
#10 0x7f1b3e6812fa in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1284
#11 0x7f1b3e6812fa in st_finalize_program ../src/mesa/state_tracker/st_program.c:1363
#12 0x7f1b3f13d501 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:754
#13 0x7f1b3f13d501 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:990
#14 0x7f1b3efeef75 in link_program ../src/mesa/main/shaderapi.c:1336
#15 0x7f1b3efeef75 in link_program_error ../src/mesa/main/shaderapi.c:1445
Fixes: 29df85788a ("r300: fix constants_remap_table memory leak")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28522>
(cherry picked from commit 24a5165cdf)
The timeout compute is invalid
Fixes: 095dfc6caa ("util: Move the implementation of futex_wake and futex_wait from futex.h to futex.c")
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28473>
(cherry picked from commit 54e3fde5ca)
Found by inspection, I think this can happen with pack_32_4x8(f2u8(a@16)),
which will use v_cvt_u16_f16 (a 16bit instruction) with a v1b definition.
No Foz-DB changes on Navi21.
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28443>
(cherry picked from commit 80652de67b)
Earlier strategy was to enable always on DG2 but there has been bunch of
issues that indicate this feature is not working correctly. Disable
until we figure out issues with it.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28184>
(cherry picked from commit a87d888546)
For BROADCAST and MOV_INDIRECT, required_exec_type was returning
brw_int_type(type_sz(t), false), which is an unsigned type. However,
get_exec_type(inst) returns the original type for either Q or UQ.
This meant that has_invalid_exec_type would detect a mismatch and
trigger lowering.
That lowering would insert new 64-bit MOVs, which would need to be
lowered on platforms which don't support Q/UQ. Except, we already
ran that lowering pass earlier. So, the unlowered Q/UQ MOVs would
reach the software scoreboarding pass, and trigger failures in the
inferred_exec_pipe() function, as no pipe is available to handle
64-bit integer operations.
It turns out that we don't need the region lowering pass to do
anything for these opcodes. The generator code for both BROADCAST
and MOV_INDIRECT already handle decomposing Q/UQ operations into
32-bit MOVs when they're not supported. And, it also implicitly
converts to integer types, even for floating point sources. The
inferred_exec_pipe function already special cases them to note
that they'll always be handled on the integer pipe, so that matches.
Just drop the region lowering code for these opcodes.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit 7a24f29fbb)
We are overriding the type to Q/UQ, so we need to split to two MOVs
if 64-bit integer math is not supported. For reference, Meteorlake
does support 64-bit floats but would still not work correctly here.
See also brw_broadcast(), which does similar indirects but correctly
checks has_64bit_int instead of has_64bit_float.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit a90edad9f7)
==134077== 96 bytes in 1 blocks are definitely lost in loss record 1 of 3
==134077== at 0x4840808: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==134077== by 0x6D6F690: vk_default_alloc (vk_alloc.c:26)
==134077== by 0x52EEEBE: vk_alloc (vk_alloc.h:48)
==134077== by 0x52EEEEE: vk_zalloc (vk_alloc.h:56)
==134077== by 0x52EF47E: xe_exec_process_syncs (anv_batch_chain.c:132)
==134077== by 0x52EF8F6: xe_execute_trtt_batch (anv_batch_chain.c:215)
==134077== by 0x5301670: anv_queue_submit_trtt_batch (anv_batch_chain.c:1697)
==134077== by 0x603D135: gfx125_write_trtt_entries (genX_cmd_buffer.c:6091)
==134077== by 0x5370B44: anv_sparse_bind_trtt (anv_sparse.c:595)
==134077== by 0x5370CFC: anv_sparse_bind (anv_sparse.c:629)
==134077== by 0x5370E6E: anv_init_sparse_bindings (anv_sparse.c:670)
==134077== by 0x5328037: anv_CreateBuffer (anv_device.c:5071)
Note to backporters: this is only for when xe.ko is being used and
ANV_SPARSE_USE_TRTT=1 is exported. This is not the regular code path.
Fixes: 18bd00c024 ("anv/trtt: don't wait/signal syncobjs using the CPU anymore")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28455>
(cherry picked from commit 38af7254e2)
Xe KMD don't refcount anything, so resources could be freed while they
are still in use if we don't wait for exec_queue to be idle.
This issue was found with Xe KMD error capture, VM was already
destroyed when it attemped to capture error state but it can also
happen in applications that did not hang.
This fixed the '*ERROR* GT0: TLB invalidation' errors when running
piglit all test list.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27500>
(cherry picked from commit 665d30b544)
num_syncs was being incremented by one if 'utrace_submit != NULL' but
a sync was only being set if also
'util_dynarray_num_elements(&utrace_submit->batch_bos) == 0'.
This mismatch could cause application to abort due to
'assert(count == num_syncs)'.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27244>
(cherry picked from commit d5ec2fa52f)
When LLVM 18 is used, pass the no-verify-fixpoint option when
running the instcombine pass. Otherwise LLVM may abort with an
error.
The background here is that this option is enabled by default for
testing purposes, because instcombine is normally only explicitly
invoked like this inside tests. If it is used in an actual
production pipeline, the no-verify-fixpoint option needs to be
enabled.
This should fix the issue reported at
https://bugzilla.redhat.com/show_bug.cgi?id=2268800.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28101>
(cherry picked from commit 99f0449987)