this breaks interfaces where the consumer reads its input indirectly
and only some of the components are written by clobbering all
the components, even if the unwritten components are never accessed
Fixes: 459b49a174 ("zink: add a new linker pass to handle mismatched i/o components")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28466>
(cherry picked from commit 460cd99ea5)
A last memory leak related to constants_remap_table is happening.
This memory leak is triggered by two deqp-gles2 tests.
For instance, this issue is triggered with
"deqp-gles2 --deqp-case=dEQP-GLES2.functional.uniform_api.random.13":
Direct leak of 336 byte(s) in 1 object(s) allocated from:
#0 0x7f1b4a5de7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f1b401a2cdf in rc_remove_unused_constants ../src/gallium/drivers/r300/compiler/radeon_remove_constants.c:101
#2 0x7f1b40185386 in rc_run_compiler_passes ../src/gallium/drivers/r300/compiler/radeon_compiler.c:476
#3 0x7f1b40185625 in rc_run_compiler ../src/gallium/drivers/r300/compiler/radeon_compiler.c:498
#4 0x7f1b401c14d2 in r3xx_compile_fragment_program ../src/gallium/drivers/r300/compiler/r3xx_fragprog.c:172
#5 0x7f1b401b669a in r300_translate_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:516
#6 0x7f1b401baf73 in r300_pick_fragment_shader ../src/gallium/drivers/r300/r300_fs.c:592
#7 0x7f1b40128db7 in r300_create_fs_state ../src/gallium/drivers/r300/r300_state.c:1071
#8 0x7f1b3e67799d in st_create_fp_variant ../src/mesa/state_tracker/st_program.c:1073
#9 0x7f1b3e680285 in st_get_fp_variant ../src/mesa/state_tracker/st_program.c:1119
#10 0x7f1b3e6812fa in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1284
#11 0x7f1b3e6812fa in st_finalize_program ../src/mesa/state_tracker/st_program.c:1363
#12 0x7f1b3f13d501 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:754
#13 0x7f1b3f13d501 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:990
#14 0x7f1b3efeef75 in link_program ../src/mesa/main/shaderapi.c:1336
#15 0x7f1b3efeef75 in link_program_error ../src/mesa/main/shaderapi.c:1445
Fixes: 29df85788a ("r300: fix constants_remap_table memory leak")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28522>
(cherry picked from commit 24a5165cdf)
The timeout compute is invalid
Fixes: 095dfc6caa ("util: Move the implementation of futex_wake and futex_wait from futex.h to futex.c")
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28473>
(cherry picked from commit 54e3fde5ca)
Found by inspection, I think this can happen with pack_32_4x8(f2u8(a@16)),
which will use v_cvt_u16_f16 (a 16bit instruction) with a v1b definition.
No Foz-DB changes on Navi21.
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28443>
(cherry picked from commit 80652de67b)
Earlier strategy was to enable always on DG2 but there has been bunch of
issues that indicate this feature is not working correctly. Disable
until we figure out issues with it.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28184>
(cherry picked from commit a87d888546)
For BROADCAST and MOV_INDIRECT, required_exec_type was returning
brw_int_type(type_sz(t), false), which is an unsigned type. However,
get_exec_type(inst) returns the original type for either Q or UQ.
This meant that has_invalid_exec_type would detect a mismatch and
trigger lowering.
That lowering would insert new 64-bit MOVs, which would need to be
lowered on platforms which don't support Q/UQ. Except, we already
ran that lowering pass earlier. So, the unlowered Q/UQ MOVs would
reach the software scoreboarding pass, and trigger failures in the
inferred_exec_pipe() function, as no pipe is available to handle
64-bit integer operations.
It turns out that we don't need the region lowering pass to do
anything for these opcodes. The generator code for both BROADCAST
and MOV_INDIRECT already handle decomposing Q/UQ operations into
32-bit MOVs when they're not supported. And, it also implicitly
converts to integer types, even for floating point sources. The
inferred_exec_pipe function already special cases them to note
that they'll always be handled on the integer pipe, so that matches.
Just drop the region lowering code for these opcodes.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit 7a24f29fbb)
We are overriding the type to Q/UQ, so we need to split to two MOVs
if 64-bit integer math is not supported. For reference, Meteorlake
does support 64-bit floats but would still not work correctly here.
See also brw_broadcast(), which does similar indirects but correctly
checks has_64bit_int instead of has_64bit_float.
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28458>
(cherry picked from commit a90edad9f7)
==134077== 96 bytes in 1 blocks are definitely lost in loss record 1 of 3
==134077== at 0x4840808: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==134077== by 0x6D6F690: vk_default_alloc (vk_alloc.c:26)
==134077== by 0x52EEEBE: vk_alloc (vk_alloc.h:48)
==134077== by 0x52EEEEE: vk_zalloc (vk_alloc.h:56)
==134077== by 0x52EF47E: xe_exec_process_syncs (anv_batch_chain.c:132)
==134077== by 0x52EF8F6: xe_execute_trtt_batch (anv_batch_chain.c:215)
==134077== by 0x5301670: anv_queue_submit_trtt_batch (anv_batch_chain.c:1697)
==134077== by 0x603D135: gfx125_write_trtt_entries (genX_cmd_buffer.c:6091)
==134077== by 0x5370B44: anv_sparse_bind_trtt (anv_sparse.c:595)
==134077== by 0x5370CFC: anv_sparse_bind (anv_sparse.c:629)
==134077== by 0x5370E6E: anv_init_sparse_bindings (anv_sparse.c:670)
==134077== by 0x5328037: anv_CreateBuffer (anv_device.c:5071)
Note to backporters: this is only for when xe.ko is being used and
ANV_SPARSE_USE_TRTT=1 is exported. This is not the regular code path.
Fixes: 18bd00c024 ("anv/trtt: don't wait/signal syncobjs using the CPU anymore")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28455>
(cherry picked from commit 38af7254e2)
Xe KMD don't refcount anything, so resources could be freed while they
are still in use if we don't wait for exec_queue to be idle.
This issue was found with Xe KMD error capture, VM was already
destroyed when it attemped to capture error state but it can also
happen in applications that did not hang.
This fixed the '*ERROR* GT0: TLB invalidation' errors when running
piglit all test list.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27500>
(cherry picked from commit 665d30b544)
num_syncs was being incremented by one if 'utrace_submit != NULL' but
a sync was only being set if also
'util_dynarray_num_elements(&utrace_submit->batch_bos) == 0'.
This mismatch could cause application to abort due to
'assert(count == num_syncs)'.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27244>
(cherry picked from commit d5ec2fa52f)
When LLVM 18 is used, pass the no-verify-fixpoint option when
running the instcombine pass. Otherwise LLVM may abort with an
error.
The background here is that this option is enabled by default for
testing purposes, because instcombine is normally only explicitly
invoked like this inside tests. If it is used in an actual
production pipeline, the no-verify-fixpoint option needs to be
enabled.
This should fix the issue reported at
https://bugzilla.redhat.com/show_bug.cgi?id=2268800.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28101>
(cherry picked from commit 99f0449987)
Hardly a copyrightable file, though it should inherit the MIT
license as the code originate from MIT licensed r300_winsys.h.
Fixes: 6e3fc2de2a ("r300g: Move bootstrap code to targets")
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Signed-off-by: David Heidelberg <david@ixit.cz>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28384>
(cherry picked from commit b2ae73b27e)
When tu_shader object is destroyed through vk_pipeline_cache, the relevant
destroy callback should relay to the general tu_shader_destroy function
that will also clean up owned resources.
During shader creation, the ir3_shader object should be destroyed once the
shader variants are retrieved. Since those variants are owned by tu_shader
they should be freed up in tu_shader_destroy.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: a03525d8db ("tu: Split program draw state into per-shader states")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27847>
(cherry picked from commit 3ee81ffe14)
It was by pure luck that all sources (and the result) of nir_dpas_intel
had the same number of components. It is possible to support matrix
sizes where the accumlator matrix and the result matrix are larger
(e.g., 16x8 * 8x16 = 16x16).
This breaks all of the assumptions of NIR's infrastructure for code
generating intrinsics. Fix the by making the accumulator matrix be the
first source. The accumulator and the result will always have the same
dimensions (due to rules of matrix multiplication) and the same type
(due to restructions of the cooperative matrix extension). This forces
them to have the same number of components.
This doesn't fix all the potential problems. NIR expects that all
0-sized sources will have the same number of components. This just
ensures that the result has the correct number of components.
Fixes: 6b14da33ad ("intel/fs: nir: Add nir_intrinsic_dpas_intel")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
(cherry picked from commit a8115221e5)
Was previously passing 1, 1, 0 as the regioning. This generated
incorrect disassembly because the encoding for a width of 1 is 0. Use
the enums to ensure the correct values are used.
Fixes: 1c92dad5cb ("intel/disasm: Disassembly support for DPAS")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
(cherry picked from commit c6bd6f2a41)
If the destination was the accumulator but is no longer, having the flag
set is not correct. On Xe2 this also causes a validation error.
v2: Reword the comment to be more clear. Suggested by Jordan.
Fixes: efa4e4bc5f ("intel/fs: Introduce regioning lowering pass.")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
(cherry picked from commit be4fa59a72)
This reverts commit be8b7980e6.
Store the cache entry so that the fast path picks up the optimized
pipeline when its available from a background optimized_compile_job().
Observed traces where it would take the fast path back and forth using
an unoptimized pipeline and never pick up the optimized pipeline leading
to >50% fps drop.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28440>
(cherry picked from commit d6978b1af2)
The result is not the same if the multiplication overflows, mad_clamp
does not truncate between the mul and the add.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28421>
(cherry picked from commit 51a5ebbd01)
ViewIndex isn't allowed with task shaders and the whole thing was just
wrong.
Fixes GPU hangs with
dEQP-VK.mesh_shader.ext.conditional_rendering.*_multiview.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28157>
(cherry picked from commit bcf793306f)
Errors produced by the call to `xcb_present_pixmap` should be explicitly
discarded to ensure they don't leak into the application event loop.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: 2b885b23 ("vk/wsi/x11: stop roundtripping on presentation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28402>
(cherry picked from commit 82ed8aadea)
Depth writes are only gated by the depth writemask. The state object
member depth_enabled must only affect depth testing.
Fixes: b29fe26d43 ("etnaviv: rework ZSA into a derived state")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28403>
(cherry picked from commit 3a10d1be0e)
opt_copy_propagation() can sometimes propagate FIXED_GRF sources into
SHADER_OPCODE_SENDs as the message payload. For example, GS input
reads, which simply take a URB handle and have the offset in the
descriptor. For non-VGRFs, there isn't a payload to split, so just
skip past such send messages.
Fixes: 589b03d02f ("intel/fs: Opportunistically split SEND message payloads")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28067>
(cherry picked from commit ba11127944)
Originally was trying to copy a pps's scaling list when an sps's was
signaled.
Fixes: 8daa32963 ("vulkan/video: add helper to derive H264 scaling lists")
Signed-off-by: Charlie Turner <cturner@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28352>
(cherry picked from commit 9e3932e990)
External images translate to 2D images in ntv, so we will have to emit
OpImageQuerySizeLod instead of OpImageQuerySize (thanks Faith for
pointing that out). This quells
VUID-VkShaderModuleCreateInfo-pCode-08737
Image must have either 'MS'=1 or 'Sampled'=0 or 'Sampled'=2
%32 = OpImageQuerySize %v2int %31
triggred by piglit
spec@oes_egl_image_external_essl3@oes_egl_image_external_essl3
on Zink.
Fixes: 3f783a3c50
zink: omit Lod image operand in ntv when not using an image texture dim
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28389>
(cherry picked from commit b6c1390354)
We are not supposed to apply the vertex index offset to our varying or
non-VS attribute (AKA image) descriptors. While at it, explicitly set
offset_enable to true when emitting vertex attribute descriptors, to
clarify our intentions.
Fixes: c0d6539827 ("panvk: Drop support for Midgard")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28182>
(cherry picked from commit d9d6514fbc)
This fixes nir_print for unstructured control-flow. It's safe to
backport just this patch because the worst case is that we don't set as
many types and not as much gets printed.
Fixes: 260a9167db ("nir/print: Improve NIR_PRINT=print_consts by using nir_gather_ssa_types()")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>
(cherry picked from commit 2be97717e6)
These are both handled by inserting them directly at the top of the
nir_function_impl. However, if the cursor is already at the top, it
never gets updated so we end up inserting other stuff after the newly
inserted undef or decl_reg. It's an odd edge case to be sure but I hit
it with my new NIR CF pass for NAK.
Fixes: 1be4c61c95 ("nir/builder: Add a helper for creating undefs")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>
(cherry picked from commit a782809f81)
It is only taking effect in hevc encoding so far.
Cc: mesa-stable
Reviewed-By: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28280>
(cherry picked from commit b24748a93a)