For instance, this issue is triggered with
"piglit/bin/glslparsertest tests/spec/arb_bindless_texture/compiler/images/arith-bound-image.frag pass 3.30 GL_ARB_bindless_texture GL_ARB_shader_image_load_store":
Direct leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe9a7 in calloc (/usr/lib64/libasan.so.6+0xb19a7)
#1 0x7f84ba7e0801 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4391
#2 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#3 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#4 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#5 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#6 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#7 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#8 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Direct leak of 136 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbff57 in operator new(unsigned long) (/usr/lib64/libasan.so.6+0xb2f57)
#1 0x7f84b1a5f749 in LLVMCreateBuilderInContext (/usr/local/lib64/libLLVM-17.so+0xc84749)
#2 0x7f84ba7817b0 in ac_llvm_context_init ../src/amd/llvm/ac_llvm_build.c:54
#3 0x7f84ba542b7a in si_llvm_context_init ../src/gallium/drivers/radeonsi/si_shader_llvm.c:120
#4 0x7f84ba542b7a in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:832
#5 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#6 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#7 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#8 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#9 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b9fee in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f84b81b9fee in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f84b81b05c7 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f84b81b05c7 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f84ba7e06ae in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4381
#7 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#8 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#9 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#10 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#11 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#12 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#13 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 176 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b9fee in rzalloc_size ../src/util/ralloc.c:152
#3 0x7f84b81b9fee in rzalloc_array_size ../src/util/ralloc.c:232
#4 0x7f84b81b05c7 in _mesa_hash_table_init ../src/util/hash_table.c:163
#5 0x7f84b81b05c7 in _mesa_hash_table_create ../src/util/hash_table.c:186
#6 0x7f84ba7e06e4 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4382
#7 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#8 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#9 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#10 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#11 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#12 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#13 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b046c in _mesa_hash_table_create ../src/util/hash_table.c:182
#3 0x7f84ba7e06e4 in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4382
#4 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#5 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#6 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#7 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#8 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#9 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#10 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
Indirect leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f84c3fbe7ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f84b81b9b3f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f84b81b046c in _mesa_hash_table_create ../src/util/hash_table.c:182
#3 0x7f84ba7e06ae in ac_nir_translate ../src/amd/llvm/ac_nir_to_llvm.c:4381
#4 0x7f84ba53fdf4 in si_llvm_translate_nir ../src/gallium/drivers/radeonsi/si_shader_llvm.c:759
#5 0x7f84ba542bb7 in si_llvm_compile_shader ../src/gallium/drivers/radeonsi/si_shader_llvm.c:836
#6 0x7f84ba337b8e in si_compile_shader ../src/gallium/drivers/radeonsi/si_shader.c:2874
#7 0x7f84ba43a7c1 in si_init_shader_selector_async ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:3176
#8 0x7f84b81c3448 in util_queue_thread_func ../src/util/u_queue.c:309
#9 0x7f84b821ea6a in impl_thrd_routine ../src/c11/impl/threads_posix.c:67
#10 0x7f84c2fea38a (/lib64/libc.so.6+0x8438a)
SUMMARY: AddressSanitizer: 920 byte(s) leaked in 6 allocation(s).
Fixes: d92d35c9db ("ac/llvm: add a return value to ac_nir_translate")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28099>
The vma_samplers vma heap is initialized unconditionally. Don't use
device->physical->indirect_descriptors as a condition on whether to
free it or not.
From my TGL machine:
==373617== 32 bytes in 1 blocks are definitely lost in loss record 1 of 1
==373617== at 0x48459F3: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==373617== by 0x6926DC0: util_vma_heap_free (vma.c:339)
==373617== by 0x6925ED3: util_vma_heap_init (vma.c:53)
==373617== by 0x5334EDA: anv_CreateDevice (anv_device.c:3404)
==373617== by 0x685593A: vk_tramp_CreateDevice (vk_dispatch_trampolines.c:78)
==373617== by 0x48A6D56: terminator_CreateDevice (loader.c:5833)
==373617== by 0x9C2293F: vulkan_layer_chassis::CreateDevice(VkPhysicalDevice_T*, VkDeviceCreateInfo const*, VkAllocationCallbacks const*, VkDevice_T**) (chassis.cpp:497)
==373617== by 0x48B0690: loader_create_device_chain (loader.c:4937)
==373617== by 0x48B1327: loader_layer_create_device (loader.c:4317)
==373617== by 0x48B8D79: vkCreateDevice (trampoline.c:1004)
==373617== by 0x10CC7A: MyApp::MyApp(int, bool) (sparse.cpp:608)
==373617== by 0x1201E8: main (sparse.cpp:6025)
Fixes: 7c76125db2 ("anv: use 2 different buffers for surfaces/samplers in descriptor sets")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28303>
Looks like `OCL_ICD_VENDORS` is meant to be a directory containing
`.icd` files, but we were giving it a path to a driver. This meant that
rusticl wouldn't get picked up when using devenv.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28231>
Extend the vkGetPhysicalDeviceFormatProperties2 cache to include
VkFormatProperties3 from the pNext chain. VkFormatProperties3 was
observed being always attached for DXVK and thus skipping the cache
if not handled.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28194>
This was missing, this is implemented in common code.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>
This was missing, this is implemented in common code.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28275>
try_merge_merge_set() expects equal_anc_out to be Temp() in the beginning,
so we need to reset it in case it's used again.
Fixes compilation of metro_exodus/163b3b895730d37b with
VK_PIPELINE_CREATE_2_DISABLE_OPTIMIZATION_BIT_KHR.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 18ba93e673 ("aco/cssa: rewrite lower_to_cssa pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28248>
A seperate meson project with mesa as a subproject was required to
compile the generate_rd executable for replaying rddecompiler
generated source. That approach was overly complicated, now the
generate_rd executable is now compiled as a part of building the
freedreno tools with a blank generate-rd.cc file that can be replaced
with the source generated by rddecompiler.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
It's useful to clear buffers at the start of sequences to view the
delta, this adds that functionality to wrbufs with a fixed clear
value of 0xDEADBEEF.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
It's possible for large allocations to hit the maximum address
space size especially with a fake AS, these failures are silent
and can cause a hard to debug segfault later down the line.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
We should be careful to not read past the end of any buffers when
dumping wrbufs, this clamps the size to the size of the buffer with
a warning.
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28253>
When computing layout for large extents and array size, the computations
can overflow the 32-bit variables holding these values. This can end up
underreporting memory requirements for a given resource, making the
application think that such a resource is allocatable.
The size-holding variables in the fdl_layout struct have their type
upgraded to uint64_t, as does the total size variable in tu_image. This
avoids problems in corner-case tests in the Vulkan CTS, but this code
should be improved further to avoid overflows and narrowing conversions.
Fixes a quartet of tests under
dEQP-VK.pipeline.monolithic.render_to_image.core.2d_array.huge.width_height_layers.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28050>
brw_fs_opt_eliminate_find_live_channel eliminates FIND_LIVE_CHANNEL
outside of control flow. None of our optimization passes generate
additional cases of that instruction, so once it's gone, we shouldn't
ever have to run the pass again. Moving it out of the loop should
save a bit of CPU time.
While we're at it, also clean adjacent BROADCAST instructions that
consume the result of our FIND_LIVE_CHANNEL. Without this, we have
to perform copy propagation to get the MOV 0 immediate into the
BROADCAST, then algebraic to turn it into a MOV, which enables more
copy propagation...not to mention CSE gets involved. Since this
FIND_LIVE_CHANNEL + BROADCAST pattern from emit_uniformize() is
really common, and it's trivial to clean up, we can do that. This
lets the initial copy prop in the loop see MOV instead of BROADCAST.
Zero impact on fossil-db, but less work in the optimization loop.
Together with the previous patches, this cuts compile time in
Borderlands 3 on Alchemist by -1.38539% +/- 0.1632% (n = 24).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>
It's a logical opcode which is lowered to a send-from-GRF later. That
lowering code is responsible for ensuring the sources are set up in a
proper SEND payload.
This was preventing copy propagation of surface handles which started
out as scalars, were splatted out to full-SIMD values with NoMask, then
actually consumed as only component 0 (scalar again), because we thought
that scalar values were not allowed.
fossil-db on Alchemist shows improvements in q2rtx but no other titles:
Totals:
Instrs: 161310436 -> 161310152 (-0.00%)
Cycles: 14370605159 -> 14370601066 (-0.00%)
Totals from 17 (0.00% of 652298) affected shaders:
Instrs: 16097 -> 15813 (-1.76%)
Cycles: 185508 -> 181415 (-2.21%)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28286>