Use the AMD_IMAGE_OPCODES=0 environment variable to test the lower image
opcode code path used by AMD CDNA/compute devices without graphics.
Runs a very limited subset of GL tests.
Signed-off-by: Thong Thai <thong.thai@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31180>
We don't have a generic v4i8 on Valhall, we have to lower it to two
v2i8. Fortunately, bi_make_vec_to() hides the Bifrost/Valhall
differences, so use that for nir_op_pack_uvec4_to_uint.
Fixes: 934b0f1add ("pan/bi: Respect swizzles for more vector ops")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31280>
We found out that some HW changes on Xe2 make the HW avoid reading the
blend state if we're using the null_rt bit in the extended descriptor.
Since the alpha_to_coverage bit resides in the blend state, that state
is ignored and writes are going through to the depth/stencil buffers.
Disable null_rt in the color outputs if the color outputs can affect
other outputs (through alpha_to_coverage & omask).
Fixes tests in this pattern on Xe2 :
dEQP-VK.pipeline.*.multisample.alpha_to_coverage_no_color_attachment.*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Backport-to: 24.2
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31196>
We'll want to tune this setting based on other parameters.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Backport-to: 24.2
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31196>
We setup an empty render target when there are no color attachments,
which effectively makes it a different surface state. In most cases
the compiler will insert a null-rt bit in the extended descriptor
which means the RT isn't even accessed. But in some cases like
alpha-to-coverage output + depth/stencil write, we will access the
render target because using the null-rt will prevent alpha-to-coverage
from happening.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2bd304bc8f ("anv: Skip the RT flush when doing depth-only rendering.")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31196>
Device memory can be allocated from different threads and thus requires
locking when allocating/freeing virtual addresses.
Fixes: 53fb1d99ca ("panvk: Transition to explicit VA assignment on v10+")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31282>
In wave64 for hw with native wave32, ds_append seems to be split in a load for
the low half and an atomic for the high half, and other LDS instructions can
be scheduled between the two.
Which means the result of the low half is unusable because it might be out of date.
I was only able to reproduce this issue in WGP mode, but be conservative and
apply the workaround in CU mode too.
Foz-DB Navi31:
Totals from 13 (0.02% of 79395) affected shaders:
Instrs: 7599 -> 7656 (+0.75%)
CodeSize: 39708 -> 39972 (+0.66%)
Latency: 83174 -> 83572 (+0.48%)
InvThroughput: 8271 -> 8357 (+1.04%)
Copies: 718 -> 717 (-0.14%)
VALU: 3689 -> 3703 (+0.38%)
SALU: 935 -> 965 (+3.21%)
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11921
Fixes: 45e935800a ("aco: implement nir_shared_append/consume_amd")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31301>
This change ensures that all these allocations are using
the same memory context.
For instance, this issue is triggered with:
"piglit/bin/arb_shader_image_load_store-host-mem-barrier -auto -fbo":
Indirect leak of 32816 byte(s) in 1 object(s) allocated from:
#0 0x7f49a35447ef in __interceptor_malloc (/usr/lib64/libasan.so.6+0xb17ef)
#1 0x7f49998e4b4f in ralloc_size ../src/util/ralloc.c:118
#2 0x7f49998e7521 in create_slab ../src/util/ralloc.c:801
#3 0x7f49998e7521 in gc_alloc_size ../src/util/ralloc.c:840
#4 0x7f49998e7d11 in gc_zalloc_size ../src/util/ralloc.c:868
#5 0x7f49999a6126 in nir_alu_instr_create ../src/compiler/nir/nir.c:682
#6 0x7f49999cba48 in clone_alu ../src/compiler/nir/nir_clone.c:217
#7 0x7f49999cc85a in clone_instr ../src/compiler/nir/nir_clone.c:456
#8 0x7f49999cee3a in clone_block ../src/compiler/nir/nir_clone.c:529
#9 0x7f49999cee3a in clone_cf_list ../src/compiler/nir/nir_clone.c:583
#10 0x7f49999d03be in clone_function_impl ../src/compiler/nir/nir_clone.c:660
#11 0x7f49999d13f7 in nir_function_impl_clone ../src/compiler/nir/nir_clone.c:678
#12 0x7f4999a0e2c5 in lower_call_function_impl ../src/compiler/nir/nir_functions.c:397
#13 0x7f4999a0e2c5 in function_link_pass ../src/compiler/nir/nir_functions.c:430
#14 0x7f4999a0e2c5 in function_link_pass ../src/compiler/nir/nir_functions.c:408
#15 0x7f4999a0e2c5 in nir_function_instructions_pass ../src/compiler/nir/nir_builder.h:108
#16 0x7f4999a0e2c5 in nir_link_shader_functions ../src/compiler/nir/nir_functions.c:452
#17 0x7f499ca30b8f in link_libintel_shaders ../src/gallium/drivers/iris/iris_program_cache.c:329
#18 0x7f499ca30b8f in iris_ensure_indirect_generation_shader ../src/gallium/drivers/iris/iris_program_cache.c:374
#19 0x7f499d185267 in gfx9_emit_indirect_generate ../src/gallium/drivers/iris/iris_indirect_gen.c:593
#20 0x7f499d119c79 in iris_upload_indirect_shader_render_state ../src/gallium/drivers/iris/iris_state.c:8744
#21 0x7f499fe86b01 in iris_indirect_draw_vbo ../src/gallium/drivers/iris/iris_draw.c:233
#22 0x7f499fe86b01 in iris_draw_vbo ../src/gallium/drivers/iris/iris_draw.c:343
#23 0x7f499a174e43 in tc_call_draw_indirect ../src/gallium/auxiliary/util/u_threaded_context.c:3828
#24 0x7f499a1557fe in batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:453
#25 0x7f499a1557fe in tc_batch_execute ../src/gallium/auxiliary/util/u_threaded_context.c:504
#26 0x7f499a167f26 in _tc_sync ../src/gallium/auxiliary/util/u_threaded_context.c:761
#27 0x7f499a168888 in tc_texture_map ../src/gallium/auxiliary/util/u_threaded_context.c:2783
#28 0x7f49986f2631 in pipe_texture_map ../src/gallium/auxiliary/util/u_inlines.h:556
#29 0x7f49986f2631 in _mesa_map_renderbuffer ../src/mesa/main/renderbuffer.c:494
#30 0x7f49991af7ca in readpixels_memcpy ../src/mesa/main/readpix.c:260
#31 0x7f49991af7ca in _mesa_readpixels ../src/mesa/main/readpix.c:898
#32 0x7f499931ee23 in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:575
#33 0x7f49991b40b5 in read_pixels ../src/mesa/main/readpix.c:1199
#34 0x7f49991b40b5 in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1216
#35 0x7f49991b4a20 in _mesa_ReadPixels ../src/mesa/main/readpix.c:1231
...
SUMMARY: AddressSanitizer: 323648 byte(s) leaked in 201 allocation(s).
Fixes: 5438b19104 ("iris: enable generated indirect draws")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31313>
This is a regression happening with the commit 87b99d5797 ("nir: use copysign for atan").
Indeed, the opcode "copysign" was generating an incompatible i915 sequence.
For instance, this issue is triggered with
"deqp-gles2 --deqp-case=dEQP-GLES2.functional.shaders.operator.angle_and_trigonometry.atan2.highp_float_vertex":
deqp-gles2: ../src/compiler/nir/nir_lower_int_to_float.c:239: lower_alu_instr: Assertion `nir_alu_type_get_base_type(info->output_type) != nir_type_int && nir_alu_type_get_base_type(info->output_type) != nir_type_uint' failed.
Fixes: c4cec84231 ("nir/i915g/r300/nv30: skip marking varyings as flat in some drivers")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31315>
Starting with a750, the ALWAYS_ON counter is initialized from a loadable
counter in CX power domain, which is never turned off except during a
GPU reset. This means that timestamps should always be monotonic except
if the GPU resets, in which case subsequent submits should return
DEVICE_LOST anyway. Thus it should be good enough to satisfy the Vulkan
requirement that vkCmdWriteTimestamp is monotonic.
kgsl tries to synchronize the CX counter to the CPU counter, and
additionally adds a synchronization ioctl to improve the accuracy. I'm
not sure whether the former is really useful for us, but the latter
should eventually be implemented in drm/msm. However for now we can
expose the extension without any kernel support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31100>
It appears this was missed due to fractional runs, but the fix for this issue has already been merged.
While the test hasn't run in the full runs yet, it should now be considered fixed, just like on other GPUs.
Fixes: 812c8f6abe ("tu: Treat partially-bound depth/stencil attachments as passthrough")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31286>
The expectations for the manual runs were missed during the uprev, update them now.
Fixes: 213f5e9152 ("Uprev Piglit to e9ab30aeaed97b69868cf4d6d6a3f70f3b53c362")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31286>
KHR-GL46.texture_swizzle.functional usually takes 50+ seconds and can
time out on occasion. This isn't caused by the new kernel, it always
took this long. Skip it for pre-merge jobs on a630.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31286>
In traces produced with PAN_MESA_DEBUG, print swizzles in human readable
form (like BGRA) as well as the raw decimal format we were printing
before. This is purely a convenience feature for developers.
Reviewed-by: Boris Brezilllon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31242>
This adds an additional aux_context, so that the gfx queue isn't stalled
due to clearing buffers or initializing DCC.
This aux context will only be used by resource_create, which will allow
us to remove all barriers around the clears because there are no others
users of those buffers on that context.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31291>
end_query writes the timestamp only when everything is finished,
so the extra barrier only adds unnecessary overhead.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31291>
this extension specifies error checking (which was not implemented)
and also sample count clamping needs to be done since the samples
specified are just min samples and not an exact param
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31235>