The index buffer and index buffer size don't affect whether or not we're
actually doing indexed rendering so we should just emit them whenever
they change. Otherwise, if someone sets an index buffer and then does a
non-indexed draw and then an indexed draw, the first draw will clear the
dirty bits without setting the index buffer registers and the second
draw won't know to re-emit them.
Fixes: 5544d39f44 ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Marc Alcala Prieto <marc.alcalaprieto@arm.com>
(cherry picked from commit 9c8e8ed655)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The Vulkan spec says:
"VUID-vkCmdExecuteGeneratedCommandsEXT-None-11062
If a rendering pass is currently active, the view mask must be 0."
So, it's invalid with VK_EXT_device_generated_commands but it's allowed
in DX12, it seems we missed this during the spec review.
Crimson Desert uses this and emulating in vkd3d-proton would be complex,
so let's re-introduce this support only for vkd3d-proton.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 782254b820)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Casting a negative float to uint64_t is undefined behavior. GCC 15 with
-O2 produces 0xFFFFFFFFFFFFFFFF for (uint64_t)(-32767.5f), causing snorm
clear values to be packed incorrectly (e.g. 0xFFFF instead of 0x8001 for
snorm16 -1.0). This results in wrong DCC comp-to-single clear colors and
~966 CTS snorm multisample_resolve test failures.
Fix by casting through int64_t first, which is well-defined (truncation
toward zero) and preserves the two's complement bit pattern.
Fixes: 585c25be1e ("radv: fix color conversions for normalized uint/sint formats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2595940b0d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Because of a previous refactor, pco_last_igrp was incorrectly changed to return
the first entry in a linked list instead of the last. Update pco_last_igrp to
return the last entry in a linked list.
The following CTS tests now pass:
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_2_dynamic_fragment
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_dynamic_fragment
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_uniform_fragment
Fixes: 719ece42c0 ("pco: Switch back to util/list")
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 7428af29f6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Just like the fix for nvk, just drop this in the GL driver as well.
Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 3f5d54ab8c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
ret was read after the timeout check, so breaking on timeout returned 0
instead of the actual fence status, potentially reporting a signaled
fence when it was still pending.
Fixes: 441f01e778 ("freedreno/drm/virtio: Drop blocking in host")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 97baa27dad)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Fixes two bugs in the WAIT_FENCE polling loop:
1. Break on timeout returned VK_SUCCESS because ret was read too late.
2. UINT64_MAX timeout_ns overflowed end_time, causing immediate exit.
Fix by reading rsp->ret before the timeout check and using
OS_TIMEOUT_INFINITE (like virtio_pipe_wait in freedreno) to avoid
overflow.
This prevents premature BO teardown during host-side fault recovery.
Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit dad72b414b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
By default crocus precompiles shaders, to avoid stuttering at screens,
caused by compiling shaders at the drawing phase.
Unfortunately at intel Gen 6 and higher the precompiled version of the
fragment shaders is not used and every fragment shader is compiled twice.
These double fragment shaders also are added to the memory cache
and disk cache.
This is caused by setting wrong values to variables at the key during
precompiling at routine crocus_create_fs_state() at src/gallium/drivers/crocus/crocus_program.c,
which differ from values at crocus_populate_fs_key() at src/gallium/drivers/crocus/crocus_state.c.
This commit solves 3 problems:
it adjusts the predicted value 'input_slots_valid' at Gen 6
it adjusts the predicted value 'ignore_sample_mask_out' at Gen 6 and higher
it predicts the value 'multisample_fbo' , which helps if samplemask is used
Cc: mesa-stable
Signed-off-by: GKraats <vd.kraats@hccnet.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 686266d2f1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The previous fix was incomplete because if the same graphics pipeline
and the same PS epilog are rebind after vkCmdExecuteCommands(), the PS
epilog state wouldn't be re-emitted, and it will use a wrong VA (in case
both fragment shader user SGPRs aren't similar either).
Resetting the PS epilog to NULL in the primary should prevent any
issues, but this tracking still need to be improved because it caused
two issues recently.
Fixes: 1a00587c44 ("radv: fix a GPU hang with PS epilogs and secondary command buffers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15176
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit a73fc90bcd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
This can be used to workaround problem cases with application
controlled subgroup size.
v2: removed intel_use_jay for stable branch
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit c105366165)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
I don't know why but these seem to fail only intermittantly now.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit e166027e52)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
StackSizePerRay is the RTDispatchGlobals::AsyncStackSize and
DisableRTGlobalsKnownValues is to interpret how many Max BVH levels we
need to use. It's not relevant to Vulkan, since we have just 2 fixed BVH
levels.
Fixes: cb423ee6 ("anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921")
Fixes: c1a44e8d ("anv: force StackIDControl value for Wa_14021821874")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7a627fa8f3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
I ran into this case where genX(cmd_buffer_emit_bt_pool_base_address)
was returning immediately because it considered an RCS engine
emulating a compute queue as neither a render nor a compute queue.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d581b7282b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
For some reason the android builders started noticing...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit f6306198d0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The Vulkan spec states that if VK_KHR_shader_clock is supported,
shaderSubgroupClock is a required feature. So let's not enable that
extension unless we can...
Fixes: e9c2c32409 ("panvk: enable VK_KHR_shader_clock")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit c8ae72f51d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The Vulkan spec states that if VK_ARM_shader_core_builtins is supported,
shaderCoreBuiltins is a required feature. So let's not enable that
extension unless we can...
Fixes: dff1d91c64 ("panvk: Enable VK_ARM_shader_core_builtins")
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 8cb89853b8)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
This hasn't been supported since the TGSI envvar was ripped out. When
converted to NIR, we don't see these instructions at all.
Fixes: c3cbe610df ("nouveau: Delete the NV50_PROG_USE_TGSI env var.")
Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit b062062430)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
We can't expose large texel-buffers if we don't emit the high bits.
Whoopsie!
Fixes: 4db7958edc ("pan/bi: Change texel buffer limits")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 57a80ff78c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
A few cases of UINT32_MAX were missed, whoops.
Fixes: c2c91e78fd ("pan/layout: Allow bigger size/surface stride on v12+")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 69b8372fbf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
This change was tested on plam and cayman. Here are the tests fixed:
spec/arb_gl_spirv/execution/uniform/atomic-uint-aoa-cs: fail pass
spec/arb_gl_spirv/execution/uniform/atomic-uint-aoa-fs: fail pass
spec/arb_gl_spirv/execution/uniform/atomic-uint-array-cs: fail pass
spec/arb_gl_spirv/execution/uniform/atomic-uint-array-fs: fail pass
spec/arb_gl_spirv/execution/uniform/atomic-uint-cs: fail pass
spec/arb_gl_spirv/execution/uniform/atomic-uint-fs: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit 0deac18581)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
This change adds a minimal support for gl_PointSize to
be used alongside gl_ClipDistance/gl_CullDistance.
This change was tested on palm and cayman. Here is the test fixed:
khr-gl4[5-6]/gl_spirv/spirv_validation_builtin_variable_decorations_test: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit 032a2bdc1e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The atomic offset implementation was incomplete.
This change was tested on cayman, it fixes all the
variants of this test:
khr-gl4[2-6]/shader_atomic_counters/advanced-usage-multi-stage: fail pass
khr-gles31/core/shader_atomic_counters/advanced-usage-multi-stage: fail pass
Fixes: 06993e4ee3 ("r600: add support for hw atomic counters. (v3)")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit 48902771ad)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
This change is inspired by b56f47611a ("radeonsi: fix
alpha-to-coverage + alpha-to-one used together for
gfx6-10.3") and implements the same algorithm.
This change was tested on rv770, palm and cayman. Here are the tests fixed:
spec/arb_framebuffer_object/execution/msaa-alpha-to-coverage_alpha-to-one: fail pass
spec/arb_framebuffer_object/execution/msaa-alpha-to-coverage_alpha-to-one_write-z: fail pass
Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit 7513f48edf)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Previously the output length of pvr_get_driver_build_sha() is changed to
BUILD_ID_EXPECTED_HASH_LENGTH, but the array defined to receive its
output, the driver_build_sha array inside struct pvr_instance, is
declared with BLAKE3_KEY_LEN, which is longer than
BUILD_ID_EXPECTED_HASH_LENGTH.
This leads to uninitialized memory being accessed when creating
pipelineCacheUUID value, and the pipelineCacheUUID value would become
random in each run, defecting the purpose of it.
Refactor the code copying the build ID to follow other drivers: changing
the parameter from the buffer to the instance pointer, inserting a
static assert thanks to being able to retrieve the destination buffer
length (as an array inside the instance structure) and using
copy_build_id_to_sha1() to do the final copy.
Fixes: 6a42493c94 ("pvr: Use BUILD_ID_EXPECTED_HASH_LENGTH")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit 9870c8d8c4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Add the missing layout which do not need implemented anything in
mali gpu.
Fixed: dEQP-VK.image.host_image_copy.properties.properties
unifiedImageLayouts feature is supported, but layout
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL was not included in
VkPhysicalDeviceHostImageCopyProperties::pCopySrcLayouts.
Fixes: 1cd61ee ("panvk: implement VK_EXT_host_image_copy for linear color images")
Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 62e7120384)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Some Max Payne 3 shaders are impacted by this and probably will fix some
issue there. The VK CTS isn't testing this, but it was verified to fix a
real problem by inserting 0 offsets into the instruction and having CTS
tests fail with the old ordering.
Totals from 3 (0.00% of 1163204) affected shaders:
CodeSize: 2496 -> 2736 (+9.62%)
Static cycle count: 732 -> 741 (+1.23%)
Fixes: ad01fbdda0 ("nak: Add a NIR texture lowering pass")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit e09045e26c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
When recording secondary command buffers with occlusion queries, the
get_vis_results flag could be set for some graphics sub_cmd's job.
Propagate this flag from secondary command buffer graphics sub_cmds to
primary command buffer sub_cmds to ensure occlusion queries in secondary
command buffers being correctly executed.
Fixes: 5c34be4340 ("pvr: Process secondary buffer queries in vkCmdExecuteCommands.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit b8c5e47949)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
There's a dynarray field inside gfx sub_cmd called sub_query_indices,
which will contain pending query indices for gfx sub_cmds inside a
secondary command buffer. It's expected that when finishing such gfx
sub_cmds, the content of query_indices is going to be moved there.
However the `util_dynarray_append_dynarray()` call is called with wrong
parameter order, thus it's copying sub_query_indices to query_indices
and then immediately wiping query_indices, forgetting all query indices
in such case.
Fix the `util_dynarray_append_dynarray()` call to fix occlusion queries
in secondary command buffers.
Fixes: 8c506c4b03 ("pvr: Use util_dynarray_append_dynarray()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 87f4122e11)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The last sub_cmd in the command buffer could be a graphics one, and when
ending a graphics sub_cmd, the query_indices array will be checked to
know whether a occlusion query starts during this graphics sub_cmd.
Finalize the query_indices array after ending the last sub_cmd,
otherwise the check for query initiation may have a false negative
result.
Fixes the `dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.
random.seed6` test case.
Fixes: 2b1992a000 ("pvr: Implement vkCmdBeginQuery API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 36f34a72c1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
src/gallium/drivers/zink/zink_instance.c:34:9: warning: variable 'have_moltenvk_layer' set but not used [-Wunused-but-set-variable]
Fixes: 2b4fcf0a06 ("zink: generate instance creation code with a python script")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit 5982deb48b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
This appears to cause some sort of prefetching which is causing
page faults for linear textures on the following page after the
texture allocation.
This might be okay for tiled, but for now just disable it.
The test crashing this was to allocate an 800x409 linear 2D texture
which gnome-initial-setup was doing.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15277
Cc: mesa-stable
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 7067b66846)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The validation of branch instructions happening in branch and thrsw
delay slots has been dead code since it was introduced as the check
was after:
if (inst->type != V3D_QPU_INSTR_TYPE_ALU)
return;
Now last_branch_ip is updated and checks in_branch_delay_slots()
are active.
Fixes in_branch_delay_slots, as for branch there are always 3 delay slots.
As scheduler enforces this restrictions shader-db does not show any
regression.
Assisted-by: Claude Opus 4.6
Fixes: 90269ba353 ("broadcom/vc5: Use THRSW to enable multi-threaded shaders.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit dd6e7c8ef0)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>