Commit graph

186439 commits

Author SHA1 Message Date
Patrick Lerda
4ec5c2fb59 r600: fix emit_image_size() range base compatibility
This change fixes a regression introduced with 8b5d41cacb.
Indeed, lookup_resid was not updated.

This change was tested on palm and cayman. Here are the tests fixed:
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-float: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-int: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-cs-uint: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-float: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-int: fail pass
khr-gl4[3-5]/shader_image_size/advanced-nonms-fs-uint: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-float: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-int: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-cs-uint: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-float: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-int: fail pass
khr-gl4[3-5]/shader_image_size/basic-nonms-fs-uint: fail pass

Fixes: 8b5d41cacb ("r600/sfn: Use range_base for atomics and images")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33352>
(cherry picked from commit fd874bdd0c)
2025-03-04 20:26:19 +01:00
Lars-Ivar Hesselberg Simonsen
28d34f30e6 panvk: Use RUN_COMPUTE over RUN_COMPUTE_INDIRECT
RUN_COMPUTE_INDIRECT has been found to cause intermittent hangs, so
this change replaces it with RUN_COMPUTE and a set TASK_AXIS_X.

While this task axis might be suboptimal, the performance cost is
somewhat offset by RUN_COMPUTE not being an emulated command.

Fixes: 2ffc05d8d2 ("panvk: Add support for CmdDispatchIndirect")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33841>
(cherry picked from commit fe31e7843d)
2025-03-04 20:26:18 +01:00
Lars-Ivar Hesselberg Simonsen
af767e1e3e panfrost: Use RUN_COMPUTE over RUN_COMPUTE_INDIRECT
RUN_COMPUTE_INDIRECT has been found to cause intermittent hangs, so
this change replaces it with RUN_COMPUTE and a set TASK_AXIS_X.

While this task axis might be suboptimal, the performance cost is
somewhat offset by RUN_COMPUTE not being an emulated command.

Fixes: 447075eeee ("panfrost: Add support for the CSF job frontend")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33841>
(cherry picked from commit 6bf9ad2610)
2025-03-04 20:26:15 +01:00
Tapani Pälli
915075bf66 iris: remove dead code that cannot get hit anymore
As of recent changes, MESA_SHADER_GEOMETRY is handled by the if ladder.

CID: 1643918
Fixes: c33ebf09f5 ("iris: fix handling of GL_*_VERTEX_CONVENTION")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33842>
(cherry picked from commit d0b8d7d46c)
2025-03-04 20:24:44 +01:00
Patrick Lerda
56d066e062 r600: fix the indirect draw 8-bits path
This change fixes the indirect draw 8-bits path which does
a conversion to 16-bits. This change is implemented to process
the parameters the same way as the other indirect draw paths.

This change was tested on palm and cayman. Here are the tests fixed:
deqp-gles31/functional/draw_indirect/draw_elements_indirect/indices/index_byte: fail pass
deqp-gles31/functional/draw_indirect/random/35: fail pass
deqp-gles31/functional/draw_indirect/random/45: fail pass
khr-gl40/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl41/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl42/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl43/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl44/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass
khr-gl45/draw_indirect/basic-indicesdatatype-unsigned_byte: fail pass

Fixes: d80701df8a ("r600g: Implement GL_ARB_draw_indirect for EG/CM")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32802>
(cherry picked from commit 9aea08e1db)
2025-03-04 20:24:40 +01:00
Faith Ekstrand
3f7abae2fc zink: Don't present to Wayland surfaces asynchronously
Wayland EGL has a driver invariant which requires that any `wl_surface`
(or wp_linux_drm_syncobj_surface_v1) calls happen inside the client's
call to eglSwapBuffers().  Submitting surface messages after
eglSwapBuffers() returns causes serialization issues with the Wayland
surface protocol and can lead to the compositor booting the app.

Fixes: 8ade5588e3 ("zink: add kopper api")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12736
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33859>
(cherry picked from commit b92117d9bb)
2025-03-04 20:24:39 +01:00
Marek Olšák
d8b47159b7 mesa: allocate GLmatrix aligned to 16 bytes
The declaration has:

typedef struct {
   alignas(16) GLfloat m[16];   /**< 16 matrix elements (16-byte aligned) */
   alignas(16) GLfloat inv[16]; /**< 16-element inverse (16-byte aligned) */
...
} GLmatrix;

We should honor that.

Fixes: 3175b63a0d - mesa: don't allocate matrices with malloc
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10237

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33856>
(cherry picked from commit 7655826243)
2025-03-04 20:24:08 +01:00
Caio Oliveira
390317a99e brw: Fix size in assembler when compacting
Calculation was wrongly walking uncompacted instructions, even if we had
some compacted in the middle, generating invalid size.  Since we are
here just drop the instruction count, since in practice the caller will
have to walk the instruction stream anyway.

Fixes: 6267585778 ("intel/brw: Also return the size of the assembled shader")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33532>
(cherry picked from commit dd1ca1588d)
2025-03-04 20:24:05 +01:00
Samuel Pitoiset
5200d13a0f radv: fix re-emitting fragment output state when resetting gfx pipeline state
When switching from pipeline to shader objects.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33840>
(cherry picked from commit 7f6e28db26)
2025-03-04 20:24:03 +01:00
Gert Wollny
9842f90fcc r600/sfn: gather info and set lowering 64 bit after nir_lower_io
After nir_lower_io we need to gather the info about 64 bit usage
to be up-to-date when deciding whether the remaining 64 bit IO ops
be lowered.

Before 89dad5618d ("gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER")
the info was eventually updated to include the use of 64 bit values
also if only some IO was using this so that SFN was handling the code
correctly. As it seems with above patch this is not always the case
anymore, and we have to take care of it.

Fixes: 89dad5618d ("gallium: add PIPE_CAP_CALL_FINALIZE_NIR_IN_LINKER")
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32774>
(cherry picked from commit 6da19eafd5)
2025-03-04 20:24:03 +01:00
Mary Guillemard
41f982ddac pan/bi: Disallow FAU special page 3 and WARP_ID on message instructions
This is a constraint that apply on Valhall and later, instructions
should not use FAU special page 3 or WARP_ID if running
on the message unit.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: fd1906afea ("pan/va: Add FAU validation")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33843>
(cherry picked from commit ef0c7382c7)
2025-03-04 20:24:02 +01:00
Konstantin Seurer
08ae198bda llvmpipe: Skip draw_mesh if the ms did not write gl_Position
There is nothing to be done and the code will hit "assert(pos != -1);"
otherwise.

cc: mesa-stable

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12684
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33812>
(cherry picked from commit 4348253db5)
2025-03-03 17:25:25 +01:00
Patrick Lerda
ebca2fafa8 r600: fix evergreen_emit_vertex_buffers() related cl regression
For instance, this issue is triggered with "piglit/bin/cl-custom-buffer-flags":
Segmentation fault

Fixes: 81889f4d5c ("r600: ensure that the last vertex is always processed on evergreen")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33351>
(cherry picked from commit ee1cb894d6)
2025-03-03 17:25:24 +01:00
Emmanuel Gil Peyrot
4607eb7eae panvk: Initialize out array with the correct length
This avoids reading past the buffer’s end in the client afterward, because the
drmFormatModifierCount hasn’t been changed from what the client passed, if it
wasn’t zero at first.

GTK triggers that bug by setting it to the length of the static array (see this
bug[0] though), but other Vulkan programs might have the same issue if they
don’t first query the count before allocating the array.

This has been tested on a Radxa ROCK 5B board running a Mali-G610 GPU.

[0] https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/8222

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 252ddaf51b ("panvk: fix VkDrmFormatModifierPropertiesListEXT query")
Fixes: https://gitlab.freedesktop.org/mstoeckl/waypipe/-/issues/127
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33657>
(cherry picked from commit b4a82110ce)
2025-03-03 17:25:23 +01:00
Hyunjun Ko
0ea91330c3 anv: Do not support the tiling of DRM modifier if DECODE_DST
Fixes: 04709e4f ("anv: fix video profile lists");

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33784>
(cherry picked from commit f7ff9b240d)
2025-03-03 17:25:22 +01:00
Mike Blumenkrantz
eff71795d0 zink: clamp UBO sizes instead of asserting
this is a nice idea, but there are apps/games that do not respect
hardware capabilities and yolo-bind fixed size buffers

fixes Ballionaire (2667120) launch on non-desktop drivers

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33819>
(cherry picked from commit b04eaa8589)
2025-03-03 17:25:18 +01:00
Job Noorman
6090162961 ir3/ra: prevent reusing parent interval of reloaded sources
We would set the `src` flag on the interval of reloaded sources.
However, the interval might be merged with its parent when inserted and
the parent wouldn't have this flag set. This caused the parent interval
to potentially be reused to reload later sources. Fix this by setting
the `src` flag on the top-level interval after insertion.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33810>
(cherry picked from commit 2d540b8074)
2025-03-03 17:25:17 +01:00
Kevin Chuang
f912436dc9 anv/bvh: Fix copy shader handling sparse buffer
Fixes: 692b5fa9f2 ("anv: Add shader to copy acceleration structures")

This commit fixes the future test "sparse_binding_structures" for
"header_bottom_address" for ray tracing pipeline.

Even on 48-bit ray tracing (Xe1/2), the software-defined part
instance_leaf_part1.bvh_ptr has to be in canonical form for copy.comp
to deference a bvh, which means we have to preserve the upper 16bits.
This is especially relevant in cases where the acceleration structure buffer
is located high, such as sparse buffer.

Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
(cherry picked from commit 87ff7b061f)
2025-03-03 17:25:16 +01:00
Kevin Chuang
614dd4999c anv/bvh: Fix encoder handling sparse buffer
Fixes: 2fe57947e3 ("anv: Implement encode shader to fit in ANV BVH")

This commit resolves the failures in the future tests
"sparse_binding_structures" for rayquery. Sparse buffers' heaps are
located high, and since it's in canonical form, the higher 16bits are
all set to 1. However, the existing encoder did not expect any non-zero
values at the higher 16bits. As a result, the instance flags got
corrupted, causing most triangle tests to fail.

Thanks for Paulo providing insights about sparse buffer properties.

Co-developed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33745>
(cherry picked from commit b9a980ea73)
2025-03-03 17:25:14 +01:00
Benjamin Lee
6248bc98c2 panfrost/va: remove swizzle mod from LDEXP
This instruction does not support swizzles. This information is not used
for anything, but will be if we use the instruction tables for
bi_lower_swizzle.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 316486dd9f ("pan/va: Add initial ISA.xml for Valhall")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
(cherry picked from commit 2a70665df7)
2025-02-28 22:17:35 +01:00
Benjamin Lee
f3ee6ed43c panfrost: fix condition in bi_nir_is_replicated
The original implementation of this returned false when the src was
replicated, and true when it was not.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 21bdee7bcc ("pan/bi: Switch to lower_bool_to_bitsize")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
(cherry picked from commit 810351ad03)
2025-02-28 22:17:35 +01:00
Benjamin Lee
91c473e49a panfrost: fix large int32->float16 conversions
On vulkan, truncating to S/U16 before converting is not valid, because
out-of-range conversions are specified to be correctly rounded. IEEE 754
requires that out-of-range values round to ±inf with RTNE and ±F16_MAX
with RTZ.

On gl, truncating is valid for U16->F16, because out-of-range int->float
conversions are undefined behavior. For S16->F16, it is not valid
because S16_MAX < F16_MAX, so some in-range values will be truncated as
well.

Instead, just handle S/U16->F16 as S/U16->F32->F16.

Fixes dEQP-VK.spirv_assembly.instruction.compute.convertstof.int32_to_float16_*
when shaderFloat16 is enabled in panvk.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: be74b84e6f ("pan/bi: Fill in some more conversions")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
(cherry picked from commit a33cd3def2)
2025-02-28 22:17:35 +01:00
Daniel Schürmann
553ab18656 aco/assembler: Fix short jumps over chained branches
If we insert

   <code>
   s_branch 1
   s_branch Target

at the end of some block, and later hide an additional chained branch
after the existing one, then we have to update the 's_branch 1' to
also jump over the newly added branch.

Fixes: cab5639a09 ('aco/assembler: chain branches instead of emitting long jumps')
Closes: #12673
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33762>
(cherry picked from commit 6659db285a)
2025-02-28 22:17:35 +01:00
Lionel Landwerlin
4a08708ca2 vulkan/runtime: ensure robustness state is fully initialized
This is part of the hashing key :

==25753== Uninitialised byte(s) found during client check request
==25753==    at 0x93D29AE: blob_write_bytes (blob.c:164)
==25753==    by 0x93A62C6: vk_pipeline_precomp_shader_serialize (vk_pipeline.c:722)
==25753==    by 0x93AC55E: vk_pipeline_cache_add_object (vk_pipeline_cache.c:433)
==25753==    by 0x93A691B: vk_pipeline_precompile_shader (vk_pipeline.c:875)
==25753==    by 0x93A8FB9: vk_create_graphics_pipeline (vk_pipeline.c:1715)
==25753==    by 0x93A9799: vk_common_CreateGraphicsPipelines (vk_pipeline.c:1860)
==25753==  Address 0xf1adf82 is 82 bytes inside a block of size 152 alloc'd
==25753==    at 0x64FA858: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==25753==    by 0x99AAC38: vk_default_alloc (vk_alloc.c:26)
==25753==    by 0x93A403B: vk_alloc (vk_alloc.h:48)
==25753==    by 0x93A406B: vk_zalloc (vk_alloc.h:56)
==25753==    by 0x93A60A0: vk_pipeline_precomp_shader_create (vk_pipeline.c:680)
==25753==    by 0x93A689D: vk_pipeline_precompile_shader (vk_pipeline.c:866)
==25753==    by 0x93A8FB9: vk_create_graphics_pipeline (vk_pipeline.c:1715)
==25753==    by 0x93A9799: vk_common_CreateGraphicsPipelines (vk_pipeline.c:1860)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33792>
(cherry picked from commit 4dba1ad93f)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
c795725649 nvk: Only support compute shader derivatives on Turing+
Fixes: e0e7d8d910 ("nvk: Advertise VK_NV/KHR_compute_shader_derivatives")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit 8de37b142e)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
eff601577a nvk: Only support deviceGeneratedCommandsMultiDrawIndirectCount on Turing+
Indirect draws on Maxwell involve patching pushbufs together and doing
that isn't possible with device generated commands.

Fixes: 83b220f833 ("nvk: Advertise VK_EXT_device_generated_commands")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit bd04fdcb2b)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
29ae40e1aa nvk: Handle pre-Turing dispatch indirect commands
The QMD layout is a bit different.

Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit 7e12ba8709)
2025-02-28 22:17:35 +01:00
Faith Ekstrand
95d0ecd6e5 nak/qmd: Add a nak_get_qmd_cbuf_desc_layout() helper
Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33771>
(cherry picked from commit c540e5e2cc)
2025-02-28 22:17:35 +01:00
Paulo Zanoni
bac3b56d51 brw: extend the NOP+WHILE workaround
It turns out that we need to add a NOP not only in between two
consecutive WHILE instructions, but also after every control flow
instruction that immediately precedes a WHILE.

v2: Rebase after the renames.

Fixes: 5ca883505e ("brw: add a NOP in between WHILE instructions on LNL")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33021>
(cherry picked from commit fd10764cff)
2025-02-28 22:17:35 +01:00
Karol Herbst
62747d6bdd intel/brw, lp: enable lower_pack_64_4x16
The compiler won't be able to emit pack_64_4x16, so we should prevent
nir_opt_algebraic to optimize to it. This fixes an infinite optimization
loop inside brw_nir_optimize:

nir_copy_prop
    16x4     %77 = @load_global (%80)
    32    %61995 = pack_32_2x16_split %77.x, %77.y
    32    %61998 = pack_32_2x16_split %77.z, %77.w
    64    %61999 = pack_64_2x32_split %61995, %61998
    64       %76 = iadd %100, %79
                   @store_global (%61999, %76)

nir_opt_algebraic
    16x4     %77 = @load_global (%80)
    32    %61995 = pack_32_2x16_split %77.x, %77.y
    32    %61998 = pack_32_2x16_split %77.z, %77.w
    16x4  %62000 = vec4 %77.x, %77.y, %77.z, %77.w
    64    %62001 = pack_64_4x16 %62000
    64       %76 = iadd %100, %79
                   @store_global (%62001, %76)

nir_lower_pack
    16x4     %77 = @load_global (%80)
    16x4  %62000 = vec4 %77.x, %77.y, %77.z, %77.w
    16    %62002 = mov %62000.y
    16    %62003 = mov %62000.x
    32    %62004 = pack_32_2x16_split %62003, %62002
    16    %62005 = mov %62000.w
    16    %62006 = mov %62000.z
    32    %62007 = pack_32_2x16_split %62006, %62005
    64    %62008 = pack_64_2x32_split %62004, %62007
    64       %76 = iadd %100, %79
                   @store_global (%62008, %76)

// brw_nir_optimize loops here

nir_copy_prop
    16x4     %77 = @load_global (%80)
    32    %62004 = pack_32_2x16_split %77.x, %77.y
    32    %62007 = pack_32_2x16_split %77.z, %77.w
    64    %62008 = pack_64_2x32_split %62004, %62007
    64       %76 = iadd %100, %79
                   @store_global (%62008, %76)

llvmpipe has a similar issue inside lp_build_opt_nir

Fixes: b1bc691b0f ("nir/algebraic: add and improve pack/unpack patterns")
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33347>
(cherry picked from commit dad5ee1039)
2025-02-28 22:17:35 +01:00
Yiwei Zhang
3370a327d7 venus: fix image format cache miss with AHB usage query
should skip updating cache key instead of marking as a miss

Fixes: e48645250c ("venus: image format properties cache")
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33757>
(cherry picked from commit fde5cebec5)
2025-02-28 22:17:35 +01:00
Mike Blumenkrantz
ce3806b8ee zink: always fully unwrap contexts
threaded_context_unwrap_sync() can be called safely on non-threaded
contexts

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33742>
(cherry picked from commit f9fe08740a)
2025-02-28 22:17:35 +01:00
Yogesh Mohan Marimuthu
13b2f1e72d winsys/amdgpu: same_queue variable should be set if there is only one queue
Fixes: 45fa34284f ("winsys/amdgpu: don't add fence dependency of other queues for userq")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33661>
(cherry picked from commit 659a41293b)
2025-02-28 22:17:35 +01:00
Tapani Pälli
f8e7fecd7e iris: wait for imported fences to be available in iris_fence_await
This ensures shared fence is available before we submit (and fail)
a batch with it, this fixes following issue on iris driver:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/12650

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33662>
(cherry picked from commit 41a7b58214)
2025-02-28 22:17:35 +01:00
Lionel Landwerlin
3630721dc8 anv: fix missing 3DSTATE_PS:Kernel0MaximumPolysperThread programming
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 815d2e3e8b ("anv: move 3DSTATE_PS to partial packing")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33712>
(cherry picked from commit 91f36ba5b6)
2025-02-28 22:17:35 +01:00
Benjamin Lee
16dfadd3e0 panfrost: remove NIR_PASS_V usage for noperspective lowering
The rest of the NIR_PASS_V usage in panfrost was dropped in
34beb93635, but this one was added in an
MR that was merged after.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Fixes: 081438ad39 ("panfrost: add nir pass to lower noperspective varyings")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33728>
(cherry picked from commit 3b5d5c072a)
2025-02-28 22:17:35 +01:00
Dylan Baker
db51d8f8ac iris: fix handling of GL_*_VERTEX_CONVENTION
By actually setting the state packets according to the program data.
Also ensure that we correctly flag that the program may be dirty when
the geometry shader state changes

Fixes piglit tests: `spec@!opengl 3.2@gl-3.2-adj-prims * pv-first`

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33658>
(cherry picked from commit c33ebf09f5)
2025-02-28 22:17:35 +01:00
Dylan Baker
11faa02ec4 iris: Correctly set NOS for geometry shader state changes
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Backport-to: 25.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33658>
(cherry picked from commit 0477ee660f)
2025-02-28 22:17:34 +01:00
Hans-Kristian Arntzen
1b6da4ed52 radv: Always set 0 dispatch offset for indirect CS.
Fixes severe glitching in Avowed.

Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33732>
(cherry picked from commit 13a3f9a972)
2025-02-28 22:17:34 +01:00
Samuel Pitoiset
20bb982788 radv: fix missing SQTT barriers for fbfetch color/depth decompressions
SQTT layout transitions need to be inside SQTT barrier. Otherwise, this
throws an assertion in RADV and might also crash when the capture is
opened with RGP.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12664
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33719>
(cherry picked from commit 67c150bf9e)
2025-02-28 22:17:34 +01:00
Peyton Lee
539f0d88be radeonsi/vpe: check reduction ratio
Check the reduction ratio is within the hardware capablity.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Reviewed-by: David Rosca <david.rosca@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33528>
(cherry picked from commit e85a6b6a63)
2025-02-28 22:17:34 +01:00
Faith Ekstrand
1f2143eea6 nvk: Do not set INVALIDATE_SKED_CACHES pre-MaxwellB
The other two uses of this are behind guards but we forgot this one.

Fixes: 976f22a5da ("nvk: Implement CmdProcess/ExecuteGeneratedCommandsEXT")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33716>
(cherry picked from commit 58218c7349)
2025-02-27 18:37:33 +01:00
Faith Ekstrand
7013ebec5d nvk: Don't bind a fragment shading rate image pre-Turing
Fixes: 75bcb656d9 ("nvk: Add support for binding fragment shading rate images")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33716>
(cherry picked from commit c145147871)
2025-02-27 18:37:32 +01:00
Natalie Vock
ea47f98811 radv/rt: Don't allocate the traversal shader in a capture/replay range
We never write the traversal shader address out to shader group handles,
so this is not necessary. On the flipside, it can cause conflicts if the
traversal shader is allocated in a range occupied by a replayed shader.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33711>
(cherry picked from commit 14b902c825)
2025-02-27 18:37:32 +01:00
Georg Lehmann
cb09b3f624 aco/insert_exec: fix continue_or_break on gfx6-7
s_cmp_lg_u64 is gfx8+

Fixes: 115ff5f95b ("aco/insert_exec_mask: don't restore exec in continue_or_break blocks")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33715>
(cherry picked from commit c249556bf4)
2025-02-27 18:37:31 +01:00
Rhys Perry
36e1923284 ac/nir: fix tess factor optimization when workgroup barriers are reduced
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: b49eab68a8 ("ac/nir: use s_sendmsg(HS_TESSFACTOR) to optimize writing tess factors for gfx11")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12632
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33645>
(cherry picked from commit 2a3dce1b59)
2025-02-27 18:37:30 +01:00
Daniel Schürmann
f9c3499918 aco/ssa_elimination: insert parallelcopies for p_phi immediately before branch
Totals from 2499 (3.15% of 79377) affected shaders: (Navi31)
Instrs: 6011729 -> 6011761 (+0.00%); split: -0.00%, +0.00%
CodeSize: 31573216 -> 31574236 (+0.00%); split: -0.00%, +0.00%
Latency: 83364734 -> 83365781 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 13545643 -> 13545783 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 302678df91)
2025-02-27 18:37:30 +01:00
Daniel Schürmann
4118fef567 aco/insert_exec_mask: don't restore exec in continue_or_break blocks
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 115ff5f95b)
2025-02-27 18:37:29 +01:00
Daniel Schürmann
1bb39be75e aco/insert_exec_mask: Don't immediately set exec to zero in break/continue blocks
Instead, only indicate that exec should be zero and do
so in the successive helper block. This allows to insert
the parallelcopies from logical phis directly before the
branch in break and continue blocks.

Totals from 56 (0.07% of 79377) affected shaders: (Navi31)
Latency: 2472367 -> 2472422 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 253053 -> 253055 (+0.00%); split: -0.00%, +0.00%

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33527>
(cherry picked from commit 7f7c1d463a)
2025-02-27 18:37:28 +01:00
Karol Herbst
33a7ae1f0a rusticl/platform: advertise all extensions supported by all devices
There is a spec issue about this to clarify this behavior, but the current
wording can be interpreted that the platform always lists all extensions
supported by all drivers.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33667>
(cherry picked from commit 0fd70ee9de)
2025-02-27 18:37:27 +01:00