Commit graph

221343 commits

Author SHA1 Message Date
Paulo Zanoni
376130716f intel/isl: fix assert when surf->size_B is > UINT_MAX
I have some local tests for Sparse Resources that I wrote when I was
working on that for Anv. One of them tries to create a sparse buffer
with size 4294967296 (which doesn't fit in an uint32_t). Without this
patch, the right side of the assertion overflows and we get:

sparse: ../../src/intel/isl/isl.c:3787: isl_surf_from_mem: Assertion `surf->size_B == surf->row_pitch_B * extent.h * extent.a' failed

Fixes: fcdae4d4c0 ("intel: Add and use isl_surf_from_mem()")
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
(cherry picked from commit c4b6df29bf)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Adrián Larumbe
c1fb012279 pan/kmod: fix double syncop count sum when populating vm_bind syncs
In order to assign bind_ops[i].syncs a slice of the sync_ops array,
op_sync_cnt must record the exact number sync operations for that vm_bind
operation, so that &sync_ops[syncop_ptr - op_sync_cnt] will give us the
right start of its slice.

Fixes: 97f6a62f7e ("pan/kmod: Add a backend for panthor")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
(cherry picked from commit 293b264b7d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Adrián Larumbe
38ed6879bf pan/kmod: Fix minor version number check for USER_MMIO_OFFSET ioctl
It has been available in the Panthor KMD since 1.5

Fixes: 590ad83b98 ("panfrost: Use pan_image_test_modifier_with_format() to do our modifier check")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
(cherry picked from commit 9145ce0bb2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Lorenzo Rossi
d24a306681 panvk/jm: Fix tls_size overwrite in indirect draws
Only caused problems when the VS/FS has more TLS than our internal shaders
that doesn't usually happen but will cause bugs when we start to
compress local memory.

Fixes: 005703e5b5 ("panvk: Move TLS preparation logic to cmd_dispatch_prepare_tls")
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit f0d2ad9840)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
David Rosca
9353a223aa frontends/va: Add missing NULL check for additional output surface
Fixes: efc6d27fd4 ("frontends/va: Add support for decode/encode processing")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 3d16845e9a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
David Rosca
408b08da2a frontends/va: Fix setting output color properties from color standard
Fixes: 6e8a8d8ee7 ("frontends/va: Stop using vpp colors standard")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 69db546936)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Karol Herbst
2faf689ee9 nak: use fmul_rtz for NAK_INTERP_MODE_PERSPECTIVE
Fixes rendering artifacts in The Surge 2 and Shadow of the Tomb Raider.
And it's what nvidia's driver is doing.

Totals from 170446 (14.05% of 1212873) affected shaders:
CodeSize: 2019019440 -> 2026071952 (+0.35%); split: -0.07%, +0.41%
Number of GPRs: 8158110 -> 8098382 (-0.73%); split: -0.80%, +0.07%
SLM Size: 106448 -> 106440 (-0.01%)
Static cycle count: 1398452243 -> 1400038117 (+0.11%); split: -0.17%, +0.28%
Spills to memory: 546 -> 520 (-4.76%)
Fills from memory: 546 -> 520 (-4.76%)
Spills to reg: 22585 -> 22670 (+0.38%); split: -0.31%, +0.68%
Fills from reg: 18243 -> 18331 (+0.48%); split: -0.34%, +0.82%
Max warps/SM: 6797472 -> 6822196 (+0.36%); split: +0.38%, -0.02%

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/11447
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/11706
Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit f5e92e5493)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Karol Herbst
0f4b8df197 nak: handle nir_op_fmul_rtz
Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 5d9225388c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Karol Herbst
f40ef3b0d9 nir: add fmul_rtz
It's needed in NVK for correctness with interpolation.

Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 4e520f671c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Raviraj Uppal
d3ec8ffb9c driconf: disable allow_rgb16_configs for SPECviewperf
Commit f2aaa9ce00 added 16 bpc unorm display formats gated behind the flag
allow_rgb16_configs driconf option, defaulting to true.

This causes SPECviewperf's maya_06 viewset to fail. Disabling
allow_rgb16_configs for SPECviewperf alongside the existing
allow_rgb10_configs workaround.

Fixes: f2aaa9ce00 ("dri,gallium: Add support for RGB[A]16_UNORM display formats.")

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 3359de8247)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Jose Maria Casanova Crespo
d968629409 broadcom/compiler: move nir_lower_undef_to_zero out of optimization loop
The combination of nir_opt_if and nir_lower_undef_to_zero running inside
the optimization loop could make it to not converge.

This was exercised by ollama running gemma3 compute shaders.

Removing the pass from the optimization loop results in No changes in
shader-db.

Assisted-by: Claude Opus 4.6
Fixes: cbe24a0e9c ("broadcom/compiler: use nir_lower_undef_to_zero")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 2cd51a6efc)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Lionel Landwerlin
1ad28db0f1 anv: fix null pointer access
Reproduces with dEQP-VK.pipeline.no_queues.pipeline_binary.compute

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 595889018a ("anv: implement VK_KHR_maintenance9")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit dad8f65611)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Icenowy Zheng
5e04ef4306 pvr: record deferred RTA clears for secondary cmdbuf subcmds
Currently the code handling deferred RTA clears cannot handle them for
secondary command buffers within render passes, because the code
immediately configures the transfer command for the deferred clear
operation, but the specific attachment image view isn't known when
recording secondary command buffers to be executed inside render passes.

Add code to record parameters for deferred RTA clears in secondary
command buffers when the attachment is unknown, and bind the recorded
clears to the attachment's image view when executing the secondary
command buffer inside a render pass.

Fixes many dynamic rendering random tests.

Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 9960637b26)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Icenowy Zheng
32111204fe pvr: add deferred RTA clear command to list after checking it's not NULL
The code that adds deferred RTA clear transfer commands checks whether
the newly allocated transfer command is NULL. However the list_addtail
call is before the check, which means that the check does not prevent
NULL dereference.

Reorder the code to ensure no NULL transfer commands would ever be added
to the deferred clear list.

In addition, pvr_transfer_cmd_alloc() has already set the command
buffer's error status when it returns NULL, so it's not needed to set it
again.

Fixes: 2eabbbe57d ("pvr: use linked list to back deferred clears")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit ecd4e93456)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Icenowy Zheng
d8e22b980d pvr: properly handle deferred RTA clears for 2D array view of 3D image
For 2D array views of 3D images, the layer of the view corresponds to
the depth (instead of the layer, which should be always 0) of the image.

Fix the code emitting deferred RTA clears to set the depth instead of
the layer of the image to clear.

Fixes the flakiness of `dEQP-VK.renderpasses.renderpass*.
remaining_array_layers.multi_layer_fb.*`.

Fixes: 95820584d0 ("pvr: Add deferred RTA clears for cores without gs_rta_support.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 40e0f0f933)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Icenowy Zheng
47081ca2d5 pvr: do not setup deferred RTA clear for active render targets
Deferred RTA clear will happen after the current graphics subcommand is
executed, which may override rendered image in the graphics subcommand.
In addition, the active render targets do not need "emulated" clear --
they can be really cleared by drawing rectangles.

Skip set up deferred RTA clear for active render target layers, and
continue to do immediate clear for these layers.

Fixes a few dynamic rendering random CTS tests, but the issue should
also exist in legacy renderpasses RTA clears.

Fixes: 95820584d0 ("pvr: Add deferred RTA clears for cores without gs_rta_support.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 43b22a477c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Nick Hamilton
688d051236 pvr: Revert don't csb emit multi-layer clear attachments without rta support
While testing HW without gs_rta_support it was raised that this
change had been made in error. After retesting with the change
reverted the listed tests still pass.

This reverts commit d68344bffe.

Backport-to: 26.0
Reported-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 08c13564d6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Icenowy Zheng
b263998997 pvr: increase maxPerStageResources for new maxPerStageDescriptorStorageBuffers
When maxPerStageResources is less than 128, it must be at least the sum
of maxPerStageDescriptorUniformBuffers,
maxPerStageDescriptorStorageBuffers, maxPerStageDescriptorSampledImages,
maxPerStageDescriptorStorageImages,
maxPerStageDescriptorInputAttachments and maxColorAttachments.

As maxPerStageDescriptorStorageBuffers is previously increased, the
value of maxPerStageResources should be increased too.

This fixes regression on two limit validation tests in the Vulkan CTS --
dEQP-VK.info.device_properties and dEQP-VK.api.info.
vulkan1p2_limits_validation.general .

Fixes: 35f57a2739 ("pvr: increase value of maxPerStageDescriptorStorageBuffers")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 1027059baa)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Karol Herbst
7116d2fed0 softfloat: make sign bit an unsigned int
According to ubsan shifting an int32_t by 31 bits to the left is undefined
behavior. So just declare it as uint32_t.

Backport-to: 26.1
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
(cherry picked from commit b2aa92b523)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Danylo Piliaiev
5d40635bc8 tu/u_trace: Fix explicit toggle_name not being used
Fixes: 889f71f71a ("tu: Add tracepoints for clear/copy/blit/lrz ops")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit 59f626ac81)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Rohit Athavale
c391568ef6 mediafoundation: Test compile steps v/s step , and set build flag
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15244
Backport-to: *

Reviewed-by: Pohsiang (John) Hsu <pohhsu@microsoft.com>
(cherry picked from commit c5b184a02a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Icenowy Zheng
823b7f0b15 pvr: wait for graphics jobs in CopyQueryPoolResults
The last graphics job, which might write to the occlusion query result,
could still be running when vkCmdCopyQueryPoolResults is called.

Additionally wait for graphics jobs before copying the results.

Fixes: 24b1e3946c ("pvr: Add support to submit occlusion query sub cmds.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 82925935d4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Karol Herbst
05a2dc747a rusticl: link the C++ runtime statically
Apparently some applications don't have their C++ situation under control.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14090
(cherry picked from commit 528ceeb49b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:27 +02:00
Karol Herbst
ac6ba297e1 ci: install libstdc++-static on fedora
(cherry picked from commit 5512680581)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:26 +02:00
Danylo Piliaiev
6f24da74e1 tu: Fix CP_CCHE_INVALIDATE not being applied at the right point
Apparently CP_CCHE_INVALIDATE is just a plain register write underneath,
so it needs WFI before it, in order to invalidate at the right point.

```
CP_CCHE_INVALIDATE:
mov $addr, 0x9881
mov $data, 0x1
waitin
mov $01, $data
```

Fixes misrendering in Doom Eternal on A750.

Fixes: fb1c3f7f5d ("tu: Implement CCHE invalidation")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit c2e78f1b22)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:26 +02:00
Caio Oliveira
54e1e88970 brw: Fix max_dispatch_width collection for CS with variable size
The intention of the original commit was to make all the shaders report
the same max_dispatch_width.  When CS has multiple variants, this was
not happening as expected.

Fixes: 2acc2f18ea ("intel/compiler: report max dispatch width statistic")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit e1745e0bd9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:26 +02:00
Eric Engestrom
c0c4f09cbc .pick_status.json: Update to 2b9e491b67
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41402>
2026-05-06 17:00:26 +02:00
Eric Engestrom
5112dfea22 VERSION: bump for 26.1.0-rc3
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-04-29 21:01:34 +02:00
Job Noorman
19ee4f8d1a ir3/shared_ra: fix live-out reload after src reload
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When reloading live-out values along loop back-edges, we make sure to
reuse the original register. However, we failed to detect cases where
the spilled value got reloaded earlier for a src in a different
register. Fix this by reloading the value again in the original
register.

Fixes a RA validation failure in Windrose.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
(cherry picked from commit aaf4d77f43)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Benjamin Cheng
23633c2afe radv/wsi: Re-use transfer queue if it exists
This avoids writing past the end of pdev->vk_queue_to_radv if all the
queue families are available.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14834
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 656b3814c2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Luigi Santivetti
54650350cc pvr: add missing multi-arch support for pipeline exec and stats
Entry points must be wrapped in the PVR_PER_ARCH macro else there
will be multiple definitions of the same symbol.

Fixes: dfddb3fe ("pvr: Add support for VK_KHR_pipeline_executable_properties")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 120bd20e49)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Icenowy Zheng
8dbc8528be pvr: skip emitting query program when copy result / reset with 0 queries
When calling vkResetQueryPool() or vkCmdCopyQueryPoolResults() with a
queryCount of 0, currently a query compute program with workgroup size
0*1*1 will be emited, which is ridiculous and will be rejected by some
assertion in pvr_compute_generate_control_stream() .

As the operation should be noop when queryCount is 0, the functions can
and should just return in such cases.

Fixes: 0aa9f32b95 ("pvr: Implement vkCmdResetQueryPool API.")
Fixes: b6e8e1cf37 ("pvr: Implement vkCmdCopyQueryPoolResults API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
(cherry picked from commit 01ba4867fa)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Olivia Lee
160d5c8636 pan/bi: fix memory access alignment
Memory accesses need to be aligned up to the next power of two of the
full access size. Component count and bit-size don't matter to the
hardware, only the total size.

shader-db results are pretty much what you would expect, there are a few
shaders that have increased LS instructions as a result of splitting
accesses to satisfy alignment requirements that were previously ignored.
The one surprising thing is that there are several shaders that have
reduced uniform usage. Looking at some of these individually, what
happened is that splitting UBO loads early allowed the compiler to
eliminate loads from unused ranges of the access.

total instrs in shared programs: 719166 -> 719174 (<.01%)
instrs in affected programs: 2355 -> 2363 (0.34%)
helped: 4
HURT: 6
helped stats (abs) min: 1.0 max: 9.0 x̄: 3.00 x̃: 1
helped stats (rel) min: 0.36% max: 6.52% x̄: 1.99% x̃: 0.54%
HURT stats (abs)   min: 1.0 max: 4.0 x̄: 3.33 x̃: 4
HURT stats (rel)   min: 0.65% max: 2.13% x̄: 1.38% x̃: 1.48%
95% mean confidence interval for instrs value: -2.14 3.74
95% mean confidence interval for instrs %-change: -1.76% 1.82%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 30210.83 -> 30218.81 (0.03%)
cycles in affected programs: 50 -> 57.99 (15.97%)
helped: 2
HURT: 6
helped stats (abs) min: 0.0078129999999999589 max: 0.070312000000000041 x̄: 0.04 x̃: 0
helped stats (rel) min: 1.10% max: 10.23% x̄: 5.66% x̃: 5.66%
HURT stats (abs)   min: 0.03125 max: 5.0 x̄: 1.34 x̃: 1
HURT stats (rel)   min: 2.38% max: 25.00% x̄: 13.05% x̃: 14.26%
95% mean confidence interval for cycles value: -0.42 2.41
95% mean confidence interval for cycles %-change: -1.74% 18.49%
Inconclusive result (value mean confidence interval includes 0).

total cvt in shared programs: 2385.91 -> 2385.91 (<.01%)
cvt in affected programs: 11.14 -> 11.14 (<.01%)
helped: 5
HURT: 4
helped stats (abs) min: 0.0078119999999999301 max: 0.070312000000000041 x̄: 0.02 x̃: 0
helped stats (rel) min: 0.27% max: 10.23% x̄: 2.61% x̃: 0.82%
HURT stats (abs)   min: 0.01562600000000014 max: 0.03125 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 1.31% max: 2.75% x̄: 2.21% x̃: 2.40%
95% mean confidence interval for cvt value: -0.02 0.02
95% mean confidence interval for cvt %-change: -3.51% 2.58%
Inconclusive result (value mean confidence interval includes 0).

total ls in shared programs: 25871 -> 25879 (0.03%)
ls in affected programs: 46 -> 54 (17.39%)
helped: 0
HURT: 4
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 2.00 x̃: 1
HURT stats (rel)   min: 10.00% max: 25.00% x̄: 18.38% x̃: 19.26%
95% mean confidence interval for ls value: -1.18 5.18
95% mean confidence interval for ls %-change: 8.46% 28.30%
Inconclusive result (value mean confidence interval includes 0).

total code size in shared programs: 6302848 -> 6302976 (<.01%)
code size in affected programs: 1536 -> 1664 (8.33%)
helped: 0
HURT: 1

total registers used in shared programs: 117324 -> 117329 (<.01%)
registers used in affected programs: 45 -> 50 (11.11%)
helped: 1
HURT: 2
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 6.25% max: 6.25% x̄: 6.25% x̃: 6.25%
HURT stats (abs)   min: 2.0 max: 4.0 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 12.50% max: 30.77% x̄: 21.63% x̃: 21.63%

total uniforms used in shared programs: 78538 -> 78274 (-0.34%)
uniforms used in affected programs: 2688 -> 2424 (-9.82%)
helped: 104
HURT: 4
helped stats (abs) min: 1.0 max: 18.0 x̄: 2.65 x̃: 2
helped stats (rel) min: 1.96% max: 54.55% x̄: 12.78% x̃: 11.11%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 3.70% max: 16.13% x̄: 9.92% x̃: 9.92%
95% mean confidence interval for uniforms used value: -3.01 -1.88
95% mean confidence interval for uniforms used %-change: -14.15% -9.74%
Uniforms used are helped.

Total CPU time (seconds): 73.26 -> 74.48 (1.67%)

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 2f2738dc90 (pan/bi: Use nir_lower_mem_access_bit_sizes)
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 72e0eda260)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Alyssa Rosenzweig
52b8d7fede nir/opt_reassociate: fix exactness bug
For an inexact-associative operation (fadd or fmul), can_reassociate ensures the
root of the chain is inexact to allow reassociating. However, build_chain just
checks for opcodes to match up after, although we do sum up exactness across the
chain. Although an Effort Was Made, it still seems incorrect to reassociate

   %3 = fadd! %0, %1
   %4 = fadd %3, %2

to instead be (ex.)

   %3 = fadd! %0, %2
   %4 = fadd! %3, %1

Closes: #14418
Fixes: e0b0f7e73c ("nir: add ALU reassocation pass")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 0c49738211)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Georg Lehmann
46a5ef37e1 intel/nir_opt_peephole_ffma: fix fp_math_ctlr for modifiers
If abs/neg don't preserve nan/inf/sz, the whole expressions won't.

Fixes: 1b0808adf3 ("intel/nir: Make ffma peephole optimization preserve fp_fast_math flags")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 26ec32dada)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Tapani Pälli
7be90ef02b anv: do not use resource barrier with split barriers
Fixes failing CTS tests using asymmetric and non-asymmetric (regular)
split barriers.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15310
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit bdaf8b6b39)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Samuel Pitoiset
94c81f2f60 radv: add missing VkMemoryRangeBarriersInfoKHR from DAC
This is used to declare barrier dependencies for an addr range
(because no VkBuffer with DAC).

This fixes new dEQP-VK.api.device_address.misc.memory_range_barrier.

Fixes: a97c889a7b ("radv: implement VK_KHR_device_address_commands")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit d1a428606e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Samuel Pitoiset
ab91484fb0 vulkan: add missing VkMemoryRangeBarriersInfoKHR support
This has been introduced by VK_KHR_device_address_commands and we
missed it completely.

Backport-to: 26.1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ce60e18bce)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Georg Lehmann
4799e79a5f nir: disable fp class analysis for 64bit transcendentals
Some backends have terrible precision for these fp64 opcodes, so don't try to
do anything clever.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15334
Fixes: 5a298f3560 ("nir: rewrite fp range analysis as a fp class analysis")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 599a52174b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Nick Hamilton
a3e7443e0c pco: fix clamping the array index when shaderImageGatherExtended is enabled
The array index value is a signed integer but the compiler was using
the unsigned version of the clamp helper function meaning the value
was not been clamped to 0 when its value was < 0.

Fix the following deqp test cases when shaderImageGatherExtended is enabled
dEQP-VK.glsl.texture_gather.basic.2d_array.*
dEQP-VK.glsl.texture_gather.offset.*.2d_array.*
dEQP-VK.glsl.texture_gather.offset_dynamic.*.2d_array.*
dEQP-VK.glsl.texture_gather.offsets.*.2d_array.*

Fixes: 854563f0f8 ("pco: fully switch over to common smp emission code")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit b80a5f9b7d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Simon Perretta
cf1752d2bd pco: amend tg4 lowering
Use lower_tg4_offsets to take care of explicit offsets, and just swizzle
the texels in the order defined by textureGather*

Fixes: 46c9239c11 ("pvr, pco: initial texture gather support with gather sampler")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 56b8dc92a9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Arjob Mukherjee
13a2d150de pvr: increase value of maxPerStageDescriptorStorageBuffers
Increase past the minimum required by the Vulkan Spec to fix tests. This
was needed due to Zink requirements which splits
`maxPerStageDescriptorStorageBuffers` between atomic buffers and
`MaxShaderStorageBlocks`.

Fixes the following GLES conformance tests:
  KHR-GLES31.core.compute_shader.resources-max
  KHR-GLES31.core.draw_indirect.advanced-twoPass-Compute-arrays
  KHR-GLES31.core.shader_image_load_store.advanced-sync-vertexArray
  KHR-GLES31.core.shader_image_load_store.basic-allTargets-store-cs
  KHR-GLES31.core.shader_image_load_store.basic-allTargets-store-fs
  KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-int
  KHR-GLES31.core.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case1-cs
  KHR-GLES31.core.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
  dEQP-GLES31.functional.draw_indirect.compute_interop.combined.drawelements_compute_cmd_and_data_and_indices
  dEQP-GLES31.functional.synchronization.in_invocation.ssbo_alias_overwrite
  dEQP-GLES31.functional.synchronization.in_invocation.ssbo_alias_write
  dEQP-GLES31.functional.synchronization.in_invocation.ssbo_atomic_alias_overwrite
  dEQP-GLES31.functional.synchronization.in_invocation.ssbo_atomic_alias_write
  dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.ssbo_atomic_multiple_write_read
  dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.ssbo_multiple_write_read
  dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_alias_overwrite
  dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_alias_write
  dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_atomic_alias_overwrite
  dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_atomic_alias_write

Backport-to: 26.0
Signed-off-by: Arjob Mukherjee <arjob.mukherjee@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 35f57a2739)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Timothy Arceri
0324fc928a amd/radeonsi: dont clamp packed user varyings
ac_nir_optimize_outputs() might pack user varyings into the color
built-ins. If this happens we skip adding clamping to the
components that contain the user varying.

This change also fixes a second bug where a color built-in can be
packed into a non-color slot and was no longer being clamped.

Fixes: 3777a5d7 ("radeonsi: assign param export indices before compilation")
Closes: #14443

Reviewed-by: Marek Olšák <maraeo@gmail.com>
(cherry picked from commit a42c55da46)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
David Rosca
3d8f8d6f65 frontends/va: Fix finding LTRs from POCs in HEVC decode
This should only consider valid entries, not loop over the entire array.
In addition the array size was wrong before.

Fixes: 779edc0759 ("frontends/va: Correctly derive HEVC StCurrBefore, StCurrAfter and LtCurr")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit c2a4fa33b8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Pavel Ondračka
2730f53840 r300: dirty VS state when switching variants
When r300_pick_vertex_shader switches to a WPOS variant, it only dirtied
rs_block_state, leaving vs_state with a stale code size. This caused
cs_count warnings (offset of -4 for one extra VS instruction) but was
mostly harmless since the emitted packet stream still used the current
shader.

Factor the VS code dirtying out of r300_bind_vs_state into a helper and
call it when selecting a new variant too.

Fixes: 806dcf9db7 ("r300: only output wpos in vertex shaders when needed")
(cherry picked from commit cc7be8433a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Jesse Natalie
3ed36f93c8 wgl: Use an hwnd xor hdc for framebuffers
It seems maybe hdcs can get recycled?

Fixes: 28058221 ("wgl: Support contexts created from non-window DCs")
(cherry picked from commit 3f35e65253)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Pavel Ondračka
b771d69b0a r300: fix MSAA resolve COLORPITCH tiling after pipe_surface de-pointerization
r300_simple_msaa_resolve used to patch srcsurf->pitch with the resolve
destination's tiling bits before passing the surface to the blitter.
That worked when set_framebuffer_state kept the same pipe_surface
pointer, so r300_get_nonnull_cb returned the patched object.

After the de-pointerization, r300_framebuffer_init creates a fresh
r300_surface from the pipe_surface template, discarding the pitch
modification. The hardware then uses the MSAA source tiling for
R300_RB3D_COLORPITCH0, leading to corruption.

Move the tiling override into r300_emit_fb_state and override the tiling
bits of COLORPITCH from the destination surface at emit time.

Fixes: 2eb45daa9c ("gallium: de-pointerize pipe_surface")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15303
(cherry picked from commit 416da54cce)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Simon Perretta
05c54411a3 pco: reserve additional outputs for trilinear sampled coeffs
Sampling coeffs with trilinear filtering will output 2x sets of data.
Whether bilinear or trilinear filtering is in use can't be determined
without checking state words, so unconditionally reserve 2x to avoid
clobbering output regs.

Fixes: 7df32ba09d ("pco: initial texture/sampler compiler support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
(cherry picked from commit af1669d9e2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Faith Ekstrand
0dd2d4368b panvk/csf: Emit INDEX_BUFFER[_SIZE] even for non-indexed draws
The index buffer and index buffer size don't affect whether or not we're
actually doing indexed rendering so we should just emit them whenever
they change.  Otherwise, if someone sets an index buffer and then does a
non-indexed draw and then an indexed draw, the first draw will clear the
dirty bits without setting the index buffer registers and the second
draw won't know to re-emit them.

Fixes: 5544d39f44 ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Marc Alcala Prieto <marc.alcalaprieto@arm.com>
(cherry picked from commit 9c8e8ed655)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00
Samuel Pitoiset
7553e83098 radv: re-introduce DGC+multiview support and enable it for vkd3d-proton only
The Vulkan spec says:
    "VUID-vkCmdExecuteGeneratedCommandsEXT-None-11062
     If a rendering pass is currently active, the view mask must be 0."

So, it's invalid with VK_EXT_device_generated_commands but it's allowed
in DX12, it seems we missed this during the spec review.

Crimson Desert uses this and emulating in vkd3d-proton would be complex,
so let's re-introduce this support only for vkd3d-proton.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 782254b820)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41268>
2026-04-29 17:53:11 +02:00