Compare commits

...

524 commits

Author SHA1 Message Date
Eric Engestrom
e4bd78e80a VERSION: bump for 26.0.5
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
2026-04-15 16:20:41 +02:00
Eric Engestrom
9c4e5bbee5 docs: add release notes for 26.0.5 2026-04-15 16:20:41 +02:00
Rhys Perry
fd9ffc0620 ir3/ra: fix copy-paste error
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
I don't entirely understand what this is all doing, but this looks like a
copy-paste error.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
(cherry picked from commit a6b86d43d3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Rhys Perry
9b19409ac8 ir3/array_to_ssa: skip remove_trivial_phi for non-array phis
remove_trivial_phi() mostly does nothing for non-array phis, but it
rewrites sources if their definining instruction are trivial phis.

In the case of trivial phis in the loop continue block (for loops with
divergent non-trivial continues), we might need to keep those if they
write a shared register, because the source of the trivial phi will not be
reachable from the loop header phi.

In this example, the predecessors of the continue block should be block2,
but the physical predecessors are block2 and block3, requiring a phi in
the continue block which will then be lowered by ir3_lower_shared_phis.
loop {
   block1:
   a = phi 0, b
   if (divergent) {
      block2:
      b = a + 1
      continue;
   }
   block3:
   break;
}

Fixes RA validation error when compiling blackmythwukong/5645a84e669a6179
from radv_fossils.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
(cherry picked from commit 4f0fb5784f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Samuel Pitoiset
6a2704a520 vulkan: mark RP attachments as invalid when no rendering create info
VkPipelineRenderingCreateInfo is only required in the fragment output
interface lib. For pre-rasterization shaders and fragment shader state
libs, only the view mask is used but it's optional.

If the attachments info isn't marked invalid merging renderpass info
during lib imports wouldn't work because it would assume that the first
lib has attachment info (eg. the pre-rasterization lib).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15241
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1950b6c1a7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Lionel Landwerlin
0e1922550d elk: don't support frontfacing ternary optimization on != 32bit
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 4dfedcca45)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Lionel Landwerlin
eb2cffbde4 brw: don't support frontfacing ternary optimization on != 32bit
Fix shader compilation on Crimson Desert :

  16    %1995 = b32csel %1992, %1993, %1994

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit a84c12414c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Mary Guillemard
d37fccbc4a hk: Add HK_MAX_RTS to maxFragmentCombinedOutputResources
The spec also mentions "output Location decorated color attachments".

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 564b061981 ("hk: Increase maxFragmentCombinedOutputResources to HK_MAX_DESCRIPTORS")
Reviewed-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 59d9bc7bee)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Mary Guillemard
dc1f5880e7 nvk: Adjust maxFragmentCombinedOutputResources to match max descriptors limit
This was set to the lowest allowed value by spec but it should really be
matching the max descriptors limit.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15249 for NVK
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 13f98d8658)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Wujian Sun
ecdb0c153c mesa: Fix inconsistent multisampled CopyTexImage checks
According to the GL_EXT_multisampled_render_to_texture specification,
copy operations should be allowed when the extension is supported.

Previously, glCopyTexImage* would unconditionally fail with
GL_INVALID_OPERATION when copying from any multisampled framebuffer
(samples > 0), even when using render-to-texture attachments.

Fixes: d7b9da2673 ("mesa/main: fix artifacts with GL_EXT_multisampled_render_to_texture")

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Wujian Sun <wujian.sun_1@nxp.com>
(cherry picked from commit 2e340d63d2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Eric Guo
d6ef5d4882 panfrost: disable round_to_nearest_even for NEAREST samplers
When round_to_nearest_even is enabled with NEAREST filtering, texture
coordinates near texel boundaries (e.g. 0.9999999404) can be incorrectly
rounded up to the next texel instead of being floor()'d.

According to OpenCL spec section 8.2, for CLK_FILTER_NEAREST:
  i = address_mode((int)floor(u))

Backport-to: *
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit c415134454)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Karol Herbst
1ed1dcb9db rusticl/device: Fix reporting of global memory on mixed memory devices
AMD APUs are hitting this case where they have very small discrete VRAM,
but a lot of staging memory, which can be used additionally.

Fixes: 7487ac2046 ("rusticl/device: support query_memory_info to retrieve available memory")
(cherry picked from commit 58d45725c7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Karol Herbst
b2b54a0194 rusticl/kernel: implement CL_KERNEL_GLOBAL_WORK_SIZE for custom devices
Apparently we are supposed to support this on custom devices.

Cc: mesa-stable
(cherry picked from commit 97a137ac88)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Karol Herbst
6fa9e9b757 radeonsi: properly report unified memory on APUs
This helps rusticl to properly advertise available global memory on APUs.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 97ca375f51)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Konstantin Seurer
573d326bc2 radv/bvh: Prefer selecting quads as the first pair of a HW node
Is a single triangle is selected, it can be the case that the next iteration
can't merge any pair with the triangle. In that case, the HW node with a
single triangle will not have the highest hw_node_index, triggering an
assert.

Fixes: c18a7d0 ("radv: Emit compressed primitive nodes on GFX12")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
(cherry picked from commit db38d1a98c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Vinson Lee
392b4ab6b2 d3d12: Fix MinGW cross-build error in resource_state_if_promoted
When cross-compiling with MinGW, d3d12_resource_state.cpp fails to
compile with:

  d3d12_resource_state.cpp:161:83: error: call to non-'constexpr'
  function 'D3D12_RESOURCE_STATES operator|(D3D12_RESOURCE_STATES,
  D3D12_RESOURCE_STATES)'
    161 |       D3D12_RESOURCE_STATE_ALL_SHADER_RESOURCE |
        |       D3D12_RESOURCE_STATE_COPY_SOURCE | D3D12_RESOURCE_STATE_COPY_DEST;
        |       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  In file included from /usr/share/mingw-w64/include/minwindef.h:163,
                   from /usr/share/mingw-w64/include/windef.h:9,
                   from /usr/share/mingw-w64/include/windows.h:69,
                   from /usr/share/mingw-w64/include/rpc.h:16,
                   from /usr/share/mingw-w64/include/unknwn.h:7,
                   from ../subprojects/DirectX-Headers-1.0/include/wsl/winadapter.h:6,
                   from ../src/gallium/drivers/d3d12/d3d12_common.h:29,
                   from ../src/gallium/drivers/d3d12/d3d12_bufmgr.h:31,
                   from ../src/gallium/drivers/d3d12/d3d12_resource_state.cpp:24:
  ../subprojects/DirectX-Headers-1.0/include/directx/d3d12.h:3540:1:
  note: 'D3D12_RESOURCE_STATES operator|(D3D12_RESOURCE_STATES,
  D3D12_RESOURCE_STATES)' declared here
   3540 | DEFINE_ENUM_FLAG_OPERATORS( D3D12_RESOURCE_STATES )
        | ^~~~~~~~~~~~~~~~~~~~~~~~~~

The DEFINE_ENUM_FLAG_OPERATORS macro in the MinGW winnt.h header
defines operator| for D3D12_RESOURCE_STATES as inline but not
constexpr.  (The DirectX-Headers WSL stubs do define it as constexpr,
but when building with MinGW, windows.h is pulled in via winadapter.h
and its non-constexpr definition wins.)  Calling a non-constexpr
function to initialize a constexpr variable is ill-formed in C++.

Fix by changing static constexpr to static const, which avoids the
constexpr context while still giving the variable static storage
duration.

Fixes: fe48cd7c5a ("d3d12: Allow state promotion for non-simultaneous access textures")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
(cherry picked from commit 2443f3608a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Yuxuan Shui
035be8e042 wsi/display: initialize Xlib display connector property IDs in all cases
Usually connector property IDs are acquired in
wsi_display_get_connector, which is called by wsi_get_connectors, and in
turn by vkGetPhysicalDeviceDisplayProperties2KHR and
vkGetPhysicalDeviceDisplayPlanePropertiesKHR. Except if the drm fd is
not available when these functions are called. Which will be the case if
vkAcquireXlibDisplayEXT is not called first.

So it goes like this. First, the display is created in
vkGetRandROutputDisplayEXT. Then it's used in
vkGetPhysicalDeviceDisplayPlanePropertiesKHR, but since the drm fd is
not available at this point, connector property IDs are not initialized.
Later, this display is used in vkAcquireXlibDisplayEXT, which also
doesn't touch the property IDs. Finally in drm_atomic_commit, the
atomic commit fails with EINVAL, specifically because of the
uninitialized ID of the "CRTC_ID" property. Since it's one of the
properties drm_atomic_commit tries to set.

This commit makes sure that find_connector_properties is called in
vkAcquireXlibDisplayEXT to initialize the property IDs.

Fixes: 513ffea1d3 ("wsi/display: use atomic mode setting")
Signed-off-by: Yuxuan Shui <yshui@codeweavers.com>
(cherry picked from commit 37a1986691)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:47 +02:00
Pavel Ondračka
18115a5df5 gallium/u_blitter: remove unused CONST declaration when using IMM
The immediate fs_clear_color shader uses IMM[0] but still declares
CONST[0][0]. That can make drivers try to read a fragment constant
buffer even though one is never uploaded on this path. Only declare
CONST[0][0] when the shader actually uses a constant buffer.

Fixes: 2ff9fa8b72 ("gallium/u_blitter: add a new fs_color_clear variant")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 79e3196320)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Daniel Schürmann
b4cfec4973 aco/lower_branches: Don't remove branches which jump over loops
Entering a loop with empty exec mask might lead to
not be able to execute the break condition and
lead to infinite loops.

Totals from 81 (0.04% of 202440) affected shaders: (Navi48)
Instrs: 3040566 -> 3040716 (+0.00%)
CodeSize: 17506768 -> 17507188 (+0.00%)
Latency: 16342966 -> 16345166 (+0.01%)
InvThroughput: 3112932 -> 3113286 (+0.01%)
Branches: 82229 -> 82365 (+0.17%)

Cc: mesa-stable
(cherry picked from commit 60b3e5b3f0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Faith Ekstrand
7809c26b56 pan/bi: Support all the swizzles in the packer
Add asserts this time that we don't miss any and that the buckets
actually match the enum in bifrost/compiler.h.

Fixes: 82328a5245 ("pan/bi: Generate instruction packer for new IR")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit fd5c6d1223)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Faith Ekstrand
1067d2772e pan/bi: Add BI_SWIZZLE_NONE
Fixes: 82328a5245 ("pan/bi: Generate instruction packer for new IR")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit ab285efd1b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Olivia Lee
665974c934 panfrost: don't try to emit varying shader stats on v12+
On v12+, IDVS no longer has separate position and varying variants, so
we only need to emit stats for one binary. Attempting to emit stats for
the nonexistent varying shader breaks shader-db.

Fixes: 7819b103fa ("pan/bi: Add support for IDVS2 on Avalon")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 31ddfe26eb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Janne Grunau
6c5ec9424d hk: Increase maxFragmentCombinedOutputResources to HK_MAX_DESCRIPTORS
Backport-to: 26.0
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15249 for hk
Signed-off-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 564b061981)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Icenowy Zheng
e0eb9bc602 pvr: set has_usc_alu_roundingmode_rne for all B-series Rogue cores
All B-series Rogue cores seem to have USC rounding mode as RTE instead
of RTZ.

Set the has_usc_alu_roundingmode_rne feature flag for them (currently
only BXS-4-64 has it set).

Verified via testing on BXM-4-64 (36.52.104.182) by fixing CTS tests
dEQP-VK.spirv_assembly.instruction.*.float_controls.fp32.input_args.* ,
and via proprietary driver vulkaninfo result on BXE-2-32 (36.29.52.182),
BXE-4-32 (36.50.54.182) and BXM-4-64 (36.56.104.183) (checking
shaderRoundingModeRT?Float32 properties).

Fixes: 1db1038a61 ("pvr: add device info for BXM-4-64 (36.56.104.183)")
Fixes: e60e0c96ba ("pvr: add device info for BXE-2-32 (36.29.52.182)")
Fixes: 2743363a57 ("pvr: add device info for BXM-4-64 (36.52.104.182)")
Fixes: ea28791d40 ("pvr: add device info for BXE-4-32 (36.50.54.182)")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 9b44def4e9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Samuel Pitoiset
55916383fc radv/meta: fix computing extent for image->image with both compressed formats
If both src and dst are compressed formats, adjusting the extent isn't
necessary because it's required that texel block extent matches. The
previous division was also wrong because it was truncating partial
blocks causing issues in some tests.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 4e00e1c3d0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Valentine Burley
2723eec312 ci/freedreno: Move remaining lazor a618 jobs, retire device type
The sc7180-trogdor-lazor-limozeen devices have been dying off over the
past few weeks, so move the last two jobs to sc7180-trogdor-kingoftown
and retire the device type.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit bbed00ac81)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Valentine Burley
a9a9bc41c0 zink/ci: Move zink-tu-a618 to sc7180-trogdor-kingoftown
The sc7180-trogdor-lazor-limozeen devices are having issues, so move the
job to a different device with available capacity.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 17d38c9668)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Lionel Landwerlin
52b549ab9d anv: don't relocate memory from blob
This could override data allocated by the application when shader code
is loaded from binary in vkCreateShaderObjectEXT().

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d39e443ef8 ("anv: add infrastructure for common vk_pipeline")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit 21952ffb07)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Marc Alcala Prieto
6f6be1368a pan/cs: Fix cs_run_fragment() calls with swapped arguments
Fix non-functional issue where calls to cs_run_fragment() had swapped
tile_order and enable_tem arguments. Both arguments evaluate to 0.
Hence, no functional change.

Fixes: 53f780ec91 ("panfrost: Remove progress_increment from all CS builders")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
(cherry picked from commit 0d08b197f2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
David Rosca
5447f81cc5 radeonsi: Set multi plane format also for imported textures
multi_plane_format is used to correctly copy all planes for staging texture
copies, otherwise only the first plane gets copied.
It's now also used in si_video_dec, which doesn't work when decoding into
imported surfaces if multi_plane_format is not set.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15232
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3dbbd94ffd)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Georg Lehmann
599bb79ff4 aco/optimizer: do not try to create 3 byte constant operands
Operand::get_const will assert.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15239
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
(cherry picked from commit d1ed4e1774)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Icenowy Zheng
fa0abe9510 pvr: fix pvr_clear_vdm_state_get_size_in_dw() inverted feature condition
The pvr_clear_vdm_state_get_size_in_dw() wrongly think instance count
inputs are needed when doing RTA clear for cores without the
gs_rta_support feature. However, the instance ID is exploited to output
the target layer ID, which isn't supported at all for cores w/o that
feature, so it looks that the condition is inverted. In addition, the
pvr_pack_clear_vdm_state() function seems to have similar logic deciding
whether to emit instance_count, and the logic is opposite to the logic
in pvr_clear_vdm_state_get_size_in_dw() for the part checking the
gs_rta_support feature.

Invert the condition to take instance ID inputs for cores with the
gs_rta_support feature instead of those without this feature.

Fixes: b59eb30e88 ("pvr: Fix cs corruption in pvr_pack_clear_vdm_state()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 3db93bbf34)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Rhys Perry
eb4618b1b5 util: fix UBSan error with _mesa_bfloat16_bits_to_float
runtime error: left shift of 65535 by 16 places cannot be represented in type 'int'

This fixes nir_opt_algebraic_pattern_test.bf2f.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: ecd2d2cf46 ("util: Add functions to convert float to/from bfloat16")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 72f2b8a034)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:46 +02:00
Ian Romanick
92bfcb208f brw: brw_reg::nr for an accumulator is not part of the offset
Without this, reg_offset will return 1024 for acc0. This causes
has_invalid_dst_region to decide that the destination region is invalid
(because 1024 != 0), and the lowering code tries to treat the floating
point accumulators as integers. It's a mess.

v2: Add and use set_gfx_platform. Suggested by Caio.

Fixes: 937373eb25 ("i965/fs: Handle fixed HW GRF subnr in reg_offset().")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit cfdb3ddb93)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Ian Romanick
a46e96fbdb brw/const: Don't allow type changes when accumulators are involved
Integer accumulators and float accumulators do not occupy the same bits,
so the types cannot be arbitrarily changed.

No shader-db or fossil-db changes on any Intel platform.

v2: Use is_accumulator() instead if brw_reg_is_arf(). Add an extra test
to show the desired behavior when an accumulator is not
involved. Suggested by Caio.

Fixes: 64c251bb3a ("intel/fs: Combine constants for SEL instructions too")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit ffdc310bf1)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Mixie
c72a0457e1 xlib: clear currentDpy when releasing the current context
After abe6d750e5, glXDestroyContext() can defer destruction by marking
the context with xid == None while it is still current.

However, the release-current path did not clear current->currentDpy,
so a context that had already been marked for deletion could remain
associated with a display after unbinding.

Fixes: abe6d750e5 ("xlib: fix glXDestroyContext in Gallium frontends")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14947
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 447a1d2e8d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Xianzhong Li
871a933cac panfrost: Fix GEM handle refcount leak in panfrost_bo_import
panfrost_bo_import() calls drmPrimeFDToHandle() then pan_kmod_bo_import(),
which also calls drmPrimeFDToHandle() internally. This double import causes
GEM handle refcount leaks because each drmPrimeFDToHandle() increments the
kernel's GEM handle refcount, but only one drmCloseBufferHandle() is called
during cleanup by panfrost_kmod_bo_free(or panthor_kmod_bo_free).

Fix by removing the redundant drmPrimeFDToHandle() and using
pan_kmod_bo_import() directly. On re-import of existing buffers, properly
release the extra pan_kmod_bo reference with pan_kmod_bo_put().

This ensures GEM handle refcount, pan_kmod_bo refcount, and panfrost_bo
refcount are all properly balanced.

Fixes: 5089a758df ("panfrost: Back panfrost_bo with pan_kmod_bo object")

Signed-off-by: Xianzhong Li <xianzhong.li@nxp.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 248b0b47b7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Faith Ekstrand
e8955066a7 pan/bi: Use bi_half() for texture MS indices
It feeds into a v2i16 so it needs to be 16-bit.

Fixes: ae79f6765a ("pan/bi: Emit Valhall texture instructions")
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit e637130794)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Faith Ekstrand
cb93fe85b9 pan/bi/ra: Allow offsets on tied sources
The only real requirement here is that the destination offset is zero
and that the destination is big enough to hold the source.  The source
offset doesn't matter.

Fixes: bc17288697 ("pan/bi: Lower split/collect before RA")
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 05c5e52054)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Faith Ekstrand
ddf6c31b8e pan/bi: Delete a few instruction encodings
The non-trivial non-replicate swizzles on IADD.v4x8 and ISUB.v4x8 are
either documented wrong or broken in hardware.  Instead of swizzling
b0101 and b2323, they swizzle b0011 and b2233 on G52.  This is either a
hardware bug or an issue with documentation.  In either case, it's
probably best not to trust it.  Those swizzles aren't all that useful
anyway.  We also weren't using any of them before (or they'd have
broken) so this isn't a performance regression.

Cc: mesa-stable
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 538b5c411e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Faith Ekstrand
7cf57ccd9c pan/bi: Support more swizzle aliases in the bifrost pack code
Fixes: 82328a5245 ("pan/bi: Generate instruction packer for new IR")
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 3fffcf4338)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Natalie Vock
190c65ebae radv/rt: Don't enable midpoint sorting
Midpoint sorting is incompatible with how our traversal works.
Specifically, we change tMax when a hit is committed so we can skip over
BVH nodes that are guaranteed not to produce a closer hit. However,
changing tMax also changes the intersection interval of box nodes with
the ray, and thus, the midpoints of that interval. Stackless traversal
relies on getting nodes back in the exact same order as before, and if
that requirement is not met, traversal may incorrectly skip over nodes.

The likely benefit of midpoint sorting does not make up for the loss of
ability to skip over BVH nodes exceeding tMax, so simply disable
midpoint sorting.

This fixes geometry being visible behind other geometry when it
shouldn't be in various applications, including Half-Life 2 RTX.

Cc: mesa-stable
(cherry picked from commit c1a7680d93)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Job Noorman
28a32fe54b nir/opt_uniform_subgroup: fix ballot_bit_count components
ballot_bit_count_reduce expects the ballot to have 4 components causing
validation failures on targets where 1 < ballot_components < 4. Fix this
by padding the ballot to 4 components.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: ae66bd1c00 ("nir/opt_uniform_subgroup: use ballot_bit_count")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit cc6eec79c2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Karol Herbst
fc45812f74 radeonsi: set valid_buffer_range for CL buffers
Seems like we never set the range for CL buffers which caused spurious
test fails in the CL CTS.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 02679a51fd)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Georg Lehmann
7757ddb8a9 nir/opt_load_skip_helpers: don't skip helpers for store_scratch data
Scratch stores store data for helper lanes that might be used later by an
instruction that cares about helpers, or even by control flow.

Fixes: a65009e808 ("nir: Add a nir_opt_tex_skip_helpers optimization")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14965
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit fc19ce6c17)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Job Noorman
fa14f8e6d5 ir3: fix handle_partial_const with vectorized src
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 50a91fbf87 ("freedreno/ir3: cleanup "partially const" ubo srcs")
(cherry picked from commit c27f0406b0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Timothy Arceri
7b5ed90bdc radeonsi: add Gun Godz workaround
This is another game based on the old YoYo engine

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15209

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 27b56314ee)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Job Noorman
7745b46956 nir/gather_info: clear interpolation qualifiers before gathering
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Fixes: 66740d9c91 ("nir: gather interpolation qualifiers")
(cherry picked from commit a72704d0fb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Job Noorman
97213d180e nir/opt_varyings: fix alu def cloning
nir_builder_alu_instr_finish_and_insert initialized the def's bit_size
and num_components so we should set them afterwards.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Fixes: c66967b5cb ("nir: add nir_opt_varyings, new pass optimizing and compacting varyings")
(cherry picked from commit 273fd18b89)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Pavel Ondračka
0347549074 st/bitmap: release the temporary bitmap sampler view
st_cb_bitmap appends a temporary bitmap sampler view to the sampler
view array passed to set_sampler_views().

1a5c660ef5 changed this path to only release the extra YUV views
returned by st_get_sampler_views(), but the temporary bitmap view is
created locally and is not part of extra_sampler_views. It therefore
stopped being released so release the temporary bitmap sampler view
explicitly after drawing the bitmap quad.

Fixes: 1a5c660ef5 ("st/bitmap: only release YUV samplerviews")
(cherry picked from commit 33864e569e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:45 +02:00
Ahmed Hesham
e145579956 rusticl: fix flag validation when creating an image
From the OpenCL specification:
    `CL_MEM_KERNEL_READ_AND_WRITE`: This flag is only used by
    clGetSupportedImageFormats to query image formats that may be both
    read from and written to by the same kernel instance. To create a
    memory object that may be read from and written to use
    CL_MEM_READ_WRITE.

If an application follows the instructions above, i.e. query a list of
supported image formats, using `CL_MEM_KERNEL_READ_AND_WRITE` as
input, and then attempts to create an image using one of the supported
image formats, by calling `clCreateImage` and passing
`CL_MEM_READ_WRITE`, the call to the image creation entry point should
succeed. This instead fails on Mali devices with the error
`CL_IMAGE_FORMAT_NOT_SUPPORTED`.

Rusticl fails when validating the image format against its supported
flags. Formats that support `PIPE_BIND_SHADER_IMAGE` have their
supported flags set as `CL_MEM_WRITE_ONLY` and
`CL_MEM_KERNEL_READ_AND_WRITE`.

This changes the supported CL flags to be `CL_MEM_WRITE_ONLY` for
`PIPE_BIND_SHADER_IMAGE` and `CL_MEM_READ_WRTE |
CL_MEM_KERNEL_READ_AND_WRITE` for `PIPE_BIND_SAMPLER_VIEW |
PIPE_BIND_SHADER_IMAGE`.

Fixes: 3386e142 (rusticl: support read_write images)

Fixes OpenCL-CTS test: `test_image_streams` on Mali. Invocation:
```
test_image_streams write 1D CL_RGB CL_SIGNED_INT8
```

Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
(cherry picked from commit e77c984cef)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:44 +02:00
Samuel Pitoiset
46aea87a79 spirv: fix OpUntypedVariableKHR with optional data type parameter
This would read OOB and crash because data type is optional per the
SPIRV spec.

Original patch by Faith Ekstrand <faith.ekstrand@collabora.com>.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1f8be7bfad)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:44 +02:00
Eric Engestrom
b3b0c9002e .pick_status.json: Mark 4b3bd6b0b5 as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:44 +02:00
Eric Engestrom
eb6df9c83d .pick_status.json: Mark 9ff879441f as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:44 +02:00
Eric Engestrom
41ca0dd7c7 .pick_status.json: Update to 7e163fb793
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
2026-04-14 15:27:44 +02:00
Eric Engestrom
bbc23bed47 docs: add sha sum for 26.0.4
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-04-03 11:31:41 +02:00
Eric Engestrom
d9e5e36b19 VERSION: bump for 26.0.4
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-04-01 19:37:49 +02:00
Eric Engestrom
97c3564810 docs: add release notes for 26.0.4 2026-04-01 19:37:49 +02:00
Juan A. Suarez Romero
ec101645ff vc4: fix unwanted buffer release on uploader
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
When converting the index buffer from 4-bytes to 2-bytes, we use the
uploader for the job. Since commit b3133e250e we do an uploader alloc
ref, which releases the uploader buffer if there is no enough space,
creating a new one.

The problem happens when we also need this buffer because it is the one
containing the index buffer to convert. This happens, for instance, if
we need to convert the primitives because they are not supported (e.g.,
converting quads to triangles), as this is done
also using the uploader.

The solution is to ensure the uploader's buffer has an extra reference
so when released, it is not destroyed. This can easily achieved by
calling first pipe_buffer_map_range(), which is required to access the
buffer, and it increases the references.

This fixes `spec@!opengl 1.1@longprim`.

Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 48c086cb42)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Samuel Pitoiset
0a9270779f radv: emit BOP events after every draw to workaround a VRS bug on GFX12
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14812
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit bf7e29617d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Pavel Ondračka
dfd0e55b5a r300: split large HiZ clears into multiple packets
R300_PACKET3_3D_CLEAR_HIZ encodes COUNT in 14 bits (COUNT[13:0]), so a
single packet can clear at most 0x3fff dwords.

Large depth surfaces on R5xx can require more HiZ dwords than that.
When we emitted a single packet, COUNT truncated and part of HiZ RAM
remained uncleared, which could show up as HyperZ corruption.

Emit CLEAR_HIZ in chunks of R300_CLEAR_HIZ_COUNT_MAX and reserve enough
atom space for the worst-case packet count derived.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/360
Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
(cherry picked from commit fddc101070)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Pavel Ondračka
2b87cb01b2 r300: add shared HyperZ pipe-count helper
Introduce r300_hyperz_pipe_count and use it in\nr300_setup_hyperz_properties.\n\nRV530 selects pipe topology from NUM_Z_PIPES, while other families use\nNUM_GB_PIPES. Keeping this in one helper avoids duplicated family checks\nand prepares follow-up HiZ clear sizing changes to reuse the same rule.

(cherry picked from commit e97ac38ff3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Pavel Ondračka
8292cf5563 r300: disable zmask clears for large surfaces
(cherry picked from commit 3fc2627897)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Pavel Ondračka
27fec03660 r300: don't apply odd macroblock rounding to 3D textures
This is intended only for NPOT 2D textures.

Fixes: 0763fb947 ("r300: align macro-tiled stride-addressed textures in X")
(cherry picked from commit 648dfe88f4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Pavel Ondračka
e06648d86a r300: fix bias presubtract algebraic transformation
One fneg too many.

Fixes: 0508db915 ("r300: implement bias presubtract")
(cherry picked from commit e68e519b91)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Mario Kleiner
af53a9bfbc dri: Fix "cosmetic" undefined behaviour warning for RGB[A]16_UNORM formats.
Ian Romanick reported some "undefined behaviour" warnings during some
not specified tests, relating to introduction of RGB[A}16_UNORM formats
in merge request
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588

This due to overflowing the 32-bits masks[], and then during assignment
the red/green/blue/alphaMask fields in struct gl_config when using a 16
bpc format. Iow. the red/green/blue/alphaMask would not be usable.

Suppress this warning by setting masks[] to zero for unorm16 formats,
just as was previously done for is_float16, ie. fp16 formats.

16 bpc formats are only exposed for display on non-X11 WSI target
platforms like GBM+DRM, Wayland, surfaceless, and these platforms do
not use the info in red/green/blue/alphaMask at all, so the "undefined
behaviour" is meaningless.

Fixes: f2aaa9ce00 ("dri,gallium: Add support for RGB[A]16_UNORM display formats.")
Reported-by: Ian Romanick @idr
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit ab94515b0a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Eric Guo
58ab5760c1 panfrost: Fix NULL pointer dereference in panfrost_emit_images
Fix a crash in image descriptor emission caused by stale image_mask bits.

Root cause:
- set_shader_images used a shift expression with count==64 when clearing
  image_mask, which is undefined behavior in C.
- This could leave image_mask inconsistent with actual image bindings,
  so panfrost_emit_images() might dereferences NULL image resources.

Fixes:
- Use 64-bit-safe bit helpers for mask updates to avoid invalid shifts.

Crash observed when running: OpenCL-CTS api/test_api
Backtrace:
  #0 util_image_to_sampler_view (v->resource is NULL)
  #1 panfrost_emit_images
  #2 panfrost_update_shader_state
  #3 panfrost_launch_grid_on_batch
  #4 panfrost_launch_grid

Backport-to: *
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit c1770565f3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
utzcoz
88cfa8ff29 gfxstream: Fix vkSetDebugUtilsObjectNameEXT crash for unwrapped objects
Mesa's vk_common_SetDebugUtilsObjectNameEXT assumes every Vulkan object
handle is a pointer to a vk_object_base struct. In gfxstream, only a
subset of objects (instance, device, queue, command buffer, command pool,
buffer, fence, semaphore) carry a Mesa wrapper. All other non-dispatchable
handles (shader modules, pipelines, render passes, etc.) are opaque host
handles that are not valid pointers.

Passing such an unwrapped handle to the common path causes it to be cast
to a vk_object_base pointer and dereferenced, resulting in a SIGSEGV
(null-pointer dereference at offset 0x40).

Override the function in the gfxstream driver to store debug names on
vk_object_base for wrapped objects and return VK_SUCCESS for unwrapped
objects.

Fixes: 7b50e62179 ("gfxstream: mega-change to support guest Linux WSI with gfxstream")
Test: Verified with hellovk (with validation layers) on Android Emulator - no crashes.
(cherry picked from commit bf8862b49f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Yiwei Zhang
4feda37353 vulkan/wsi/win32: respect acquire timeout for sw wsi
When DXGI is not supported, win32 falls back to sw wsi without acquire
timeout ignored.

This change:
1. adds the needed acquire mutex and cond
   - the fail path is intentionally left untouched so that mutex and
     cond are both valid when wsi_win32_swapchain_destroy is called
2. adds wsi_win32_acquire_idle_cpu_image helper to respect timeout
3. adds wsi_win32_set_image_idle helper to properly signal acquire_cond
   for sw wsi case

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15122
Cc: mesa-stable
(cherry picked from commit af42f0c80f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Yiwei Zhang
0287336eea vulkan/wsi/win32: add wsi_win32_find_idle_image helper
Prepare to handle timeout for sw wsi (no DXGI).

Cc: mesa-stable
(cherry picked from commit 8ff24c7db3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Hyunjun Ko
b62216f82d anv: Add dummy workload for AV1 decode on affected platforms (Wa_1508208842)
Implement software workaround for AVP decoder corruption on Gen12
platforms. These platforms require a warmup workload before
the actual AV1 decode to prevent output corruption.

- Gen12: Tiger Lake, DG1, Rocket Lake, Alder Lake

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 260908cecb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Rhys Perry
ad17b864cb radv: fix memory leak in radv_rt_nir_to_asm
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 574f577657)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Adam Simpkins
99129b13ce iris: fix a crash in disable_rb_aux_buffer
I have been running into crashes in this function when using blender.
Some of the entries in ice->state.framebuffer.base.cbufs[0] can
apparently have the texture field be null, which was causing a segfault
in this loop.

In my case, nr_cbufs was 3, and the first two cbufs entries had a null
texture and format set to PIPE_FORMAT_NONE. The last entry had format of
PIPE_FORMAT_R16G16_FLOAT and a non-null texture.

Adding this null check before attempting to dereference the texture
fixes the crash for me and allows blender to work normally.

Fixes: ca96f8517c ("iris: remove uses of pipe_surface as a pointer")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit e16c8cc579)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:35 +02:00
Faith Ekstrand
c74811897f pan/buffer: Add the offset to the size for buffer textures
In the attribute model, the size is for the attribute binding and the
offset is an offset into that range.  If we're going to use that to
offset the buffer itself, we need to increase the size accordingly.

Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit ce56f49561)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Faith Ekstrand
60df6998b4 pan/bi: v2x16 conversions don't replicate
They swizzle just like anything else.  Technically, we could maybe do a
little better than the generic case for these since they only read 8
bits per 16 bits in the destination but the generic case is correct,
even if it isn't optimal.

Fixes: f7d44a46cd ("pan/bi: Optimize replication")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 8dc458225b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Georg Lehmann
2527900bbc nir/lower_non_uniform_access: fix fusing loops for same index but different array variable
struct nu_handle is hashed and deduplicated using struct nu_handle_key, which ignored
parent_deref. That means all instructions will use the first parent_deref when rewriting
the sources.

Avoid this by not including the parent deref in the struct, and instead querying it
when needed.

Fixes: 4d09cd7fa5 ("nir/lower_non_uniform_access: Group accesses using the same resource")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15173
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
(cherry picked from commit e7077e8f5c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Marek Olšák
b084df3ed6 radeonsi: disable streamout queries for u_blitter
Cc: mesa-stable
Reviewed-by: Pierre-Eric
(cherry picked from commit 918e5764f4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Marek Olšák
a20678b267 radeonsi: fix blits via util_blitter_draw_rectangle
It didn't save states properly. The only correct place to save them is
si_blitter_begin. Unfortunately, we can't skip saving and restoring
those states because we don't know in advance whether the rectangle path
will be used.

Cc: mesa-stable
Reviewed-by: Pierre-Eric
(cherry picked from commit 556ceb1b75)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Natalie Vock
5e538b5efd vulkan: Bump MAX_ENCODE_PASSES
RADV needs one more encode pass for a bugfix in the next commit.

Cc: mesa-stable
(cherry picked from commit e713527aa9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Icenowy Zheng
ee98ce1142 pvr: fix dirty tracking for stencil ops
The dirty state of stencil ops is not checked when deciding whether to
rebuild the ISP state, although the values are part of the ISP state
(the 27:16 bits of ISPB word).

Add MESA_VK_DYNAMIC_DS_STENCIL_OP to the condition for rebuilding ISP
control registers.

Fixes GLCTS tests when running on top of Zink:
dEQP-GLES2.functional.fragment_ops.stencil.zero_stencil_fail

Fixes: 88f1fad3f7 ("pvr: Use common pipeline & dynamic state frameworks")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit ee031d67b4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Lionel Landwerlin
e84168bdac brw: fence SLM writes between workgroups
On LSC platforms the SLM writes are unfenced between workgroups. This
means a workgroup W1 finishing might have uncompleted SLM writes.
Another workgroup W2 dispatched after W1 which gets allocated an
overlapping SLM location might have writes that race with the previous
W1 operations.

The solution to this is fence all write operations (store & atomics)
of a workgroup before ending the threads. We do this by emitting a
single SLM fence either at the end of the shader or if there is only a
single unfenced right, at the end of that block.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13924
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit fa523aedd0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
emre
b5c8c1dcb2 nvk: fix barrier cache invalidation
Fixes: e1c1cdbd5f ("nvk: Implement vkCmdPipelineBarrier2 for real")
Reviewed-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit fe558d8328)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Icenowy Zheng
dbdbef673f pvr: consider the size of DMA request when setting msize of DDMADT
The DDMADT instruction of PDS has out-of-bound test capability, which is
used for implementation of robust vertex input fetch.

According to the pseudocode in the comment block before the "LAST DDMAD"
mark in pvr_pipeline_pds.c, the check is between
`calculated_source_address + (burst_size << 2)` and `base_address +
buffer_size`, in which the `burst_size` seems to correspond to the BSIZE
field set in the low 32-bit of DDMAD(T) src3 and the `buffer_size`
corresponds to the MSIZE field set in the DDMADT-specific high 32-bit of
src3. As the calculated source address is just the base address adds the
multiplication result (the offset), the base address could be eliminated
from the check, results in the check between `offset + (BSIZE * 4)` and
`MSIZE` .

Naturally it's expected to just set the MSIZE field to the buffer size.
In addition, as the Vulkan spec says "Reads from a vertex input MAY
instead be bounds checked against a range rounded down to the nearest
multiple of the stride of its binding", the driver rounds down the
accessible buffer size before setting MSIZE to it.

However when running OpenGL ES 2.0 CTS, two problems are exhibited about
the setting of the size to check:

- dEQP-GLES2.functional.buffer.write.basic.array_stream_draw sets up a
  VBO with 3 bytes per vertex (RGB colors and 1B per color) and 340
  vertices (results in a buffer size of 1020 = 0x3fc). However as the
  DMA request size, which is specified by BSIZE, is counted by dwords,
  3 bytes are rounded up to 1 dword (which is 4 bytes). When the bound
  check of the last vertex happens, the vertex's DMA start offset is
  0x3f9, so the DDMADT check happens between 0x3fd (0x3f9 + 1 * 4) and
  0x3fc, and indicates a check failure. This prevents the last vertex,
  which is perfectly in-bound, from being properly fetched; this is
  against the Vulkan specification, and needs to be fixed.
- dEQP-GLES2.functional.vertex_arrays.single_attribute.strides.
  buffer_0_32_float2_vec4_dynamic_draw_quads_1 sets up a VBO with a size
  of 168 bytes, and tries to draw 6 vertices (each vertex consumes 2
  floats (thus 8 bytes) of attribute) with a stride of 32 bytes using
  this VBO. Zink then translates the VBO to a Vulkan vertex buffer bound
  with size = 168B, stride = 32B. Here the optional rule about rounding
  down buffer size happens in the current PowerVR driver, and the
  checked bound is rounded down to 160B, which prevented the last
  vertex's 8B attributes to be fetched. It looks like this kind of
  situation is considered in the codepath without DDMADT, but omitted
  for the codepath utilizing DDMADT for bound check.

So this patch tries to mimic the behavior of DDMADT when setting the
MSIZE field of it to prevent false out-of-bounds. It first calculates
the offset of the last valid vertex DMA, then adds the DMA request size
to it to form the final MSIZE value. With the code calculating the last
valid DMA offset considering the situation of fetching the attribute
from the space after the last whole multiple of stride, both problems
mentioned above are solved by this rework.

There're 99 GLES CTS testcases fixed by this change, and Vulkan CTS
shows no regression on `dEQP-VK.robustness.robustness1_vertex_access.*`
tests.

Fixes: 4873903b56 ("pvr: Enable PDS_DDMADT")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit 252904f3d1)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Icenowy Zheng
e62ef26e01 pvr: move PVR_BUFFER_MEMORY_PADDING_SIZE definition to pvr_buffer.h
This memory padding is enforced by GetBufferMemoryRequirements2 and
might be then checked against to decide whether it's enough.

Move it to pvr_buffer.h for further assertions.

Backport-to: 25.3
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit d992474be9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Icenowy Zheng
778c3580e7 pvr: save vertex attribute size for DMA checking
Currently the size of single components inside one attribute is saved
and checked against when checking DMA capability. However, the vertex
attribute DMA happens for a whole attribute instead of individually for
its components, so checking against the component size is useless -- the
size of the whole attribute is what needs to be saved and checked.

Rename all component_size_in_bytes fields to attrib_size_in_bytes, and
save the size of the whole attribute inside them.

Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit aa8dad141c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Icenowy Zheng
874cba5743 pvr: fix "obb" typo in oob_buffer_size when building vertex pds data
The ddmadt_oob_buffer_size structure to be filled is named
`obb_buffer_size`, which is obviously a typo.

Change to `oob_buffer_size` to fix the typo.

Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit caea72cffc)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Lionel Landwerlin
eb382e0cef anv: add drirc option to workaround missing application barriers on typed/untyped data
Enable it for Horizon Forbidden West (only seems to have untyped data
issue).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14889
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit db964068bf)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Lionel Landwerlin
014f4ce985 anv: add an analysis pass to detect compute shaders clearing data
Applications often miss emitting barriers between a shader
initializing data & another shader writing data in the same location
afterward. This is very common for UAVs (see vkd3d-proton).

Vkd3d-proton does a pretty good job as inserting missing barriers
between UAV clears & writes. But some applications also have similar
issues with custom shaders. Here we introduce an analysis pass that
recognize shaders doing clear/initialization. We'll use that
information in the following commit to insert barriers after those
shaders.

Since Gfx12.5 our HW has become a lot more sensitive to those issues
due to the introduction of an L1 untyped data cache that is not
coherent across the shader units. On Gfx20+, typed data is also L1
cacheable exposing even more issues.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit 13bf1a4008)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:34 +02:00
Alyssa Rosenzweig
ef136c2687 nir: add nir_get_io_data_src
This complements our existing nir_get_io_index_src helper. Most, but annoyingly
not all, stores put their data source in source 0. Having a helper for this lets
us reduce special casing in a bunch of random places.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Job Noorman <jnoorman@igalia.com>
(cherry picked from commit 8fb1d65426)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Icenowy Zheng
ca4f356d37 pvr: Align width for PBE write when creating linear image
Even if a linear image isn't created with usages declaring PBE writes,
the image might be exported and then re-imported with a usage that
allows rendering to.

Always align linear images' width for being written by PBE.

This fixes WSI creating surfaces with odd width, exporting them and
re-importing for rendering.

Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 765a9f4fd9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Eric Engestrom
b5547c33bb [26.0 only] venus/ci: mark a test as fixed
Couldn't easily figure out which commit fixed it, but since it's a fix
I didn't try very hard ^^

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Yiwei Zhang
f54c73b080 venus: fix to relax the KHR_external_memory_fd requirement
This reverts commit 1895de16a6. The proper
way to filter out venus incapable physical devices is to do the platform
specific check during renderer side instance creation time.

Fixes: 1895de16a6 ("venus: filter out venus incapable physical devices")
(cherry picked from commit c2fe95a364)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Alyssa Milburn
7b5e16fb05 nv50,nvc0: Avoid uninitialized cbuf reads in blits
Overwrite the whole framebuffer cbuf rather than copying it from the
stack; fixes util_framebuffer_get_num_samples getting uninitialized
stack contents during validation.

Suggested-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Alyssa Milburn <amilburn@zall.org>
Fixes: 2eb45daa9c ("gallium: de-pointerize pipe_surface")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14082
(cherry picked from commit a6992c7bbe)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Icenowy Zheng
b26748a918 pco: fix encoding of fred's s0abs bit
The s0abs bit in the encoing of fred instruction is wrongly set to the
status of .neg modifier instead of .abs modifier.

Fix this copy-n-paste error.

Fixes GLCTS tests when running on top of Zink:
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.4
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.45
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.4
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.45

Fixes: 8ec174b3f9 ("pco: add support for various selection, complex, trig ops")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 54860bb4c7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Pierre-Eric Pelloux-Prayer
dfa0f6706d gallium/u_blitter: add a new fs_color_clear variant
The referenced commit switched from a passthrough shader
to fs_clear_color[write_all_cbufs=0]. It shouldn't matter since
the shader isn't supposed to be executed - it's only setup to get
the first color output active.

On some chips (gfx8) it seems to cause issues (hangs or page fault)
for some piglit tests, eg:
  framebuffer-blit-levels draw stencil

To fix this, introduce a 3rd variant, where a constant buffer isn't
required and instead the color is hardcoded in the shader.

Fixes: ca09c173f6 ("gallium/u_blitter: remove UTIL_BLITTER_ATTRIB_COLOR, use a constant buffer")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2ff9fa8b72)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Erik Faye-Lund
e406a5bbd7 panvk: remove unused flag
This flag isn't used any more, so let's remove all references to it.

Fixes: e25064c026 ("panvk: Use indirect path for indexed draw on JM")
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 5b8ebb8553)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
juntak0916
f8d492ccc3 nvk: fix BindImageMemory2 per-bind status result
The per-bind status was always being set to VK_SUCCESS instead of the
actual result from nvk_bind_image_memory.

Fixes: 93792b5ef2 ("nvk: Add static wrappers for image/buffer binding")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit dd3e153a10)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Job Noorman
e2e2dc6bf7 ir3/legalize: don't drop sync flags on removed predt/predf
When a predt/predf branch can be removed, any sync flags set on the
terminator were removed as well. Fix this by copying these flags to the
prede that replaces the terminator.

Fixes frame instability in "Devil May Cry 5" and "Resident Evil 3".

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 39088571f0 ("ir3: add support for predication")
(cherry picked from commit b2a44da9e9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Jose Maria Casanova Crespo
b640985aab broadcom/common: fix V3D 7.1 TFU ICFG IFORMAT values
The V3D 7.1 TFU ICFG register restructured the IFORMAT field to 3 bits
(25:23) vs 4 bits on V3D 4.2. The defines were still using the V3D 4.2
encoding (11-15) which overflows the 3-bit field. Fix values to the
correct 3-7 range.

This was working by accident because the overflow bits land in the
SVTWID field, which is not used for the affected tiling formats.

Also rename SAND_128 to SAND since V3D 7.1 has a single SAND input
format; the tile width is now controlled by SVTWID.

Fixes: 146ceadcf4 ("v3dv: add support for TFU jobs in v71")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 89229f08bb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Valentine Burley
8c2283df41 ci: Drop duplicate Intel shader-db run
Skylake is the default device for the Intel shim, and it's already
included in the four Intel families listed below.

Fixes: 183d57aa9e ("ci: Run intel shader-db on Haswell, Broadwell, and Meteorlake")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 9dd0f19198)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Ian Romanick
efc9ad3986 brw: Handle scalars and swizzles correctly in is_const_zero
v2: Massive simplification based on feedback from Ken.

Fixes: 96cde9cc01 ("intel/fs: Emit better code for bfi(..., 0)")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit dff1e8ae28)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Ian Romanick
29a5d22049 brw/algebraic: Allow mixed types in saturate constant folding
Prevents assertion failures in func.shader-ballot.basic.q0 and other
tests starting with "nir/algebraic: Optimize some b2f of integer
comparison".

Vector immediates, bfloat, and 8-bit floats are still not supported.

v2: Almost complete re-write based on suggestions from Ken.

v3: Don't retype() on a brw_imm_f value.

Fixes: f8e54d02f7 ("intel/compiler: Relax mixed type restriction for saturating immediates")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 985ace332b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Marek Olšák
6416837667 radeonsi: recompute IO bases after optimizations
to fix an assertion added by the commit, reproduced by viewperf13/catia

Fixes: d06616063c - radeonsi: assert that IO bases don't have holes & the same base isn't used twice

Reviewed-by: Pierre-Eric
(cherry picked from commit 8ea3d794fb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
Radu Costas
9257997e3e pco: Amend errant nir_move_option
Move options were bit or-ing from the wrong enum, causing undefined
behaviour when the number of intrinsics changed.
Replaced it with the values from the right nir_move_options enum that
were previously working. (Further refinement needed on these after
extensive testing.)

Fixes: f1b24267d2 ("pco: rework nir processing and passes")
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 721c1b8f65)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:33 +02:00
pal1000
9c45a18af7 util: Fix use of undeclared identifier 'NULL' in src/util/os_misc.h when compiling with clang
Fixes: 2771eb39fd ("util: Add function os_unset_option/os_set_option for latter use")

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14805

```
FAILED: [code=1] src/util/libmesa_util.a.p/u_process.c.obj
"cc" "-Isrc/util/libmesa_util.a.p" "-Isrc/util" "-I../../src/util" "-Iinclude" "-I../../include" "-Isrc" "-I../../src" "-Isrc/util/format" "-I../../src/util/format" "-IC:/msys64/clang64/bin/../include" "-fvisibility=hidden" "-fdiagnostics-color=always" "-D_FILE_OFFSET_BITS=64" "-Wall" "-Winvalid-pch" "-std=c11" "-O2" "-g" "-D__STDC_CONSTANT_MACROS" "-D__STDC_FORMAT_MACROS" "-D__STDC_LIMIT_MACROS" "-DPACKAGE_VERSION=\"26.0.0-rc3\"" "-DPACKAGE_BUGREPORT=\"https://gitlab.freedesktop.org/mesa/mesa/-/issues\"" "-DHAVE_OPENGL=1" "-DHAVE_OPENGL_ES_1=1" "-DHAVE_OPENGL_ES_2=1" "-DHAVE_SOFTPIPE" "-DHAVE_LLVMPIPE" "-DHAVE_ZINK" "-DHAVE_D3D12" "-DHAVE_VIRGL" "-DHAVE_SWRAST" "-DMESA_SYSTEM_HAS_KMS_DRM=0" "-DVIDEO_CODEC_VC1DEC=1" "-DVIDEO_CODEC_H264DEC=1" "-DVIDEO_CODEC_H264ENC=1" "-DVIDEO_CODEC_H265DEC=1" "-DVIDEO_CODEC_H265ENC=1" "-DVIDEO_CODEC_AV1DEC=1" "-DVIDEO_CODEC_AV1ENC=1" "-DVIDEO_CODEC_VP9DEC=1" "-DVIDEO_CODEC_MPEG12DEC=1" "-DVIDEO_CODEC_JPEGDEC=1" "-DHAVE_WINDOWS_PLATFORM" "-DHAVE_SURFACELESS_PLATFORM" "-DUSE_LIBGLVND=0" "-DUSE_D3D12_PREVIEW_HEADERS=0" "-DHAVE_GALLIUM_D3D12_VIDEO" "-DHAVE_VA_SURFACE_ATTRIB_DRM_FORMAT_MODIFIERS" "-DGLAPI_EXPORT_PROTO_ENTRY_POINTS=1" "-DALLOW_KCMP" "-DMESA_DEBUG=0" "-DHAVE___BUILTIN_BSWAP32" "-DHAVE___BUILTIN_BSWAP64" "-DHAVE___BUILTIN_CLZ" "-DHAVE___BUILTIN_CLZLL" "-DHAVE___BUILTIN_CTZ" "-DHAVE___BUILTIN_EXPECT" "-DHAVE___BUILTIN_FFS" "-DHAVE___BUILTIN_FFSLL" "-DHAVE___BUILTIN_POPCOUNT" "-DHAVE___BUILTIN_POPCOUNTLL" "-DHAVE___BUILTIN_UNREACHABLE" "-DHAVE___BUILTIN_TYPES_COMPATIBLE_P" "-DHAVE___BUILTIN_ADD_OVERFLOW" "-DHAVE_FUNC_ATTRIBUTE_CONST" "-DHAVE_FUNC_ATTRIBUTE_FLATTEN" "-DHAVE_FUNC_ATTRIBUTE_MALLOC" "-DHAVE_FUNC_ATTRIBUTE_PURE" "-DHAVE_FUNC_ATTRIBUTE_UNUSED" "-DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT" "-DHAVE_FUNC_ATTRIBUTE_WEAK" "-DHAVE_FUNC_ATTRIBUTE_FORMAT" "-DHAVE_FUNC_ATTRIBUTE_PACKED" "-DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL" "-DHAVE_FUNC_ATTRIBUTE_ALIAS" "-DHAVE_FUNC_ATTRIBUTE_NORETURN" "-DHAVE_FUNC_ATTRIBUTE_COLD" "-DHAVE_FUNC_ATTRIBUTE_VISIBILITY" "-DHAVE_UINT128" "-D_WIN32_WINNT=0x0A00" "-DWINVER=0x0A00" "-D_GNU_SOURCE" "-DUSE_SSE41" "-DHAVE___BUILTIN_IA32_CLFLUSHOPT" "-DUSE_GCC_ATOMIC_BUILTINS" "-DHAS_SCHED_H" "-DHAVE_DLFCN_H" "-DHAVE_CET_H" "-DHAVE_STRTOF" "-DHAVE_STRTOK_R" "-DHAVE_QSORT_S" "-DHAVE_STRUCT_TIMESPEC" "-DHAVE_ZLIB" "-DHAVE_ZSTD" "-DHAVE_COMPRESSION" "-DWIN32_LEAN_AND_MEAN" "-DWINDOWS_NO_FUTEX" "-DMESA_LLVM_VERSION_STRING=\"21.1.8\"" "-DLLVM_IS_SHARED=0" "-DDRAW_LLVM_AVAILABLE=1" "-DAMD_LLVM_AVAILABLE=1" "-DGALLIVM_USE_ORCJIT=0" "-DHAVE_SPIRV_TOOLS" "-DUSE_LIBELF" "-DTHREAD_SANITIZER=0" "-DHAVE_RENDERDOC_INTEGRATION=false" "-Werror=implicit-function-declaration" "-Werror=missing-prototypes" "-Werror=return-type" "-Werror=empty-body" "-Werror=incompatible-pointer-types" "-Werror=int-conversion" "-Wimplicit-fallthrough" "-Wmisleading-indentation" "-Wno-missing-field-initializers" "-Wno-format-truncation" "-fno-math-errno" "-fno-trapping-math" "-Qunused-arguments" "-fno-common" "-Wno-unknown-pragmas" "-Wno-microsoft-enum-value" "-Wno-unused-function" "-Werror=thread-safety" "-ffunction-sections" "-fdata-sections" "-pipe" "-Wp,-D_FORTIFY_SOURCE=2" "-fstack-protector-strong" "-Wp,-D__USE_MINGW_ANSI_STDIO=1" "-march=core2" "-Werror=pointer-arith" "-Werror=vla" "-Werror=gnu-empty-initializer" "-Wgnu-pointer-arith" -MD -MQ src/util/libmesa_util.a.p/u_process.c.obj -MF "src/util/libmesa_util.a.p/u_process.c.obj.d" -o src/util/libmesa_util.a.p/u_process.c.obj "-c" ../../src/util/u_process.c
In file included from ../../src/util/u_process.c:28:
../../src/util/os_misc.h:151:24: error: use of undeclared identifier 'NULL'
  151 |    os_set_option(name, NULL, true);
      |                        ^~~~
1 error generated.

```
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
(cherry picked from commit 128dc57436)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Faith Ekstrand
97105a1526 nir: Consider if uses in nir_def_all_uses_*
They check for if uses and want to return false but nir_foreach_use()
means the if uses are never seen.

Cc: mesa-stable
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 3f870d62b0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Georg Lehmann
6462dac37d gallivm: don't optimize fadd(a, 0.0) with signed zero preserve
Fixes: 540e84bedb ("gallivm: Preserve -0 and nan")
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
(cherry picked from commit 284b4143f7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Pierre-Eric Pelloux-Prayer
d5f9f12123 radeonsi: account for outputs_written when updating spi_shader_col_format
Variants can modify which outputs get written so we must update
these fields otherwise spi_shader_col_format will be incorrect.

This can happen for instance with uniforms inlining:

   uniform bool depth_only;
   void main() {
      if (depth_only) return;
      ...
   }

When depth_only is true, this shader becomes empty after uniforms
inlining but spi_shader_col_format wasn't updated properly,
causing a hang.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14737
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 88986dcc9c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Pierre-Eric Pelloux-Prayer
a21f8ddf3d radeonsi: move spi_shader_*_format to si_shader_variant_info
Variants can affect theses value so it's best to store them
in this struct.

No functional changes.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit da7c515783)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Ryan Zhang
57ef28bbde panvk: trivial fix to remove repeated assignment
Fixes: c0d9827 ("panvk: Use WB mappings for the global RW and executable memory pools")

Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
(cherry picked from commit 760ac320be)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
kingstom.chen
5d8b53beea radv/rt: only run move_rt_instructions() for CPS shaders
move_rt_instructions() only makes sense for CPS recursive shaders, where
later rt_trace_ray calls can overwrite the current shader's RT system
values.

Running it on the function-call path can hoist load_hit_attrib_amd
above merged intersection writes, which corrupts any-hit
hitAttributeEXT. Move the pass into the existing CPS-only
non-intersection branch before nir_lower_shader_calls().

Fixes: c5d796c902 ("radv/rt: Use function call structure in NIR lowering")
Closes: #15074

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
(cherry picked from commit 5a7f4c62d8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Valentine Burley
bb0c3b12b9 tu/drm/virtio: Fix GEM handle leak on failed dmabuf res_id lookup
When vdrm_handle_to_res_id fails in virtio_bo_init_dmabuf, the handle
obtained from vdrm_dmabuf_to_handle was leaked.
Closing the handle is safe despite the lack of vdrm refcounting
because dma_bo_lock is held and already-imported BOs return early.
At this point, we are the sole holder of the handle.

While here, use the local vdrm variable consistently.

Fixes: 6ca192f586 ("turnip: virtio: fix iova leak upon found already imported dmabuf")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit f2c89f0188)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Valentine Burley
e984faec26 tu/drm/virtio: Fix GEM handle leak in tu_bo_init error path
In tu_bo_init, if growing the submit BO list fails, the GEM handle
must be closed. However, bo->gem_handle is only populated later
via compound assignment. Use the gem_handle parameter directly
to ensure the correct handle is closed and not leaked.

Fixes: d67d501af4 ("tu/drm/virtio: Switch to vdrm helper")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 316d9b0209)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Valentine Burley
9ca275f29a tu/drm/virtio: Do not free iova from heap for lazy BOs
When initializing a BO using a lazy VMA, the iova is provided by
the sparse VMA and was not allocated from the device's VMA heap.
Avoid calling util_vma_heap_free in the error path for such BOs
to prevent heap corruption and potential double-frees.

Fixes: 88d001383a ("tu: Add support for a "lazy" sparse VMA")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit eb7897f57b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Valentine Burley
b6640c1609 tu/drm/virtio: Avoid freeing zombified tu_sparse_vma
This is d3cedd2fa5 ("tu/drm: msm's has_set_iova codepath should avoid
freeing zombified tu_sparse_vma") but for virtio.

Fixes: 764b3d9161 ("tu: Implement transient attachments and lazily allocated memory")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit f1366ca144)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Valentine Burley
2aa33f41e6 tu/drm/virtio: Move set_iova into success path of virtio_bo_init_dmabuf
set_iova() was called unconditionally after tu_bo_init(), even on the
failure path where the BO has been zeroed. This would call set_iova()
with res_id 0 and a stale iova, corrupting the iova mapping.

Move set_iova() into the success branch so it is only called when
tu_bo_init() succeeds.

Fixes: db88a490b8 ("tu: Avoid extraneous set_iova")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 7a96bc3187)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Valentine Burley
28f24b21ee tu/drm/virtio: Add missing lock to virtio_bo_init_dmabuf
Lock vma mutex when freeing iova in virtio_bo_init_dmabuf.

Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
(cherry picked from commit 28e3fb7052)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
pal1000
0b54363202 clc: Fix static link with clang>=22
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15090

Backport-to: 26.0

Reviewed-by: Karol Herbst <kherbst@redhat.com>

Tested-by: Rudi Heitbaum <rudi@heitbaum.com>
(cherry picked from commit 718afd787c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Mary Guillemard
79c1ff9077 nak: Do not allow load_helper_invocation reordering
load_helper_invocation can not be reordered past a demote.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 7ece220f96 ("nak/nir: Lower systm values before lowering I/O")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit cba5841d61)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Mary Guillemard
d4c73521c5 nir/dead_cf: Add missing load_global_bounded handling
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: caa0854da8 ("nir: plumb load_global_bounded")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit bb6fc8cc20)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Mary Guillemard
7908d4e89f nir/dead_cf: Add missing load_ssbo_ir3 handling
Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 0092edfec0 ("nir/dead_cf: Do not remove loops with loads that can't be reordered")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 6013667d61)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Mike Blumenkrantz
d769e67c52 llvmpipe: fix color fbfetch
with the unlowering pass, there is no longer a separate gl_LastFragData variable,
so this workaround just breaks color outputs

fixes dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data

cc: mesa-stable

(cherry picked from commit 4b2022a8f5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:32 +02:00
Mike Blumenkrantz
ad2384db21 mesa/renderbuffer: always add PIPE_BIND_SAMPLER_VIEW to rendering textures
this fixes expectations around e.g., using u_blitter to copy textures

cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 929eb9a021)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Nick Hamilton
b3197d6821 pvr: Fix for multiple attachments being assigned to the same tile buffer.
When the first attachment is assigned to a tile buffer, the buffer
alloc mask was not been updated. This means when a second attachment
is added to the same tile buffer it will be assigned the same offset
as the first which will lead to incorrect behaviour.

Fixes for depq-vk:
dEQP-VK.renderpasses.dynamic_rendering.complete_secondary_cmd_buff.suballocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.complete_secondary_cmd_buff.dedicated_allocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.suballocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.dedicated_allocation.attachment.4.568

Fixes: a7de9dae6b ("pvr: Add routine for filling out usc_mrt_setup from dynamic rendering state")

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 96cfb1cb7f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Luigi Santivetti
5594cb90ac pvr: keep compiler resources in sync with attachments
Do not assume that the application always provides images for backing
attachments. The app can provide a super set of attachments of which
only some are actually backed with images.

We want to filter-out attachments that aren't meaningful for rendering
or sampling, and create compiler resources only for relevant ones.

Fix assert in CTS:
  pvr_arch_mrt.c:215: pvr_rogue_init_usc_mrt_setup: Assertion `att_format != VK_FORMAT_UNDEFINED' failed.

Seen in pipeline monolithic, for instance:
  dEQP-VK.pipeline.monolithic.multisample.misc.dynamic_rendering.multi_renderpass.r8g8b8a8_unorm_r16g16b16a16_sfloat_r16g16b16a16_sint_d32_sfloat_s8_uint.random_127

Fixes: d549c1d045 ("pvr: add pipeline handling to use dynamic rendering info")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 5473ca3be3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Luigi Santivetti
ff902ec44b pvr: expose partial usc mrt init routine
Expose the routine in preperation for a later commit.

Backport-to: 26.0

Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 6b0fea938b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Simon Perretta
ee797f7159 pco: use vm/icm for tile buffer store coverage mask
Use the valid/input coverage masks for tile buffer store coverage masks
when running single/multi-sampled fragment shaders respectively.

Fixes: 297a0c269a ("pvr, pco: tile buffer support")

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reported-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 8eee60fa78)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
David Rosca
f1b6e15252 frontends/va: Fix leaks when create_video_codec fails
Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit 089cd9d88e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
David Rosca
9c88f72f8c frontends/va: Fix leaking H264/5 PPS/SPS objects when decoder wasn't created
When destroying H264/5 decode context we check the profile from decoder to
free the H264/5 PPS/SPS objects, but decoder is only created when decoding
first frame so these objects will never get freed in case decoder is NULL.

Cc: mesa-stable
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit 5134d37e7d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Rhys Perry
123322b8ae aco/tests: fix assembler/isel tests with LLVM 23
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit e2ebcba11b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Rhys Perry
4f859cf584 aco/tests: fix assembler tests with LLVM 22
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 0826685f1b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Iván Briano
b26c4014c6 brw: do not omit RT writes if dual_src_blend is on
Dual source blending when one of the sources is not written to leaves
those values undefined, but the other should still be valid.
By omitting unwritten outputs, we ended up not writing anything at all
for the case that OUT1 is written to but OUT0 is undefined.

Fixes new CTS tests: dEQP-VK.pipeline.*.blend.dual_source.undefined_output.first*

Cc: mesa-stable
Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit fd556e54f6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Iván Briano
3794d34ad4 anv: fix anv_is_dual_src_blend_equation
Fixes new tests: dEQP-VK.pipeline.*.blend.dual_source.undefined_output.second*

Cc: mesa-stable
Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 2ce8a9e1be)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Erik Faye-Lund
2953656a39 pan/lib: divide extent by tile-extend, not itself
Dividing this by itself is nonsensical, and just always gives us one.
That's obviously not what we want here.

But in this case we also know that the extent is divisible by the tile
extent, so there's no need for DIV_ROUND_UP, we can just divide.

Fixes: e6f8cab698 ("pan/layout: Split the logic per modifier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 5280b80281)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Erik Faye-Lund
ad057adfd4 pan/lib: set srgb-flag for afrc render-targets
Without this, sRGB rendering to AFRC is broken.

Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit b0c32fcc66)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Erik Faye-Lund
0a5c858f28 pan/lib: do not try to use stencil-aspect of color attachment
We can't use the stencil-aspect of a color-attachment. That's going to
fail, so let's use the color-aspect instead. We already have it around
anyway.

Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 322aaa88c6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Erik Faye-Lund
4539139804 pan/genxml: remove non-existent YUV Enable for AFRC
This is controlled by the writeback-mode when using AFRC, not by an YUV
Enable field. This Filed doesn't exist in these, and should according to
the spec be zero.

Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 15e0ac0731)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Eric Engestrom
9e5e58aa38 .pick_status.json: Mark 538c3ee6c7 as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Samuel Pitoiset
111c2dacaa radv/amdgpu: free the VA range in case the BO allocation failed
Found by inspection.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 02628a5eb7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:31 +02:00
Robert Mader
eccd9df749 llvmpipe: Stop aligning height to raster block size for unbacked handles
This code path is usually used by lavapipe when importing dmabufs, not
for output.
The resulting size_required is then used to calculate the size
requirements for VkMemoryRequirements2 etc. Requiring a multiple of
LP_RASTER_BLOCK_SIZE - 4 - can eventually result in lavapipe rejecting
dmabuf imports.

An example is YUV420 at a resolution of 1680x1050 produced by Gstreamer
1.28 - e.g. from a screencasts. In this case we currently compute a size
of 3235840, while other drivers like radv compute 3225600. The actual
size is 3227648, fitting into the later but not the former.

Removing the alignment brings lavapipe in line with other drivers.

Cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
(cherry picked from commit 0bbc26d2c4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric Engestrom
676c9c7f0e ci: changing .gitlab-ci.yml itself also means the container jobs must exist
Fixes: 4b2a4dce78 ("ci: Skip check-only container jobs for pre-merge")
(cherry picked from commit 4466914680)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric R. Smith
0629cbe042 panfrost: fix typos in architecture detection
The preprocessor symbol we want is `PAN_ARCH`, not `MALI_ARCH`.

Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 3945421c17)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric R. Smith
675c1885b6 panfrost: fix texel buffer calculations
We were computing some positions using `void*` rather than pointers to
the appropriate structures. This caused bad pointers, the effect of
which depended on the current memory environment -- tests related to
texel buffers could pass or not depending on what other tests had run
previously.

Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
(cherry picked from commit 0142e2e5e3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Mary Guillemard
cb9c75e2ca nvk: Broacast viewport0 and scissor0 in case of FSR on Turing
On Turing, the hardware rely on the viewport index for FSR.
If not all viewports are defined, we will end up not rendering
anything when selecting the primitive shading rate.

This patch makes it that we now broadcast the viewport and scissor 0
likes the proprietary driver.

This fixes "dEQP-VK.mesh_shader.ext.builtin.primitive_shading_rate_*" on
Turing.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 2fb4aed9 ("nvk: Advertise VK_KHR_fragment_shading_rate")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit d00965651a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Mary Guillemard
fbebd62932 nvk: Move viewport and scissor emit to their own function
We are going to need to reuse those functions to fix FSR support on
Turing.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Cc: mesa-stable
(cherry picked from commit 56e31d8145)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Timothy Arceri
d55ff99ad1 util/driconf: add workarounds for Lethis - Path Of Progress
The game uses glGetUniformLocation() but specifies the wrong program id
for one of the uniforms. The shader programs both contain shaders with
a uniform of the same name but because they have a different number of
uniforms the returned uniform location does not match the expected uniform.

Here we add a workaround to force the uniform with the wrong get location
params to always have the location 0 so that it doesn't matter which
shader the application checks for the location.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14864
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 09393b33b2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Timothy Arceri
1f9129a359 mesa: add force_explicit_uniform_loc_zero workaround
Allows a uniform name to be passed to force_explicit_uniform_loc_zero
allowing us to set that uniform to an explicit location of zero.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 87ae5cab94)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Dave Airlie
1cfb84c060 st/mesh: handle mesh shader point size
This sets the per-vertex point size state correctly in the presence of mesh shaders.

(fixes line is just a educated pick)

Fixes: 51d6e4404a ("mesa: allow NULL for vertex shader when mesh pipeline")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
(cherry picked from commit 5bfaf7536a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Icenowy Zheng
18606773fe vulkan/wsi/headless: properly use CPU images for CPU devices
Currently the headless WSI unconditionally uses DRM images as WSI
images, which isn't proper behavior for working with lavapipe driver,
and leads to either error or crash (depending on whether udmabuf is
available).

Properly setup CPU images instead of DRM images for software-rendering
WSI devices.

This fixes (at least) `dEQP-VK.wsi.headless.swapchain.render.*` on
lavapipe.

Fixes: 90caf9bdbd ("vulkan/wsi/headless: drop the wsi_create_null_image_mem override")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
(cherry picked from commit 38cf1b3829)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Mike Blumenkrantz
59f2c9502d ntv: always emit const coord components for fbfetch loads
VUID-StandaloneSpirv-SubpassData-04660
  The (u,v) coordinates used for a SubpassData must be the <id> of a constant vector (0,0)

cc: mesa-stable

(cherry picked from commit 95b7a5b82b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Faith Ekstrand
4dc0c3ce94 nak: Report progress from nak_nir_rematerialize_load_const()
Fixes: 8fffcdb18b ("nak/nir: Re-materialize load_const instructions in use blocks")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit 381bc06c4a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric Engestrom
73d8c998fe .pick_status.json: Mark 384d128164 as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric Engestrom
cc617b7b58 .pick_status.json: Mark d38916d673 as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric Engestrom
8e1a3577ed .pick_status.json: Mark 32a818d11d as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric Engestrom
9c86791bde .pick_status.json: Mark 26b19e355f as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:30 +02:00
Eric Engestrom
2c25692309 .pick_status.json: Update to 48c086cb42
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
2026-04-01 11:45:29 +02:00
Eric Engestrom
1897b18965 docs: add sha sum for 26.0.3
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-03-18 17:25:12 +01:00
Eric Engestrom
3f173c02d1 VERSION: bump for 26.0.3 2026-03-18 16:54:10 +01:00
Eric Engestrom
a04cff0266 docs: add release notes for 26.0.3 2026-03-18 16:54:10 +01:00
Mike Blumenkrantz
cbc172ecb2 mesa/st/samplerview: explicitly block releasing in-use samplerviews
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
st_texture_set_sampler_view() currently allows only one samplerview for
a given texobj per context. in a scenario where the same texobj is
bound multiple times with different samplerviews (e.g., SRGB) for the
same draw like

samplerviews[] = {view0, view1}

then st_texture_set_sampler_view() will release view0 while creating view1
before either view is actually set to the driver, and then the driver will explode

this is gross, but the best solution which avoids infinite memory ballooning
from bufferview offsets is to pass through the array of views during creation
to ensure that the cache doesn't try to prune a view it just created

caught by Left 4 Dead 2

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15045

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 3264adf863)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-18 16:04:34 +01:00
Mike Blumenkrantz
d123fcf112 mesa/st/sampler_view: eliminate st_sampler_view::srgb_skip_decode
this prevents matching existing samplerviews when instead the existing
samplerviews can just match formats

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit c186023e51)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-18 16:04:34 +01:00
Mike Blumenkrantz
9aeca4c8b1 mesa/st/sampler_view: use a local variable for texture sv format
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 64dd6bf8aa)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-18 16:04:34 +01:00
Mike Blumenkrantz
6026a01827 mesa/st/sampler_view: use a local variable for buffer sv format
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 8fce32191e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-18 16:04:34 +01:00
Mike Blumenkrantz
56b5152152 mesa/st: make st_texture_get_current_sampler_view static
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit 22ed7c8230)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-18 16:04:33 +01:00
Connor Abbott
f7d31e8681 vtn: Fix vtn_mediump_upconvert_value() with transposed matrices
We can produce a transposed value sometimes, and we have to make sure
that val->transposed is also updated when that happens.

Noticed by inspection after the previous commit.

Cc: mesa-stable
(cherry picked from commit c13bdaaa40)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:23 +01:00
Connor Abbott
e234dcf62c vtn: Fix vtn_mediump_downconvert_value() for transposed matrices
We forgot to set the actual value. This meant that whenever we actually
needed to use the transposed matrix we would immediately segfault.

Cc: mesa-stable
(cherry picked from commit 048d2a0c68)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:22 +01:00
Mike Blumenkrantz
8af93595b9 lavapipe: fix mesh property exports
this should match how the spec actually functions

cc: mesa-stable

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
(cherry picked from commit 73feb138b6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:21 +01:00
Mike Blumenkrantz
62508b76b6 llvmpipe: save mesh shader when calling u_blitter
this otherwise causes the draw module to use mesh shaders when blitting

cc: mesa-stable

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
(cherry picked from commit 58dd7afa0e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:19 +01:00
Danylo Piliaiev
e5f61543fe tu/kgsl: Better detection of sparse support
Apparently a device can support KGSL_MEMFLAGS_VBO but not
IOCTL_KGSL_GPUMEM_BIND_RANGES or IOCTL_KGSL_GPU_AUX_COMMAND.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15006
Fixes: 71ef46717c ("tu/kgsl: Add support for sparse binding")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit f23e88108d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:18 +01:00
Natalie Vock
976efbe12a radv/rt: Fix shared ray query stack on top of application LDS
Since the stack pointer may wrap around the stack size in overflow
cases, traversal logic calculates the real stack pointer with
nir_umod_imm(b, stack, args->stack_entries * args->stack_stride).

For ray queries, "stack" was initialized to
"stack_base + local_invocation_idx * 4". This was completely broken, as
the umod would later delete the stack base completely and overwrite the
start of LDS, which belongs to the apps' shared memory.

Instead, add the stack base as a constant offset in the load/store_stack
callback. (This should also save 1 VALU per ray query)
Also, delete radv_ray_traversal_args::stack_base since it's unused now.

Cc: mesa-stable
(cherry picked from commit b046eaf36d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:17 +01:00
David Rosca
842ab24923 radv/video: Fix coding pic_parameter_set_id in H264 slice header
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
(cherry picked from commit 25095cc393)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:03 +01:00
David Rosca
9794a3970a radv/video: Fix AV1 encode min tile size
Fixes: 37e71a5cb2 ("radv/video: add support for AV1 encoding")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
(cherry picked from commit 0450e4ff65)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:01 +01:00
Pavel Ondračka
0bf2abc5d5 r300: pad short vertex shaders to avoid R3xx hangs
Vertex shaders shorter than four instructions can hard-lock R3xx GPUs.
This seems to happen in combination with a small vertex count. This was
seen before, most notably with dummy shaders, but the earlier fix only
removed those dummy shaders, so some occurrences could still slip
through the cracks. Pad all vertex shaders to four instructions on R3xx.

Reviewed-by: Filip Gawin <filip@gawin.net>
Fixes: c6aa639ba9 ("r300: skip draws instead of using a dummy vertex shader")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/337
(cherry picked from commit 9b12664b72)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:59:00 +01:00
Natalie Vock
0b4e497c7b radv/rt: Bump ray query stack base limit for GFX12
GFX12 encoding added one bit to the stack offset, doubling the limit on
the stack base offset that is possible to encode. In practice, this
always allows using bvh_stack_push* instructions on GFX12 since LDS is
still 64kB.

Cc: mesa-stable
Fixes: 59a39779 (radv/rt: Only use ds_bvh_stack_rtn if the stack base is possible to encode)
(cherry picked from commit 867d0b33b3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:58:59 +01:00
Mike Blumenkrantz
e7b12e5009 zink: work around drivers with broken mesh shader properties
some properties require setting MAX+1, but there are drivers which mistakenly
set 0

cc: mesa-stable

(cherry picked from commit c09d0018a3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:58:57 +01:00
Mary Guillemard
4a90af4a3b nvk/mme: Add missing nullcheck in nvk_mme_test_state_state
Needed for some FSR macro changes I want to test.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 7d6cc15ab8 ("nvk/mme: Add a unit test framework for driver macros")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit 32895657b4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:58:56 +01:00
Rob Clark
3acc53d7a4 freedreno/drm: Fix bo_flush race
Once we've dropped the lock, we need to be referring to our own
temporary reference.

Fixes: 7b02bc6139 ("freedreno/drm: Drop fd_bo_fence")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
(cherry picked from commit f5d40636cd)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:58:41 +01:00
Ryan Zhang
ae58d184b0 panvk/csf: use DEFERRED_FLUSH for fragment job cache flush
The correct dependence is cs_flush_caches.cs_defer.signal to
signal cs_sync32_set.cs_defer.wait in occulusion query path.

Fixes: 443ddac ("panvk/csf: merge v10 and v11 paths in
issue_fragment_jobs")

Fixed: many random fail cases in VK-GL-CTS 1.4.4.2, eg.
dEQP-VK.query_pool.occlusion_query.get_results_conservative
_size_64_wait_query_without_availability_draw_points_clear_color

Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
(cherry picked from commit 93b58064f7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:58:40 +01:00
Eric Engestrom
5bbdfb0da5 .pick_status.json: Mark f2f792996d as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 18:58:38 +01:00
Faith Ekstrand
4b9aa545ea pan/compiler: Handle store_per_view_output in collect_varyings()
No idea how this wasn't blowing anything up before.

Fixes: 448b5e0225 ("panvk: implement multiview support")
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 425458c598)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 13:11:20 +01:00
Mike Blumenkrantz
c0a931e338 mesa/st: fix unlower_io_to_vars to work with mesh shaders
cc: mesa-stable

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15034
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15040

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 3dbb7e896d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 13:11:18 +01:00
Mike Blumenkrantz
3dea1bd33d nir: fix nir_is_io_compact for mesh shaders
cc: mesa-stable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e604a8f617)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 13:11:15 +01:00
Yiwei Zhang
ac750a1e3c venus: force prime blit on Nvidia GPU
Normally Venus on Nvidia GPUs takes the prime blit path. The exception
is when KWin or any wlroots based compositors are used:
1. KWin and wlroots based compositors always add LINEAR to dmabuf
   feedback tranches assuming LINEAR can be handled by GPU drivers.
2. Venus + Virgl only sees the compositor injected LINEAR mod since
   Virgl doesn't support explicit modifiers on the driver side.
3. Nvidia GPUs doesn't support LINEAR color attachment, and it's too
   late to reject LINEAR mod when the native image path has already
   been taken instead of the prime image path.

Gamescope requires VK_EXT_physical_device_drm and its runtime doesn't
use standard WSI extensions, so venus can spoof without impacting it.

Cc: mesa-stable
(cherry picked from commit 1a302155ee)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 12:39:24 +01:00
Ian Douglas Scott
216cb0fca5 wsi/wayland: Use wl_fixes to destroy wl_registry
cc: mesa-stable

(cherry picked from commit 6641c891fd)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 12:38:44 +01:00
Rob Clark
04d09c9524 freedreno/fdl: Use 4k alignment for tiled
Tiled-but-not-UBWC images should also have 4k alignment.

Cc: mesa-stable
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
(cherry picked from commit 3d4792d577)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 12:38:41 +01:00
Mike Blumenkrantz
2401bd8e0a zink: run opt_combine_stores when optimizing
this ensures stores to mesh builtins are vectorized, as required by
spec

cc: mesa-stable

(cherry picked from commit 20c65db45d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 12:38:39 +01:00
Mike Blumenkrantz
d4465aad0b zink: allow renderpass termination for clears with ZINK_DEBUG=rp and GENERAL layouts
this doesn't require a layout change

cc: mesa-stable

(cherry picked from commit eed3007588)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 12:38:38 +01:00
Mike Blumenkrantz
58950e2d06 zink: reapply zsbuf state after unordered blits
this otherwise creates desync if a renderpass continues after blit reordering

cc: mesa-stable

(cherry picked from commit 43a6928d62)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 12:38:36 +01:00
Eric Engestrom
641b7710e9 .pick_status.json: Update to 70a487adfb
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40488>
2026-03-17 11:26:05 +01:00
Eric Engestrom
804e4154a3 docs: add sha sum for 26.0.2
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-03-12 13:12:05 +01:00
Eric Engestrom
efd5b779da VERSION: bump for 26.0.2 2026-03-12 12:56:33 +01:00
Eric Engestrom
3646899ffd docs: add release notes for 26.0.2 2026-03-12 12:56:33 +01:00
Mike Blumenkrantz
5cf88188bd egl/device: fix the fix for explicit sw rejection in non-sw EGL_PLATFORM=device
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
"explicit sw" means llvmpipe, which cannot be a real drm device. this requires also
returning only a single device so as to avoid leaking non-sw drivers

should fix LIBGL_ALWAYS_SOFTWARE=1 eglinfo

Fixes: 8a339cdebc ("egl: fix sw fallback rejection in non-sw EGL_PLATFORM=device")
(cherry picked from commit c9b2986607)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Job Noorman
1f89a0fb96 ir3: don't predicate vote_all/vote_any
These get lowered to control flow which isn't allowed inside predicated
blocks.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 39088571f0 ("ir3: add support for predication")
(cherry picked from commit 5e4a7d01fe)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Job Noorman
d78e309e4d ir3: update context builder after ir3_get_predicate
If we are currently inserting instructions after the src of the
predicate conversion, uses of the predicate will be inserted before its
def (the conversion). Fix this by updating the context builder to point
to after the conversion.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fda91b49d7 ("ir3: refactor builders to use ir3_builder API")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15043
(cherry picked from commit f88e8b778d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Samuel Pitoiset
0e31cb83ce radv: fix missing L2 cache invalidation with streamout on GFX12
COPY_DATA emitted from the CP isn't coherent with L2, in case the
buffer filled size needs to be copied.

This fixes rare and random flickering with Mafia 3 Definitive Edition
on RDNA4.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14697
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit d9420eed9e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Sagar Ghuge
ada713b32f anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921
WA states that we need to allocate maximum number of stackIDs per DSS
from RT_DISPATCH_GLOBALS to 2048.

We can still throttle/control the CFE_STATE::StackID to be in range
specified by the field.

This does impact performance having CFE_STATE::stackIDs capped to 2K
by default. More the outstanding ray queries, larger the working set and
have more impact on cache hit rate.

This affect performance on Xe2+ onwards:
* Boundary Benchmark:            36.2%
* Solar Bay extreme:             9.8%
* Hitman world of assassination: 3.9%

Fixes: c1a44e8d43 ("anv: force StackIDControl value for Wa_14021821874")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit cb423ee636)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Tapani Pälli
446fab4a4a anv: add handling for Wa_14026600921
This is the Xe3 version of the earlier workaround.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 840e6e855b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Tapani Pälli
77add2d8f2 intel/dev: update mesa_defs.json from workaround database
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit c75309c8f1)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Faith Ekstrand
7054ea6d45 pan/bi: Be more careful about bit sizes in b2f lowering
Fixes: 21bdee7bcc ("pan/bi: Switch to lower_bool_to_bitsize")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 08c437f644)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Faith Ekstrand
9c2b19219a nir/lower_bool_to_bitsize: Make all bN_csel sources match
Previously, we assumed that the selector for bcsel could be whatever,
regardless of the bit sizes of the data and we'd just fix it in the
back-end.  This works okay for scalars but falls over the moment we
vectorize because all our vector handling assumes bit sizes match.
Since matching bit sizes is what the hardware wants anyway, it's better
to do the right thing in NIR and hope copy-propagation can fold in
conversions if needed.

Unfortunately, copy prop isn't that smart yet so this does hurt a bit:

    Instrs: 1193679 -> 1198086 (+0.37%); split: -0.06%, +0.43%
    CodeSize: 11915136 -> 11950592 (+0.30%); split: -0.05%, +0.34%
    Full: 160985 -> 160941 (-0.03%); split: -0.04%, +0.01%
    Estimated normalized CVT cycles: 4456.938557000181 -> 4480.876069000186 (+0.54%); split: -0.13%, +0.67%
    Estimated normalized SFU cycles: 6350.9375 -> 6392.21875 (+0.65%)
    Estimated normalized Load/Store cycles: 205773.0 -> 205795.0 (+0.01%)
    Maximum number of threads: 12864 -> 12863 (-0.01%)
    Number of spill instructions: 22487 -> 22489 (+0.01%)
    Number of fill instructions: 52179 -> 52219 (+0.08%)

Hurt shaders:

    google-meet-clvk/BgBlur
    google-meet-clvk/Relight
    parallel-rdp/small_subgroup
    parallel-rdp/small_uber_subgroup

The proper solution here is to teach copy-prop about this stuff so that
it can propagate swizzles into ALU ops when they're supported:
https://gitlab.freedesktop.org/panfrost/mesa/-/issues/265

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14945
Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 3fd471dca5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Faith Ekstrand
740734ac72 etnaviv: Call lower_bool_to_int32 not to_bitsize
It calls both for some reason but never handles any other booleans than
32-bit.  This was probably a mistake.

Fixes: e63a7882a0 ("etnaviv: call nir_lower_bool_to_bitsize")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
(cherry picked from commit 6fb3995659)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:12 +01:00
Mary Guillemard
f550eb1903 vulkan: Do not override the shader_flags in case of no task shader
This should be doing a or and not an assign.
This fixes issues on NVK with mesh stages on DGC.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: 9308e8d90d ("vulkan: Add generic graphics and compute VkPipeline implementations")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 8f2eeee7ba)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Antonino Maniscalco
7b2af9e15a zink: don't care about generated gs output primitive
Zink uses the output primitive of the last vertex stage when deciding
the raster primitive. When we generate the gs the output primitive
depends on the raster primitive.

Not only does the generated gs output primitive have no value in chosing
the raster primitive, it can also get us stuck with the last raster
primitve which is of course incorrect.

Ignore it for generated shaders.

Cc: mesa-stable
(cherry picked from commit d526bbc29b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Timothy Arceri
b5304ffef7 glx: guard glx_screen frontend_screen member
Guards workaround code with the same conditions as glx_screen`s
frontend_screen member.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Fixes: 67eeee43e0 ("driconf: add a way to override GLX_CONTEXT_RESET_ISOLATION_BIT_ARB")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15021
(cherry picked from commit bd42f62b0f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Iván Briano
836a22d1a2 anv: don't try to fast clear D/S with multiview
If multiview is enabled on the render pass, baseLayer and layerCount
will be 0 and 1 respectively and throw us off.
We can still fast clear if view_mask == 1, but anything else hits the
BLORP_BATCH_NO_EMIT_DEPTH_STENCIL restriction.

Fixes: e488773b29 ("anv: Fast clear depth/stencil surface in vkCmdClearAttachments")

Signed-off-by: Iván Briano <ivan.briano@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
(cherry picked from commit 5d22f307d5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Ian Romanick
2a2dba1bc7 elk/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
shader-db:

Broadwell
total instructions in shared programs: 18607516 -> 18607530 (<.01%)
instructions in affected programs: 2095 -> 2109 (0.67%)
helped: 0 / HURT: 8

total cycles in shared programs: 955704436 -> 955702925 (<.01%)
cycles in affected programs: 34299 -> 32788 (-4.41%)
helped: 2 / HURT: 6

All Haswell and older platforms had similar results. (Haswell shown)
total instructions in shared programs: 16989200 -> 16989201 (<.01%)
instructions in affected programs: 461 -> 462 (0.22%)
helped: 0 / HURT: 1

total cycles in shared programs: 946537070 -> 946537035 (<.01%)
cycles in affected programs: 16378 -> 16343 (-0.21%)
helped: 1 / HURT: 0

Test: piglit!1100
Reported-by: Georg Lehmann
Fixes: ca675b73d3 ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 64c60582b5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Ian Romanick
829e5ccc84 brw/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
This optimization was added in October 2013, and the error was only just
now discovered. Removing the SEL.G.SAT optimization affected zero
shader-db shaders, and it affected 9 fossil-db shaders for instruction
size only.

I haven't checked to see if any of the hurt shaders are helped by
!39987.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 17093041 -> 17093055 (<.01%)
instructions in affected programs: 2072 -> 2086 (0.68%)
helped: 0 / HURT: 8

total cycles in shared programs: 876739578 -> 876739154 (<.01%)
cycles in affected programs: 18946 -> 18522 (-2.24%)
helped: 2 / HURT: 6

fossil-db:

Lunar Lake
Totals:
Instrs: 906230557 -> 906240487 (+0.00%); split: -0.00%, +0.00%
CodeSize: 14498856128 -> 14499003168 (+0.00%); split: -0.00%, +0.00%
Send messages: 40667184 -> 40667205 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104068494103 -> 104068561943 (+0.00%); split: -0.00%, +0.00%
Max live registers: 189570192 -> 189570204 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 48157648 -> 48157552 (-0.00%)
Non SSA regs after NIR: 139823587 -> 139823016 (-0.00%); split: -0.00%, +0.00%

Totals from 9172 (0.46% of 1985212) affected shaders:
Instrs: 10774709 -> 10784639 (+0.09%); split: -0.00%, +0.09%
CodeSize: 177868384 -> 178015424 (+0.08%); split: -0.08%, +0.17%
Send messages: 311154 -> 311175 (+0.01%); split: -0.00%, +0.01%
Cycle count: 232471392 -> 232539232 (+0.03%); split: -0.15%, +0.18%
Max live registers: 1243549 -> 1243561 (+0.00%); split: -0.00%, +0.01%
Max dispatch width: 196672 -> 196576 (-0.05%)
Non SSA regs after NIR: 509663 -> 509092 (-0.11%); split: -0.19%, +0.08%

Test: piglit!1100
Reported-by: Georg Lehmann
Fixes: ca675b73d3 ("i965/fs: Optimize saturating SEL.L(E) with imm val >= 1.0.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
(cherry picked from commit 6c6c6ce054)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Eric R. Smith
63a6e0ffc9 pco: fix a typo in the check for optimization looping
The count isn't incremented anywhere else.

Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Fixes: f1b24267d2 ("pco: rework nir processing and passes")
(cherry picked from commit 8521051cfa)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Pavel Ondračka
eea697b179 r300: disable clip-discard watermark for triangles
Commit 0d4aa5f55f introduced the watermark to optimize the guardband
state changes and always computed new_distance as MAX2(distance,
watermark).

That is correct for point/line paths where distance > 0, but it keeps a
non-zero discard distance alive when the next draw sets distance = 0
(triangles). This leaks wide point/line clip-discard state into later
triangle draws and can clip away large parts of geometry (as observed in
Sauerbraten). Only apply the watermark when distance > 0 and reset it to
zero otherwise so triangle draws disable clip-discard as intended.

Fixes: 0d4aa5f55f ("r300: pop-free clipping")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14959
(cherry picked from commit ce33f82f83)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Samuel Pitoiset
ecb7bf7b68 radv: fix local invocation index for mesh/task and quad derivatives on GFX12
It must be lowered.

This fixes
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.{mesh,task}.*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3c4cb16159)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Samuel Pitoiset
f858d2238e radv: fix a GPU hang with PS epilogs and secondary command buffers
If the secondary changes the fragment output state and if the same
PS epilog used before ExecuteCommands() is re-bind immediately after
that call, the PS epilog state wouldn't be re-emitted.

Apply the same change for VS prologs, although the logic is slightly
different and the bug shouldn't occur. The whole logic of secondaries
should be completely rewritten because it's definitely not robust.

This fixes a GPU hang in Where Winds Meet, see
https://github.com/doitsujin/dxvk/issues/5436.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1a00587c44)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Yiwei Zhang
2b6e7f0be2 lvp: avoid advertising dmabuf support for kms_swrast
Lavapipe relies on true udmabuf support for dmabuf export allocation.
This changes aligns the behavior with both llvmpipe_allocate_memory_fd
and llvmpipe_import_memory_fd.

Fixes: 7d0a631f20 ("llvmpipe: export dmabuf caps for kms_swrast")
Reviewed-by: Lucas Fryzek <lfryzek@igalia.com>
(cherry picked from commit 5ab8c8a439)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Mel Henning
60e29a07c0 driconf: force_vk_vendor on No Man's Sky + NVK
Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary@mary.zone>
(cherry picked from commit bfde63e4d8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Georg Lehmann
8f6c3dcc90 nir/opt_algebraic: fix frsq clamp pattern
This is not NaN correct.
And also make the pattern 32bit only because the constant is hard coded
FLT_MAX.

Fixes: 780b5c1037 ("nir/algebraic: Simplify some Inf and NaN avoidance code")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit ab773fc5d4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Danylo Piliaiev
4a4a86390b tu: Don't read .patch_input_gmem of unused attachment
There was duplicated code to set unscaled_input_fragcoord and a read
from VK_ATTACHMENT_UNUSED attachment, which incorrectly updated
builder->unscaled_input_fragcoord.

ubsan:
 tu_pipeline.cc:4734:44: runtime error: load of value 127, which is not a valid value for type 'bool'

Seen in:
 dEQP-VK.renderpasses.renderpass1.custom_resolve.monolithic.stencil_only_s8

Fixes: 97da0a7734 ("tu: Rewrite to use common Vulkan dynamic state")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit 81a76be861)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Danylo Piliaiev
ace5f6c88d tu: Store gmem attachments after custom resolve in dyn RP
For dynamic renderpass we created a fake second subpass,
which would is used by CmdBeginCustomResolveEXT, however
CmdBeginCustomResolveEXT doesn't trigger tile stores, but
attachments didn't know they should be stored after fake
custom resolve subpass.

Fixes: 520e3f3a47 ("tu: Implement VK_EXT_custom_resolve")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit 67c54c4465)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Caio Oliveira
8355670805 nir: Fix constant folding for iadd_sat
Use INT_MIN instead of INT_MAX for underflow.

Fixes: cc4b50b023 ("nir/opcodes: use u_overflow to fix incorrect checks")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pelloux@gmail.com>
(cherry picked from commit da57fbfb07)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:11 +01:00
Connor Abbott
725626858d tu: Fix setting will_be_resolved with MSRTSS
We were setting it on the user's attachments, which become
resolve/unresolve attachments, but it should be set on the color
and depth/stencil attachments.

Cc: mesa-stable
(cherry picked from commit d0be4ab2ab)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Connor Abbott
9a361c3801 tu: Set polygon mode when blitting
Noticed by inspection.

Cc: mesa-stable
(cherry picked from commit 1d167ffe77)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Yiwei Zhang
b88c8f37e4 pan: fix to not clear out of bitset range
Fixes: 617f0562bb ("pan: Use bitset instead of bool array in bi_find_loop_blocks")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit ec24d1afb6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Lucas Fryzek
d7ee1e68df vulkan/wsi: Check that xshm can be attached
Cc: mesa-stable
Co-authored-by: Carlos Lopez <clopez@igalia.com>
(cherry picked from commit 4933e60bc2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Lucas Fryzek
5f4eccf1fb glx: Check that xshm can be attached
Cc: mesa-stable
Co-authored-by: Carlos Lopez <clopez@igalia.com>
(cherry picked from commit a67af81944)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Lucas Fryzek
2c4c7fbfa9 egl/dri: Check that xshm can be attached
Cc: mesa-stable
Co-authored-by: Carlos Lopez <clopez@igalia.com>
(cherry picked from commit 5f481dd89d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Lucas Fryzek
23b88ba221 x11: Add helper util to check for xshm support
Cc: mesa-stable
(cherry picked from commit 9e1671dea9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Lucas Fryzek
8d313e5d1c drisw: Properly mark shmid as -1 when alloc fails
Cc: mesa-stable
(cherry picked from commit b93bf19d94)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Timothy Arceri
681de5a641 st/glsl_to_nir: update state var locations earlier
We need to update the state var locations before the
st_serialize_base_nir() calls otherwise
_mesa_optimize_state_parameters() can alter params such that
variants wont be able to find the correct match when calling
_mesa_lookup_state_param_idx().

Prior to 891d46f5 this worked because after failing to match
we would end up adding additional params back in that we had
just attempted to optimise.

Fixes: a6fcc2835e ("
st/glsl_to_nir: make sure the variant has the correct locations set")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14837

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
(cherry picked from commit 6c60f423b3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Timothy Arceri
0edb7039cb mesa/st: use same path for setting state ref locations
After the fix in a6fcc2835e we can now take the same path whether
allow_st_finalize_nir_twice is set or not.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit b59c3ac82a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Caio Oliveira
b2a34da82f spirv: Fix spec constant to handle Select for non-native floats
There was an assumption that if the instruction had non-native float
as a source, the first source would have such type.  This doesn't
hold for Select, and the code failed in two ways

- The boolean source of Select was being converted to the non-native
  float type.

- The loop that resolves the bit-size for unsized operands would
  trip at `assert(i == 0)` because Select has more than one source.

Re-organize the code to track the types of the sources independently,
and fix both issues above.

Fixes: 90e1b12890 ("spirv: Add bfloat16 support to SpecConstantOp")
Fixes: 51d3c4c889 ("spirv: support float8 spec constant op")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 6affcb43a7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Caio Oliveira
4588b025c8 spirv: Pull constant source fixup to the existing loop
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit b0c3b20bff)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Caio Oliveira
0775d0f1b5 spirv: Refactor ALU opcode translation to take bit sizes
Only used by Convert operations, so just pass 0 from callers that
are not Convert and clarify that in the code.

Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 1c3c987d5c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Timothy Arceri
a66a9280fb glsl: add workaround for MDK2 HD
Allows a shader to compile that uses an embedded struct declaration
which are not allowed in glsl 1.20+

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14986
(cherry picked from commit f109bfc3f1)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Rhys Perry
1d66a995ce nir/range_analysis: set deleted key
If (uintptr_t)&deleted_key is small enough, inserting entries into the
hash table might not work correctly.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit c0079e09ca)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Ian Romanick
0d52c7941e brw: Also check for ADDRESS file in update_for_reads
Like accumulators and ARF address registers, the virtual address
registers are not tracked in a way the defs analysis can know
about. This could actually be fixed, but that is future work.

Fixes: b110b06447 ("brw: introduce a new register type for the address register")
Suggested-by: Lionel
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 8624da56ee)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:10 +01:00
Ian Romanick
815691378b brw: Use brw_reg_is_arf in update_for_reads
brw_reg::nr encodes both which ARF it is and which instance of that
ARF. In other words, nr for acc0 and acc2 have some bits that say
BRW_ARF_ACCUMULATOR and some bits that say 0 vs 2. The previous test
would only detect acc0.

Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 366410e913)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Ian Romanick
f21bc439a1 brw: Don't mark_invalid in update_for_reads for non-VGRF destination
This can occur if NULL or an accumulator is an explicit destination.
update_for_reads still needs to process the sources.

v2: Pass a brw_reg to ::mark_invalid, and do the VGRF check in that one
place.

Fixes: 0d144821f0 ("intel/brw: Add a new def analysis pass")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a548466186)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Jose Maria Casanova Crespo
31ea1923de v3d: reject fast TLB blit when RT formats don't match
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.

So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.

Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.

Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.

Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 5454221cfb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Marek Olšák
f7d391f851 ac: set the correct number of Z planes for ALLOW_EXPCLEAR
This is an old driver bug that could cause Z corruption on gfx8-11.5.

v2: handle allow_expclear differently

Cc: mesa-stable

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v2)
(cherry picked from commit 4cfe08e583)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Karol Herbst
d29063d4f2 nir: fix nir_round_int_to_float for fp16
fp16 has quite the limited value range and with bigger integers
nir_round_int_to_float might return Inf where it shouldn't depending on
the rounding mode.

Fixes conversions half_rt[npz]_(u)?(int|long) CL CTS tests.

Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
(cherry picked from commit e1ed7de274)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Karol Herbst
3d8ff40d58 nir: fix nir_alu_type_range_contains_type_range for fp16 to int
The special value "Inf" doesn't fit into an int and therefore we have to
clamp regardless of whether all the other values would fit. And because
f2u32 and f2u64 define out-of-range conversions as UB in nir, we need to
clamp.

This change should have no effect for non saturating conversions.

Fixes "conversions long_sat_*half" CL CTS tests

Cc: mesa-stable
Suggested-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 8e8fb2ebaa)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Boris Brezillon
7ee55d3a5f pan/kmod: Allow mmap() on foreign buffers
If the BO comes from a different subsystem
(args.extra_flags & DRM_PANTHOR_BO_IS_IMPORTED), we should normally
add extra DMA_BUF_IOCTL_SYNC calls around CPU accesses to ensure the
CPU mapping consistency, but this is something we never worried about
(we've always assumed exporters were exposing uncached mappings with
NOP {begin,end}_cpu_access() implementations), and it worked fine until
now.

The long term plan is to hook up DMA_BUF_IOCTL_SYNC, but this requires
more work, and we need a quick fix that can be backported easily, hence
this revert+FIXME.

Fixes: b5e47ba894 ("pan/kmod: Add new helpers to sync BO CPU mappings")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14963
Closes: https://gitlab.freedesktop.org/panfrost/mesa/-/issues/282
Closes: https://gitlab.freedesktop.org/wayland/weston/-/issues/1101
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Acked-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 30f1d5bab9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Pierre-Eric Pelloux-Prayer
b299e0323a mesa: don't wraparound st_context::work_counter
st->release_counter is initialized to 0, so if we happen to call
st_add_releasebuf with a non-NULL releasebuf when st->work_counter
is 0 due to wraparound in st_context_add_work, we might end up never
calling st_prune_releasebufs.

Since st_context_add_work and st_add_releasebuf both use work_counter
as a "some work was done" and don't care about the actual value, we
can remove the wraparound which will fix the buffer not being released
issue.

Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14955
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14499
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 10d32feae8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Christoph Pillmayer
d6ea90b495 pan/bi: Move FAUs to memory for memory phis
We can have PHIs like this: m10 = PHI u2, 3.
For these, insert_coupling_code will spill the sources but that doesn't
work properly for FAU values before this commit because bi_index_as_mem
asserts that index.type == BI_INDEX_NORMAL and we also can't look up an
FAU index in ctx->S_exit or ctx->remat.

Fixes: 6c64ad93 ("panfrost: spill registers in SSA form")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 8a4d8d490b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Christoph Pillmayer
955a82bb83 pan/bi: Fix coupling spill placement
In the following arrangement the old logic leads to the following:
                       |
                       v
            +----------+------------+
            |block5                 |
            |m815 = PHI m1034, m860 |<-----------+
            |343 = FMA.f32 ...      |            |
            +----------+------------+            |
                       |                         |
        +--------------+                         |
        |              |                         |
        v              v                         |
     +-----+        +-----+                      |
     |b6   |        |b7,8 |                      |
     |     |        |     |                      |
     +-----+        +--+--+                      |
        |    +---+     |    +---+                |
        +----|b9 +-----+----|b10+---+            |
        v    +---+          +---+   v            |
+-------+-------------+     +-------+---------+  |
|block12              |     |block11          |  |
|m882 = PHI m815, m860|     |m860 = MEMMOV 343+--+
+---------+-----------+     +-----------------+
          v

The spill of / into m860 (corresponding to 343) ends up in block11 when
insert_coupling_code(succ=block5, pred=block11) because of the memory
phi in block5. Later, in insert_coupling_code(block12, block9), we
reject inserting the spill after ca9c9957. As a result, m860 is
undefined along block5 -> block7,8 -> block9 -> block12.

When the spill position is chosen first, ctx->block is block5 so
choose_spill_position falsely returns the fallback position. The issue
can be fixed by explicitly passing the "current block".

Fixes: ca9c9957 ("pan: Avoid some redundant SSA spills")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 09e1ba28e5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Timothy Arceri
734e53c96b glsl: relax precision matching on unused uniforms ES
0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
allowed precision mismatches on uniforms, however if you lower precision on
16-bit consts, then this error triggers instead.

So here we relax the type matching and just make sure we match int vs
float.

Fixes: 0886be09 ("glsl: Allow precision mismatch on dead data with GLSL ES 1.00")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5337
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 73bc604128)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Pavel Ondračka
02f422a145 r300: disable HiZ for PIPE_FUNC_ALWAYS
AMD docs support this:
R5xx Acceleration v1.5 says safest handling for ZFUNC changes is to disable
HiZ except specific LESS/LEQUAL and GREATER/GEQUAL transitions.
ATI OpenGL Programming and Optimization Guide advises avoiding ALWAYS when
trying to benefit from HiZ so that would imply fglrx also disables HiZ
there.

On RV530 this fixes the following dEQPs:
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.43
dEQP-GLES2.functional.fragment_ops.interaction.basic_shader.74

Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8093
(cherry picked from commit b0f019f8cf)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
David Rosca
c001485f3b vl: Also disable MPEG2 Main profile when mpeg12 decode is disabled
Fixes: f4959c16c8 ("meson: add mpeg12dec as a video-codec")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
(cherry picked from commit 55bab89951)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Jose Maria Casanova Crespo
7d25d214f5 vc4: flush write jobs before BO replacement in DISCARD_WHOLE path
The DISCARD_WHOLE_RESOURCE path in vc4_map_usage_prep() replaces the
resource's BO with vc4_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in vc4_submit_setup_rcl_surface(), any pending
write job would store to the new BO instead of the old one, corrupting
the new written data.

This is the same bug that was fixed in v3d in the previous commit.

Fixes: 18ccda7b86 ("vc4: When asked to discard-map a whole resource, discard it.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit ecb6c5d555)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Jose Maria Casanova Crespo
fb8f81a1d8 v3d: flush write jobs before BO replacement in DISCARD_WHOLE path
The DISCARD_WHOLE_RESOURCE path in v3d_map_usage_prep() replaces the
resource's BO with v3d_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in emit_rcl() any pending write job would
store to the new BO instead of the old one, corrupting the new
written data.

This is adressed by flushing all pending write jobs affecting the
resource before replacing its BO.

This fixes multiple tests where data copied to a renderbuffer was
overwritten by a previos GPU clear. Test are from the subgroup:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.*

Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
(cherry picked from commit 1eaa46da09)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Jesse Natalie
5bf2bcd81e d3d12: Fix importing external resources
Fixes: 97061dd7 ("d3d12: Add support for Xbox GDK.")
(cherry picked from commit 9e277ed2b6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Samuel Pitoiset
f1f583b3bc radv: fix copying images with different swizzle modes on SDMA7
Swizzle modes must match on SDMA7 (GFX12), and the micro tile mode
doesn't exist.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit cc21e61e43)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:09 +01:00
Rhys Perry
223af79274 aco: perform dce for blocks skipped for process_block()
We might need to DCE users of dead instructions removed by
process_block().

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 9e8ba10447 ("aco/vn: remove dead instructions early")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit 17b18496f6)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Erik Faye-Lund
6e5d08c8e5 gallium/dri: set LIBVA_DRIVERS_PATH in devenv
We're setting this in the non-DRI codepath, but this was missed when we
started embedding the VA driver into libgallium. This means we no longer
were able to use VA-API from meson devenv, like we could before.

Fixes: 212d57f7e6 ("targets/va: Build va driver into libgallium when building with dri")
(cherry picked from commit 7e4744909b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Patrick Lerda
6f28830365 r600: fix cs atomic operations when the shader is called multiple times
This change is useful when the compute shader is called multiple
times with the atomic operations enabled. It fixes some data
coherency issues. This is done by moving
evergreen_emit_atomic_buffer_setup() after r600_flush_emit().

This change is also a partial fix for compute_shader.pipeline-compute-chain.
In this specific case, it makes the memory barrier working.

This change was tested on cayman and barts; it makes these tests
fully deterministic:
khr-gl4[2-6]/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
khr-gles31/core/shader_atomic_counters/advanced-usage-many-dispatches: fail pass
deqp-gles31/functional/synchronization/inter_call/without_memory_barrier/atomic_counter_dispatch_.*_calls_.*_invocations: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
(cherry picked from commit dad942b468)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Pavel Ondračka
b1775f660a r300: copy target when merging alpha output instruction
The alpha instruction always wrote to the same rendertarget as the rgb and the
original target was ignored (surprisingly the HW docs explicitly allows rgb and
alpha to write to different targets). This makes tesseract rendering a bit
better, but there are still some remaining issues.

Fixes: 1c2c4ddbd1 ("r300g: copy the compiler from r300c")
Reviewed-by: Filip Gawin <filip@gawin.net>
(cherry picked from commit 87a881558f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Pierre-Eric Pelloux-Prayer
f1a3aa4036 frontends/va: fix undefined ref error
When building with "-Dvideo-codecs=h264dec,h265dec,av1dec" va/encode.c
won't be built but it's still required because it's used from
picture.c

Fixes: c4f05bdf60 ("frontends/va: include picture_*.c based on selected codec")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 82a51ba9b3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Mike Blumenkrantz
020b960d03 radv: fix multiview fast clears
this was only clearing layer0 because it was ignoring the viewmask

cc: mesa-stable

(cherry picked from commit b8ee6f3d30)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Lionel Landwerlin
195fbfb2f1 anv: dirty all push constant stages in simple shader
Above we're reprogramming push constants as well at a couple of
workarounds that require dirtying all stages.

cmd_buffer->state.gfx.push_constant_stages was already set in the
above function.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4fa1eddb4c ("anv: optimize binding table flushing")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14953
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 38ef732169)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Icenowy Zheng
049caf1696 pvr: only specially handle gfx subcmd for BeginQuery
Among all subcommands, only gfx subcommands are bound to a query pool,
other subcommands seem to need no special handling.

In addition, if a ResetQuery is done before BeginQuery, the last
subcommand will be a event one, which fails the current assert that
assumes it's a gfx one.

Change the assertion of the subcommand being a gfx one to an addition
check of whether the subcommand is a gfx one.

This fixes crash of Vulkan CTS 1.4.5.1 test
dEQP-VK.query_pool.discard.normal.no_depth.none.discard .

Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 5a497316d4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Benjamin Cheng
3611d60cea radeonsi/vcn: Use full pitch for pre-encode input
In 1f83e73145, the pre-encode input picture size was also reduced.
However it was recently discovered that VCN FW uses the input picture
pitch as the pitch for this, which means that previous change broke
pre-encode.

Fixes: 1f83e73145 ("radeonsi/vcn: Reduce allocated size for pre-encode recon pics")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
(cherry picked from commit 2b2b1d405a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Connor Abbott
ebfdb11193 ir3: Fix constlen trimming when more than one stage is trimmed
The logic is supposed to find the stage with the maximum constlen to
trim for each time we have to trim a stage. But by not resetting
max_constlen each time, we would "trim" the same stage repeatedly,
leaving us thinking the total is below the limit when it actually isn't.

Cc: mesa-stable
(cherry picked from commit ae8928b638)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Connor Abbott
5840bf0b1b tu: Use HW offset 0 in 3d loads/clears with FDM
The HW uses ViewportIndex to select which GRAS_BIN_FOVEAT offset to use.
For normal 3d draws, either the ViewportIndex equals the view/layer or
we make the offset the same for all viewports/layers, but we aren't
aware of this in the 3d path and we always use viewport 0.

Use the HW offset 0 when subtracting the HW offset. This is a bit of a
hack, but it should work. This fixes LOAD_OP_LOAD with FDM.

Fixes: b34b089ca1 ("tu: Use GRAS bin offset registers")
(cherry picked from commit 68c0031f56)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Lionel Landwerlin
98ec831d58 anv: add missing handling for attachment locations in secondaries
Fixes:
  dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.interaction_with_shader_object
  dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.remap_single_attachment_shader_object

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 095c470d25)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Luigi Santivetti
ec658ea317 zink: fix format conversion logic for the alpha emulation case
cc: mesa-stable

Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Fixes: 252bff0f ("zink: use real A8_UNORM when possible")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
(cherry picked from commit 640bc3bc53)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Georg Lehmann
6c7f739b9d aco/insert_fp_mode: don't skip setting round for fract
fract(-FLT_MIN) is < 1.0 with rtz but 1.0 with rtne.

Fixes: 7212a75c5e ("aco/insert_fp_mode: exclude some instructions that will never round")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit 8f4de30d05)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Mike Blumenkrantz
f1a64582dd st/bitmap: only release YUV samplerviews
this is consistent with other callers of st_get_sampler_views() and
avoids desync in the sv cache

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14934
Fixes: 73da0dcddc ("gallium: eliminate frontend refcounting from samplerviews")

Acked-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 1a5c660ef5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Mike Blumenkrantz
0486f6bf8b zink: add TRANSFER_WRITE -> HOST_READ sync to end of batch
this is technically required by spec, even though at a practical level
it probably has no effect

cc: mesa-stable

(cherry picked from commit 3ba275aa70)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Georg Lehmann
5a61e04572 ci: disable debian-ppc64el and debian-s390x
They failed a lot today, no idea why. But having flakes in pre merge CI sucks.

(cherry picked from commit b05271f16c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:08 +01:00
Eric Engestrom
6788336325 .pick_status.json: Update to 73dba1e151
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-03-11 23:21:07 +01:00
Eric Engestrom
b602b7f01e fixup! docs: add release notes for 26.0.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
2026-02-26 19:18:41 +01:00
Eric Engestrom
51fe0abad8 docs: add sha sum for 26.0.1
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-02-25 17:35:18 +01:00
Eric Engestrom
bf5998be6e VERSION: bump for 26.0.1 2026-02-25 16:54:24 +01:00
Eric Engestrom
ed6f967681 docs: add release notes for 26.0.1 2026-02-25 16:54:23 +01:00
Leon Perianu
f1a2f841f2 pvr: fix format table properties duplicate
- RGBA8888_* is a preprocessor alias for R8G8B8A8_* in u_format.yaml.
- Both entries in the format tables collide on the same enum value, and
   RGBA8888 overwrites R8G8B8A8.
- The fix was reverting to the version that was in the commit
39e949434c because there is a different format
was used that did not cause any collisions.

dEQP fixes:
   dEQP-VK.api.info.format_properties.r8g8b8a8_sint
   dEQP-VK.api.info.format_properties.r8g8b8a8_snorm
   dEQP-VK.api.info.format_properties.r8g8b8a8_uint
   dEQP-VK.api.info.format_properties.r8g8b8a8_unorm

Fixes: 9f740b26a6 ("pvr: Fix bugs in the format table")
Signed-off-by: Leon Perianu <leon.perianu@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 7c6dbb099a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:24 +01:00
Mary Guillemard
977f0409b2 hk: Fix crash in hk_handle_passthrough_gs
We should be returning if no GS is needed and no GS shader is bound.
This fix various segfaults introduced by the original fix.

Signed-off-by: Mary Guillemard <mary@mary.zone>
Fixes: e10f29399f ("hk: fix passthrough GS key invalidation")
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 6d040df750)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:24 +01:00
Lionel Landwerlin
03847a6f0b anv: remove snprintf for aux op transition
With perfetto that string is processed later leading to
use-after-free.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 413e169f45)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Lionel Landwerlin
77f3279c37 anv: dirty descriptors after blorp operations
Blorp emits 3DSTATE_BINDING_TABLE_POINTER_* instructions in 3D mode.

At the moment we're saved by the push constants reemitting the btp but
we'll drop that in the next commit.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 533c748b34)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Samuel Pitoiset
87a238d829 radv: fix potential GPU hangs with secondaries on transfer queue
Cache flushes should be skipped on SDMA. In practice,
radv_emit_cache_flush() should only be called on GFX/ACE.

SDMA NOP packets are emitted in barriers directly.

This fixes recent VKCTS coverage
dEQP-VK.api.command_buffers.secondary_on_transfer_queue.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit c4d5090d69)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Samuel Pitoiset
6d7f2e3fbd ac/nir: fix writemask for dual source blending on GFX11+
This should definitely be an OR operation if MRT0 and MRT1 don't write
the same channels. This also requires to set the writemask manually
because when it's 0 (in case a dual-source output is missing), the
intrinsic computes the mask itself with the number of components.

No fossils-db changes on NAVI33.

Fixes: 45d8cd037a ("ac/nir: rewrite ac_nir_lower_ps epilog to fix dual src blending with mono PS")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14878
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 2eb9420061)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Nick Hamilton
6edfd32388 pvr: Add support for fragment pass through shader
On the Rogue architecture add support for using a fragment passthrough
shader when there is no fragment shader present in a graphics
pipeline but the sample mask is required.

fix:
dEQP-VK.pipeline.monolithic.empty_fs.masked_samples

Backport-to: 26.0

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Co-authored-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 14508b4c9a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Nick Hamilton
23cd27b129 pvr: Update CI fails list after render pass fixes
Backport-to: 26.0

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit b87d995d32)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Jarred Davies
5000c31573 pvr: Add missing support for tile buffers to SPM EOT programs
Configure the EOT setup for SPM EOT programs so that the generated
programs load the tile buffer into the output buffer before doing
the emit

Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71

Backport-to: 26.0

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit d1f2ad17dd)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Nick Hamilton
022e34b5f3 pvr: Add missing support for preserve attachments
In subpasses preserve attachments are not used by the subpass but
their contents must be preserved throughout the subpass.

Add a list for the preserve attachments info specified by a subpass
and when determining a subpass attachments total uses check the
preserve attachments list and add it uses to the total.

Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71

Backport-to: 26.0

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 0e01b9ef2d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Nick Hamilton
a965c71ec6 pvr: Rename pvr_render_input_attachment
The struct will also be used for preserve attachments in the next
commit.

Backport-to: 26.0

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit e18670347a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Jarred Davies
fb1ba13c57 pvr: Fix allocating the required scratch buffer space for tile buffers
When calculating the dwords per pixel the output registers should
always be taken into account in addition to the number of tile buffers.

Fixes incorrect scratch buffer space calculation when both output
registers and tile buffers are emitted by a render.

Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71

Fixes: 3457f8083a ("pvr: Acquire scratch buffer on framebuffer creation.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit df445dc9b9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Nick Hamilton
9ad2e48819 pvr: Fix incorrect subpass merging optimisation
The subpass merging optimisation check for when subpasses are using
tile buffers was in the incorrect location.

The current check is in a function called from two places but only
the first of these should have been doing the optimisation check.

This was incorrectly affecting the number of renders that subpass
merging could avoid.

Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71

Fixes: 10b6a0d567 ("pvr: Add support for generating render pass hw setup data.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 0640ac7e3b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Danylo Piliaiev
ac49313d06 ir3: Align TCS per-patch output to 64 bytes to prevent stale reads
Empirically, TCS outputs have to be aligned to 64 bytes,
otherwise stale data may be read in rare cases. The exact
reason is not clear, but tests and proprietary driver behavior
strongly point at the need for 64 byte alignment.

Fixes tesselation issues in at least "Conan Exiles" but likely in many
more cases.

CC: mesa-stable

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit 47251b2e2d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Rhys Perry
ba82a16761 aco: resolve hazards before calls
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit 613b4fe407)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Rhys Perry
697fbaddb5 aco: reset all vgpr_used_by_vmem_ in resolve_all_gfx11
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 26.0
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit dfda890ae8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Benjamin Otte
d7607b6a4e lavapipe: Fix features for nonsubsampled ycbcr formats
The Vulkan spec says about VkFormatFeatureFlagBits:

  If a format does not incorporate chroma downsampling (it is
  not a “422” or “420” format) but the implementation supports
  sampler Y′CBCR conversion for this format, the implementation
  must set VK_FORMAT_FEATURE_MIDPOINT_CHROMA_SAMPLES_BIT.

Fixes: af062126ae
Signed-off-by: Benjamin Otte <otte@redhat.com>
(cherry picked from commit 0b6dd167ac)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Robert Mader
a163dec3ff lavapipe: enable dmabuf import for planar drm formats
Like e.g. NV12. This just requires some minor fixes around offset
handling.

(cherry picked from commit 0b6340fd94)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Mike Blumenkrantz
499a74569f zink: only do pre-sync transfer barrier after a renderpass
this is otherwise pointless and (for swapchain images) broken
(because they may never have acquired an image)

discovered by @valentine

cc: mesa-stable

(cherry picked from commit d47ba92d42)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Samuel Pitoiset
545509553a radv/meta: fix depth/stencil resolves with different regions
This is possible since VK_KHR_maintenance10.

This fixes new VKCTS coverage in
dEQP-VK.pipeline.*.multisample.m10_resolve.*.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ab6147e8ef)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Tapani Pälli
befb9af14b util: bring back fix to avoid strict aliasing bugs in xxhash
This is commit b9e163fa67 that got lost in xxhash upgrade 070bf8986c.

Fixes graphics artifacts seen in multiple workloads with Intel driver
when using clang compiler.

Fixes also CTS tests:

 dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_cubemap
 dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_3d
 dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_2d_array
 dEQP-GLES31.functional.geometry_shading.layered.fragment_layer_2d_multisample_array

v2: pass arguments from meson.build instead of hardcoding
    (Eric Engestrom)

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14684
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14107
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13895
Fixes: 070bf8986c ("util: Upgrade xxhash.h to v0.8.3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit d2351b3d04)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Faith Ekstrand
a457021d67 panvk: Also load output attachments with LOAD_OP_NONE+STORE_OP_NONE
We already had this for LOAD_OP_DONT_CARE but we also need it for
LOAD_OP_NONE.

Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 44ff0c4707)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Faith Ekstrand
262e7feab9 panvk/jm: Refactor BeginRendering()
The old code was all out of order and made no sense.  There's a reason
it made no sense. It was wrong.  Cleaning this up fixes a solid 1/3 of
the remaining Bifrost CTS fails in CI.

Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 962d1f33e1)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Faith Ekstrand
e29de2865e panvk/preload: Stop assuming 32 registers
cc: mesa-stable

Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 3bb7d929f4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Faith Ekstrand
37191db342 panvk: Create both Z/S descriptors, even for separate Z/S
The Vulkan spec says that aspects are ignored for Z/S attachments so we
shouldn't consider that as a factor when deciding whether or not to
create other aspect descriptors.  This will be irrelevant in a couple of
commits but we need it for the backport anyway.

Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 19ad26a8de)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Faith Ekstrand
3a92074d8c nir/gather_info: Add support for panfrost tile load/store intrinsics
Fixes: 6fc1030e4f ("nir: Add some new panfrost fragment shader intrinsics")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 88ad8bc75d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:23 +01:00
Faith Ekstrand
897f5814ed pan/clear: Stop packing undefined bits in colors
The util code doesn't actually fill things with zeros so the high bits
are undefined.  If we really want things replicated, we need to mask off
just the bits we care about.

Cc: mesa-stable
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
(cherry picked from commit 4d8551552e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Emma Anholt
61f09295f3 ir3/ra: Fix DOUBLE_ONLY limit pressure computation.
As the comment says, we want to limit our pressure based on underlying HW
reg file size, not max it out to HW reg file size.  This caused us to not
spill when we should when the HW reg size was bigger than the ISA reg file
size, leading to OOB writes in RA when it tried to allocate to the limit
pressure we spilled to.

Fixes segfaults in llama.cpp's test-backend-ops.

Fixes: e6e34883a9 ("ir3: Add wavesize control")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14846
(cherry picked from commit 0c6da326f8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
José Roberto de Souza
b7752ddbc3 intel/perf: Add HSW verx10 to intel_perf_query_result_write_mdapi()
HSW is verx10 75 and when we switched from ver to verx10 I forgot to add the case
75.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a097a3d214 ("intel/perf: Change mdapi switch cases from ver to verx")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14902
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit 48c685ee39)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Natalie Vock
71145cb846 radv/nir: Correctly handle workgroup sizes not aligned to 32
Since the stride is always 32 dwords, we need to treat the workgroup
size as multiples of that value. Using MAX2() only works for cases where
the workgroup size is less than 32, which was hit by some CTS with 1x1
workgroups.

Cc: mesa-stable
(cherry picked from commit b08f9f192c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Samuel Pitoiset
54293d4fdd radv: fix potential corruption after FMASK decompression on GFX6-8
While reworking image resolves completely in RADV, I found a very weird
bug where the only fix was to emit caches immediately after
decompressing the source resolve image (after FMASK_DECOMPRESS).

I have been struggling this for few hours and figured that it was
something related to context rolls (ie. as long the context was rolled
out, emitting the flushes immediately was required).

It turns out this was a known hardware bug on GFX6 that was implemented
in PAL. Though PAL only applies on GFX6 but GFX7-8 are also affected
based on my testing. Note that RadeonSI flushes CB_META too.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 837078b8d5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Lionel Landwerlin
6f75431e98 anv: disable ccs modifier reporting when ccs modifiers are disabled
Reporting the modifiers when we're going to disable it in the back
hits various asserts in anv_image.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2418c91537 ("anv/drirc: disable Xe2 CCS drm modifiers for GTK engine")
Helps: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14853
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4f38b5c888)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Lionel Landwerlin
5fa6c15b36 anv: apply the same ccs disabling for Xe3 than Xe2
The new compression scheme introduced in Xe2 also applies to Xe3, so
we're liable for the same bugs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2418c91537 ("anv/drirc: disable Xe2 CCS drm modifiers for GTK engine")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 4ac47f8dde)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Rhys Perry
849cdbcf72 aco: fix gfx6-8 store_scratch() with function calls
Might happen with radv_emulate_rt=true.

Fixes the_great_circle/a6079328b8df7712 with polaris10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: e006f68b11 ("aco/isel: Don't add scratch offset as gfx8- soffset if no offsets exist")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
(cherry picked from commit 75722da909)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Ian Romanick
bfeb230f9b elk/cmod: Don't propagate from CMP to ADD if there is a write between
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.

No shader-db changes on any ELK platform. I suspect the problematic
cases only occur after scheduling has rearranged instructions. This is
likely the reason BRW didn't experience this problem until 09450faf.

Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit da1fd9786b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Ian Romanick
024c5de569 elk/cmod: Don't propagate from CMP to possible Inf + (-Inf)
This is a backport of BRW e26270249b.

shader-db:

All Intel platforms had similar results. (Broadwell shown)
total instructions in shared programs: 18623918 -> 18624594 (<.01%)
instructions in affected programs: 125179 -> 125855 (0.54%)
helped: 0 / HURT: 139

total cycles in shared programs: 957073100 -> 957072484 (<.01%)
cycles in affected programs: 16534168 -> 16533552 (<.01%)
helped: 42 / HURT: 68

Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit bdbfe8de4d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Ian Romanick
d68b3091b2 brw/cmod: Don't propagate from CMP to ADD if there is a write between
If either source of the CMP is modified before an appropriate ADD is
found, the ADD and the CMP will not have the same result.

shader-db:

Lunar Lake
total instructions in shared programs: 17098815 -> 17098818 (<.01%)
instructions in affected programs: 1187 -> 1190 (0.25%)
helped: 0 / HURT: 3

total cycles in shared programs: 876858960 -> 876858968 (<.01%)
cycles in affected programs: 6878 -> 6886 (0.12%)
helped: 0 / HURT: 1

Meteor Lake, DG2, Tiger Lake, Ice Lake, and Skylake had similar results. (Meteor Lake shown)
total instructions in shared programs: 20034973 -> 20034984 (<.01%)
instructions in affected programs: 4599 -> 4610 (0.24%)
helped: 0 / HURT: 11

total cycles in shared programs: 881033088 -> 881033108 (<.01%)
cycles in affected programs: 57872 -> 57892 (0.03%)
helped: 0 / HURT: 5

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 918873064 -> 918873269 (+0.00%)
CodeSize: 14747338416 -> 14747339360 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104141836677 -> 104141840371 (+0.00%); split: -0.00%, +0.00%

Totals from 205 (0.01% of 2011421) affected shaders:
Instrs: 290415 -> 290620 (+0.07%)
CodeSize: 4280704 -> 4281648 (+0.02%); split: -0.01%, +0.03%
Cycle count: 18166526 -> 18170220 (+0.02%); split: -0.00%, +0.02%

Closes: #14874
Fixes: 020b0055e7 ("i965/fs: Propagate conditional modifiers from compares to adds")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit d1614cd6db)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Frank Binns
e1ae66262f pvr: Fix alloc callbacks usage when freeing frame buffers
When creating frame buffers the alloc callbacks are used in the host
allocations, those same alloc callbacks need to be used when freeing
those allocations but are missing in some places causing the CTS to
report memory leaks in certain test cases.

Fixes: 146364ab9f ("pvr: add support for VK_KHR_dynamic_rendering")

fix:
dEQP-VK.api.object_management.alloc_callback_fail.framebuffer
dEQP-VK.api.object_management.single_alloc_callbacks.framebuffer

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 05ef9f01a7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Frank Binns
dea37352ba pvr/ci: move some timing out tests from fails to skips
Some of these test cases where already in the skip list.

Signed-off-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 74fd985c6c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Yiwei Zhang
22c27bd3ea venus: sync protocol for strict aliasing compliance
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124148 for details.

Backport log: headers are generated from the protocol used by 26.0
              branch with the strict aliasing fix

(cherry picked from commit 6411ee0c2d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Aitor Camacho
40cf87c35a kk: Fix graphics pipeline serialization
Bundles all graphics pipeline creation information required by Metal into
the vertex shader so we can later rebuild the pipeline. This allows us to
correctly create pipelines from caches that were loaded from files.

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit cdbf7242f3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Aitor Camacho
358c8f257a kk: Move gfx pipeline data to the info struct within kk_shader
Makes it easier to serialize and add data specific to the gfx pipeline.

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit 99d8246d1c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:22 +01:00
Aitor Camacho
6152bf1cfb kk: Fix compute pipeline cache
When deserializing the compute shader from a blob, we need to recreate the
pipeline because the blob may have been loaded from file and therefore the
reference to the Metal resource will be invalid.

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit 75f6f46c0f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Aitor Camacho
024143cca4 kk: Correctly release pipeline handles at shader destroy
The condition to release Metal pipelines incorrectly checks which shader
stage we are destroying leading to leads when graphics pipelines had to
be released.

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit 622ebba476)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Aitor Camacho
9a63c20469 kk: Fix shader uint32_t value serialization
We need to write with blob_write_uint32 if we are using blob_read_uint32

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit 15c0dd39fc)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Aitor Camacho
a3f872630b kk: Fill pipelineUUID
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit b350f059f5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Natalie Vock
6f88b07e5d radv: Initialize nir_lower_io_to_scalar progress variable
The NIR_PASS macro only overwrites this when the pass actually makes
progress. If the pass doesn't make progress, the variable stays
uninitialized.

Clang correctly spots this and warns about it.

Cc: mesa-stable
(cherry picked from commit 47e4a68a83)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Mike Blumenkrantz
641a3ea0d9 zink: fix broken compiler assert
cc: mesa-stable

(cherry picked from commit 44f2c40830)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Natalie Vock
c4bb652871 radv/rt: Only use ds_bvh_stack_rtn if the stack base is possible to encode
The hardware only provides 13 bits for encoding the stack base (in
dwords). That translates to the stack base being required to be below
8192 dwords, or 32kB. It's possible to exceed this - LDS is 64kB after
all. Add an explicit check to make sure we don't end up with offsets
that overflow the hw's address fields. This fixes Metro Exodus Enhanced
Edition, which was using ray queries in a 1024-thread sized workgroup,
resulting in exactly 64kB of LDS being required for the stack.

This check isn't required for RT pipelines as we always use 32 or 64
wide workgroups with no other LDS used, so it's impossible to reach this
stack base limit.

Cc: mesa-stable
(cherry picked from commit 59a397793e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Olivia Lee
47caf527e3 hk: fix passthrough GS key invalidation
Just seeing that a passthrough GS was already bound is not sufficient to
know that it is a *matching* passthrough GS. If the application binds a
new VS that requires a different passthrough GS key than the previous
VS, then we need to bind a different passthrough GS.

Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
(cherry picked from commit e10f29399f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Janne Grunau
3397d3995f hk: Use aligned vector fill in hk_CmdFillBuffer if possible
30% faster with 16KB buffers, more than twice as fast with 8MB and
larger buffers.

(cherry picked from commit 651a321ee2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Janne Grunau
1ce5b5b361 asahi: Implement clear_buffer using libagx_fill*
Use either libagx_fill_uint4 or libagx_fill based of size and object
alignment for clear_sizes which are a power of two up to 16.
Reported fill rate for 256MB buffers on a M1 Ultra (G13D) in
gpu-ratemeter is 355 GB/s for 16 byte aligned buffers and 155 GB/s for
4 byte aligned buffers.

Signed-off-by: Janne Grunau <janne-fdr@jannau.net>
(cherry picked from commit 5c2d62c030)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Janne Grunau
37a269e303 asahi: Use GPU for buffer copies in resource_copy_region()
Use a compute shader to copy PIPE_BUFFERs. Based on hk's hk_cmd_copy().
For large copy sizes (>= 128MB) it achieves 3/4 of the available memory
bandwidth on a M1 Ultra (G13D). `gpu-ratemeter gl.bufbw` reports
~625 GB/s for 256MB buffer size. Apple specifies the memory bandwidth of
the M1 Ultra with 819.2 GB/s.

Signed-off-by: Janne Grunau <j@jannau.net>
(cherry picked from commit 3f5497ded8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Pavel Ondračka
0f21dc1bd4 mesa: implement FRAMEBUFFER_RENDERABLE internalformat query
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik-faye-lund@collabora.com>
Cc: mesa-stable
(cherry picked from commit 2b76f2e4a7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Jianxun Zhang
372c7545e6 anv: Limit modifier disabling workaround to specific GTK versions
The issue caused us to put a switch to disable (Xe2) drm modifers
in 2418c91537 is fixed in GTK 4.20.3,
so we can enable the modifiers with this and newer GTK releases.

GTK https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/9164:
b2a42d5a6e Revert "vulkan: Wait for device to be idle before
           create/recreating swapchain"
270735a151 vulkan: Rework swapchain present implementation

The hex values represent the GTK version range: [4.0.0, 4.20.2] for
VK_MAKE_VERSION(), refer to:
f493f5c88d

Cc: mesa-stable
Signed-off-by: Jianxun Zhang <jianxun.zhang@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit df7d333656)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Wei Hao
f60b93b454 radeonsi: fix threaded shader compilation finishing after context is destroyed
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit ec6d077351)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Ryan Zhang
96ee7156af panvk: guard against NULL pointers to avoid crash
Vkcts simulate_oom caselist try to alloc fail manual
which caused the panvk crash. We should guard driver
cannot access null pointor.

Fixes: 598a8d9d11 ("panvk: Collect allocated push
sets at the command level")

Fixed:
dEQP-VK.wsi.wayland.swapchain.simulate_oom.*

Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
(cherry picked from commit 418e6c4ed9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Lars-Ivar Hesselberg Simonsen
11db64a7d3 pan/genxml/v13: Fix HSR Prepass typo
Fixes: ece01443e1 ("pan/genxml: Add v13 definition")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 71500a32fa)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Lars-Ivar Hesselberg Simonsen
43b9a2ea5e panvk: Fix dcd_flags1 dirty bit
dcd_flags1 was not counted as dirty in case the color attachment map was
updated. This could lead to an outdated value for render_target_mask.

Fixes: a4670a67e0 ("panvk/csf: Set the correct DCD_FLAGS_1.render_rarget_mask")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 75242b1862)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Pavel Ondračka
98e2234eb4 r300: align macro-tiled stride-addressed textures in X
Odd macro-tile counts in X trigger flaky rendering/readback in
parallel stress runs with macro-tiled NPOT textures (for example
piglit draw-pixel-with-texture -auto -fbo).

When a texture is macro-tiled and uses stride addressing, align the
width to two macro tiles. This keeps the stride at an even number of
macro tiles in X and avoids the corruption without disabling
macrotiling.

I was not able to find anything about this in the docs.

Cc: mesa-stable
(cherry picked from commit 0763fb947a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Yiwei Zhang
7c0b97be73 venus: workaround a gcc-15 dead store elimination (DSE) bug
No issue with clang or gcc-14.x (or earlier versions). The issue only
shows up since gcc-15.1. The compiler somehow fails to consider those
cs helpers dereferencing the pointer from the pNext chain for reads,
and thus has falsely optimized away the pNext store. This change works
around this with a no-op memory clobber.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13242
Cc: mesa-stable
(cherry picked from commit b0397b967d)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Timothy Arceri
6fb7a07c79 st/glsl_to_nir: make sure the variant has the correct locations set
For drivers that set allow_st_finalize_nir_twice locations are set
when the variable is created. But for variants here we update the
locations in case parameter opt pass or something else changed the
location.

Fixes: 891d46f517 ("st/glsl_to_nir: dont add duplicate state tokens")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14837

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
(cherry picked from commit a6fcc2835e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Timothy Arceri
d7fa6a4deb mesa: add _mesa_lookup_state_param_idx() helper
This will be used in the following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
(cherry picked from commit c3aae0714c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:21 +01:00
Ian Romanick
0710d042db elk: Call nir_opt_algebraic_late in elk_postprocess_nir
Make sure that lowering undone in elk_nir_optimize are reapplied.

No shader-db or fossil-db changes on any Intel platform. This is most
likely to impact either Gfx8 on ANV or Gfx7.5 on HASVK. I don't
fossil-db test either of those platforms.

I tried doing a similar thing here as is done in BRW (previous commit),
but that caused a couple Haswell shaders to fall off a performance
cliff:

total spills in shared programs: 8247 -> 8311 (0.78%)
spills in affected programs: 6 -> 70 (1066.67%)
helped: 0 / HURT: 2

total fills in shared programs: 8558 -> 8910 (4.11%)
fills in affected programs: 6 -> 358 (5866.67%)
helped: 0 / HURT: 2

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit df704bd38e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Ian Romanick
1f65b768a1 brw: Call nir_opt_algebraic_late later in brw_postprocess_nir_opts
Move the call to nir_opt_algebraic_late after the last time
brw_nir_optimize might be called. nir_opt_algebraic_distribute_src_mods
works together with the late algebraic optimizations, so move it also.

shader-db:

Lunar Lake
total instructions in shared programs: 17081222 -> 17080842 (<.01%)
instructions in affected programs: 419931 -> 419551 (-0.09%)
helped: 545 / HURT: 826

total cycles in shared programs: 878437752 -> 879236226 (0.09%)
cycles in affected programs: 506003142 -> 506801616 (0.16%)
helped: 3091 / HURT: 3189

LOST:   18
GAINED: 16

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19994270 -> 19993231 (<.01%)
instructions in affected programs: 490499 -> 489460 (-0.21%)
helped: 660 / HURT: 800

total cycles in shared programs: 882498776 -> 882834186 (0.04%)
cycles in affected programs: 477858602 -> 478194012 (0.07%)
helped: 3458 / HURT: 3564

total fills in shared programs: 4371 -> 4370 (-0.02%)
fills in affected programs: 7 -> 6 (-14.29%)
helped: 1 / HURT: 0

LOST:   28
GAINED: 10

Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
total instructions in shared programs: 19943849 -> 19942782 (<.01%)
instructions in affected programs: 467384 -> 466317 (-0.23%)
helped: 655 / HURT: 796

total cycles in shared programs: 860085674 -> 861410289 (0.15%)
cycles in affected programs: 426900998 -> 428225613 (0.31%)
helped: 3250 / HURT: 3441

LOST:   19
GAINED: 14

fossil-db:

Lunar Lake
Totals:
Instrs: 926472091 -> 926204838 (-0.03%); split: -0.04%, +0.01%
CodeSize: 14845921056 -> 14842776112 (-0.02%); split: -0.10%, +0.08%
Send messages: 41459570 -> 41459574 (+0.00%); split: -0.00%, +0.00%
Cycle count: 104481085069 -> 104583692712 (+0.10%); split: -0.14%, +0.24%
Spill count: 3454651 -> 3457340 (+0.08%); split: -0.15%, +0.23%
Fill count: 4958779 -> 4958487 (-0.01%); split: -0.46%, +0.45%
Max live registers: 193805970 -> 193839002 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 49114416 -> 49113776 (-0.00%); split: +0.01%, -0.01%
Non SSA regs after NIR: 142953905 -> 142800740 (-0.11%); split: -0.12%, +0.01%

Totals from 420256 (20.80% of 2020128) affected shaders:
Instrs: 448571327 -> 448304074 (-0.06%); split: -0.09%, +0.03%
CodeSize: 7312002800 -> 7308857856 (-0.04%); split: -0.21%, +0.17%
Send messages: 17716494 -> 17716498 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52178854998 -> 52281462641 (+0.20%); split: -0.28%, +0.48%
Spill count: 2945654 -> 2948343 (+0.09%); split: -0.17%, +0.26%
Fill count: 4404768 -> 4404476 (-0.01%); split: -0.51%, +0.51%
Max live registers: 60875448 -> 60908480 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 9455280 -> 9454640 (-0.01%); split: +0.04%, -0.04%
Non SSA regs after NIR: 60542740 -> 60389575 (-0.25%); split: -0.28%, +0.02%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 1000081384 -> 999726726 (-0.04%); split: -0.05%, +0.01%
CodeSize: 16764458080 -> 16761624256 (-0.02%); split: -0.09%, +0.07%
Subgroup size: 27599528 -> 27599544 (+0.00%)
Send messages: 45538933 -> 45538951 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93303830912 -> 93370118192 (+0.07%); split: -0.19%, +0.26%
Spill count: 3739306 -> 3739719 (+0.01%); split: -0.22%, +0.23%
Fill count: 5089719 -> 5083626 (-0.12%); split: -0.56%, +0.44%
Max live registers: 122041364 -> 122055848 (+0.01%); split: -0.00%, +0.01%
Max dispatch width: 38117296 -> 38127200 (+0.03%); split: +0.06%, -0.03%
Non SSA regs after NIR: 164296197 -> 164299306 (+0.00%); split: -0.01%, +0.01%

Totals from 338754 (14.82% of 2285730) affected shaders:
Instrs: 452723479 -> 452368821 (-0.08%); split: -0.10%, +0.03%
CodeSize: 7861878032 -> 7859044208 (-0.04%); split: -0.19%, +0.16%
Subgroup size: 16 -> 32 (+100.00%)
Send messages: 17050010 -> 17050028 (+0.00%); split: -0.00%, +0.00%
Cycle count: 52881801997 -> 52948089277 (+0.13%); split: -0.33%, +0.46%
Spill count: 3271458 -> 3271871 (+0.01%); split: -0.25%, +0.26%
Fill count: 4628422 -> 4622329 (-0.13%); split: -0.61%, +0.48%
Max live registers: 30738902 -> 30753386 (+0.05%); split: -0.01%, +0.06%
Max dispatch width: 4787264 -> 4797168 (+0.21%); split: +0.47%, -0.26%
Non SSA regs after NIR: 61748026 -> 61751135 (+0.01%); split: -0.03%, +0.03%

Tiger Lake
Totals:
Instrs: 1011068379 -> 1010977290 (-0.01%); split: -0.03%, +0.02%
CodeSize: 14197751744 -> 14197683040 (-0.00%); split: -0.07%, +0.07%
Send messages: 46431228 -> 46431220 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85066526419 -> 85085088071 (+0.02%); split: -0.16%, +0.18%
Spill count: 3853750 -> 3855185 (+0.04%); split: -0.15%, +0.19%
Fill count: 6716746 -> 6719594 (+0.04%); split: -0.25%, +0.29%
Max live registers: 122307387 -> 122326083 (+0.02%); split: -0.00%, +0.02%
Max dispatch width: 38009632 -> 38003280 (-0.02%); split: +0.03%, -0.05%
Non SSA regs after NIR: 158403572 -> 158415390 (+0.01%); split: -0.01%, +0.02%

Totals from 277728 (12.17% of 2281577) affected shaders:
Instrs: 349206856 -> 349115767 (-0.03%); split: -0.07%, +0.05%
CodeSize: 5042621104 -> 5042552400 (-0.00%); split: -0.20%, +0.20%
Send messages: 13132243 -> 13132235 (-0.00%); split: -0.00%, +0.00%
Cycle count: 36183327716 -> 36201889368 (+0.05%); split: -0.38%, +0.43%
Spill count: 2210072 -> 2211507 (+0.06%); split: -0.26%, +0.33%
Fill count: 4188439 -> 4191287 (+0.07%); split: -0.39%, +0.46%
Max live registers: 24956695 -> 24975391 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3948832 -> 3942480 (-0.16%); split: +0.32%, -0.48%
Non SSA regs after NIR: 45616425 -> 45628243 (+0.03%); split: -0.04%, +0.06%

Ice Lake
Totals:
Instrs: 1009584306 -> 1009411757 (-0.02%); split: -0.02%, +0.01%
CodeSize: 12593466880 -> 12592958096 (-0.00%); split: -0.01%, +0.01%
Send messages: 47274203 -> 47274171 (-0.00%); split: -0.00%, +0.00%
Cycle count: 84920281455 -> 84914027301 (-0.01%); split: -0.05%, +0.04%
Spill count: 2988523 -> 2986191 (-0.08%); split: -0.14%, +0.07%
Fill count: 5296078 -> 5288737 (-0.14%); split: -0.21%, +0.07%
Max live registers: 125429384 -> 125444786 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 41269072 -> 41267312 (-0.00%); split: +0.03%, -0.03%
Non SSA regs after NIR: 163223895 -> 163236623 (+0.01%); split: -0.01%, +0.02%

Totals from 243818 (10.45% of 2334244) affected shaders:
Instrs: 296953759 -> 296781210 (-0.06%); split: -0.08%, +0.02%
CodeSize: 3643224480 -> 3642715696 (-0.01%); split: -0.04%, +0.03%
Send messages: 11518671 -> 11518639 (-0.00%); split: -0.00%, +0.00%
Cycle count: 33065548412 -> 33059294258 (-0.02%); split: -0.13%, +0.11%
Spill count: 1346515 -> 1344183 (-0.17%); split: -0.32%, +0.15%
Fill count: 2537906 -> 2530565 (-0.29%); split: -0.43%, +0.14%
Max live registers: 21476776 -> 21492178 (+0.07%); split: -0.02%, +0.09%
Max dispatch width: 3727288 -> 3725528 (-0.05%); split: +0.31%, -0.35%
Non SSA regs after NIR: 41050474 -> 41063202 (+0.03%); split: -0.04%, +0.07%

Skylake
Totals:
Instrs: 513573157 -> 513462971 (-0.02%); split: -0.02%, +0.00%
CodeSize: 5950280672 -> 5950001392 (-0.00%); split: -0.01%, +0.00%
Send messages: 24909757 -> 24909758 (+0.00%); split: -0.00%, +0.00%
Cycle count: 57636102242 -> 57634726342 (-0.00%); split: -0.03%, +0.03%
Spill count: 627286 -> 627241 (-0.01%); split: -0.01%, +0.00%
Fill count: 837888 -> 837804 (-0.01%); split: -0.01%, +0.00%
Max live registers: 87272271 -> 87284192 (+0.01%); split: -0.00%, +0.02%
Max dispatch width: 32278832 -> 32271800 (-0.02%); split: +0.02%, -0.04%
Non SSA regs after NIR: 87387713 -> 87387614 (-0.00%); split: -0.00%, +0.00%

Totals from 177432 (10.30% of 1722906) affected shaders:
Instrs: 127170648 -> 127060462 (-0.09%); split: -0.10%, +0.01%
CodeSize: 1443406368 -> 1443127088 (-0.02%); split: -0.03%, +0.01%
Send messages: 5444220 -> 5444221 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15423028495 -> 15421652595 (-0.01%); split: -0.10%, +0.10%
Spill count: 235844 -> 235799 (-0.02%); split: -0.03%, +0.01%
Fill count: 333783 -> 333699 (-0.03%); split: -0.03%, +0.01%
Max live registers: 13765573 -> 13777494 (+0.09%); split: -0.01%, +0.10%
Max dispatch width: 3086880 -> 3079848 (-0.23%); split: +0.24%, -0.47%
Non SSA regs after NIR: 17623772 -> 17623673 (-0.00%); split: -0.00%, +0.00%

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 11b96a84b0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Ian Romanick
2874160ce2 brw: Call nir_opt_algebraic_late in brw_nir_create_raygen_trampoline
Make sure that lowering undone in brw_nir_optimize are reapplied.

No shader-db changes on any Intel platform.

Why are there fossil-db changes on platforms that don't support ray tracing?

Lunar Lake
Totals:
Instrs: 926636441 -> 926636313 (-0.00%); split: -0.00%, +0.00%
Send messages: 41510729 -> 41510723 (-0.00%); split: -0.00%, +0.00%
Cycle count: 104509492613 -> 104509490569 (-0.00%); split: -0.00%, +0.00%
Max live registers: 193792922 -> 193792890 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 150091934 -> 150092170 (+0.00%); split: -0.00%, +0.00%

Totals from 10 (0.00% of 2020428) affected shaders:
Instrs: 8142 -> 8014 (-1.57%); split: -3.14%, +1.57%
Send messages: 192 -> 186 (-3.12%); split: -7.29%, +4.17%
Cycle count: 131892 -> 129848 (-1.55%); split: -6.93%, +5.38%
Max live registers: 1442 -> 1410 (-2.22%); split: -3.05%, +0.83%
Non SSA regs after NIR: 950 -> 1186 (+24.84%); split: -26.95%, +51.79%

Meteor Lake
Totals:
Instrs: 1000805547 -> 1000805543 (-0.00%); split: -0.00%, +0.00%
Cycle count: 93131592265 -> 93131619619 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122081268 -> 122081244 (-0.00%); split: -0.00%, +0.00%

Totals from 16 (0.00% of 2286241) affected shaders:
Instrs: 18652 -> 18648 (-0.02%); split: -1.39%, +1.37%
Cycle count: 369520 -> 396874 (+7.40%); split: -2.94%, +10.34%
Max live registers: 1350 -> 1326 (-1.78%); split: -4.15%, +2.37%

DG2
Totals:
Instrs: 999834626 -> 999834651 (+0.00%); split: -0.00%, +0.00%
Send messages: 45719398 -> 45719403 (+0.00%); split: -0.00%, +0.00%
Cycle count: 93118238139 -> 93118269557 (+0.00%); split: -0.00%, +0.00%
Max live registers: 122098944 -> 122098936 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 169413734 -> 169413661 (-0.00%); split: -0.00%, +0.00%

Totals from 13 (0.00% of 2286795) affected shaders:
Instrs: 18799 -> 18824 (+0.13%); split: -1.04%, +1.18%
Send messages: 492 -> 497 (+1.02%); split: -2.44%, +3.46%
Cycle count: 352838 -> 384256 (+8.90%); split: -1.08%, +9.98%
Max live registers: 1237 -> 1229 (-0.65%); split: -2.91%, +2.26%
Non SSA regs after NIR: 2191 -> 2118 (-3.33%); split: -20.86%, +17.53%

Tiger Lake
Totals:
Instrs: 1011816778 -> 1011816714 (-0.00%); split: -0.00%, +0.00%
Send messages: 46515289 -> 46515285 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85148902406 -> 85148894668 (-0.00%); split: -0.00%, +0.00%
Max live registers: 122362180 -> 122362172 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 38036160 -> 38036176 (+0.00%)
Non SSA regs after NIR: 160317521 -> 160317649 (+0.00%); split: -0.00%, +0.00%

Totals from 6 (0.00% of 2282318) affected shaders:
Instrs: 9204 -> 9140 (-0.70%); split: -1.43%, +0.74%
Send messages: 258 -> 254 (-1.55%); split: -3.10%, +1.55%
Cycle count: 287652 -> 279914 (-2.69%); split: -3.29%, +0.60%
Max live registers: 552 -> 544 (-1.45%); split: -2.90%, +1.45%
Max dispatch width: 48 -> 64 (+33.33%)
Non SSA regs after NIR: 914 -> 1042 (+14.00%); split: -14.00%, +28.01%

Ice Lake
Totals:
Instrs: 1012203285 -> 1012203249 (-0.00%); split: -0.00%, +0.00%
Send messages: 47358859 -> 47358858 (-0.00%); split: -0.00%, +0.00%
Cycle count: 85112165276 -> 85112171905 (+0.00%); split: -0.00%, +0.00%
Max live registers: 125545002 -> 125544992 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 41335696 -> 41335656 (-0.00%)
Non SSA regs after NIR: 166448597 -> 166448602 (+0.00%); split: -0.00%, +0.00%

Totals from 13 (0.00% of 2335519) affected shaders:
Instrs: 16486 -> 16450 (-0.22%); split: -1.67%, +1.46%
Send messages: 368 -> 367 (-0.27%); split: -4.89%, +4.62%
Cycle count: 347643 -> 354272 (+1.91%); split: -1.34%, +3.25%
Max live registers: 1104 -> 1094 (-0.91%); split: -3.80%, +2.90%
Max dispatch width: 192 -> 152 (-20.83%)
Non SSA regs after NIR: 2100 -> 2105 (+0.24%); split: -21.76%, +22.00%

Skylake
Totals:
Instrs: 504548665 -> 504548057 (-0.00%); split: -0.00%, +0.00%
Send messages: 24479148 -> 24479118 (-0.00%); split: -0.00%, +0.00%
Cycle count: 57575198140 -> 57575179256 (-0.00%); split: -0.00%, +0.00%
Max live registers: 85570671 -> 85570575 (-0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 85097646 -> 85098486 (+0.00%); split: -0.00%, +0.00%

Totals from 22 (0.00% of 1703671) affected shaders:
Instrs: 19866 -> 19258 (-3.06%); split: -3.72%, +0.66%
Send messages: 464 -> 434 (-6.47%); split: -8.19%, +1.72%
Cycle count: 250854 -> 231970 (-7.53%); split: -9.23%, +1.70%
Max live registers: 2024 -> 1928 (-4.74%); split: -5.53%, +0.79%
Non SSA regs after NIR: 2498 -> 3338 (+33.63%); split: -8.33%, +41.95%

Fixes: 442daeb54a ("nir/opt_algebraic: use fcanonicalize")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 5af0b8bd09)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Konstantin Seurer
3ffe4b257b vulkan/cmd_queue: Fixup stride for multi draws
Copying the draw infos packs them so the stride needs to be set to the
struct size.

cc: mesa-stable

Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
(cherry picked from commit be5ab80de1)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Ian Romanick
45ce75f3bc nir: Use STACK_ARRAY instead of NIR_VLA
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.

Fixes: c11833ab24 ("nir,spirv: Rework function calls")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 9017d37e84)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Ian Romanick
978fd42b4b spirv: Use STACK_ARRAY instead of NIR_VLA
The number of fields comes from the shader, so it could be a value large
enough that using alloca would be problematic.

Fixes: 2a023f30a6 ("nir/spirv: Add basic support for types")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 3da828d2dd)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Jesse Natalie
5048a2ed1c meson: Include DirectX-Headers dependency for all VK Windows builds
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14839
Cc: mesa-stable
Reviewed-by: Eric Engestrom <eric@igalia.com>
(cherry picked from commit f0066a3150)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Alyssa Rosenzweig
806f0a35a4 brw: drop buggy SLM optimization
This was incorrect for OpenCL due to the possibility of variable shared memory
existing despite shared_size == 0. Fortunately the optimization it was trying to
do should be done in NIR via nir_opt_barrier_modes so we can just drop the brw
code and move on with our merry lives. Fixes OpenCL tests on Iris:

non_uniform_work_group non_uniform_3d_barriers
basic async_strided_copy_local_to_global

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit bd5ebbb2f8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Anna Maniscalco
6278aa107a freedreno/common: set has_astc_hdr true for a7xx targets
Fixes: dc07473524 ("freedreno/fdl: add astc hdr formats")
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
(cherry picked from commit e959dd0dd7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Daniel Schürmann
7fda785505 nir/clone: Fix cloning indirect call instructions
Fixes: bb40284f76 ('nir: Add indirect calls')
(cherry picked from commit 88b4221519)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Samuel Pitoiset
a2ad1789fa ac,radv,radeonsi: use correct swizzle/pitch for depth-only images with SDMA
This fixes new VKCTS coverage
dEQP-VK.api.copy_and_blit.core.use_after_copy.*.

is_stencil isn't set for RadeonSI because it doesn't do SDMA copies
with Z/S.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 1be4ffdff9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Eric Engestrom
88e238de07 .pick_status.json: Mark 7dd7731ac7 as denominated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Aitor Camacho
4229b57783 wsi/metal: Expose additional color spaces if instance extension enabled
Caught through VVL test NegativeWsi.SwapchainImageFormatList. The test
would try to create a swapchain with a color space from
VK_EXT_swapchain_colorspace without enabling the extension. This is
because wsi would expose those color spaces even when the extension was
not enabled.

Fixes: fd045ac99c ("wsi/metal: add support for color spaces")

Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit e6f118f12b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
1994d93542 isl: fix 32bit math with 4GB buffer size
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit d956957153)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
af97f7fe38 anv: add missing constant cache invalidation for descriptor buffers
A descriptor buffer promoted to push constants requires a constant
cache invalidation if it is modified on the device.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 42b70cf05a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
12da136c07 anv: fix nested command buffer relocations
When executing 3 command buffers :

vkCmdExecuteCommands(CB_B, CB_C);
vkCmdExecuteCommands(CB_A, CB_B);

vkQueueSubmit(CB_A);

We're not transfering correctly the relocations of CB_C from CB_B to
CB_A.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e64889635c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Konstantin Seurer
f8ce75c40c radv: Fix setting the viewport for depth stencil FS resolves
Fixes: 704fbbb ("radv/meta: rework depth/stencil resolves using graphics")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit f574de2249)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Lionel Landwerlin
2abdb028dd anv: flush render caches on first pipeline select
Given a situation like this :
  - CB_A: begin, renderDepthA, end
  - CB_B: begin, computeA, barrier (depth), computeB, end

The depth cache is not being flushed between renderDepthA & computeB
because :
  - it's not flushed at the end of CB_A (it's not required)
  - when CB_B starts, we're still on GFX pipeline mode but do not
    flush render caches because pipeline mode is unknown
  - when barrier is CB_B is executed, we're already in compute
    pipeline mode and HW cannot flush depth.

The fix is to flush RT/depth cached when switching from unknown
pipeline mode any pipeline mode.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e6dae6ef5f ("vulkan: Optimize implicit end_subpass barrier")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14816
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: David Gow <david@davidgow.net>
(cherry picked from commit 888ac904a3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Juston Li
6dbd6ee94b anv: set missing protected bit for protected depth/stencil surfaces
This bit is set in mocs for other protected attachment types by
anv_image_fill_surface_state() however was ommited for depth/stencil
attachments here.

Without the protected bit set, it causes heavy black artifacting when
attaching a protected depth attachment image to a framebuffer.

Fixes: 794b0496e9 ("anv: enable protected memory")
Signed-off-by: Juston Li <justonli@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit f84ed620c2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
2c250e6235 elk/cse: use copies in operands_match instead of in-place modification
`operands_match` was modifying instruction source operands in-place
(through the `elk_fs_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit 14c65322e8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
03e6f285e5 elk/cse: fix operands_match corrupting non-IMM register data
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.

Guard all `.f` accesses with `.file == IMM` checks.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit 93f39f87c4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
2b221e5a1a brw/cse: use copies in operands_match instead of in-place modification
`operands_match` was modifying instruction source operands in-place
(through the `brw_reg *src` pointer member) and relying on a
save/restore pattern to undo the modifications. Work on local copies
instead, which is simpler and avoids mutating shared state in a
comparison function.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit b302faad8b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:20 +01:00
Matt Turner
14c7d820cd brw/cse: fix operands_match corrupting non-IMM register data
The MUL case in `operands_match` was reading and writing the `.f` union
member unconditionally, even when the register's `.file != IMM`. In that
case `.f` aliases the struct containing `.nr`/`.swizzle`/etc, so the
`fabsf()` call could corrupt the `.nr` by clearing bit 31.

Guard all `.f` accesses with `.file == IMM` checks.

Fixes: 47c4b38540 ("i965/fs: Allow CSE to handle MULs with negated arguments.")
(cherry picked from commit f5e0f63216)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:19 +01:00
Eric Engestrom
547fd52a66 pick-ui: add Backport-to: * as a synonym to Cc: mesa-stable
(cherry picked from commit b2d99b9378)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:19 +01:00
Eric Engestrom
9a0d13be9a bin/gen_release_notes: fix support for python 3.14
There is no default even loop anymore, we need to make one if we want
one now.

Cc: mesa-stable
(cherry picked from commit c7603a11de)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:19 +01:00
Eric Engestrom
e5fb4a0682 .pick_status.json: Update to 03d2cc2b2a
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
2026-02-25 14:22:19 +01:00
Eric Engestrom
8794fced82 docs: add sha sum for 26.0.0
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-02-11 19:19:12 +01:00
Eric Engestrom
c10cba7efa VERSION: bump for 26.0.0 2026-02-11 19:07:29 +01:00
Eric Engestrom
e0f7bc0024 docs: add release notes for 26.0.0 2026-02-11 19:07:29 +01:00
Georg Lehmann
3062621cf6 aco/opt_postRA: don't optimize across calls
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Could do better by checking which registers are clobbered/preserved,
but that's unlikely to be useful anyway.

Backport-to: 26.0

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit fc7b5d7eed)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Georg Lehmann
33ca80ea38 aco: handle all SALU that modifies PC in needs_exec_mask
Calls use swappc.

Backport-to: 26.0

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit 10b12a6ee2)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Georg Lehmann
d8acb10c56 aco/lower_branches: consider jump target of conditional branches based on vcc
Cc: mesa-stable

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit 421a4dacf0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Karol Herbst
acdbdcc53b vtn: set default fp_math_ctrl values for kernels
The kernel capabilty has the `FPFastMathMode` decoration, but not the
`FPFastMathDefault` execution mode, so a SPIR-V module not using
`SPV_KHR_float_controls2` has no way of setting any defaults.

Fixes: 9da2d21804 ("vtn: implement default fp_math_ctrl without using execution mode")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Tested-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit faf3a93e8f)

[Eric: adjusted commit because of missing 46a617884e, as suggested by the author
at https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39790#note_3325830]

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Dave Airlie
b53dbb573a gallivm: handle u16 correct on const loads.
I somehow screwed this up on my previous attempt at fixing this bug,

This should fix the loop limiter bug on big endian properly.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Cc: mesa-stable
Fixes: e28cfb2bad ("gallivm: handle u8/u16 const loads properly on big-endian.")
(cherry picked from commit c016346b50)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Eric R. Smith
160efe917e mesa: do not unbind general point when different indexed points are deleted
When a buffer is deleted, we have to remove it from all binding points.
We were re-using the code for BindBufferRange for this; however, this
caused the general binding point to be unbound (bound to NULL)
unconditionally, even if a different buffer is bound there. Fix this by
inlining the various bind calls into the delete buffers code.

cc: mesa-stable

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14755
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit fa418f1e73)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Samuel Pitoiset
5f6d1e4b44 radv/meta: fix CmdCopyBufferToImage2() on compute queue with compressed HTILE
Only for partial copies because image stores don't decompress on writes
(ie. HTILE isn't updated by image stores).

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 9f5a20abde)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Karol Herbst
dc8a39037b vtn/opencl: flush denorms for cbrt()
libclc doesn't so we have to. fixes math_brutefore cbrt on Iris.

Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit af954427bf)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
OPNA2608
890ff49038 rocket: Fix printing of rknpu_mem_create.dma_addr
The Linux kernel's __u64 isn't always implemented as a long long, and there's no nice define for printing it like with uint64_t.

(cherry picked from commit 41b9dc3a2c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
OPNA2608
00db096003 vc4: Fix printing of get_tiling.modifier
The Linux kernel's __u64 isn't always implemented as a long long, and there's no nice define for printing it like with uint64_t.

(cherry picked from commit 4c699087d4)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
José Expósito
ede2a1ce84 venus: Fix error log on PPC
On the ppc64le architecture error log fail to compile with error:

    ../src/virtio/vulkan/vn_renderer_virtgpu.c: In function ‘virtgpu_ioctl_map’:
    ../src/virtio/vulkan/vn_renderer_virtgpu.c:751:66: error: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 6 has type ‘__u64’ {aka ‘long unsigned int’} [-Werror=format=]
    751 |          "mmap failed: gpu_fd=%d, handle=%u, size=%zu, offset=%llu, err=%s",
        |                                                               ~~~^
        |                                                                  |
        |                                                                  long long unsigned int
        |                                                               %lu
    752 |          gpu->fd, gem_handle, size, args.offset, strerror(errno));
        |                                     ~~~~~~~~~~~
        |                                         |
        |                                         __u64 {aka long unsigned int}
    cc1: some warnings being treated as errors

Parse the parameters to fix the failure.

Fixes: a49b7adad8 ("venus: add error log coverage for virtgpu backend")
(cherry picked from commit dd3fe2d671)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
José Expósito
ee5075f221 winsys/amdgpu: Fix userq job info log on PPC
On the ppc64le architecture the macro printing the userq job info fails
to compile with error:

   In file included from ../src/gallium/winsys/amdgpu/drm/amdgpu_cs.cpp:11:
   ../src/gallium/winsys/amdgpu/drm/amdgpu_cs.cpp: In function ‘int amdgpu_cs_submit_ib_userq(amdgpu_userq*, amdgpu_cs*, uint32_t*, unsigned int, uint32_t*, unsigned int, uint64_t*, uint64_t)’:
   ../src/gallium/winsys/amdgpu/drm/amdgpu_cs.cpp:1652:20: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 6 has type ‘__u64’ {aka ‘long unsigned int’} [-Werror=format=]
   1652 |          mesa_logi("amdgpu: uq_log: %s:  num_wait_fences=%d  uq_va=%llx  job=%llx\n",
   1653 |                    amdgpu_userq_str[acs->queue_index], userq_wait_data.num_fences, fence_info[i].va,
         |                                                                                    ~~~~~~~~~~~~~~~~
         |                                                                                                  |
         |                                                                                                  __u64 {aka long unsigned int}
   ../src/util/log.h:78:70: note: in definition of macro ‘mesa_logi’
      78 | #define mesa_logi(fmt, ...) mesa_log(MESA_LOG_INFO, (MESA_LOG_TAG), (fmt), ##__VA_ARGS__)
         |                                                                      ^~~
   ../src/gallium/winsys/amdgpu/drm/amdgpu_cs.cpp:1652:20: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 7 has type ‘__u64’ {aka ‘long unsigned int’} [-Werror=format=]
   1652 |          mesa_logi("amdgpu: uq_log: %s:  num_wait_fences=%d  uq_va=%llx  job=%llx\n",
   1653 |                    amdgpu_userq_str[acs->queue_index], userq_wait_data.num_fences, fence_info[i].va,
   1654 |                    fence_info[i].value);
         |                    ~~~~~~~~~~~~~~~~~~~
         |                                  |
         |                                  __u64 {aka long unsigned int}
   ../src/util/log.h:78:70: note: in definition of macro ‘mesa_logi’
      78 | #define mesa_logi(fmt, ...) mesa_log(MESA_LOG_INFO, (MESA_LOG_TAG), (fmt), ##__VA_ARGS__)
         |                                                                      ^~~

Parse the parameters to fix the failure.

Fixes: 2547fd0f59 ("winsys/amdgpu: print userq job info")
(cherry picked from commit 757ae04bd9)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Caio Oliveira
99bb93440f brw: Fix cooperative matrix constant sources other than src0
Code was wrongly using src0 to pick the constant value.

Fixes: bf9ad36f2d ("brw: Properly handle cooperative matrices created with constants")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 6b0e29bc77)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Faith Ekstrand
e0682b4317 pan/bi: Don't attempt to fuse AND(ICMP, ICMP) if the AND is swizzled
There might be cases under which we can make this work but they're
tricky at best.  For now, don't even try.

Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 918624174b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Faith Ekstrand
4dd76ad0a1 pan/bi: Run lower_alu_width after opt_algebraic_late
It can generate extract instructions which we expect to be scalar.

Cc: mesa-stable
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit deb9244436)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Faith Ekstrand
6eded1a7d0 nir/lower_bool_to_bit_size: Use the correct num_components for conversions
There's a nice little comment here saying we use the same write mask (an
out of date term in NIR) and swizzle but we're no longer actually doing
that.  Depending on nir_builder magic, we may actually generate a scalar
when we really want a vector.  The fix is to use more builder helpers
and just eat the potential copy.

Fixes: 3180656bbc ("nir: don't use nir_build_alu() with incomplete sources")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
(cherry picked from commit 711b3358a8)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Karol Herbst
c0e5d821e1 rusticl/mesa: only use resource_from_user_memory if the cap is advertised
Fixes some buffer tests on some iris configurations.

Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Tested-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
(cherry picked from commit 240bae6b23)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:48 +00:00
Alyssa Rosenzweig
cf716d4586 nir: disable fast-math for lowering conversions
the lowerings for e.g. f2f16_rtp have carefully written sequences using
Infinity. nir_opt_algebraic will stomp right through this. `feq x, inf`
without an exact flag is basically always a bug. Disable fast math here.
Fixes OpenCL CTS test_half on Iris.

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
(cherry picked from commit 91550d0709)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Yiwei Zhang
74d8362ebd pan/kmod: drop pan_kmod_bo_check_import_flags validation
The passed flags is always zero on the import paths:
- panfrost_bo_import
- panvk_AllocateMemory
- panvk_GetMemoryFdPropertiesKHR

Fixes: 1c7793ea0b ("panvk: Advertise a HOST_CACHED memory type if we have WC maps")
Tested-by: Valentine Burley <valentine.burley@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit 8d25f9821b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Vinson Lee
f25e220841 freedreno/decode: Fix const correctness in get_tex_count
Fix compiler error:

../src/freedreno/decode/cffdec.c:580:7: error: assigning to 'char *'
from 'const char *' discards qualifiers
[-Werror,-Wincompatible-pointer-types-discards-qualifiers]
  580 |    p = strstr(name, "CONST");
      |      ^ ~~~~~~~~~~~~~~~~~~~~~

glibc now provides C23-style type-generic string functions. strstr
returns const char * when passed a const char * argument. Update p
declaration to const since it's only used for offset calculation.

Fixes: 1ea4ef0d3b ("freedreno: slurp in decode tools")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
(cherry picked from commit bc34a122f3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Khem Raj
bdfae9cc8b glx: fix const qualifier warnings found with C23 glibc support
glibc master has been C23'fying the functions which is resulting errors

Several functions assigned results of bsearch/strstr/strpbrk/memchr to
non-const pointers, triggering -Wincompatible-pointer-types-discards-qualifiers
under clang/gcc with -Werror. Cast bsearch return values where needed and
propagate const correctness for strstr/strpbrk/memchr results.

Removes build failures with strict warning flags without changing behavior.

Signed-off-by: Khem Raj <raj.khem@gmail.com>

[Eric: changed the glxglvnd.c hunk to add the missing `const` instead of casting it away]

Cc: mesa-stable
(cherry picked from commit 268e19378f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Samuel Pitoiset
6febbade40 radv: fix late decompressions for fbfetch with more corner cases
With layers, or custom sample locations for depth.
Found this by inspection.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ce3539b54f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Iago Toral Quiroga
6e899b3eba nir/opt_vectorize_load_store: allow sizes unaligned with high offset for loads
This was added specifically for vectorized stores, so allow for loads.

Without this, the pass will fail to vectorize 2 consecutive 16-bit loads
into a single 32-bit load.

Fixes: 2ed79f80ba ("nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
(cherry picked from commit f6a2d14008)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Tapani Pälli
3442ccdcba anv: skip compressed flag for bo if not supported by modifier
This has not been problem before the compression hint given to kernel
but now that we set it we hit problems when allocating bo if modifier
does not support compression.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14625
Fixes: f91de58818 ("anv: Add support to DRM_XE_GEM_CREATE_FLAG_NO_COMPRESSION")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
(cherry picked from commit fc814fa828)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Reilly Brogan
66cd1f224c amd,compiler: fix const errors found with C23 glibc support
In glibc 2.43 the strstr function now propagate const to the output, triggering -Wincompatible-pointer-types-discards-qualifiers
under clang/gcc with -Werror.

Fix two of these cases by adding the const qualifier.

cc: mesa-stable

(cherry picked from commit ece5f671b3)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Vinson Lee
a871c42e39 compiler/clc: Fix const correctness in libclc_add_generic_variants
Fix compiler error:

../src/compiler/clc/nir_load_libclc.c:266:13: error: initializing
'char *' with an expression of type 'const char *' discards qualifiers
[-Werror,-Wincompatible-pointer-types-discards-qualifiers]
  266 |       char *U3AS1 = strstr(func->name, "U3AS1");
      |             ^       ~~~~~~~~~~~~~~~~~~~~~~~~~~~

glibc now provides C23-style type-generic string functions. strstr
returns const char * when passed a const char * argument. Update U3AS1
declaration to const since it's only used for offset calculation.

Fixes: 4a08ee7ecf ("spirv/libclc: Add generic versions of arithmetic functions")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 85fd63068e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Christian Gmeiner
cf20e1610c pan/compiler: Fix progress reporting in pan_nir_lower_store_component
lower_store_component() always returns false even though it modifies
NIR instructions (rewrites sources, creates new SSA defs, removes
previous stores). This triggers the "NIR changed but no progress
reported" assertion in nir_shader_intrinsics_pass.

Return true when a store_output or store_per_view_output intrinsic is
processed, since the function always modifies the shader in that case.

Closes: https://gitlab.freedesktop.org/panfrost/mesa/-/issues/274
Cc: mesa-stable
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 4938ad435e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Yiwei Zhang
3235b9a3dc venus: remove obsolete asserts for ANB image creation
Those have long been supported by vn_image_deferred_info_init because of
AHB support. For non-aliased ANB image, those are directly passed from
the platform swapchain create info as well. So we just need to drop the
obsolete asserts to make newer Android platform and ANGLE happy.

Cc: mesa-stable
(cherry picked from commit 091c4f43ff)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Rudi Heitbaum
6de99bcc31 mesa: retain const qualifier from pointer
Since glibc-2.43:

For ISO C23, the functions bsearch, memchr, strchr, strpbrk, strrchr, strstr, wcschr, wcspbrk, wcsrchr, wcsstr and wmemchr that return pointers into their input arrays now have definitions as macros that return a pointer to a const-qualified type when the input argument is a pointer to a const-qualified type.

https://lists.gnu.org/archive/html/info-gnu/2026-01/msg00005.html

Resolves the following warnings:
    src/mesa/glapi/glapi/gen/enums.c: In function '_mesa_enum_to_string':
    src/mesa/glapi/glapi/gen/enums.c:7799:8: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
     7799 |    elt = bsearch(& nr, enum_string_table_offsets,
          |        ^

    ../src/egl/main/egldispatchstubs.c: In function 'FindProcIndex':
    ../src/egl/main/egldispatchstubs.c:52:7: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
       52 |       bsearch(name, __EGL_DISPATCH_FUNC_NAMES, __EGL_DISPATCH_COUNT,
          |       ^~~~~~~

Signed-off-by: Rudi Heitbaum <rudi@heitbaum.com>
(cherry picked from commit 1acc96b8cb)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Arjob Mukherjee
e8ec898e98 pvr: Fixup for deqp-vk.api 2d.optimal.* conformance
Its no longer an error for depth and stencil formats to have invalid
accumulator format.

Fixes the following tests:
* dEQP-VK.api.info.image_format_properties.2d.optimal.d16_unorm
* dEQP-VK.api.info.image_format_properties.2d.optimal.d24_unorm_s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.d32_sfloat
* dEQP-VK.api.info.image_format_properties.2d.optimal.d32_sfloat_s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.x8_d24_unorm_pack32

Backport-to: 26.0
Signed-off-by: Arjob Mukherjee <arjob.mukherjee@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 58c7437d3a)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Samuel Pitoiset
5773c7bda6 radv/meta: fix the key for DCC decompress on compute
This could return the graphics DCC pipeline if it was created before,
and crash or potentially hang the GPU.

Found this while working on in-progress VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit ad7151f4bf)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Zan Dobersek
aba5409384 tu/kgsl: wait-only submit handling should not ignore sparse bind commands
Commit cf4bd2e412 added a fast path for handling no-command submits to
accommodate a kernel behavior quirk. Sparse support was complete before
that change but landed afterwards, leaving sparse submits that don't have
command buffers but do have sparse bind commands to take that fast path,
leaving the bind commands unhandled. The condition for the fast path is
fixed to address that.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 71ef46717c ("tu/kgsl: Add support for sparse binding")
(cherry picked from commit 5b33ee9f0b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Calder Young
cd67481fd2 anv: Avoid dumping BVH before command buffer is submitted
Fixes a race condition where a BVH will be dumped before its command buffer is
actually submitted if a different command buffer completes between the time the
BVH dump is recorded and the time the command buffer is actually submitted.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Fixes: 1b55f101 ("anv/bvh: Dump BVH synchronously upon command buffer completion")
(cherry picked from commit 95e471e558)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Mel Henning
c87b0a77a1 zink: Emit float controls for preserve_denorms too
Fixes: 6afa1b3bad ("zink: handle denorm preserve execution modes")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 9189a70598)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Tapani Pälli
bb9fef071e iris: set DisableAnyMCTRresponsefix to zero on init
This is to make sure early culling related Wa_16020518922 is enabled
properly.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 331238e44e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Tapani Pälli
e1446659f8 anv: set DisableAnyMCTRresponsefix to zero on init
This is to make sure early culling related Wa_16020518922 is enabled
properly.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14204
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 9aaed82543)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Tapani Pälli
3988bebbe9 intel/genxml: add CHICKEN_RASTER_2 with required bit for Xe3
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 61b5e91bba)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Mary Guillemard
3e4b65fc7a nvk: Reenable compression support with nouveau 1.4.2
Now that the small/large pages race is fixed, we can safely enable it
back when the kernel side report 1.4.2 support.

Fixes: f3c53cf66b ("nvk: Disable large pages for now")
Signed-off-by: Mary Guillemard <mary@mary.zone>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit b524bf368e)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Aitor Camacho
eefede5549 kk: Fix disabling workaround 4
Fixes: 67d05f71e9 ("kk: Track fragment helper status since Metal does not correctly demote them")

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit 29900e8229)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
2026-02-11 14:54:47 +00:00
Erico Nunes
5b7b66e43b Revert "ci: lima farm maintenance"
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This reverts commit ca1d59d813.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39696>
(cherry picked from commit f3131bc145)
2026-02-11 15:51:45 +01:00
Eric Engestrom
3fe056f178 .pick_status.json: Update to d7814bcad0 2026-02-11 14:21:56 +01:00
Eric Engestrom
4ac24ba7e5 VERSION: bump for 26.0.0-rc3
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-02-04 19:35:07 +01:00
Bernd Kuhls
f83e86c29f blake3: add blake3_neon.c only for little endian archs
Fixes build error on big endian archs:

Build machine cpu family: x86_64
Build machine cpu: x86_64
Host machine cpu family: aarch64
Host machine cpu: cortex-a53
Target machine cpu family: aarch64
Target machine cpu: cortex-a53
[...]
../src/util/blake3/blake3_neon.c:6:2: error: #error "This implementation only supports little-endian ARM."
    6 | #error "This implementation only supports little-endian ARM."

as detected by buildroot autobuilders:
https://autobuild.buildroot.net/results/efd/efd07d97df4e0c1ceb07fc26e17898afef5435b9/build-end.log

For reference:
$ grep -i endian output/build/mesa3d-25.3.4/buildroot-build/cross-compilation.conf
endian = 'big'

Signed-off-by: Bernd Kuhls <bernd@kuhls.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39681>
(cherry picked from commit 248b818407)
2026-02-04 18:39:35 +01:00
Samuel Pitoiset
1e415d1bdf radv: emit pending flushes after late decompressions with fbfetch
If the rendering state is inherited in the secondary, otherwise nothing
wait for the pending flushes after a decompression pass. One more
argument to stop delaying this.

Fixes
dEQP-VK.renderpasses.dynamic_rendering.partial_secondary_cmd_buff.local_read.*

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39678>
(cherry picked from commit 13c9e529bd)
2026-02-04 18:39:35 +01:00
Samuel Pitoiset
870140c527 radv: disable unordered submits when SQTT queue events are enabled
Otherwise the QueuePresent event is missing and RGP is confused.

Fixes: 82d06b58ad ("radv: use vk_drm_syncobj_copy_payloads")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39158>
(cherry picked from commit 83ca338e37)
2026-02-04 18:39:35 +01:00
Hyunjun Ko
a7d0da012e anv/video: disable encoder on untested platforms
Not enough tested on over Gen12 platforms.
Turns out to be not working on DG2, for example.

Cc: mesa-stable
Closes: #14449

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39676>
(cherry picked from commit d2c24a0d8b)
2026-02-04 18:39:35 +01:00
Loïc Molinari
51ed940bb8 panfrost: Fix clean_pixel_write_enable forced check for AFBC
Clean tiles must actually be written back for AFBC buffers (color,
z/s) when either one of the effective tile size dimension is smaller
than the superblock dimension. This commit fixes the current check
which compares the effective tile size to the superblock size.

Fixes: 762a0f4133 ("panfrost: Add the concept of render block")
Signed-off-by: Loïc Molinari <loic.molinari@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38422>
(cherry picked from commit 098b69a05c)
2026-02-04 18:39:35 +01:00
Valentine Burley
db6cbb8410 tu: Fix memory leak of patchpoints_ctx in dynamic rendering
tu_CmdBeginRendering was unconditionally allocating a new
patchpoints_ctx. When resuming a render pass chain, this overwrote the
existing context from the suspended pass, leaking it and all associated
FDM patchpoints.

Fixes: 0dd06c74d6 ("tu: Fix FDM patchpoint memory leak")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39639>
(cherry picked from commit d4ad50752f)
2026-02-04 18:39:35 +01:00
Konstantin Seurer
b32cd7c265 radv/bvh: Make sure internal nodes are collapsed when possible
Avoiding NaNs should have the same effect but it's good practice to not
rely on float OPs for correctness.

Fixes: 95a89f7 ("radv: Report smaller bvh sizes when possible")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39640>
(cherry picked from commit 24a1e3d8c2)
2026-02-04 18:39:35 +01:00
Konstantin Seurer
4b86b5e53d vulkan: Make sure no NaNs end up in the BVH
Fixes: 2032268 ("vulkan: Avoid NAN in the IR BVH")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39640>
(cherry picked from commit 60c1e4e3e6)
2026-02-04 18:39:35 +01:00
Konstantin Seurer
cd1a3b7482 radv/rra: Fix nullptr dereference
cc: mesa-stable

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39640>
(cherry picked from commit 2f3a9c10f4)
2026-02-04 18:39:35 +01:00
Lucas Stach
4cae263356 etnaviv: idle the pipe before flushing texture caches
As seen in the Vivante kernel driver function gckHARDWARE_Flush(),
GPUs without gcvFEATURE_TEX_CACHE_FLUSH_FIX, which translates to
all GPUs before halti5, need a full stall of the GPU pipeline
before flushing the texture caches.

This fixes sporadic GPU hangs observed in use-cases where texture
data updates are intermixed with draws without any state changes
that might necessitate a stall.

Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39673>
(cherry picked from commit 643ba9a784)
2026-02-04 18:39:35 +01:00
Emma Anholt
4107091cfe ci/tu: Clear stale xfails from the nightlies.
Fixes: 63243bcc3e ("tu: Fix TU_DRAW_STATE_VB size")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39568>
(cherry picked from commit c0a4d3ef1e)
2026-02-04 18:39:35 +01:00
Emma Anholt
26a8c34ff4 lima/ci: Remove erroneous skips.
When you get UnexpectedResult(skip), that means take your xfail out
because it's now skipping.  Which is the fix, instead of "take the xfail
out and add it to manual skips".

Fixes: e54440d15e ("Uprev Piglit to a3826de3c26a279599d15b018a9a3e75ca46f4f8")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39568>
(cherry picked from commit 42e17a948e)
2026-02-04 18:39:35 +01:00
Juan A. Suarez Romero
86f442db75 broadcom/cle: bump up gen version for v3d
The generation version for V3D XML package was marked as 3.3, but
actually we removed all the code supporting this generation, and the
generations we support now are from 4.2 onwards.

So we bump up the generation version.

Fixes: 9c4829473a ("broadcom/cle: remove v33 and v41 from xml definition")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39577>
(cherry picked from commit 5a85b3d9f4)
2026-02-04 18:39:35 +01:00
Qiang Yu
67ad90c108 radeonsi: fix mesh shader outputs kill
Mesh shader uses store per vertex output for point size
and store per primitive output for layer id.

This fixes gpu-ratemeter run slow for kill point size
and layer id cases when mono shader is used which expect
to kill these outputs.

Also gather fragment shader per primitive input info
to kill mesh shader per primitive output.

Fixes: e6e21dfbf2 ("radeonsi: kill outputs for mesh shader")
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39644>
(cherry picked from commit f20cd07e21)
2026-02-04 18:39:34 +01:00
Nanley Chery
a7ace43e9a anv: Don't set the display flag on WSI blit sources
These images are never used with scanout hardware.

Fixes: 2c00b7d1e6 ("anv: flag WSI images as scanout images for ISL")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39618>
(cherry picked from commit c429d7479e)
2026-02-04 18:39:34 +01:00
Nanley Chery
d6d5071a84 anv: Treat non-WSI PRESENT_SRC as TRANSFER_SRC
For non-WSI images, explicitly map VK_IMAGE_LAYOUT_PRESENT_SRC_KHR to
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL in anv_layout_to_aux_state().

Before this patch, the function passed PRESENT_SRC into
vk_image_layout_to_usage_flags() and got a return value of 0 from it
(that function expects that layout to be explicitly handled by the
caller). This caused the logic dependent on the return value to be
unreliable.

Fixes: c5cad407f8 ("anv: handle non-wsi images in anv_layout_to_aux_state")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39618>
(cherry picked from commit f616d4fb2a)
2026-02-04 18:39:34 +01:00
Nanley Chery
7571128959 anv: Fix clear state of WSI blit sources during presentation
On gfx12+, this fixes assert failures in hybrid GPU scenarios.

Fixes: 811c413f98 ("anv: Don't return the Xe2+ fast-clear type early")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39618>
(cherry picked from commit 476f461ce7)
2026-02-04 18:39:34 +01:00
Nanley Chery
f4e0da9e07 anv: Don't return the Xe2+ fast-clear type early
Don't return early from anv_layout_to_fast_clear_type() for Xe2+. We'll
need to make more use of the function for some MCS changes in later
commits.

Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit 811c413f98)
2026-02-04 18:39:34 +01:00
Patrick Lerda
14615a0add r600: improve vs_as_ls switch reliability
This change updates the vs_as_ls switch logic to make it
reliable. It resets the dirty flag when the switch is
happening. It uses also evergreen_emit_vs_constant_buffers()
to try to update again some of the states which could be
lost otherwise.

This change fixes some "flakes". These tests needed previously
to be executed twice to set the hardware in the proper state
for the test to pass. It also fixes the main issue of the
texture_view.view_sampling test.

This change was tested on palm and cayman. Here are the tests
which are now utterly fixed:
khr-gl4[3-6]/stencil_texturing/functional: fail pass
khr-gl4[4-6]/texture_cube_map_array/texture_size_tesselation_ev_sh: fail pass
khr-gles31/core/texture_cube_map_array/texture_size_tesselation_ev_sh: fail pass
khr-glesext/texture_cube_map_array/texture_size_tesselation_ev_sh: fail pass

Fixes: 25f96c1120 ("r600: hook up constants/samplers/sampler view for tessellation")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Acked-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39269>
(cherry picked from commit 9c5e15e6f5)
2026-02-04 18:39:34 +01:00
Christian Gmeiner
8f6282d846 meson: Restore .clang-format for ninja clang-format target
The empty .clang-format file in the project root is required for meson
to generate the clang-format target. It was accidentally deleted.

Fixes: efe60d2940 ("intel: remove unused show_shader_stage debug option")

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39648>
(cherry picked from commit 449261b6ba)
2026-02-04 18:39:34 +01:00
Mel Henning
05889250e6 nvk: Report additional host_image_copy layouts
Fixes dEQP-VK.image.host_image_copy.properties.properties
on VK CTS 1.4.5

Fixes: d5df263ac9 ("nvk: Enable VK_EXT_host_image_copy")
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39634>
(cherry picked from commit 0e9d29f518)
2026-02-04 18:39:34 +01:00
Natalie Vock
872c536a88 radv/rt: Fix discardable attributes on chit and traversal shaders
It was incorrect to mark chit/miss arguments as discardable without
the equivalent in the traversal shader. Also, tail calls with modified
parameters that aren't marked discardable are incorrect.

This could lead to random corruption by clobbering parameter values
across two levels of nested calls: A Raygen shader calls traversal,
expecting e.g. the ray tMax parameter to be preserved. Traversal
overwrites the parameter's register with the hit t and tail-calls chit,
which immediately returns to raygen. Now the raygen shader still has the
clobbered tMax (which is actually the ray hit t) - if it calls traversal
multiple times, the second traversal iteration may use the previous
ray's hit t as tMax instead of the intended value.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39579>
(cherry picked from commit 3275be503c)
2026-02-04 18:39:34 +01:00
Natalie Vock
8ab3b18cd7 radv/rt: Fix some tail-call compatibility checks
There were two issues here:
1. Tail calls where the tail-callee receives modified parameters are
hazardous and only work if the parameter is return or discardable.
Otherwise, the caller of the function that executes the tail-call may
not expect some of the parameters to be clobbered.
2. There was also an indexing confusion with the call instruction vs.
call signature parameters. The call instruction has not been adapted
to the new lowered signatures, where the system args are prepended. To
make things clearer, split the loop into two, one iterating over
parameters in the call signature and one for parameters of the call
instruction.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39579>
(cherry picked from commit 0d7705c206)
2026-02-04 18:39:34 +01:00
Natalie Vock
0753012766 aco: Don't exclude discardable parameters from register preservation
The original semantic of discardable parameters was "okay, nothing
actually uses this parameter, feel free to clobber it", but we were
only using it with tail calls from a function without discardable
parameters, which was broken.

Instead, slightly change the use-case and utilize the "discardable"
attribute to mark parameters that the callee will clobber in a tail
call. This makes doing tail calls safe when the tail callee receives a
modified set of parameters.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39579>
(cherry picked from commit ad23e02a28)
2026-02-04 18:39:34 +01:00
Natalie Vock
7b1c9adfea radv/rt: Refactor shader group stack size calculation to include traversal stack
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39579>
(cherry picked from commit 62254ab0be)
2026-02-04 18:39:34 +01:00
Mel Henning
d46967fafb nvk: Initialize SET_ALPHA_TO_COVERAGE_OVERRIDE
This matches the initialization that the proprietary driver does.

Fixes dEQP-VK.query_pool.discard.*.alpha_to_coverage* on vk cts 1.4.5

Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39621>
(cherry picked from commit 8d7f14620b)
2026-02-04 18:39:34 +01:00
Konstantin Seurer
55043ae265 vulkan: Limit the number of LBVH invocations
Fixes: 0817551 ("vulkan: Handle inactive primitives with LBVH builds")
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39569>
(cherry picked from commit 529c83a134)
2026-02-04 18:39:34 +01:00
Valentine Burley
102b3d8008 tu: Handle VkDrmFormatModifierPropertiesList2EXT
Expose DRM format modifiers via VkDrmFormatModifierPropertiesList2EXT.
VVL is one notable user.

This is required for VK_EXT_image_drm_format_modifier when
VK_KHR_format_feature_flags2 is supported.

Cc: mesa-stable
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39600>
(cherry picked from commit e185f40fc3)
2026-02-04 18:39:34 +01:00
Karol Herbst
79f909808c clc: fix compile compatability with LLVM-22
See d090311aa7

Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39374>
(cherry picked from commit dc03f94e07)
2026-02-04 18:39:34 +01:00
Karol Herbst
ca428e3b3c nir: fix nir_fixup_is_exported for LLVM-22
Starting with LLVM-22 we won't see the kernel wrapper anymore, and this
is a trivial fix to get around this.

See: 5458eb2511

Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39374>
(cherry picked from commit 24d20df3d6)
2026-02-04 18:39:34 +01:00
Karol Herbst
84566763c2 clc: enable generic address space and seq_cst and device scope atomic features
This is going to be required with LLVM-22.

See 423bdb2bf2

Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39374>
(cherry picked from commit 6eda573a8a)
2026-02-04 18:39:33 +01:00
Karol Herbst
05c679d37b clc: support some atomic and generic address space features
Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39374>
(cherry picked from commit 01e1392139)
2026-02-04 18:39:33 +01:00
Karol Herbst
c6f8d2ef92 clc: reorder headers to fix compilation errors due to UNUSED
Cc: mesa-stable
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39374>
(cherry picked from commit 7f9a7ed553)
2026-02-04 18:39:33 +01:00
Lars-Ivar Hesselberg Simonsen
4a3a3a7d84 panfrost/bi: Fix unbound texel buffers
In case of texel buffers that are read in the shader, but not bound by
the application, the current implementation would incorrectly try to
read from non-existent buffers.

To ensure this does not happen, this change sets the format for any
unbound attributes to CONST_0000, which will kill any actual
reads/writes and always return zeroes.

This fixes the following two tests:
- spec@arb_shading_language_420pack@active sampler conflict
- spec@arb_texture_buffer_object@render-no-bo

Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39431>
(cherry picked from commit aec8132b8b)
2026-02-04 18:39:33 +01:00
David Rosca
525cce7c2a radv/video: Fix maxActiveReferencePictures for H265 decode
Also change to use H265 constant for maxDpbSlots (both values for H264 and H265
are the same).

Fixes: ee535aa039 ("radv: video: rework maxActiveReferenceSlot/MaxDpbSlots")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39609>
(cherry picked from commit 7607aeefa6)
2026-02-04 18:39:33 +01:00
Eric Engestrom
5ec65a4378 Revert "meson: static link spirv-tools for darwin"
This reverts commit f21d0f2cbe.

This causes issues with other platforms trying to do static builds.

A better option is for everyone to use `meson setup --prefer-static`.

Fixes: f21d0f2cbe ("meson: static link spirv-tools for darwin")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14751
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39613>
(cherry picked from commit 342a5ba44e)
2026-02-04 18:39:33 +01:00
Samuel Pitoiset
a5d92b79fa radv: fix tracking of pipelines used in secondaries
This is just wrong if the secondary uses ESO because the emitted
pipelines would be NULL in the secondary, but if the app re-binds
the same pipeline in the primary it would consider it as already
emitted. A sequence like this would break:

CmdBindPipeline(compute)
CmdDispatch()
CmdExecuteCommands() --> with ESO compute
CmdBindPipeline(compute)
CmdDispatch()

This tracking is probably useless anyways because it's unlikely that
apps will rebind the same pipeline right after CmdExecuteCommands() but
let's keep it because this is a bugfix.

Fixes
dEQP-VK.api.command_buffers.pipeline_shader_object_mix_with_secondaries.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39587>
(cherry picked from commit 9ad02b5724)
2026-02-04 18:39:33 +01:00
Samuel Pitoiset
7eb9d75017 radv: zero-initialize image view objects
Mostly to make sure that color/depth descriptors are zero-initialized
in case applications are missing the usage flags. In this case, they
will be considerd as null descriptors.

This hides the issue in
https://gitlab.freedesktop.org/mesa/mesa/-/issues/14637
but the real fix has to be in the Steam Overlay.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39585>
(cherry picked from commit fa4da581c6)
2026-02-04 18:39:33 +01:00
Hyunjun Ko
e73e4e1554 anv/video: Compute AV1 tile positions internally
The pMiColStarts/pMiRowStarts arrays from applications may have
incorrect units. Instead of using them directly, compute the tile
start positions in superblock units internally based on the tile
dimensions.

Cc: mesa-stable
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39471>
(cherry picked from commit 8e9fec8e40)
2026-02-04 18:39:33 +01:00
Hyunjun Ko
162ef4da2c anv/video: fix a typo in Vulkan AV1 decoding.
Cc: mesa-stable
Fixes: e510efed05d("anv: support in-loop super resolution for AV1 decoding")
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39471>
(cherry picked from commit 8004f46466)
2026-02-04 18:39:33 +01:00
Rhys Perry
d3a67ee1d9 radv: fix when incomplete rt pipeline libraries are loaded from cache
It might be that the radv_pipeline_cache_lookup_nir_handle() in
radv_ray_tracing_pipeline_cache_search() fails but we will later need the
NIR. If rt_stages[i].shader was non-NULL, then we would not have created
the NIR.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.2
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38263>
(cherry picked from commit 89eefdcadb)
2026-02-04 18:39:33 +01:00
Olivia Lee
dc140f5500 hk: fix hk_passthrough_gs_key size computation
The non-dynamic members of xfb_info are already included in
sizeof(hk_passthrough_gs_key), so adding nir_xfb_info_size counts them
twice. Because of this we were including uninitialized memory in the key
in hk_handle_passthrough_gs, which is undefined behavior.

Fixes: 5bc8284816 ("hk: add Vulkan driver for Apple GPUs")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39574>
(cherry picked from commit d6745b358d)
2026-02-04 18:39:33 +01:00
Tapani Pälli
d89eceaa2c anv: route clear operations on compute to companion
This fixes bunch of cts tests hitting issues when attempting
anv_image_mcs_op with compute.

Fixes: ab9d3528dc ("anv: fix queue check in anv_blorp_execute_on_companion on xe3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39581>
(cherry picked from commit 85978ccd28)
2026-02-04 18:39:33 +01:00
Zan Dobersek
be9d5d6508 tu: allocate transient attachments used for LRZ
When proceeding with rendering, any transient attachment that will be used
as LRZ buffer should also be allocated. With GMEM rendering, these
attachments otherwise remained unloaded and subsequent LRZ clears produced
GPU faults.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 764b3d9161 ("tu: Implement transient attachments and lazily allocated memory")
Fixes: #14604
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39535>
(cherry picked from commit b6a049ea4b)
2026-02-04 18:39:33 +01:00
Mike Blumenkrantz
5dddf74a34 ntv: emit ViewIndex with flat for fragment stage
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39606>
(cherry picked from commit 999aaac12e)
2026-02-04 18:39:33 +01:00
Nick Hamilton
d7a47c1627 pvr: Fix the isp samples per tile calculation
The samples per tile calculation was incorrect for sample count 4 and 8.

Fix:
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.depth.samples_4.*
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.stencil.samples_4.*

Backport-to: 26.0

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39580>
(cherry picked from commit 9f9788330e)
2026-02-04 18:39:33 +01:00
Lionel Landwerlin
844a79b474 vulkan/wsi/direct: remove VkDisplay created from GetDrmDisplayEXT on ReleaseDisplayEXT
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39556>
(cherry picked from commit 1112c1461d)
2026-02-04 18:39:33 +01:00
Georg Lehmann
1f5f2cc952 nir/opt_algebraic: use correct syntax to create exact fsat
Fixes: 3b06824e4c ("nir/opt_algebraic: optimize some post peephole select patterns")

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39586>
(cherry picked from commit d8ef28671d)
2026-02-04 18:39:33 +01:00
Tomeu Vizoso
8111b41eb4 dril: don't build a rocket_dri.so
As Rocket has no graphics capability.

Fixes: 5b829658f7 ("rocket: Initial commit of a driver for Rockchip's NPU")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38532>
(cherry picked from commit a5daecafd3)
2026-02-04 18:39:32 +01:00
Eric Engestrom
1c4642663b .pick_status.json: Mark a66d19b691 as denominated 2026-02-04 18:39:32 +01:00
Eric Engestrom
7c3ff4cecc .pick_status.json: Update to 248b818407 2026-02-04 18:39:32 +01:00
Eric Engestrom
42f03572d1 VERSION: bump for 26.0.0-rc2
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-01-28 17:41:52 +01:00
Ella Stanforth
8808ec23fa pvr/csbgen: fix packing multiple addresses
Cc: mesa-stable
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
(cherry picked from commit 7be87ca82a)
2026-01-28 16:18:00 +01:00
Nick Hamilton
28f3e82f2d pco: Fix for atomic operations on an image buffer
Within the driver buffers are treated as 2D as sampling them as 1D
will run into HW restrictions on max size.

The compiler does the same however for atomic image ops the address
is manually calculated and doing this via the 2D path leads to
incorrect offsets.

The fix is to treat buffers as 1D for atomic ops which calculates
the correct offsets for the operations.

Fix deqp:
dEQP-VK.image.atomic_operations.add.buffer.*
dEQP-VK.image.atomic_operations.and.buffer.*
dEQP-VK.image.atomic_operations.compare_exchange.buffer.*
dEQP-VK.image.atomic_operations.dec.buffer.*
dEQP-VK.image.atomic_operations.exchange.buffer.*
dEQP-VK.image.atomic_operations.inc.buffer.*
dEQP-VK.image.atomic_operations.max.buffer.*
dEQP-VK.image.atomic_operations.min.buffer.*
dEQP-VK.image.atomic_operations.or.buffer.*
dEQP-VK.image.atomic_operations.sub.buffer.*
dEQP-VK.image.atomic_operations.xor.buffer.*

Fixes: 6dc5e1e109 ("pco: fully support Vulkan 1.2 image atomics")

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39521>
(cherry picked from commit 079377c767)
2026-01-28 16:18:00 +01:00
Olivia Lee
6f2d97ef41 Revert "panvk: advertise VK_EXT_primitives_generated_query on v10+"
This reverts commit 6eadcaa851.

VK_EXT_primitives_generated_query has a dependency on
VK_EXT_transform_feedback, which we do not implement yet. This is
breaking the android CTS. It will be reenabled once transform feedback
is in.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39547>
(cherry picked from commit 4959f45e99)
2026-01-28 16:17:59 +01:00
Iván Briano
cb8a069e24 brw: fix local_invocation_index with quad derivaties on mesh/task shaders
For mesh/task shaders, the thread payload provides a local invocation
index, but it's always linear so it doesn't give the correct value when
quad derivatives are in use.
The lowering pass where all of this is done correctly for compute
shaders assumes load_local_invocation_index will be lowered in the
backend for mesh/task, calculates the values for the quads correctly but
then avoid replacing the original intrinsic and we remain with the wrong
results.

Add an intel specific intrinsic and always lower the generic one to that
(or whatever else was calculated) to avoid ambiguities and fix the value
for quad derivatives.

Fixes future CTS tests using mesh/task shaders under:
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.*

Fixes: d89bfb1ff7 ("intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39276>
(cherry picked from commit 5b48805b42)
2026-01-28 16:17:59 +01:00
Georg Lehmann
8d9349e75b aco: disable DPP for rev integer subs and shifts
It is not documented anywhere, but at least on gfx12 and gfx10.3
DPP is applied to src1 instead of src0.
This might be useful for shifts, but to be safe just disable DPP
completely for now.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14739

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
(cherry picked from commit 140ca3bb50)
2026-01-28 16:17:59 +01:00
Georg Lehmann
6553c4ce40 aco: add a helper function for non supported DPP opcodes
Cc: mesa-stable

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
(cherry picked from commit 8e99bf5380)
2026-01-28 16:17:59 +01:00
Eric Engestrom
e68f96eb1f nir/meson: fix cpp_args of nir_opt_algebraic_pattern_tests
Fixes: 4c30c44b75 ("nir: Generate unit tests for nir_opt_algebraic")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39550>
(cherry picked from commit d12e3454e6)
2026-01-28 16:17:59 +01:00
Nanley Chery
c2eca1a1cc anv: Fix the fast clear type for FCV writes
We started allowing non-default clear colors with FCV in commit
cd8e120b97. When rendering to an image with FCV, set the fast-clear
type to ANV_FAST_CLEAR_ANY if the image properties allow such
fast-clears.

Fixes: cd8e120b97 ("anv: Allow more single subresource fast-clears with FCV")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit ce196c9de5)
2026-01-28 16:17:59 +01:00
Nanley Chery
f3db65d95e anv: Update predicated resolve documentation
* Don't mention gfx7-8 due to the hasvk split.
* Account for the array of clear colors.

Fixes: 0e6b132a75 ("anv: Access more colors in fast_clear_memory_range")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit e7854d06a5)
2026-01-28 16:17:59 +01:00
Nanley Chery
943fd8152a iris: Use the CLEAR state on Xe2+ for MCS
On Xe2+, HSD 14011946253 and the related documents explain that MCS
still only supports a single clear color.

Fixes: df006bba02 ("iris: Update aux state for color fast clears (xe2)")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit 6c6b2d8f30)
2026-01-28 16:17:59 +01:00
Nanley Chery
f3adaccb4b iris: Set missing flags on clear color changes
When changing the clear color without a fast clear, use dirty bits to
ensure that surfaces with inline clear colors are updated and that
partial resolves are done as needed.

Remove the flags at the bottom of fast_clear_color() as
blorp_fast_clear() already sets them for us.

Fixes: 64d861b700 ("iris: Skip some fast-clears even on color changes")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit 3b642f7456)
2026-01-28 16:17:59 +01:00
Nanley Chery
a680c20d40 intel/isl: Fix QPitch of arrayed MCS
From RENDER_SURFACE_STATE::AuxiliarySurfaceQPitch on BDW+,

   This field must be set to an integer multiple of the Surface
   Vertical Alignment

Accomplish this by aligning the height of each MCS layer to main
surface's vertical alignment. Prevents the following test group from
failing on Xe2 when a future commit enables multi-layer fast-clears in
anv:

   dEQP-VK.api.image_clearing.*.
   clear_color_attachment.multiple_layers.
   *_clamp_input_sample_count_*

The main test I used to debug this:

   dEQP-VK.api.image_clearing.core.
   clear_color_attachment.multiple_layers.
   a8b8g8r8_unorm_pack32_64x11_clamp_input_sample_count_2

Backport-to: 25.3
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37660>
(cherry picked from commit eb4a581e44)
2026-01-28 16:17:59 +01:00
Mel Henning
d20d30442c nvk: Disable large pages for now
Reviewed-by: Mary Guillemard <mary@mary.zone>
Fixes: cabfdb4404 ("nvk: Enable compression")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39364>
(cherry picked from commit f3c53cf66b)
2026-01-28 16:17:59 +01:00
Georg Lehmann
7e42c6e949 aco: fix demote in header of single iteration loop
The control is not divergent before a divergent break in a single iteration loop,
but we already pushed the loop mask on the stack.

Fixes: 90faadae72 ("aco/insert_exec_mask: don't disable dead quads on demote in divergent CF")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14733
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39528>
(cherry picked from commit 4b1996b1c7)
2026-01-28 16:17:59 +01:00
Tapani Pälli
41026e14f9 blorp: fix asserts hit with msaa blorp blits on xe3
Tested on PTL, fixes various copy_and_blit tests that utilize compute
after ab9d3528dc that exposed this to them.

Fixes: ab9d3528dc ("anv: fix queue check in anv_blorp_execute_on_companion on xe3")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39548>
(cherry picked from commit bb84773c81)
2026-01-28 16:17:59 +01:00
Caterina Shablia
174aa7ed66 panvk: fix sparse image non-opaque binds
I have no idea how this passed CTS.

Fixes: 5326c451 ("panvk/csf: implement sparse image non-opaque binds")
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39546>
(cherry picked from commit a3ec5ece8b)
2026-01-28 16:17:59 +01:00
Samuel Pitoiset
362faeb15e radv: add a workaround for a synchronization bug in Strange Brigade Vulkan
This game has broken synchronization reported by VVL and it indeed
doesn't wait for idle right before present. Workaround this by
injecting a full barrier (easier than rewriting the dep struct).

This only applies to the Vulkan backend.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14705
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39480>
(cherry picked from commit 14d3fb5f1b)
2026-01-28 16:17:59 +01:00
Samuel Pitoiset
33fbf9bf61 radv: fix applying radv_ssbo_non_uniform=true for Crysis 2/3 remastered
DX11 games that use Vulkan interop for RT with a broken and too generic
app/engin name. This is very specific to these two games.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14718
Fixes: 56813236f4 ("radv: use app names instead of exec name for shader based drirc workarounds")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39518>
(cherry picked from commit d679236e09)
2026-01-28 16:17:59 +01:00
Rob Clark
cda3f42323 freedreno/a6xx: Better program state size calc
Most of the time we were significantly over-allocating the size of
program stateobjs.  Except when the shader had a very large # of
immediates, in which case we were under-allocating (and crashing).

Fixes: 598928d7e7 ("nir/loop_analyze: determine whether all control flow gets eliminated upon loop unrolling")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14731
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39545>
(cherry picked from commit 670ded35c1)
2026-01-28 16:17:59 +01:00
Konstantin Seurer
3ef0b4b27a vulkan: Avoid NAN in the IR BVH
Build and encoding stages should be able to assume that AABBs don't have
NANs. This commit covers all possible sources of NAN.

Fixes: 091b43b ("radv: Use HPLOC for TLAS builds")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14696
Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39508>
(cherry picked from commit 20322687e0)
2026-01-28 16:17:59 +01:00
Konstantin Seurer
1f1da9bc5a vulkan: Handle inactive primitives with LBVH builds
cc: mesa-stable

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39378>
(cherry picked from commit 0817551f00)
2026-01-28 16:17:59 +01:00
Nanley Chery
0d3857c832 blorp: Fix Tile64 clear redescription assertion
Prevent assert failures in a future commit where Tile64 will be selected
more often.

Fixes: 42ef23ecd1 ("intel/blorp: Don't redescribe some Tile64 clears")
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
(cherry picked from commit 6fc0e5c0aa)
2026-01-28 16:17:59 +01:00
Nanley Chery
cec72c7a29 intel/isl: Fix miptail selection for compressed textures
When determining if an LOD can fit within a miptail, we must minify in
pixel space and then convert to elements.

Prevents the following test case from failing when Yf is force-enabled:

   dEQP-VK.image.texel_view_compatible.graphic.extended.3d_image.texture_read.astc_8x5_srgb_block.r32g32b32a32_uint

Fixes: 46f45d62d1 ("intel/isl: Start using miptails")
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38063>
(cherry picked from commit add742fca6)
2026-01-28 16:17:59 +01:00
Mike Blumenkrantz
e2bf4b9007 ntv: emit demote extension/capability when emitting demote
this is cleaner and more accurate

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39540>
(cherry picked from commit a842e641d9)
2026-01-28 16:17:59 +01:00
Mel Henning
03c90bcd1f nvk: Ignore meta ops in occlusion queries
Fixes: 052bbd65c9 ("nvk: Implement pipeline statistics and occlusion queries")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39510>
(cherry picked from commit e32bfc5efe)
2026-01-28 16:17:59 +01:00
Faith Ekstrand
e8f33e8ffb nvk: Enable ZPASS_PIXEL_COUNT in draw_state_init()
Fixes: 052bbd65c9 ("nvk: Implement pipeline statistics and occlusion queries")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39510>
(cherry picked from commit c081ab864f)
2026-01-28 16:17:59 +01:00
Patrick Lerda
4a1133e769 r600: update cubearray imagesize calculation
The previous method to calculate imageSize().z was
incorrect for a cubearray view.

This change was tested on palm and cayman. Here is the test fixed:
spec/arb_texture_view/rendering-layers-image/layers rendering of imagecubearray: fail pass

Fixes: 6c1432f0be ("r600/eg: fix cube map array buffer images.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39063>
(cherry picked from commit 0b8d8f2b17)
2026-01-28 16:17:59 +01:00
Benjamin Cheng
4e1f5fda4a radv/video: Use a more reliable way of computing tile sizes
Some apps (old FFmpeg, contemporary CTS) send down pMi{Col,Row}Starts in
SB units, not MI units. Instead of dependening on those values which
could be unreliable, derive the tile sizes in SB using other parameters.

Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39492>
(cherry picked from commit c10ebb0fda)
2026-01-28 16:17:59 +01:00
Patrick Lerda
fd0ec1af2b r600: fix rv770 clamp to max_texel_buffer_elements
This change fixes the clamp to max_texel_buffer_elements
issue related to rv770 and older gpus.

Here are the tests fixed on rv770:
spec/arb_texture_buffer_object/texture-buffer-size-clamp/r8ui_texture_buffer_size_via_sampler: fail pass
spec/arb_texture_buffer_object/texture-buffer-size-clamp/rg8ui_texture_buffer_size_via_sampler: fail pass
spec/arb_texture_buffer_object/texture-buffer-size-clamp/rgba8ui_texture_buffer_size_via_sampler: fail pass

Fixes: 1a441ad5cb ("r600: clamp to max_texel_buffer_elements")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39385>
(cherry picked from commit afcead9158)
2026-01-28 16:17:58 +01:00
Patrick Lerda
161f3c2144 r600: make vertex r10g10b10a2_sscaled conformant on palm and beyond
This is a gl4.3 issue very similar to e8fa3b4950.

The mode r10g10b10a2_sscaled processed as vertex on palm at the
hardware level doesn't follow the current standard. Indeed, the .w
component (2-bits) is not calculated as expected. The table below
describes the situation.

This change fixes this issue by adding two gpu instructions at
the vertex fetch shader stage. An equivalent C representation and
a gpu asm dump of the generated sequence are available below.

.w(2-bits)	expected	palm		cypress
0		 0		0		 0
1		 1		1		 1
2		-2		2		-2
3		-1		3		-1

w_out = w_in - (w_in > 1. ? 4. : 0.);

0002 00000024 A0040000  ALU 2 @72
 0072 801F2C0A 600004C0     1 w:     SETGT*4                __.w,  R10.w, 1.0
 0074 839FCC0A 61400010     2 w:     ADD                    R10.w,  R10.w, -PV.w

Note: cypress returns the expected value, and does not need
this correction.

This change was tested on palm, barts and cayman. Here are the tests fixed:
khr-gl4[3-6]/vertex_attrib_binding/basic-input-case6: fail pass
khr-gles31/core/vertex_attrib_binding/basic-input-case6: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38849>
(cherry picked from commit 2ed761021f)
2026-01-28 16:17:58 +01:00
Patrick Lerda
88ae449dbc r600: fix rv770 dot4 operations
Using a PV register which is not PV.x, after a dot4 operation,
does not work on rv770. Anyway, this does work on evergreen
but this is not documented.

This change updates this behavior for all the r600 gpus
which fixes the issue on rv770. It adds max4 which has the
same requirement in the case of max4 being implemented.

Here are some of the affected tests on rv770:
piglit/bin/fp-abs-01 -auto -fbo
glcts --deqp-case=KHR-GL31.buffer_objects.triangles
piglit/bin/shader_runner generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-distance-vec2-vec2.shader_test -auto -fbo

Fixes: 942e6af40b ("r600/sfn: use PS and PV inline registers when possible")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39101>
(cherry picked from commit da1108dcc4)
2026-01-28 16:17:58 +01:00
Patrick Lerda
3231523878 r600: fix cayman msaa shading behavior
The functionality was working properly at glMinSampleShading(0.)
and glMinSampleShading(1.). The issue was with the intermediary
values. This change makes this function compatible with the
evergreen setup.

Note: this was one of the few functionalities which were working
properly on evergreen but not on cayman.

Here are the tests fixed:
spec/arb_sample_shading/samplemask 4 all/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 4/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 6 all/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 6 all/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 6/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 6/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 8 all/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 8 all/0.500000 partition: fail pass
spec/arb_sample_shading/samplemask 8/0.250000 partition: fail pass
spec/arb_sample_shading/samplemask 8/0.500000 partition: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_rbo_4: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_rbo_8: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_texture_4: fail pass
deqp-gles31/functional/shaders/sample_variables/sample_mask_in/bit_count_per_two_samples/multisample_texture_8: fail pass

Fixes: f7796a966d ("radeonsi: add basic code for overrasterization")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38615>
(cherry picked from commit d5d844bfc4)
2026-01-28 16:17:58 +01:00
Georg Lehmann
6303313da0 aco/optimizer: fix parsing salu p_insert as shift
Fixes: 88f7e3fff3 ("aco/optimizer: parse pseudo alu instructions")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
(cherry picked from commit ba73792de0)
2026-01-28 16:17:58 +01:00
Rhys Perry
ca22a66dd9 aco/insert_fp_mode: remove incorrect assertion
This can happen if a loop has no continues, and the later code should work
fine in this situation.

This fixes war_thunder/0013a69e097b2471 on navi21.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 6b9d28ab9b ("aco/insert_fp_mode: insert fp mode in reverse")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39481>
(cherry picked from commit e59a0df302)
2026-01-28 16:17:58 +01:00
Zan Dobersek
cfdaa05349 tu: handle DS_DEPTH_BOUNDS_TEST_BOUNDS state under TU_DYNAMIC_STATE_RB_DEPTH_CNTL
MESA_VK_DYNAMIC_DS_DEPTH_BOUNDS_TEST_BOUNDS state should be emitted as part
of TU_DYNAMIC_STATE_RB_DEPTH_CNTL along with other depth state, and not as
part of dynamic stencil state.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 979cf7bac0 ("tu: Merge depth/stencil draw states")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39323>
(cherry picked from commit 3cb4776ede)
2026-01-28 16:17:58 +01:00
Sushma Venkatesh Reddy
6c6ed2a9e6 brw: Use lookup tables for Gfx12+ 3src type encoding/decoding
The previous Gfx12+ implementation using bit masking is failing for FP8
types, so replacing with explicit lookup tables.
For float types, the encoding now aligns with brw_data_type_float, ensuring
correct behavior for DPAS and other 3-source instructions.

Fixes: d1d4e3d530 ("brw: Add EU assembler support for float8")

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39448>
(cherry picked from commit 0ce4e8ba6f)
2026-01-28 16:17:58 +01:00
Calder Young
0148f7f746 Revert "anv,brw: Allow multiple ray queries without spilling to a shadow stack"
This optimization doesn't work when the ray query index isn't uniform across
the subgroup, which is something the spec allows. While there are some smart
ways to fix this and still avoid unnecessary spilling, its not worth investing
the time until we find a realtime raytracing workload that actually needs to
use multiple live ray queries for something.

Fixes: 1f1de7eb ("anv,brw: Allow multiple ray queries without spilling to a shadow stack")
Acked-by: Sagar Ghuge <sagar.ghuge@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39445>
(cherry picked from commit 895ff7fe92)
2026-01-28 16:17:58 +01:00
Rob Clark
14887b7f03 freedreno/lrz: Correct lrz fc layout for gen8
Fixes: 14a23e8b3e ("freedreno/lrz: Add gen8 lrz layout support")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39375>
(cherry picked from commit 1d715662de)
2026-01-28 16:17:58 +01:00
Gurchetan Singh
98afd0c2f7 gallium: fix sometimes-uninitialized warning
Otherwise:

gallium/auxiliary/gallivm/lp_bld_nir_soa.c:2394:7:
 error: variable 'opname' is used uninitialized whenever switch default is taken

is observed.

Reviewed-by: @LingMan
Fixes: 12bceb228a ("gallivm: let reduce ops use llvm intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39418>
(cherry picked from commit 0f582b0268)
2026-01-28 16:17:58 +01:00
Danylo Piliaiev
ca25229f90 tu: Fix typo in min bounds calculation of FDM scissors
Fixes: fec372dfa5 ("tu: Implement FDM viewport patching")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39461>
(cherry picked from commit 1d6fe66989)
2026-01-28 16:17:58 +01:00
Rob Clark
4aa5731f09 freedreno: Force single wavesize if double threadsize is unsupported
Turns out ir3 isn't enforcing this itself.

Fixes: c323848b0b ("ir3, tu: Plumb through support for per-shader robustness")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39470>
(cherry picked from commit 455b692e4f)
2026-01-28 16:17:58 +01:00
Rob Clark
e1dae01299 freedreno/common: Fix gen8 EFU float control
This reg should be programmed to zero like previous gens.

Fixes: 6e3598177b ("freedreno/common: Add A840 and X2-85")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39467>
(cherry picked from commit 53b879ac58)
2026-01-28 16:17:58 +01:00
Silvio Vilerino
00632c8dfc d3d12: Add HAVE_GALLIUM_D3D12_VIDEO guards for d3d12_video_encoder_set_max_async_queue_depth/d3d12_video_encoder_get_last_slice_completion_fence
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14709
Fixes: e55b2b5064 ("d3d12: Add get_video_enc_last_slice_completion_fence interop")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39457>
(cherry picked from commit 4b366f8824)
2026-01-28 16:17:58 +01:00
Silvio Vilerino
944bcc85a0 d3d12: Add missing using Microsoft::WRL:ComPtr in d3d12_context_common
Fixes: b06b2fbaba ("d3d12: Remove Agility v717 guards for features now available in v618")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39457>
(cherry picked from commit 237313a243)
2026-01-28 16:17:58 +01:00
Lionel Landwerlin
fefa2b1e68 iris: fix incorrect intrinsic usage on ELK
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: faa857a061 ("intel: rework push constant handling")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14708
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39443>
(cherry picked from commit 21661f66fc)
2026-01-28 16:17:58 +01:00
Nick Hamilton
861c689517 pvr: Temporarily disable the buffer device address extension
The extension is optional in Vulkan 1.2 and is causing crashes in
multiple CTS tests.

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Backport-to: 26.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39351>
(cherry picked from commit 3aacc324bc)
2026-01-28 16:17:58 +01:00
Natalie Vock
b055af7ceb aco: Fix parameter stack size calculation
This only accounted for 1/32 (or 1/64) of the actual parameter size. In
some cases this meant that some threads were smashing other threads'
stacks.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39455>
(cherry picked from commit 15328a5ef3)
2026-01-28 16:17:58 +01:00
Mike Blumenkrantz
b12d9282c9 zink: re-allow transient images during blitting
now that transient images are a more complete mechanism, this should
in theory be okay and also accounts for the case where
a framebuffer contains mixed msrtt textures and plain multisampled textures

(cherry picked from commit 6474af3b42)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39469>
2026-01-28 16:17:58 +01:00
Yiwei Zhang
2f53818f7a venus: refactor Android ANB tracking to avoid confusions with WSI
WSI used to track the similar for aliased wsi image creation, but later
got deprecated. So let's rename wsi.memory to wsi.anb_mem and drop
wsi.memory_owned to avoid confusions with common wsi related trackings.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39401>
(cherry picked from commit 481df22209)
2026-01-28 16:17:58 +01:00
Yiwei Zhang
f299be5193 venus: properly handle wsi implicit in-fence
Vulkan is supposed to operate in explicit synchronization mode. However,
for legacy compositors that only support implicit fencing, we have to
extract the compositor implicit fence (release fence) and resolve it
properly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39401>
(cherry picked from commit 849e3552e8)
2026-01-28 16:17:58 +01:00
Yiwei Zhang
e0af337416 venus: refactor vn_AcquireNextImage2KHR
Prepare for valid implicit in-fence.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39401>
(cherry picked from commit 211c21725c)
2026-01-28 16:17:58 +01:00
Yiwei Zhang
29b37e4484 venus: add vn_renderer_bo_export_sync_file helper
...and a renderer internal helper shared by virtgpu and vtest backend
when supported.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39401>
(cherry picked from commit 9718847dbf)
2026-01-28 16:17:58 +01:00
Yiwei Zhang
960a4d667b venus: track dedicated image during mem alloc
Need this because the new common wsi interface only returns the wsi
memory from the acquired image index.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39401>
(cherry picked from commit 3fca8423c9)
2026-01-28 16:17:58 +01:00
Yiwei Zhang
48c28ee238 venus: track prime blit dst buffer memory in the wsi image
This is to prepare for handling WSI implicit acquire fence.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39401>
(cherry picked from commit eb709cba47)
2026-01-28 16:17:58 +01:00
Simon Perretta
1b1229d3b2 pco: update formatless skip check
The skip check should only be checking the format rather than the entire
packed word.

Fixes: 52ddc40a75 ("pco: restrict shadow sampler comparator clamping to unorm formats")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39428>
(cherry picked from commit c5b70dcb48)
2026-01-28 16:17:58 +01:00
Samuel Pitoiset
f585d2fadc vulkan: fix missing begin debug marker for HPLOC
This fixes capturing with RGP.

Fixes: 091b43b970 ("radv: Use HPLOC for TLAS builds")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39427>
(cherry picked from commit 873008f274)
2026-01-28 16:17:58 +01:00
Kitlith
a09bbbf3e1 pvr: Free drm device in can_present_on_device
Fixes: 6bda88bfdb ("pvr: copy WSI can_present_on_device function from PanVK")
Signed-off-by: Kitlith <kitlith@kitl.pw>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39415>
(cherry picked from commit b18b52e61d)
2026-01-28 16:17:57 +01:00
Kitlith
6d4b68c748 panvk: Free drm device in can_present_on_device
Fixes: 08da41f2f1 ("panvk: override can_present_on_device")
Signed-off-by: Kitlith <kitlith@kitl.pw>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39415>
(cherry picked from commit 4de41bf27d)
2026-01-28 16:17:57 +01:00
jaap aarts
700f6c3214 radv/sqtt: Prevent concurrent submit when sqtt is enabled
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39090>
(cherry picked from commit 8f7941f92d)
2026-01-28 16:17:57 +01:00
Aitor Camacho
f4e56b61da hk: Handle unbound sets that contain dynamic buffers
The offset for the dynamic buffers needs to be computed with the currently
bound pipeline layout. This change fixes incorrectly selecting the offset
for a dynamic buffer if a descriptor with a lower index than the currently
being bound contains a dynamic buffer but said descriptor hasn't being
bound yet. It also prevents the binding to override the dynamic buffers in
order to preserve the already bound dynamic descriptors.

Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
(cherry picked from commit aaf4405507)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39440>
2026-01-28 16:17:57 +01:00
Aitor Camacho
d2bc79c260 nvk: Handle unbound sets that contain dynamic buffers
The offset for the dynamic buffers needs to be computed with the currently
bound pipeline layout. This change fixes incorrectly selecting the offset
for a dynamic buffer if a descriptor with a lower index than the currently
being bound contains a dynamic buffer but said descriptor hasn't being
bound yet. It also prevents the binding to override the dynamic buffers in
order to preserve the already bound dynamic descriptors.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
(cherry picked from commit 80a076f5d0)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39440>
2026-01-28 16:17:57 +01:00
Dylan Baker
1ed4f69065 bin/pick: When the main widget is replaced, trigger a redraw
The docs clearly say this, and though it used to just work that seems to
have been a coincidence rather than being correct.

CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39459>
(cherry picked from commit 0380c1228e)
2026-01-28 16:17:57 +01:00
Eric Engestrom
b317162543 pick-ui: update for python 3.14 support
```
Traceback (most recent call last):
  File "bin/pick-ui.py", line 31, in <module>
    loop = urwid.MainLoop(u.render(), PALETTE, event_loop=evl, handle_mouse=False)
                          ~~~~~~~~^^
  File "bin/pick/ui.py", line 196, in render
    asyncio.ensure_future(self.update())
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.14/asyncio/tasks.py", line 730, in ensure_future
    loop = events.get_event_loop()
  File "/usr/lib64/python3.14/asyncio/events.py", line 715, in get_event_loop
    raise RuntimeError('There is no current event loop in thread %r.'
                       % threading.current_thread().name)
RuntimeError: There is no current event loop in thread 'MainThread'.
```

Of the 3 dependencies, only urwid actually needs to be updated, but
while at it let's pick the latest of each.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39452>
(cherry picked from commit 21829c9f7e)
2026-01-28 16:17:57 +01:00
Eric Engestrom
4141851e8a .pick_status.json: Update to bed1576b14 2026-01-28 16:17:57 +01:00
Eric Engestrom
a2b03c2117 VERSION: bump for 26.0.0-rc1
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2026-01-21 19:28:32 +01:00
529 changed files with 50927 additions and 5656 deletions

0
.clang-format Normal file
View file

View file

@ -241,6 +241,7 @@ include:
# changed, else we'll just use the already-built containers # changed, else we'll just use the already-built containers
- if: *is-merge-attempt - if: *is-merge-attempt
changes: &image_tags_path changes: &image_tags_path
- .gitlab-ci.yml
- .gitlab-ci/image-tags.yml - .gitlab-ci/image-tags.yml
when: on_success when: on_success
# Skip everything for pre-merge and merge pipelines which don't change # Skip everything for pre-merge and merge pipelines which don't change

View file

@ -774,7 +774,7 @@ debian-riscv64:
# While s390 is dead, s390x is very much alive, and one of the last major # While s390 is dead, s390x is very much alive, and one of the last major
# big-endian platforms, so it provides useful coverage. # big-endian platforms, so it provides useful coverage.
# In case of issues with this job, contact @ajax # In case of issues with this job, contact @ajax
debian-s390x: .debian-s390x:
extends: extends:
- .meson-cross - .meson-cross
- .use-debian/s390x_build - .use-debian/s390x_build
@ -789,7 +789,7 @@ debian-s390x:
DRI_LOADERS: DRI_LOADERS:
-D glvnd=disabled -D glvnd=disabled
debian-ppc64el: .debian-ppc64el:
extends: extends:
- .meson-cross - .meson-cross
- .use-debian/ppc64el_build - .use-debian/ppc64el_build

View file

@ -14,7 +14,7 @@ export LD_LIBRARY_PATH=$LIBDIR
cd /usr/local/shader-db cd /usr/local/shader-db
for driver in freedreno intel lima v3d vc4; do for driver in freedreno lima v3d vc4; do
section_start shader-db-${driver} "Running shader-db for $driver" section_start shader-db-${driver} "Running shader-db for $driver"
env LD_PRELOAD="$LIBDIR/lib${driver}_noop_drm_shim.so" \ env LD_PRELOAD="$LIBDIR/lib${driver}_noop_drm_shim.so" \
./run -j"${FDO_CI_CONCURRENT:-4}" ./shaders \ ./run -j"${FDO_CI_CONCURRENT:-4}" ./shaders \

36942
.pick_status.json Normal file

File diff suppressed because it is too large Load diff

View file

@ -1 +1 @@
26.0.0-devel 26.0.5

View file

@ -385,5 +385,5 @@ async def main() -> None:
if __name__ == "__main__": if __name__ == "__main__":
loop = asyncio.get_event_loop() loop = asyncio.new_event_loop()
loop.run_until_complete(main()) loop.run_until_complete(main())

View file

@ -27,7 +27,9 @@ from pick.ui import UI, PALETTE
if __name__ == "__main__": if __name__ == "__main__":
u = UI() u = UI()
evl = urwid.AsyncioEventLoop(loop=asyncio.new_event_loop()) asyncio_loop = asyncio.new_event_loop()
asyncio.set_event_loop(asyncio_loop)
evl = urwid.AsyncioEventLoop(loop=asyncio_loop)
loop = urwid.MainLoop(u.render(), PALETTE, event_loop=evl, handle_mouse=False) loop = urwid.MainLoop(u.render(), PALETTE, event_loop=evl, handle_mouse=False)
u.mainloop = loop u.mainloop = loop
loop.run() loop.run()

View file

@ -52,7 +52,7 @@ IS_FIX = re.compile(r'^\s*fixes:\s*([a-f0-9]{6,40})', flags=re.MULTILINE | re.IG
IS_CC = re.compile(r'^\s*cc:\s*["\']?([0-9]{2}\.[0-9])?["\']?\s*["\']?([0-9]{2}\.[0-9])?["\']?\s*\<?mesa-stable', IS_CC = re.compile(r'^\s*cc:\s*["\']?([0-9]{2}\.[0-9])?["\']?\s*["\']?([0-9]{2}\.[0-9])?["\']?\s*\<?mesa-stable',
flags=re.MULTILINE | re.IGNORECASE) flags=re.MULTILINE | re.IGNORECASE)
IS_REVERT = re.compile(r'This reverts commit ([0-9a-f]{40})') IS_REVERT = re.compile(r'This reverts commit ([0-9a-f]{40})')
IS_BACKPORT = re.compile(r'^\s*backport-to:\s*(\d{2}\.\d),?\s*(\d{2}\.\d)?', IS_BACKPORT = re.compile(r'^\s*backport-to:\s*(?:(\d{2}\.\d),?\s*(\d{2}\.\d)?|(\*))',
flags=re.MULTILINE | re.IGNORECASE) flags=re.MULTILINE | re.IGNORECASE)
# XXX: hack # XXX: hack
@ -295,7 +295,7 @@ async def resolve_nomination(commit: 'Commit', version: str) -> 'Commit':
if backport_to := IS_BACKPORT.findall(commit_message): if backport_to := IS_BACKPORT.findall(commit_message):
for match in backport_to: for match in backport_to:
if any(Version(version) >= Version(backport_version) if any(backport_version == '*' or Version(version) >= Version(backport_version)
for backport_version in match if backport_version != ''): for backport_version in match if backport_version != ''):
commit.nominated = True commit.nominated = True
commit.nomination_type = NominationType.BACKPORT commit.nomination_type = NominationType.BACKPORT

View file

@ -263,7 +263,7 @@ class TestRE:
""") """)
backport_to = core.IS_BACKPORT.findall(message) backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.2', '')] assert backport_to == [('19.2', '', '')]
def test_multiple_release_space(self): def test_multiple_release_space(self):
"""Tests commit with more than one branch specified""" """Tests commit with more than one branch specified"""
@ -278,7 +278,7 @@ class TestRE:
""") """)
backport_to = core.IS_BACKPORT.findall(message) backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.1', '19.2')] assert backport_to == [('19.1', '19.2', '')]
def test_multiple_release_comma(self): def test_multiple_release_comma(self):
"""Tests commit with more than one branch specified""" """Tests commit with more than one branch specified"""
@ -293,7 +293,7 @@ class TestRE:
""") """)
backport_to = core.IS_BACKPORT.findall(message) backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.1', '19.2')] assert backport_to == [('19.1', '19.2', '')]
def test_multiple_release_lines(self): def test_multiple_release_lines(self):
"""Tests commit with more than one branch specified in mulitple tags""" """Tests commit with more than one branch specified in mulitple tags"""
@ -305,7 +305,7 @@ class TestRE:
""") """)
backport_to = core.IS_BACKPORT.findall(message) backport_to = core.IS_BACKPORT.findall(message)
assert backport_to == [('19.0', ''), ('19.1', '19.2')] assert backport_to == [('19.0', '', ''), ('19.1', '19.2', '')]
class TestResolveNomination: class TestResolveNomination:
@ -405,6 +405,17 @@ class TestResolveNomination:
assert c.nominated assert c.nominated
assert c.nomination_type is core.NominationType.BACKPORT assert c.nomination_type is core.NominationType.BACKPORT
@pytest.mark.asyncio
async def test_backport_all_is_nominated(self):
s = self.FakeSubprocess(b'Backport-to: *')
c = core.Commit('abcdef1234567890', 'a commit')
with mock.patch('bin.pick.core.asyncio.create_subprocess_exec', s.mock):
await core.resolve_nomination(c, '0.0')
assert c.nominated
assert c.nomination_type is core.NominationType.BACKPORT
@pytest.mark.asyncio @pytest.mark.asyncio
async def test_backport_is_nominated_after(self): async def test_backport_is_nominated_after(self):
s = self.FakeSubprocess(b'Backport-to: 16.2') s = self.FakeSubprocess(b'Backport-to: 16.2')

View file

@ -1,3 +1,3 @@
attrs==23.1.0 attrs==25.4.0
packaging==25.0 packaging==26.0
urwid==2.1.2 urwid==3.0.3

View file

@ -224,6 +224,7 @@ class UI:
if commit.nominated and commit.resolution is core.Resolution.UNRESOLVED: if commit.nominated and commit.resolution is core.Resolution.UNRESOLVED:
b = urwid.AttrMap(CommitWidget(self, commit), None, focus_map='reversed') b = urwid.AttrMap(CommitWidget(self, commit), None, focus_map='reversed')
self.commit_list.append(b) self.commit_list.append(b)
self.mainloop.draw_screen()
self.save() self.save()
async def feedback(self, text: str) -> None: async def feedback(self, text: str) -> None:
@ -236,6 +237,7 @@ class UI:
if c.base_widget is commit: if c.base_widget is commit:
del self.commit_list[i] del self.commit_list[i]
break break
self.mainloop.draw_screen()
def save(self): def save(self):
core.save(itertools.chain(self.new_commits, self.previous_commits)) core.save(itertools.chain(self.new_commits, self.previous_commits))
@ -246,6 +248,7 @@ class UI:
def reset_cb(_) -> None: def reset_cb(_) -> None:
self.mainloop.widget = o self.mainloop.widget = o
self.mainloop.draw_screen()
async def apply_cb(edit: urwid.Edit) -> None: async def apply_cb(edit: urwid.Edit) -> None:
text: str = edit.get_edit_text() text: str = edit.get_edit_text()
@ -263,6 +266,7 @@ class UI:
raise RuntimeError(f"Couldn't find {sha}") raise RuntimeError(f"Couldn't find {sha}")
await commit.apply(self) await commit.apply(self)
self.mainloop.draw_screen()
q = urwid.Edit("Commit sha\n") q = urwid.Edit("Commit sha\n")
ok_btn = urwid.Button('Ok') ok_btn = urwid.Button('Ok')
@ -279,12 +283,14 @@ class UI:
self.mainloop.widget = urwid.Overlay( self.mainloop.widget = urwid.Overlay(
urwid.Filler(box), o, 'center', ('relative', 50), 'middle', ('relative', 50) urwid.Filler(box), o, 'center', ('relative', 50), 'middle', ('relative', 50)
) )
self.mainloop.draw_screen()
def chp_failed(self, commit: 'CommitWidget', err: str) -> None: def chp_failed(self, commit: 'CommitWidget', err: str) -> None:
o = self.mainloop.widget o = self.mainloop.widget
def reset_cb(_) -> None: def reset_cb(_) -> None:
self.mainloop.widget = o self.mainloop.widget = o
self.mainloop.draw_screen()
t = urwid.Text(textwrap.dedent(f""" t = urwid.Text(textwrap.dedent(f"""
Failed to apply {commit.commit.sha} {commit.commit.description} with the following error: Failed to apply {commit.commit.sha} {commit.commit.description} with the following error:
@ -313,3 +319,4 @@ class UI:
self.mainloop.widget = urwid.Overlay( self.mainloop.widget = urwid.Overlay(
urwid.Filler(box), o, 'center', ('relative', 50), 'middle', ('relative', 50) urwid.Filler(box), o, 'center', ('relative', 50), 'middle', ('relative', 50)
) )
self.mainloop.draw_screen()

View file

@ -3,6 +3,12 @@ Release Notes
The release notes summarize what's new or changed in each Mesa release. The release notes summarize what's new or changed in each Mesa release.
- :doc:`26.0.5 release notes <relnotes/26.0.5>`
- :doc:`26.0.4 release notes <relnotes/26.0.4>`
- :doc:`26.0.3 release notes <relnotes/26.0.3>`
- :doc:`26.0.2 release notes <relnotes/26.0.2>`
- :doc:`26.0.1 release notes <relnotes/26.0.1>`
- :doc:`26.0.0 release notes <relnotes/26.0.0>`
- :doc:`25.3.3 release notes <relnotes/25.3.3>` - :doc:`25.3.3 release notes <relnotes/25.3.3>`
- :doc:`25.3.2 release notes <relnotes/25.3.2>` - :doc:`25.3.2 release notes <relnotes/25.3.2>`
- :doc:`25.2.8 release notes <relnotes/25.2.8>` - :doc:`25.2.8 release notes <relnotes/25.2.8>`
@ -473,6 +479,12 @@ The release notes summarize what's new or changed in each Mesa release.
:maxdepth: 1 :maxdepth: 1
:hidden: :hidden:
26.0.5 <relnotes/26.0.5>
26.0.4 <relnotes/26.0.4>
26.0.3 <relnotes/26.0.3>
26.0.2 <relnotes/26.0.2>
26.0.1 <relnotes/26.0.1>
26.0.0 <relnotes/26.0.0>
25.3.3 <relnotes/25.3.3> 25.3.3 <relnotes/25.3.3>
25.3.2 <relnotes/25.3.2> 25.3.2 <relnotes/25.3.2>
25.2.8 <relnotes/25.2.8> 25.2.8 <relnotes/25.2.8>

4765
docs/relnotes/26.0.0.rst Normal file

File diff suppressed because it is too large Load diff

247
docs/relnotes/26.0.1.rst Normal file
View file

@ -0,0 +1,247 @@
Mesa 26.0.1 Release Notes / 2026-02-25
======================================
Mesa 26.0.1 is a bug fix release which fixes bugs found since the 26.0.0 release.
Mesa 26.0.1 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 26.0.1 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: bb5104f9f9a46c9b5175c24e601e0ef1ab44ce2d0fdbe81548b59adc8b385dcc mesa-26.0.1.tar.xz
SHA512: d47072257035acfa8a5594c0cda831b4e5178169dea8a06c6657268a441e32271f8798486e837cea23f35ce3f0b4b9520a4ea4ed26b0e1267b02da4c649bc9f9 mesa-26.0.1.tar.xz
New features
------------
- None
Bug fixes
---------
- Missing Haswell case after a097a3d214eda7fb7b9ff63176754b7260e09e03 leads to bogus assert in intel_perf_mdapi.c
- Question: Does building Lavapipe on Windows require building "microsoft-experimental" as well?
- [ANV]: Regression in dxvk Greedfall
- [ANV][BMG] Building Mesa with Clang causes Missing Skin Textures in UE games - Tekken 8
- [ANV][DG2][Regression]: Flickering water "boxes" in Civilization VII
- [RADV] Killer7 has a blue tint with RDNA3/4
- [bisected] Xe3 regression with piglit tess/barrier-patch.shader_test after cmod prop change
- [radeonsi] Regression: GL_FEEDBACK returns 0.0 for X-coordinates (Legacy GL)
- anv, bisected: Genshin Impact wrong shadows, flickering grass
- turnip: llama.cpp: Running test-backend-ops results in segmentation fault
- venus crashes in vn_CreateDevice() with latest mesa/main [bisected]
Changes
-------
Aitor Camacho (7):
- wsi/metal: Expose additional color spaces if instance extension enabled
- kk: Fill pipelineUUID
- kk: Fix shader uint32_t value serialization
- kk: Correctly release pipeline handles at shader destroy
- kk: Fix compute pipeline cache
- kk: Move gfx pipeline data to the info struct within kk_shader
- kk: Fix graphics pipeline serialization
Alyssa Rosenzweig (1):
- brw: drop buggy SLM optimization
Anna Maniscalco (1):
- freedreno/common: set has_astc_hdr true for a7xx targets
Benjamin Otte (1):
- lavapipe: Fix features for nonsubsampled ycbcr formats
Daniel Schürmann (1):
- nir/clone: Fix cloning indirect call instructions
Danylo Piliaiev (1):
- ir3: Align TCS per-patch output to 64 bytes to prevent stale reads
Emma Anholt (1):
- ir3/ra: Fix DOUBLE_ONLY limit pressure computation.
Eric Engestrom (5):
- docs: add sha sum for 26.0.0
- .pick_status.json: Update to 03d2cc2b2ae5341409ee1fab74e98134a6df0511
- bin/gen_release_notes: fix support for python 3.14
- pick-ui: add \`Backport-to: \*` as a synonym to \`Cc: mesa-stable`
- .pick_status.json: Mark 7dd7731ac710b0c7213f6bb466b33f62eca80604 as denominated
Faith Ekstrand (6):
- pan/clear: Stop packing undefined bits in colors
- nir/gather_info: Add support for panfrost tile load/store intrinsics
- panvk: Create both Z/S descriptors, even for separate Z/S
- panvk/preload: Stop assuming 32 registers
- panvk/jm: Refactor BeginRendering()
- panvk: Also load output attachments with LOAD_OP_NONE+STORE_OP_NONE
Frank Binns (2):
- pvr/ci: move some timing out tests from fails to skips
- pvr: Fix alloc callbacks usage when freeing frame buffers
Ian Romanick (8):
- spirv: Use STACK_ARRAY instead of NIR_VLA
- nir: Use STACK_ARRAY instead of NIR_VLA
- brw: Call nir_opt_algebraic_late in brw_nir_create_raygen_trampoline
- brw: Call nir_opt_algebraic_late later in brw_postprocess_nir_opts
- elk: Call nir_opt_algebraic_late in elk_postprocess_nir
- brw/cmod: Don't propagate from CMP to ADD if there is a write between
- elk/cmod: Don't propagate from CMP to possible Inf + (-Inf)
- elk/cmod: Don't propagate from CMP to ADD if there is a write between
Janne Grunau (3):
- asahi: Use GPU for buffer copies in resource_copy_region()
- asahi: Implement clear_buffer using libagx_fill*
- hk: Use aligned vector fill in hk_CmdFillBuffer if possible
Jarred Davies (2):
- pvr: Fix allocating the required scratch buffer space for tile buffers
- pvr: Add missing support for tile buffers to SPM EOT programs
Jesse Natalie (1):
- meson: Include DirectX-Headers dependency for all VK Windows builds
Jianxun Zhang (1):
- anv: Limit modifier disabling workaround to specific GTK versions
José Roberto de Souza (1):
- intel/perf: Add HSW verx10 to intel_perf_query_result_write_mdapi()
Juston Li (1):
- anv: set missing protected bit for protected depth/stencil surfaces
Konstantin Seurer (2):
- radv: Fix setting the viewport for depth stencil FS resolves
- vulkan/cmd_queue: Fixup stride for multi draws
Lars-Ivar Hesselberg Simonsen (2):
- panvk: Fix dcd_flags1 dirty bit
- pan/genxml/v13: Fix HSR Prepass typo
Leon Perianu (1):
- pvr: fix format table properties duplicate
Lionel Landwerlin (8):
- anv: flush render caches on first pipeline select
- anv: fix nested command buffer relocations
- anv: add missing constant cache invalidation for descriptor buffers
- isl: fix 32bit math with 4GB buffer size
- anv: apply the same ccs disabling for Xe3 than Xe2
- anv: disable ccs modifier reporting when ccs modifiers are disabled
- anv: dirty descriptors after blorp operations
- anv: remove snprintf for aux op transition
Mary Guillemard (1):
- hk: Fix crash in hk_handle_passthrough_gs
Matt Turner (4):
- brw/cse: fix \`operands_match` corrupting non-IMM register data
- brw/cse: use copies in \`operands_match` instead of in-place modification
- elk/cse: fix \`operands_match` corrupting non-IMM register data
- elk/cse: use copies in \`operands_match` instead of in-place modification
Mike Blumenkrantz (2):
- zink: fix broken compiler assert
- zink: only do pre-sync transfer barrier after a renderpass
Natalie Vock (3):
- radv/rt: Only use ds_bvh_stack_rtn if the stack base is possible to encode
- radv: Initialize nir_lower_io_to_scalar progress variable
- radv/nir: Correctly handle workgroup sizes not aligned to 32
Nick Hamilton (5):
- pvr: Fix incorrect subpass merging optimisation
- pvr: Rename pvr_render_input_attachment
- pvr: Add missing support for preserve attachments
- pvr: Update CI fails list after render pass fixes
- pvr: Add support for fragment pass through shader
Olivia Lee (1):
- hk: fix passthrough GS key invalidation
Pavel Ondračka (2):
- r300: align macro-tiled stride-addressed textures in X
- mesa: implement FRAMEBUFFER_RENDERABLE internalformat query
Rhys Perry (3):
- aco: fix gfx6-8 store_scratch() with function calls
- aco: reset all vgpr_used_by_vmem\_ in resolve_all_gfx11
- aco: resolve hazards before calls
Robert Mader (1):
- lavapipe: enable dmabuf import for planar drm formats
Ryan Zhang (1):
- panvk: guard against NULL pointers to avoid crash
Samuel Pitoiset (5):
- ac,radv,radeonsi: use correct swizzle/pitch for depth-only images with SDMA
- radv: fix potential corruption after FMASK decompression on GFX6-8
- radv/meta: fix depth/stencil resolves with different regions
- ac/nir: fix writemask for dual source blending on GFX11+
- radv: fix potential GPU hangs with secondaries on transfer queue
Tapani Pälli (1):
- util: bring back fix to avoid strict aliasing bugs in xxhash
Timothy Arceri (2):
- mesa: add _mesa_lookup_state_param_idx() helper
- st/glsl_to_nir: make sure the variant has the correct locations set
Wei Hao (1):
- radeonsi: fix threaded shader compilation finishing after context is destroyed
Yiwei Zhang (2):
- venus: workaround a gcc-15 dead store elimination (DSE) bug
- venus: sync protocol for strict aliasing compliance

239
docs/relnotes/26.0.2.rst Normal file
View file

@ -0,0 +1,239 @@
Mesa 26.0.2 Release Notes / 2026-03-12
======================================
Mesa 26.0.2 is a bug fix release which fixes bugs found since the 26.0.1 release.
Mesa 26.0.2 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 26.0.2 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: 973f535221be211c6363842b4cce9ef8e9b3e1d5ea86c5450ca86060163c7346 mesa-26.0.2.tar.xz
SHA512: 0a7b9fc9b09e40345cc22d246dc1656900d74754c093882f6a39623af17fddc5f4a0c7938207c784ccf7306c5ed497be6a02c36f95c6548e01a2faa085e04c35 mesa-26.0.2.tar.xz
New features
------------
- None
Bug fixes
---------
- 26.0.1 fails to build: \`create_context.c: error: 'struct glx_screen' has no member named 'frontend_screen'`
- A770: Counter-Strike 2 visual glitches (regression)
- Bisected regression: Assertion texObj->pt == view->texture failed.
- Kodi regression with panthor >= 1.7 after updating to Linux 7.0-rc1
- MDK2 HD (opengl) has most elements rendered as black
- Mesa 25.3 amdgpu memory issue
- OpenGL 4.1 VRAM Memory Leak with setting uniform variables
- Panfrost Bifrost compiler assertion failure: wrong vectorization in bi_alu_src_index (Mesa 26.0.0)
- RADV: RDNA4 visual corruption in DX11 (DXVK) Mafia III character model glitches, AMDVLK renders correctly (9070XT)
- [radeonsi] Regression: GL_FEEDBACK returns 0.0 for X-coordinates (Legacy GL)
- glsl: spec\@glsl-es-1.00\@linker\@glsl-mismatched-uniform-precision-unused broken
- ir3: ir3_get_predicate() vs &ctx->build
- r300 , regression , bisected : Glitches with Sauerbraten
- r300: HiZ related dEQP failures
Changes
-------
Anna Maniscalco (1):
- zink: don't care about generated gs output primitive
Benjamin Cheng (1):
- radeonsi/vcn: Use full pitch for pre-encode input
Boris Brezillon (1):
- pan/kmod: Allow mmap() on foreign buffers
Caio Oliveira (4):
- spirv: Refactor ALU opcode translation to take bit sizes
- spirv: Pull constant source fixup to the existing loop
- spirv: Fix spec constant to handle Select for non-native floats
- nir: Fix constant folding for iadd_sat
Christoph Pillmayer (2):
- pan/bi: Fix coupling spill placement
- pan/bi: Move FAUs to memory for memory phis
Connor Abbott (4):
- tu: Use HW offset 0 in 3d loads/clears with FDM
- ir3: Fix constlen trimming when more than one stage is trimmed
- tu: Set polygon mode when blitting
- tu: Fix setting will_be_resolved with MSRTSS
Danylo Piliaiev (2):
- tu: Store gmem attachments after custom resolve in dyn RP
- tu: Don't read .patch_input_gmem of unused attachment
David Rosca (1):
- vl: Also disable MPEG2 Main profile when mpeg12 decode is disabled
Eric Engestrom (3):
- docs: add sha sum for 26.0.1
- fixup! docs: add release notes for 26.0.1
- .pick_status.json: Update to 73dba1e15173ff6109925de9615f9d9f5cccc2be
Eric R. Smith (1):
- pco: fix a typo in the check for optimization looping
Erik Faye-Lund (1):
- gallium/dri: set LIBVA_DRIVERS_PATH in devenv
Faith Ekstrand (3):
- etnaviv: Call lower_bool_to_int32 not to_bitsize
- nir/lower_bool_to_bitsize: Make all bN_csel sources match
- pan/bi: Be more careful about bit sizes in b2f lowering
Georg Lehmann (3):
- ci: disable debian-ppc64el and debian-s390x
- aco/insert_fp_mode: don't skip setting round for fract
- nir/opt_algebraic: fix frsq clamp pattern
Ian Romanick (5):
- brw: Don't mark_invalid in update_for_reads for non-VGRF destination
- brw: Use brw_reg_is_arf in update_for_reads
- brw: Also check for ADDRESS file in update_for_reads
- brw/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
- elk/algebraic: Don't optimize SEL.L.SAT or SEL.G.SAT
Icenowy Zheng (1):
- pvr: only specially handle gfx subcmd for BeginQuery
Iván Briano (1):
- anv: don't try to fast clear D/S with multiview
Jesse Natalie (1):
- d3d12: Fix importing external resources
Job Noorman (2):
- ir3: update context builder after ir3_get_predicate
- ir3: don't predicate vote_all/vote_any
Jose Maria Casanova Crespo (3):
- v3d: flush write jobs before BO replacement in DISCARD_WHOLE path
- vc4: flush write jobs before BO replacement in DISCARD_WHOLE path
- v3d: reject fast TLB blit when RT formats don't match
Karol Herbst (2):
- nir: fix nir_alu_type_range_contains_type_range for fp16 to int
- nir: fix nir_round_int_to_float for fp16
Lionel Landwerlin (2):
- anv: add missing handling for attachment locations in secondaries
- anv: dirty all push constant stages in simple shader
Lucas Fryzek (5):
- drisw: Properly mark shmid as -1 when alloc fails
- x11: Add helper util to check for xshm support
- egl/dri: Check that xshm can be attached
- glx: Check that xshm can be attached
- vulkan/wsi: Check that xshm can be attached
Luigi Santivetti (1):
- zink: fix format conversion logic for the alpha emulation case
Marek Olšák (1):
- ac: set the correct number of Z planes for ALLOW_EXPCLEAR
Mary Guillemard (1):
- vulkan: Do not override the shader_flags in case of no task shader
Mel Henning (1):
- driconf: force_vk_vendor on No Man's Sky + NVK
Mike Blumenkrantz (4):
- zink: add TRANSFER_WRITE -> HOST_READ sync to end of batch
- st/bitmap: only release YUV samplerviews
- radv: fix multiview fast clears
- egl/device: fix the fix for explicit sw rejection in non-sw EGL_PLATFORM=device
Patrick Lerda (1):
- r600: fix cs atomic operations when the shader is called multiple times
Pavel Ondračka (3):
- r300: copy target when merging alpha output instruction
- r300: disable HiZ for PIPE_FUNC_ALWAYS
- r300: disable clip-discard watermark for triangles
Pierre-Eric Pelloux-Prayer (2):
- frontends/va: fix undefined ref error
- mesa: don't wraparound st_context::work_counter
Rhys Perry (2):
- aco: perform dce for blocks skipped for process_block()
- nir/range_analysis: set deleted key
Sagar Ghuge (1):
- anv: Fix Wa_14021821874, Wa_14018813551, Wa_14026600921
Samuel Pitoiset (4):
- radv: fix copying images with different swizzle modes on SDMA7
- radv: fix a GPU hang with PS epilogs and secondary command buffers
- radv: fix local invocation index for mesh/task and quad derivatives on GFX12
- radv: fix missing L2 cache invalidation with streamout on GFX12
Tapani Pälli (2):
- intel/dev: update mesa_defs.json from workaround database
- anv: add handling for Wa_14026600921
Timothy Arceri (5):
- glsl: relax precision matching on unused uniforms ES
- glsl: add workaround for MDK2 HD
- mesa/st: use same path for setting state ref locations
- st/glsl_to_nir: update state var locations earlier
- glx: guard glx_screen frontend_screen member
Yiwei Zhang (2):
- pan: fix to not clear out of bitset range
- lvp: avoid advertising dmabuf support for kms_swrast

113
docs/relnotes/26.0.3.rst Normal file
View file

@ -0,0 +1,113 @@
Mesa 26.0.3 Release Notes / 2026-03-18
======================================
Mesa 26.0.3 is a bug fix release which fixes bugs found since the 26.0.2 release.
Mesa 26.0.3 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 26.0.3 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: ddb7443d328e89aa45b4b6b80f077bf937f099daeca8ba48cabe32aab769e134 mesa-26.0.3.tar.xz
SHA512: 82a33d0fa0c2855a63f599e38753126a2195025a13e45f38e14fda7aa008cb05925bb74088e4a1e199c9237d9388f4d4408a2c95c1d7fe79d8e6e6f27c84187b mesa-26.0.3.tar.xz
New features
------------
- None
Bug fixes
---------
- Portal hard locks the machine on rv350.
- Turnip crash with lazy depth textures: GPUMEM_BIND_RANGES failed (Not a typewriter)
- [regression] Left 4 Dead 2 crashing when joining or starting survival with "Official Dedicated" servers
- lavapipe: crash in caselist
- zink: mesh shaders broken
Changes
-------
Connor Abbott (2):
- vtn: Fix vtn_mediump_downconvert_value() for transposed matrices
- vtn: Fix vtn_mediump_upconvert_value() with transposed matrices
Danylo Piliaiev (1):
- tu/kgsl: Better detection of sparse support
David Rosca (2):
- radv/video: Fix AV1 encode min tile size
- radv/video: Fix coding pic_parameter_set_id in H264 slice header
Eric Engestrom (3):
- docs: add sha sum for 26.0.2
- .pick_status.json: Update to 70a487adfb42e3f9ed3b182a37133aed991fcf63
- .pick_status.json: Mark f2f792996dffd97092f18961b44d71b568cd8551 as denominated
Faith Ekstrand (1):
- pan/compiler: Handle store_per_view_output in collect_varyings()
Ian Douglas Scott (1):
- wsi/wayland: Use \`wl_fixes` to destroy \`wl_registry`
Mary Guillemard (1):
- nvk/mme: Add missing nullcheck in nvk_mme_test_state_state
Mike Blumenkrantz (13):
- zink: reapply zsbuf state after unordered blits
- zink: allow renderpass termination for clears with ZINK_DEBUG=rp and GENERAL layouts
- zink: run opt_combine_stores when optimizing
- nir: fix nir_is_io_compact for mesh shaders
- mesa/st: fix unlower_io_to_vars to work with mesh shaders
- zink: work around drivers with broken mesh shader properties
- llvmpipe: save mesh shader when calling u_blitter
- lavapipe: fix mesh property exports
- mesa/st: make st_texture_get_current_sampler_view static
- mesa/st/sampler_view: use a local variable for buffer sv format
- mesa/st/sampler_view: use a local variable for texture sv format
- mesa/st/sampler_view: eliminate st_sampler_view::srgb_skip_decode
- mesa/st/samplerview: explicitly block releasing in-use samplerviews
Natalie Vock (2):
- radv/rt: Bump ray query stack base limit for GFX12
- radv/rt: Fix shared ray query stack on top of application LDS
Pavel Ondračka (1):
- r300: pad short vertex shaders to avoid R3xx hangs
Rob Clark (2):
- freedreno/fdl: Use 4k alignment for tiled
- freedreno/drm: Fix bo_flush race
Ryan Zhang (1):
- panvk/csf: use DEFERRED_FLUSH for fragment job cache flush
Yiwei Zhang (1):
- venus: force prime blit on Nvidia GPU

273
docs/relnotes/26.0.4.rst Normal file
View file

@ -0,0 +1,273 @@
Mesa 26.0.4 Release Notes / 2026-04-01
======================================
Mesa 26.0.4 is a bug fix release which fixes bugs found since the 26.0.3 release.
Mesa 26.0.4 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 26.0.4 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
SHA256: 6d91541e086f29bb003602d2c81070f2be4c0693a90b181ca91e46fa3953fe78 mesa-26.0.4.tar.xz
SHA512: ddb59df633116a7ccd9d2d3a2e2009945909e3f774956efcbc032a2f963641cce50d0f319bebdc041df17700aa827aa2ccbc61c9e40b4020de9ff027eab27e23 mesa-26.0.4.tar.xz
New features
------------
- None
Bug fixes
---------
- Accumulation of black squares with OpenGL applications at high resolutions (hiz-related)
- RADV: Invalid hitAttributeEXT value when using function-call RT pipelines
- Segmentation fault in gm200_validate_sample_locations with Firefox on GTX 1070 Ti (nouveau)
- Vulkan CTS regression bisected to 5d2c17a5fdce ("vtn: skip make-available/visible for shared")
- [anv] Intel ARC B390 | Horizon Forbidden West | DX12 | Flashing effects
- [radeonsi] Missing ground texture in Lethis Path of the Progress
- amdgpu reset/crash when simulating stereo camera
- building mesa_clc on ubutu-26.04 with gcc-16 fails link
- util: Build regression with MSYS2 MinGW-W64 x64 clang 21 on 26.0.0-rc3
- wsi: \`assert(chain->dxgi);` may failed under venus for win32
Changes
-------
Adam Simpkins (1):
- iris: fix a crash in disable_rb_aux_buffer
Alyssa Milburn (1):
- nv50,nvc0: Avoid uninitialized cbuf reads in blits
Alyssa Rosenzweig (1):
- nir: add nir_get_io_data_src
Dave Airlie (1):
- st/mesh: handle mesh shader point size
David Rosca (2):
- frontends/va: Fix leaking H264/5 PPS/SPS objects when decoder wasn't created
- frontends/va: Fix leaks when create_video_codec fails
Eric Engestrom (9):
- docs: add sha sum for 26.0.3
- .pick_status.json: Update to 48c086cb4203d1a8e7458e0d0a85cfffc5b4bfe5
- .pick_status.json: Mark 26b19e355fefcd6a8325924e6a391dd67a675c34 as denominated
- .pick_status.json: Mark 32a818d11d3d60ebbc23a62127e988d17e742b79 as denominated
- .pick_status.json: Mark d38916d673e6d2359e96fed45ebd83ca026dfcb5 as denominated
- .pick_status.json: Mark 384d12816459996fbac5722e9fdb29527662cafb as denominated
- ci: changing .gitlab-ci.yml itself also means the container jobs must exist
- .pick_status.json: Mark 538c3ee6c7a419d5c55bef2294ca10166f8d9af4 as denominated
- [26.0 only] venus/ci: mark a test as fixed
Eric Guo (1):
- panfrost: Fix NULL pointer dereference in panfrost_emit_images
Eric R. Smith (2):
- panfrost: fix texel buffer calculations
- panfrost: fix typos in architecture detection
Erik Faye-Lund (5):
- pan/genxml: remove non-existent YUV Enable for AFRC
- pan/lib: do not try to use stencil-aspect of color attachment
- pan/lib: set srgb-flag for afrc render-targets
- pan/lib: divide extent by tile-extend, not itself
- panvk: remove unused flag
Faith Ekstrand (4):
- nak: Report progress from nak_nir_rematerialize_load_const()
- nir: Consider if uses in nir_def_all_uses_*
- pan/bi: v2x16 conversions don't replicate
- pan/buffer: Add the offset to the size for buffer textures
Georg Lehmann (2):
- gallivm: don't optimize fadd(a, 0.0) with signed zero preserve
- nir/lower_non_uniform_access: fix fusing loops for same index but different array variable
Hyunjun Ko (1):
- anv: Add dummy workload for AV1 decode on affected platforms (Wa_1508208842)
Ian Romanick (2):
- brw/algebraic: Allow mixed types in saturate constant folding
- brw: Handle scalars and swizzles correctly in is_const_zero
Icenowy Zheng (8):
- vulkan/wsi/headless: properly use CPU images for CPU devices
- pco: fix encoding of fred's s0abs bit
- pvr: Align width for PBE write when creating linear image
- pvr: fix "obb" typo in oob_buffer_size when building vertex pds data
- pvr: save vertex attribute size for DMA checking
- pvr: move PVR_BUFFER_MEMORY_PADDING_SIZE definition to pvr_buffer.h
- pvr: consider the size of DMA request when setting msize of DDMADT
- pvr: fix dirty tracking for stencil ops
Iván Briano (2):
- anv: fix anv_is_dual_src_blend_equation
- brw: do not omit RT writes if dual_src_blend is on
Job Noorman (1):
- ir3/legalize: don't drop sync flags on removed predt/predf
Jose Maria Casanova Crespo (1):
- broadcom/common: fix V3D 7.1 TFU ICFG IFORMAT values
Juan A. Suarez Romero (1):
- vc4: fix unwanted buffer release on uploader
Lionel Landwerlin (3):
- anv: add an analysis pass to detect compute shaders clearing data
- anv: add drirc option to workaround missing application barriers on typed/untyped data
- brw: fence SLM writes between workgroups
Liviu Prodea (2):
- clc: Fix static link with clang>=22
- util: Fix use of undeclared identifier 'NULL' in src/util/os_misc.h when compiling with clang
Luigi Santivetti (2):
- pvr: expose partial usc mrt init routine
- pvr: keep compiler resources in sync with attachments
Marek Olšák (3):
- radeonsi: recompute IO bases after optimizations
- radeonsi: fix blits via util_blitter_draw_rectangle
- radeonsi: disable streamout queries for u_blitter
Mario Kleiner (1):
- dri: Fix "cosmetic" undefined behaviour warning for RGB[A]16_UNORM formats.
Mary Guillemard (5):
- nvk: Move viewport and scissor emit to their own function
- nvk: Broacast viewport0 and scissor0 in case of FSR on Turing
- nir/dead_cf: Add missing load_ssbo_ir3 handling
- nir/dead_cf: Add missing load_global_bounded handling
- nak: Do not allow load_helper_invocation reordering
Mike Blumenkrantz (3):
- ntv: always emit const coord components for fbfetch loads
- mesa/renderbuffer: always add PIPE_BIND_SAMPLER_VIEW to rendering textures
- llvmpipe: fix color fbfetch
Natalie Vock (1):
- vulkan: Bump MAX_ENCODE_PASSES
Nick Hamilton (1):
- pvr: Fix for multiple attachments being assigned to the same tile buffer.
Pavel Ondračka (5):
- r300: fix bias presubtract algebraic transformation
- r300: don't apply odd macroblock rounding to 3D textures
- r300: disable zmask clears for large surfaces
- r300: add shared HyperZ pipe-count helper
- r300: split large HiZ clears into multiple packets
Pierre-Eric Pelloux-Prayer (3):
- radeonsi: move spi_shader_*_format to si_shader_variant_info
- radeonsi: account for outputs_written when updating spi_shader_col_format
- gallium/u_blitter: add a new fs_color_clear variant
Radu Costas (1):
- pco: Amend errant nir_move_option
Rhys Perry (3):
- aco/tests: fix assembler tests with LLVM 22
- aco/tests: fix assembler/isel tests with LLVM 23
- radv: fix memory leak in radv_rt_nir_to_asm
Robert Mader (1):
- llvmpipe: Stop aligning height to raster block size for unbacked handles
Ryan Zhang (1):
- panvk: trivial fix to remove repeated assignment
Samuel Pitoiset (2):
- radv/amdgpu: free the VA range in case the BO allocation failed
- radv: emit BOP events after every draw to workaround a VRS bug on GFX12
Simon Perretta (1):
- pco: use vm/icm for tile buffer store coverage mask
Timothy Arceri (2):
- mesa: add force_explicit_uniform_loc_zero workaround
- util/driconf: add workarounds for Lethis - Path Of Progress
Valentine Burley (7):
- tu/drm/virtio: Add missing lock to virtio_bo_init_dmabuf
- tu/drm/virtio: Move set_iova into success path of virtio_bo_init_dmabuf
- tu/drm/virtio: Avoid freeing zombified tu_sparse_vma
- tu/drm/virtio: Do not free iova from heap for lazy BOs
- tu/drm/virtio: Fix GEM handle leak in tu_bo_init error path
- tu/drm/virtio: Fix GEM handle leak on failed dmabuf res_id lookup
- ci: Drop duplicate Intel shader-db run
Yiwei Zhang (3):
- venus: fix to relax the KHR_external_memory_fd requirement
- vulkan/wsi/win32: add wsi_win32_find_idle_image helper
- vulkan/wsi/win32: respect acquire timeout for sw wsi
emre (1):
- nvk: fix barrier cache invalidation
juntak0916 (1):
- nvk: fix BindImageMemory2 per-bind status result
kingstom.chen (1):
- radv/rt: only run move_rt_instructions() for CPS shaders
utzcoz (1):
- gfxstream: Fix vkSetDebugUtilsObjectNameEXT crash for unwrapped objects

177
docs/relnotes/26.0.5.rst Normal file
View file

@ -0,0 +1,177 @@
Mesa 26.0.5 Release Notes / 2026-04-15
======================================
Mesa 26.0.5 is a bug fix release which fixes bugs found since the 26.0.4 release.
Mesa 26.0.5 implements the OpenGL 4.6 API, but the version reported by
glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
Some drivers don't support all the features required in OpenGL 4.6. OpenGL
4.6 is **only** available if requested at context creation.
Compatibility contexts may report a lower version depending on each driver.
Mesa 26.0.5 implements the Vulkan 1.4 API, but the version reported by
the apiVersion property of the VkPhysicalDeviceProperties struct
depends on the particular driver being used.
SHA checksums
-------------
::
TBD.
New features
------------
- None
Bug fixes
---------
- Is maxFragmentCombinedOutputResources=16 in Honeykrisp reflects an actual HW limit?
- Mesa LLVMpipe Memory Leak
Changes
-------
Ahmed Hesham (1):
- rusticl: fix flag validation when creating an image
Daniel Schürmann (1):
- aco/lower_branches: Don't remove branches which jump over loops
David Rosca (1):
- radeonsi: Set multi plane format also for imported textures
Eric Engestrom (4):
- docs: add sha sum for 26.0.4
- .pick_status.json: Update to 7e163fb79377c0fdf6d4e99ca4775fa7e1a4299e
- .pick_status.json: Mark 9ff879441f91a8296891e2e13264a7a015a11a7d as denominated
- .pick_status.json: Mark 4b3bd6b0b54d998a31356bf049911004683ea64f as denominated
Eric Guo (1):
- panfrost: disable round_to_nearest_even for NEAREST samplers
Faith Ekstrand (6):
- pan/bi: Support more swizzle aliases in the bifrost pack code
- pan/bi: Delete a few instruction encodings
- pan/bi/ra: Allow offsets on tied sources
- pan/bi: Use bi_half() for texture MS indices
- pan/bi: Add BI_SWIZZLE_NONE
- pan/bi: Support all the swizzles in the packer
Georg Lehmann (2):
- nir/opt_load_skip_helpers: don't skip helpers for store_scratch data
- aco/optimizer: do not try to create 3 byte constant operands
Ian Romanick (2):
- brw/const: Don't allow type changes when accumulators are involved
- brw: brw_reg::nr for an accumulator is not part of the offset
Icenowy Zheng (2):
- pvr: fix pvr_clear_vdm_state_get_size_in_dw() inverted feature condition
- pvr: set has_usc_alu_roundingmode_rne for all B-series Rogue cores
Janne Grunau (1):
- hk: Increase maxFragmentCombinedOutputResources to HK_MAX_DESCRIPTORS
Job Noorman (4):
- nir/opt_varyings: fix alu def cloning
- nir/gather_info: clear interpolation qualifiers before gathering
- ir3: fix handle_partial_const with vectorized src
- nir/opt_uniform_subgroup: fix ballot_bit_count components
Karol Herbst (4):
- radeonsi: set valid_buffer_range for CL buffers
- radeonsi: properly report unified memory on APUs
- rusticl/kernel: implement CL_KERNEL_GLOBAL_WORK_SIZE for custom devices
- rusticl/device: Fix reporting of global memory on mixed memory devices
Konstantin Seurer (1):
- radv/bvh: Prefer selecting quads as the first pair of a HW node
Lionel Landwerlin (3):
- anv: don't relocate memory from blob
- brw: don't support frontfacing ternary optimization on != 32bit
- elk: don't support frontfacing ternary optimization on != 32bit
Marc Alcala Prieto (1):
- pan/cs: Fix cs_run_fragment() calls with swapped arguments
Mary Guillemard (2):
- nvk: Adjust maxFragmentCombinedOutputResources to match max descriptors limit
- hk: Add HK_MAX_RTS to maxFragmentCombinedOutputResources
Mixie (1):
- xlib: clear currentDpy when releasing the current context
Natalie Vock (1):
- radv/rt: Don't enable midpoint sorting
Olivia Lee (1):
- panfrost: don't try to emit varying shader stats on v12+
Pavel Ondračka (2):
- st/bitmap: release the temporary bitmap sampler view
- gallium/u_blitter: remove unused CONST declaration when using IMM
Rhys Perry (3):
- util: fix UBSan error with _mesa_bfloat16_bits_to_float
- ir3/array_to_ssa: skip remove_trivial_phi for non-array phis
- ir3/ra: fix copy-paste error
Samuel Pitoiset (3):
- spirv: fix OpUntypedVariableKHR with optional data type parameter
- radv/meta: fix computing extent for image->image with both compressed formats
- vulkan: mark RP attachments as invalid when no rendering create info
Timothy Arceri (1):
- radeonsi: add Gun Godz workaround
Valentine Burley (2):
- zink/ci: Move zink-tu-a618 to sc7180-trogdor-kingoftown
- ci/freedreno: Move remaining lazor a618 jobs, retire device type
Vinson Lee (1):
- d3d12: Fix MinGW cross-build error in resource_state_if_promoted
Wujian Sun (1):
- mesa: Fix inconsistent multisampled CopyTexImage checks
Xianzhong Li (1):
- panfrost: Fix GEM handle refcount leak in panfrost_bo_import
Yuxuan Shui (1):
- wsi/display: initialize Xlib display connector property IDs in all cases

View file

@ -1,32 +0,0 @@
VK_KHR_relaxed_block_layout on pvr
VK_KHR_storage_buffer_storage_class on pvr
VK_EXT_external_memory_acquire_unmodified on panvk
VK_EXT_discard_rectangles on NVK
VK_KHR_present_id on HoneyKrisp
VK_KHR_present_id2 on HoneyKrisp
VK_KHR_present_wait on HoneyKrisp
VK_KHR_present_wait2 on HoneyKrisp
VK_KHR_maintenance10 on ANV, NVK, RADV
VK_EXT_shader_uniform_buffer_unsized_array on ANV, HK, NVK, RADV
VK_EXT_device_memory_report on panvk
VK_VALVE_video_encode_rgb_conversion on radv
VK_EXT_custom_resolve on RADV
GL_EXT_shader_pixel_local_storage on Panfrost v6+
VK_EXT_image_drm_format_modifier on panvk/v7
VK_KHR_sampler_ycbcr_conversion on panvk/v7
sparseResidencyImage2D on panvk v10+
sparseResidencyStandard2DBlockShape on panvk v10+
VK_KHR_surface_maintenance1 promotion everywhere EXT is exposed
VK_KHR_swapchain_maintenance1 promotion everywhere EXT is exposed
VK_KHR_dynamic_rendering on PowerVR
VK_EXT_multisampled_render_to_single_sampled on panvk
VK_KHR_pipeline_binary on HoneyKrisp
VK_KHR_incremental_present on pvr
VK_KHR_xcb_surface on pvr
VK_KHR_xlib_surface on pvr
VK_KHR_robustness2 on panvk v10+
VK_KHR_robustness2 on HoneyKrisp
VK_KHR_robustness2 on hasvk
VK_KHR_robustness2 on NVK
VK_KHR_robustness2 on Turnip
VK_KHR_robustness2 on lavapipe

View file

@ -197,6 +197,9 @@ following example::
This will backport the commit to the 21.0 branch, as well as any more recent This will backport the commit to the 21.0 branch, as well as any more recent
stable branch. Multiple ``Backport-to:`` lines are allowed, but only the stable branch. Multiple ``Backport-to:`` lines are allowed, but only the
lowest number mentioned actually matters, so for clarity, please only use one. lowest number mentioned actually matters, so for clarity, please only use one.
You can also use the special ``Backport-to: *`` which will nominate the commit
to be backported to every active stable branch, making it a synonym to the ``Cc:
mesa-stable`` below.
The last option is deprecated and mostly here for historical reasons The last option is deprecated and mostly here for historical reasons
dating back to when patch submission was done via emails: using a ``Cc:`` dating back to when patch submission was done via emails: using a ``Cc:``

View file

@ -642,7 +642,7 @@ if with_dri
endif endif
dep_dxheaders = null_dep dep_dxheaders = null_dep
if with_gallium_d3d12 or with_microsoft_clc or with_microsoft_vk or with_gfxstream_vk and host_machine.system() == 'windows' if with_gallium_d3d12 or with_microsoft_clc or with_microsoft_vk or (with_any_vk and host_machine.system() == 'windows')
dep_dxheaders = dependency('directx-headers', required : false) dep_dxheaders = dependency('directx-headers', required : false)
if not dep_dxheaders.found() if not dep_dxheaders.found()
dep_dxheaders = dependency('DirectX-Headers', dep_dxheaders = dependency('DirectX-Headers',
@ -1931,7 +1931,6 @@ dep_spirv_tools = dependency(
'SPIRV-Tools', 'SPIRV-Tools',
required : with_spirv_tools, required : with_spirv_tools,
version : '>= 2024.1', version : '>= 2024.1',
static : host_machine.system() == 'darwin',
) )
if dep_spirv_tools.found() if dep_spirv_tools.found()
pre_args += '-DHAVE_SPIRV_TOOLS' pre_args += '-DHAVE_SPIRV_TOOLS'
@ -1959,6 +1958,9 @@ if with_clc
if dep_llvm.version().version_compare('>= 18.0') if dep_llvm.version().version_compare('>= 18.0')
clang_modules += 'clangAPINotes' clang_modules += 'clangAPINotes'
endif endif
if dep_llvm.version().version_compare('>= 22.0')
clang_modules += ['clangAnalysisLifetimeSafety', 'clangOptions']
endif
dep_clang = [] dep_clang = []
foreach m : clang_modules foreach m : clang_modules

View file

@ -401,7 +401,6 @@ spec@egl 1.4@eglterminate then unbind context,Fail
spec@egl_khr_surfaceless_context@viewport,Fail spec@egl_khr_surfaceless_context@viewport,Fail
spec@egl_mesa_configless_context@basic,Fail spec@egl_mesa_configless_context@basic,Fail
spec@ext_external_objects@vk-ping-pong-single-sem,Crash spec@ext_external_objects@vk-ping-pong-single-sem,Crash
spec@glsl-es-1.00@linker@glsl-mismatched-uniform-precision-unused,Fail
spec@glsl-es-3.00@execution@built-in-functions@fs-packhalf2x16,Fail spec@glsl-es-3.00@execution@built-in-functions@fs-packhalf2x16,Fail
spec@glsl-es-3.00@execution@built-in-functions@vs-packhalf2x16,Fail spec@glsl-es-3.00@execution@built-in-functions@vs-packhalf2x16,Fail
spec@khr_texture_compression_astc@miptree-gles srgb-fp,Fail spec@khr_texture_compression_astc@miptree-gles srgb-fp,Fail

View file

@ -46,7 +46,6 @@ api@clgetdeviceinfo,Fail
api@clgetextensionfunctionaddressforplatform,Fail api@clgetextensionfunctionaddressforplatform,Fail
api@clgetkernelarginfo,Fail api@clgetkernelarginfo,Fail
api@cllinkprogram,Fail api@cllinkprogram,Fail
custom@r600 create release buffer bug,Fail
program@build@vector-data-types,Fail program@build@vector-data-types,Fail
program@execute@builtin@builtin-float-nextafter-1.0.generated,Fail program@execute@builtin@builtin-float-nextafter-1.0.generated,Fail
program@execute@builtin@builtin-float-nextafter-1.0.generated@nextafter float1,Fail program@execute@builtin@builtin-float-nextafter-1.0.generated@nextafter float1,Fail
@ -71,5 +70,4 @@ program@run kernel with max work item sizes,Fail
# uprev Piglit in Mesa # uprev Piglit in Mesa
spec@ext_external_objects@vk-semaphores,Crash spec@ext_external_objects@vk-semaphores,Crash
spec@ext_external_objects@vk-semaphores-2,Crash spec@ext_external_objects@vk-semaphores-2,Crash
spec@glsl-es-1.00@linker@glsl-mismatched-uniform-precision-unused,Fail

View file

@ -121,7 +121,6 @@ spec@ext_texture_srgb@texwrap formats-s3tc bordercolor-swizzled@GL_COMPRESSED_SR
spec@ext_texture_srgb@texwrap formats-s3tc bordercolor-swizzled@GL_COMPRESSED_SRGB_S3TC_DXT1_EXT- swizzled- border color only,Fail spec@ext_texture_srgb@texwrap formats-s3tc bordercolor-swizzled@GL_COMPRESSED_SRGB_S3TC_DXT1_EXT- swizzled- border color only,Fail
spec@glsl-1.50@execution@geometry@tri-strip-ordering-with-prim-restart gl_triangle_strip_adjacency ffs,Fail spec@glsl-1.50@execution@geometry@tri-strip-ordering-with-prim-restart gl_triangle_strip_adjacency ffs,Fail
spec@glsl-1.50@execution@geometry@tri-strip-ordering-with-prim-restart gl_triangle_strip_adjacency other,Fail spec@glsl-1.50@execution@geometry@tri-strip-ordering-with-prim-restart gl_triangle_strip_adjacency other,Fail
spec@glsl-es-1.00@linker@glsl-mismatched-uniform-precision-unused,Fail
spec@glsl-es-3.00@execution@built-in-functions@fs-packhalf2x16,Fail spec@glsl-es-3.00@execution@built-in-functions@fs-packhalf2x16,Fail
spec@glsl-es-3.00@execution@built-in-functions@vs-packhalf2x16,Fail spec@glsl-es-3.00@execution@built-in-functions@vs-packhalf2x16,Fail
spec@khr_texture_compression_astc@miptree-gl srgb-fp,Fail spec@khr_texture_compression_astc@miptree-gl srgb-fp,Fail

View file

@ -14,7 +14,6 @@ spec@egl_khr_surfaceless_context@viewport,Fail
spec@ext_external_objects@vk-image-display,Crash spec@ext_external_objects@vk-image-display,Crash
spec@ext_external_objects@vk-semaphores,Crash spec@ext_external_objects@vk-semaphores,Crash
spec@ext_external_objects@vk-semaphores-2,Crash spec@ext_external_objects@vk-semaphores-2,Crash
spec@glsl-es-1.00@linker@glsl-mismatched-uniform-precision-unused,Fail
spec@glsl-es-3.00@execution@built-in-functions@fs-packhalf2x16,Fail spec@glsl-es-3.00@execution@built-in-functions@fs-packhalf2x16,Fail
spec@glsl-es-3.00@execution@built-in-functions@vs-packhalf2x16,Fail spec@glsl-es-3.00@execution@built-in-functions@vs-packhalf2x16,Fail
spec@khr_texture_compression_astc@miptree-gles srgb-fp,Fail spec@khr_texture_compression_astc@miptree-gles srgb-fp,Fail

View file

@ -222,9 +222,11 @@ static uint32_t
ac_sdma_get_tiled_info_dword(const struct radeon_info *info, ac_sdma_get_tiled_info_dword(const struct radeon_info *info,
const struct ac_sdma_surf_tiled *tiled) const struct ac_sdma_surf_tiled *tiled)
{ {
const uint32_t swizzle_mode = tiled->surf->has_stencil ? tiled->surf->u.gfx9.zs.stencil_swizzle_mode const uint32_t swizzle_mode =
tiled->is_stencil ? tiled->surf->u.gfx9.zs.stencil_swizzle_mode
: tiled->surf->u.gfx9.swizzle_mode; : tiled->surf->u.gfx9.swizzle_mode;
const uint16_t epitch = tiled->surf->has_stencil ? tiled->surf->u.gfx9.zs.stencil_epitch const uint16_t epitch =
tiled->is_stencil ? tiled->surf->u.gfx9.zs.stencil_epitch
: tiled->surf->u.gfx9.epitch; : tiled->surf->u.gfx9.epitch;
const enum gfx9_resource_type dimension = const enum gfx9_resource_type dimension =
ac_sdma_get_tiled_resource_dim(info->sdma_ip_version, tiled); ac_sdma_get_tiled_resource_dim(info->sdma_ip_version, tiled);

View file

@ -61,6 +61,7 @@ struct ac_sdma_surf_tiled {
uint64_t va; uint64_t va;
enum pipe_format format; enum pipe_format format;
uint32_t bpp; uint32_t bpp;
bool is_stencil;
struct { struct {
uint32_t x; uint32_t x;

View file

@ -1055,8 +1055,15 @@ ac_init_ds_surface(const struct radeon_info *info, const struct ac_ds_state *sta
static unsigned static unsigned
ac_get_decompress_on_z_planes(const struct radeon_info *info, enum pipe_format format, uint8_t log_num_samples, ac_get_decompress_on_z_planes(const struct radeon_info *info, enum pipe_format format, uint8_t log_num_samples,
bool htile_stencil_disabled, bool no_d16_compression) bool tc_compat_htile_enabled, bool htile_stencil_disabled, bool no_d16_compression,
bool z_allow_expclear)
{ {
if (info->gfx_level < GFX8)
return 0;
if (!tc_compat_htile_enabled)
return z_allow_expclear ? 15 : 0;
uint32_t max_zplanes = 0; uint32_t max_zplanes = 0;
if (info->gfx_level >= GFX9) { if (info->gfx_level >= GFX9) {
@ -1073,6 +1080,7 @@ ac_get_decompress_on_z_planes(const struct radeon_info *info, enum pipe_format f
max_zplanes = 1; max_zplanes = 1;
max_zplanes++; max_zplanes++;
assert(max_zplanes != 1); /* 1 is invalid and can cause corruption on gfx11-11.5 */
} else { } else {
if (format == PIPE_FORMAT_Z16_UNORM && no_d16_compression) { if (format == PIPE_FORMAT_Z16_UNORM && no_d16_compression) {
/* Do not enable Z plane compression for 16-bit depth /* Do not enable Z plane compression for 16-bit depth
@ -1093,6 +1101,7 @@ ac_get_decompress_on_z_planes(const struct radeon_info *info, enum pipe_format f
} }
} }
assert(max_zplanes != 10 && max_zplanes != 13); /* disallowed values */
return max_zplanes; return max_zplanes;
} }
@ -1115,14 +1124,18 @@ ac_set_mutable_ds_surface_fields(const struct radeon_info *info, const struct ac
log_num_samples = G_028040_NUM_SAMPLES(ds->db_z_info); log_num_samples = G_028040_NUM_SAMPLES(ds->db_z_info);
} }
bool z_allow_expclear = info->gfx_level <= GFX11_5 &&
G_028038_ALLOW_EXPCLEAR(ds->db_z_info);
const uint32_t max_zplanes = const uint32_t max_zplanes =
ac_get_decompress_on_z_planes(info, state->format, log_num_samples, ac_get_decompress_on_z_planes(info, state->format, log_num_samples,
tile_stencil_disable, state->no_d16_compression); state->tc_compat_htile_enabled, tile_stencil_disable,
state->no_d16_compression, z_allow_expclear);
if (info->gfx_level >= GFX9) { if (info->gfx_level >= GFX9) {
if (state->tc_compat_htile_enabled) {
ds->db_z_info |= S_028038_DECOMPRESS_ON_N_ZPLANES(max_zplanes); ds->db_z_info |= S_028038_DECOMPRESS_ON_N_ZPLANES(max_zplanes);
if (state->tc_compat_htile_enabled) {
if (info->gfx_level >= GFX10) { if (info->gfx_level >= GFX10) {
const bool iterate256 = log_num_samples >= 1; const bool iterate256 = log_num_samples >= 1;
@ -1138,12 +1151,13 @@ ac_set_mutable_ds_surface_fields(const struct radeon_info *info, const struct ac
ds->db_z_info |= S_028038_ZRANGE_PRECISION(state->zrange_precision); ds->db_z_info |= S_028038_ZRANGE_PRECISION(state->zrange_precision);
} else { } else {
if (state->tc_compat_htile_enabled) { if (info->gfx_level >= GFX8)
ds->u.gfx6.db_htile_surface |= S_028ABC_TC_COMPATIBLE(1);
ds->db_z_info |= S_028040_DECOMPRESS_ON_N_ZPLANES(max_zplanes); ds->db_z_info |= S_028040_DECOMPRESS_ON_N_ZPLANES(max_zplanes);
} else {
if (state->tc_compat_htile_enabled)
ds->u.gfx6.db_htile_surface |= S_028ABC_TC_COMPATIBLE(1);
else
ds->u.gfx6.db_depth_info |= S_02803C_ADDR5_SWIZZLE_MASK(1); ds->u.gfx6.db_depth_info |= S_02803C_ADDR5_SWIZZLE_MASK(1);
}
ds->db_z_info |= S_028040_ZRANGE_PRECISION(state->zrange_precision); ds->db_z_info |= S_028040_ZRANGE_PRECISION(state->zrange_precision);
} }

View file

@ -1096,6 +1096,13 @@ ac_query_gpu_info(int fd, void *dev_p, struct radeon_info *info,
info->family == CHIP_NAVI22 || info->family == CHIP_NAVI22 ||
info->family == CHIP_VANGOGH; info->family == CHIP_VANGOGH;
/* GFX12 is affected by random GPU hangs when VRS rates are exported by the
* last VGT stage under some conditions that are unclear. One possible
* workaround is to emit BOP events after every draw that exports VRS
* rates.
*/
info->has_vrs_export_bug = info->gfx_level == GFX12;
/* HW bug workaround when CS threadgroups > 256 threads and async compute /* HW bug workaround when CS threadgroups > 256 threads and async compute
* isn't used, i.e. only one compute job can run at a time. If async * isn't used, i.e. only one compute job can run at a time. If async
* compute is possible, the threadgroup size must be limited to 256 threads * compute is possible, the threadgroup size must be limited to 256 threads

View file

@ -229,6 +229,7 @@ struct radeon_info {
bool has_attr_ring_wait_bug; bool has_attr_ring_wait_bug;
bool cp_dma_supports_sparse; bool cp_dma_supports_sparse;
bool has_vrs_ds_export_bug; bool has_vrs_ds_export_bug;
bool has_vrs_export_bug;
bool has_taskmesh_indirect0_bug; bool has_taskmesh_indirect0_bug;
bool sdma_supports_sparse; /* Whether SDMA can safely access sparse resources. */ bool sdma_supports_sparse; /* Whether SDMA can safely access sparse resources. */
bool sdma_supports_compression; /* Whether SDMA supports DCC and HTILE. */ bool sdma_supports_compression; /* Whether SDMA supports DCC and HTILE. */

View file

@ -49,6 +49,8 @@ ac_sqtt_get_data_va(const struct radeon_info *rad_info, const struct ac_sqtt *da
void void
ac_sqtt_init(struct ac_sqtt *data) ac_sqtt_init(struct ac_sqtt *data)
{ {
simple_mtx_init(&data->lock, mtx_plain);
list_inithead(&data->rgp_pso_correlation.record); list_inithead(&data->rgp_pso_correlation.record);
simple_mtx_init(&data->rgp_pso_correlation.lock, mtx_plain); simple_mtx_init(&data->rgp_pso_correlation.lock, mtx_plain);
@ -71,6 +73,8 @@ ac_sqtt_init(struct ac_sqtt *data)
void void
ac_sqtt_finish(struct ac_sqtt *data) ac_sqtt_finish(struct ac_sqtt *data)
{ {
simple_mtx_destroy(&data->lock);
assert(data->rgp_pso_correlation.record_count == 0); assert(data->rgp_pso_correlation.record_count == 0);
simple_mtx_destroy(&data->rgp_pso_correlation.lock); simple_mtx_destroy(&data->rgp_pso_correlation.lock);

View file

@ -15,6 +15,7 @@
#include "ac_pm4.h" #include "ac_pm4.h"
#include "ac_rgp.h" #include "ac_rgp.h"
#include "amd_family.h" #include "amd_family.h"
#include "util/simple_mtx.h"
#define SQTT_BUFFER_ALIGN_SHIFT 12 #define SQTT_BUFFER_ALIGN_SHIFT 12
@ -61,6 +62,8 @@ struct ac_sqtt {
struct rgp_clock_calibration rgp_clock_calibration; struct rgp_clock_calibration rgp_clock_calibration;
struct hash_table_u64 *pipeline_bos; struct hash_table_u64 *pipeline_bos;
simple_mtx_t lock;
}; };
struct ac_sqtt_data_info { struct ac_sqtt_data_info {

View file

@ -443,10 +443,14 @@ emit_ps_color_export(nir_builder *b, lower_ps_state *s, unsigned output_index, u
} }
} }
s->exp[s->exp_num++] = nir_export_amd(b, nir_vec(b, outputs, 4), nir_intrinsic_instr *exp = nir_export_amd(b, nir_vec(b, outputs, 4),
.base = target, .base = target,
.write_mask = write_mask,
.flags = flags); .flags = flags);
/* Set the writemask explicitly because write_mask=0 means full write mask. */
nir_intrinsic_set_write_mask(exp, write_mask);
s->exp[s->exp_num++] = exp;
return true; return true;
} }
@ -483,7 +487,7 @@ emit_ps_dual_src_blend_swizzle(nir_builder *b, lower_ps_state *s, unsigned first
uint32_t mrt0_write_mask = nir_intrinsic_write_mask(mrt0_exp); uint32_t mrt0_write_mask = nir_intrinsic_write_mask(mrt0_exp);
uint32_t mrt1_write_mask = nir_intrinsic_write_mask(mrt1_exp); uint32_t mrt1_write_mask = nir_intrinsic_write_mask(mrt1_exp);
uint32_t write_mask = mrt0_write_mask & mrt1_write_mask; uint32_t write_mask = mrt0_write_mask | mrt1_write_mask;
nir_def *mrt0_arg = mrt0_exp->src[0].ssa; nir_def *mrt0_arg = mrt0_exp->src[0].ssa;
nir_def *mrt1_arg = mrt1_exp->src[0].ssa; nir_def *mrt1_arg = mrt1_exp->src[0].ssa;

View file

@ -216,6 +216,11 @@ the correct layout is:
VOP2 `v_pk_fmac_f16`. But like all other packed math opcodes, DPP does not function in practice. VOP2 `v_pk_fmac_f16`. But like all other packed math opcodes, DPP does not function in practice.
RDNA1 and RDNA2 support `v_pk_fmac_f16_dpp`. RDNA1 and RDNA2 support `v_pk_fmac_f16_dpp`.
## DPP with integer `subrev` and shifts
No documentation mentions this, but DPP is seemingly applied to src1 instead of src0 for
integer reverse subtract and shift opcodes.
## ds_swizzle_b32 rotate/fft modes ## ds_swizzle_b32 rotate/fft modes
These are first mentioned in the GFX9 (Vega) ISA doc, information from the LLVM bug tracker These are first mentioned in the GFX9 (Vega) ISA doc, information from the LLVM bug tracker

View file

@ -1867,6 +1867,8 @@ resolve_all_gfx11(State& state, NOP_ctx_gfx11& ctx,
ctx.vgpr_used_by_vmem_bvh.any()) { ctx.vgpr_used_by_vmem_bvh.any()) {
waitcnt_depctr &= 0xffe3; waitcnt_depctr &= 0xffe3;
ctx.vgpr_used_by_vmem_load.reset(); ctx.vgpr_used_by_vmem_load.reset();
ctx.vgpr_used_by_vmem_sample.reset();
ctx.vgpr_used_by_vmem_bvh.reset();
ctx.vgpr_used_by_vmem_store.reset(); ctx.vgpr_used_by_vmem_store.reset();
ctx.vgpr_used_by_ds.reset(); ctx.vgpr_used_by_ds.reset();
} }
@ -1912,7 +1914,9 @@ handle_block(Program* program, Ctx& ctx, Block& block)
Handle(state, ctx, instr, block.instructions); Handle(state, ctx, instr, block.instructions);
/* Resolve all possible hazards (we don't know what s_setpc_b64 jumps to). */ /* Resolve all possible hazards (we don't know what s_setpc_b64 jumps to). */
if (instr->opcode == aco_opcode::s_setpc_b64) { if (instr->opcode == aco_opcode::s_setpc_b64 || instr->opcode == aco_opcode::s_swappc_b64 ||
instr->opcode == aco_opcode::s_call_b64) {
found_end |= instr->opcode == aco_opcode::s_setpc_b64;
block.instructions.emplace_back(std::move(instr)); block.instructions.emplace_back(std::move(instr));
std::vector<aco_ptr<Instruction>> resolve_instrs; std::vector<aco_ptr<Instruction>> resolve_instrs;
@ -1920,8 +1924,6 @@ handle_block(Program* program, Ctx& ctx, Block& block)
block.instructions.insert(std::prev(block.instructions.end()), block.instructions.insert(std::prev(block.instructions.end()),
std::move_iterator(resolve_instrs.begin()), std::move_iterator(resolve_instrs.begin()),
std::move_iterator(resolve_instrs.end())); std::move_iterator(resolve_instrs.end()));
found_end = true;
continue; continue;
} }

View file

@ -484,10 +484,17 @@ process_instructions(exec_ctx& ctx, Block* block, std::vector<aco_ptr<Instructio
Operand exit_cond = Operand(exec, bld.lm); Operand exit_cond = Operand(exec, bld.lm);
if (state == Exact) { if (state == Exact) {
assert(info.exec.size() == 1); bld.sop2(Builder::s_andn2, Definition(exec, bld.lm), bld.def(s1, scc),
bld.sop2(Builder::s_andn2, Definition(exec, bld.lm), bld.def(s1, scc), info.exec[0].op, info.exec.back().op, src);
src); info.exec.back().op = Operand(exec, bld.lm);
info.exec[0].op = Operand(exec, bld.lm);
/* Although this is in uniform CF, it might be a loop without back-edge.
* Update the loop restore mask as well.
*/
for (unsigned i = 0; i < info.exec.size() - 1; i++) {
assert(info.exec[i + 1].type & mask_type_loop);
info.exec[i].op = bld.copy(bld.def(bld.lm), Operand(exec, bld.lm));
}
} else { } else {
Temp cond = bld.tmp(s1); Temp cond = bld.tmp(s1);
info.exec[0].op = bld.sop2(Builder::s_andn2, bld.def(bld.lm), Definition(cond, scc), info.exec[0].op = bld.sop2(Builder::s_andn2, bld.def(bld.lm), Definition(cond, scc),

View file

@ -233,9 +233,6 @@ instr_ignores_round_mode(const Instruction* instr)
case aco_opcode::v_rndne_f64: case aco_opcode::v_rndne_f64:
case aco_opcode::v_rndne_f32: case aco_opcode::v_rndne_f32:
case aco_opcode::v_rndne_f16: case aco_opcode::v_rndne_f16:
case aco_opcode::v_fract_f64:
case aco_opcode::v_fract_f32:
case aco_opcode::v_fract_f16:
case aco_opcode::s_min_f32: case aco_opcode::s_min_f32:
case aco_opcode::s_min_f16: case aco_opcode::s_min_f16:
case aco_opcode::s_max_f32: case aco_opcode::s_max_f32:
@ -454,8 +451,7 @@ emit_set_mode_block(fp_mode_ctx* ctx, Block* block)
for (uint32_t pred : block->linear_preds) for (uint32_t pred : block->linear_preds)
max_pred = MAX2(max_pred, pred); max_pred = MAX2(max_pred, pred);
assert(max_pred != 0); if (max_pred >= block->index) {
mode_mask to_set = 0; mode_mask to_set = 0;
/* Check if the any mode was changed during the loop. */ /* Check if the any mode was changed during the loop. */
u_foreach_bit (i, fp_state.required) { u_foreach_bit (i, fp_state.required) {
@ -465,6 +461,7 @@ emit_set_mode_block(fp_mode_ctx* ctx, Block* block)
if (to_set) if (to_set)
set_mode(ctx, block, fp_state, 0, to_set); set_mode(ctx, block, fp_state, 0, to_set);
} }
}
ctx->block_states[block->index] = fp_state; ctx->block_states[block->index] = fp_state;
} }

View file

@ -391,6 +391,65 @@ convert_to_SDWA(amd_gfx_level gfx_level, aco_ptr<Instruction>& instr)
return tmp; return tmp;
} }
bool
opcode_supports_dpp(amd_gfx_level gfx_level, aco_opcode opcode, bool vop3p)
{
switch (opcode) {
/* reverse integer subtract and shift seem to apply dpp to src1 instead of src0 */
case aco_opcode::v_subrev_co_u32:
case aco_opcode::v_subrev_co_u32_e64:
case aco_opcode::v_subbrev_co_u32:
case aco_opcode::v_subrev_u16:
case aco_opcode::v_subrev_u32:
case aco_opcode::v_ashrrev_i32:
case aco_opcode::v_lshrrev_b32:
case aco_opcode::v_lshlrev_b32:
case aco_opcode::v_ashrrev_i16:
case aco_opcode::v_lshrrev_b16:
case aco_opcode::v_lshlrev_b16:
case aco_opcode::v_ashrrev_i16_e64:
case aco_opcode::v_lshrrev_b16_e64:
case aco_opcode::v_lshlrev_b16_e64: return false;
case aco_opcode::v_pk_fmac_f16: return gfx_level < GFX11;
/* there are more cases but those all take 64-bit inputs */
case aco_opcode::v_madmk_f32:
case aco_opcode::v_madak_f32:
case aco_opcode::v_madmk_f16:
case aco_opcode::v_madak_f16:
case aco_opcode::v_fmamk_f32:
case aco_opcode::v_fmaak_f32:
case aco_opcode::v_fmamk_f16:
case aco_opcode::v_fmaak_f16:
case aco_opcode::v_readfirstlane_b32:
case aco_opcode::v_cvt_f64_i32:
case aco_opcode::v_cvt_f64_f32:
case aco_opcode::v_cvt_f64_u32:
case aco_opcode::v_mul_lo_u32:
case aco_opcode::v_mul_lo_i32:
case aco_opcode::v_mul_hi_u32:
case aco_opcode::v_mul_hi_i32:
case aco_opcode::v_qsad_pk_u16_u8:
case aco_opcode::v_mqsad_pk_u16_u8:
case aco_opcode::v_mqsad_u32_u8:
case aco_opcode::v_mad_u64_u32:
case aco_opcode::v_mad_i64_i32:
case aco_opcode::v_permlane16_b32:
case aco_opcode::v_permlanex16_b32:
case aco_opcode::v_permlane64_b32:
case aco_opcode::v_readlane_b32_e64:
case aco_opcode::v_writelane_b32_e64: return false;
/* simpler than listing all VOP3P opcodes which do not support DPP */
case aco_opcode::v_fma_mix_f32:
case aco_opcode::v_fma_mixlo_f16:
case aco_opcode::v_fma_mixhi_f16:
case aco_opcode::p_v_fma_mixlo_f16_rtz:
case aco_opcode::p_v_fma_mixhi_f16_rtz:
case aco_opcode::v_dot2_f32_f16:
case aco_opcode::v_dot2_f32_bf16: return gfx_level >= GFX11;
default: return !vop3p;
}
}
bool bool
can_use_DPP(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool dpp8) can_use_DPP(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool dpp8)
{ {
@ -433,41 +492,7 @@ can_use_DPP(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool dpp
if (instr->writes_exec()) if (instr->writes_exec())
return false; return false;
/* simpler than listing all VOP3P opcodes which do not support DPP */ return opcode_supports_dpp(gfx_level, instr->opcode, instr->isVOP3P());
if (instr->isVOP3P()) {
return instr->opcode == aco_opcode::v_fma_mix_f32 ||
instr->opcode == aco_opcode::v_fma_mixlo_f16 ||
instr->opcode == aco_opcode::v_fma_mixhi_f16 ||
instr->opcode == aco_opcode::p_v_fma_mixlo_f16_rtz ||
instr->opcode == aco_opcode::p_v_fma_mixhi_f16_rtz ||
instr->opcode == aco_opcode::v_dot2_f32_f16 ||
instr->opcode == aco_opcode::v_dot2_f32_bf16;
}
if (instr->opcode == aco_opcode::v_pk_fmac_f16)
return gfx_level < GFX11;
/* there are more cases but those all take 64-bit inputs */
return instr->opcode != aco_opcode::v_madmk_f32 && instr->opcode != aco_opcode::v_madak_f32 &&
instr->opcode != aco_opcode::v_madmk_f16 && instr->opcode != aco_opcode::v_madak_f16 &&
instr->opcode != aco_opcode::v_fmamk_f32 && instr->opcode != aco_opcode::v_fmaak_f32 &&
instr->opcode != aco_opcode::v_fmamk_f16 && instr->opcode != aco_opcode::v_fmaak_f16 &&
instr->opcode != aco_opcode::v_readfirstlane_b32 &&
instr->opcode != aco_opcode::v_cvt_f64_i32 &&
instr->opcode != aco_opcode::v_cvt_f64_f32 &&
instr->opcode != aco_opcode::v_cvt_f64_u32 && instr->opcode != aco_opcode::v_mul_lo_u32 &&
instr->opcode != aco_opcode::v_mul_lo_i32 && instr->opcode != aco_opcode::v_mul_hi_u32 &&
instr->opcode != aco_opcode::v_mul_hi_i32 &&
instr->opcode != aco_opcode::v_qsad_pk_u16_u8 &&
instr->opcode != aco_opcode::v_mqsad_pk_u16_u8 &&
instr->opcode != aco_opcode::v_mqsad_u32_u8 &&
instr->opcode != aco_opcode::v_mad_u64_u32 &&
instr->opcode != aco_opcode::v_mad_i64_i32 &&
instr->opcode != aco_opcode::v_permlane16_b32 &&
instr->opcode != aco_opcode::v_permlanex16_b32 &&
instr->opcode != aco_opcode::v_permlane64_b32 &&
instr->opcode != aco_opcode::v_readlane_b32_e64 &&
instr->opcode != aco_opcode::v_writelane_b32_e64;
} }
aco_ptr<Instruction> aco_ptr<Instruction>
@ -889,7 +914,9 @@ needs_exec_mask(const Instruction* instr)
if (instr->isSALU() || instr->isBranch() || instr->isSMEM() || instr->isBarrier()) if (instr->isSALU() || instr->isBranch() || instr->isSMEM() || instr->isBarrier())
return instr->opcode == aco_opcode::s_cbranch_execz || return instr->opcode == aco_opcode::s_cbranch_execz ||
instr->opcode == aco_opcode::s_cbranch_execnz || instr->opcode == aco_opcode::s_cbranch_execnz ||
instr->opcode == aco_opcode::s_setpc_b64 || instr->reads_exec(); instr->opcode == aco_opcode::s_setpc_b64 ||
instr->opcode == aco_opcode::s_swappc_b64 || instr->opcode == aco_opcode::s_call_b64 ||
instr->reads_exec();
if (instr->isPseudo()) { if (instr->isPseudo()) {
switch (instr->opcode) { switch (instr->opcode) {

View file

@ -2040,6 +2040,8 @@ bool can_use_opsel(amd_gfx_level gfx_level, aco_opcode op, int idx);
bool instr_is_16bit(amd_gfx_level gfx_level, aco_opcode op); bool instr_is_16bit(amd_gfx_level gfx_level, aco_opcode op);
uint8_t get_gfx11_true16_mask(aco_opcode op); uint8_t get_gfx11_true16_mask(aco_opcode op);
bool can_use_SDWA(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool pre_ra); bool can_use_SDWA(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool pre_ra);
bool opcode_supports_dpp(amd_gfx_level gfx_level, aco_opcode opcode, bool vop3p);
bool can_use_DPP(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool dpp8);
bool can_use_DPP(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool dpp8); bool can_use_DPP(amd_gfx_level gfx_level, const aco_ptr<Instruction>& instr, bool dpp8);
bool can_write_m0(const aco_ptr<Instruction>& instr); bool can_write_m0(const aco_ptr<Instruction>& instr);
/* updates "instr" and returns the old instruction (or NULL if no update was needed) */ /* updates "instr" and returns the old instruction (or NULL if no update was needed) */

View file

@ -298,7 +298,9 @@ eliminate_useless_exec_writes_in_block(branch_ctx& ctx, Block& block)
/* blocks_incoming_exec_used is initialized to true, so this is correct even for loops. */ /* blocks_incoming_exec_used is initialized to true, so this is correct even for loops. */
if (instr->opcode == aco_opcode::s_cbranch_scc0 || if (instr->opcode == aco_opcode::s_cbranch_scc0 ||
instr->opcode == aco_opcode::s_cbranch_scc1) { instr->opcode == aco_opcode::s_cbranch_scc1 ||
instr->opcode == aco_opcode::s_cbranch_vccz ||
instr->opcode == aco_opcode::s_cbranch_vccnz) {
exec_write_used |= ctx.blocks_incoming_exec_used[instr->salu().imm]; exec_write_used |= ctx.blocks_incoming_exec_used[instr->salu().imm];
} }
@ -377,6 +379,10 @@ can_remove_branch(branch_ctx& ctx, Block& block, Pseudo_branch_instruction* bran
if (uniform_branch && !ctx.program->blocks[i].instructions.empty()) if (uniform_branch && !ctx.program->blocks[i].instructions.empty())
return false; return false;
/* Don't enter loops with empty exec mask. */
if (ctx.program->blocks[i].loop_nest_depth > block.loop_nest_depth)
return false;
for (aco_ptr<Instruction>& instr : ctx.program->blocks[i].instructions) { for (aco_ptr<Instruction>& instr : ctx.program->blocks[i].instructions) {
if (instr->isSOPP()) { if (instr->isSOPP()) {
/* Discard early exits and loop breaks and continues should work fine with /* Discard early exits and loop breaks and continues should work fine with

View file

@ -22,7 +22,13 @@ enum aco_nir_function_attribs {
}; };
enum aco_nir_parameter_attribs { enum aco_nir_parameter_attribs {
/* Parameter value is not used by any callee and does not need to be preserved */ /* This parameter's value may not be preserved across a callee. Unlike return parameters, the
* parameter's value is undefined on return. Callers must back up values of discardable
* parameters separately.
* Mostly used for tail calls, where parameters to the tail callee have different values than
* for the caller. In that case, on function return, the parameters will have been overwritten
* with the tail callee parameter values.
*/
ACO_NIR_PARAM_ATTRIB_DISCARDABLE = 0x1, ACO_NIR_PARAM_ATTRIB_DISCARDABLE = 0x1,
}; };

View file

@ -427,6 +427,21 @@ process_block(vn_ctx& ctx, Block& block)
block.instructions = std::move(new_instructions); block.instructions = std::move(new_instructions);
} }
void
dce_instructions(vn_ctx& ctx, Block& block)
{
std::vector<aco_ptr<Instruction>> new_instructions;
new_instructions.reserve(block.instructions.size());
for (aco_ptr<Instruction>& instr : block.instructions) {
if (is_dead(ctx.uses, instr.get()))
continue;
new_instructions.emplace_back(std::move(instr));
}
block.instructions = std::move(new_instructions);
}
void void
rename_phi_operands(Block& block, aco::unordered_map<uint32_t, Temp>& renames) rename_phi_operands(Block& block, aco::unordered_map<uint32_t, Temp>& renames)
{ {
@ -467,10 +482,12 @@ value_numbering(Program* program)
if (block.logical_idom == (int)block.index) if (block.logical_idom == (int)block.index)
ctx.expr_values.clear(); ctx.expr_values.clear();
if (block.logical_idom != -1) if (block.logical_idom != -1) {
process_block(ctx, block); process_block(ctx, block);
else } else {
dce_instructions(ctx, block);
rename_phi_operands(block, ctx.renames); rename_phi_operands(block, ctx.renames);
}
/* increment exec_id when entering nested control flow */ /* increment exec_id when entering nested control flow */
if (block.kind & block_kind_branch || block.kind & block_kind_loop_preheader || if (block.kind & block_kind_branch || block.kind & block_kind_loop_preheader ||

View file

@ -1190,7 +1190,7 @@ alu_opt_gather_info(opt_ctx& ctx, Instruction* instr, alu_opt_info& info)
info.operands.push_back({instr->operands[0]}); info.operands.push_back({instr->operands[0]});
if (instr->definitions[0].regClass() == s1) { if (instr->definitions[0].regClass() == s1) {
info.defs.push_back(instr->definitions[1]); info.defs.push_back(instr->definitions[1]);
info.opcode = aco_opcode::v_lshl_b32; info.opcode = aco_opcode::s_lshl_b32;
info.format = Format::SOP2; info.format = Format::SOP2;
std::swap(info.operands[0], info.operands[1]); std::swap(info.operands[0], info.operands[1]);
} else { } else {
@ -1759,6 +1759,8 @@ pseudo_can_accept_constant(const aco_ptr<Instruction>& instr, unsigned operand)
assert(instr->operands.size() > operand); assert(instr->operands.size() > operand);
if (instr->operands[operand].isFixed()) if (instr->operands[operand].isFixed())
return false; return false;
if (!util_is_power_of_two_nonzero(instr->operands[operand].bytes()))
return false;
switch (instr->opcode) { switch (instr->opcode) {
case aco_opcode::p_extract_vector: case aco_opcode::p_extract_vector:
@ -2810,7 +2812,8 @@ label_instruction(opt_ctx& ctx, aco_ptr<Instruction>& instr)
instr->operands[0] = op; instr->operands[0] = op;
break; break;
} }
} else if (info.is_constant()) { } else if (info.is_constant() &&
util_is_power_of_two_nonzero(instr->definitions[0].bytes())) {
/* propagate constants */ /* propagate constants */
uint64_t mask = u_bit_consecutive64(0, instr->definitions[0].bytes() * 8u); uint64_t mask = u_bit_consecutive64(0, instr->definitions[0].bytes() * 8u);
uint64_t val = (info.val >> (dst_offset * 8u)) & mask; uint64_t val = (info.val >> (dst_offset * 8u)) & mask;

View file

@ -142,6 +142,10 @@ save_reg_writes(pr_opt_ctx& ctx, aco_ptr<Instruction>& instr)
ctx.instr_idx_by_regs[ctx.current_block->index][instr->pseudo().scratch_sgpr] = ctx.instr_idx_by_regs[ctx.current_block->index][instr->pseudo().scratch_sgpr] =
overwritten_unknown_instr; overwritten_unknown_instr;
} }
if (instr->isCall()) {
std::fill(ctx.instr_idx_by_regs[ctx.current_block->index].begin(),
ctx.instr_idx_by_regs[ctx.current_block->index].end(), overwritten_unknown_instr);
}
} }
Idx Idx
@ -862,6 +866,8 @@ instr_overwrites(Instruction* instr, PhysReg reg, unsigned size)
if (scratch_reg >= reg && reg + size > scratch_reg) if (scratch_reg >= reg && reg + size > scratch_reg)
return true; return true;
} }
if (instr->isCall())
return true;
return false; return false;
} }

View file

@ -672,7 +672,7 @@ build_end_with_regs(isel_context* ctx, std::vector<Operand>& regs)
Instruction* Instruction*
add_startpgm(struct isel_context* ctx, bool is_callee) add_startpgm(struct isel_context* ctx, bool is_callee)
{ {
ctx->program->scratch_arg_size += ctx->callee_info.scratch_param_size; ctx->program->scratch_arg_size += ctx->callee_info.scratch_param_size * ctx->program->wave_size;
unsigned def_count = 0; unsigned def_count = 0;
for (unsigned i = 0; i < ctx->args->arg_count; i++) { for (unsigned i = 0; i < ctx->args->arg_count; i++) {
@ -1034,8 +1034,7 @@ find_param_regs(Program* program, const ABI& abi, callee_info& info,
param_demand += Temp(0, it2->rc); param_demand += Temp(0, it2->rc);
it2->dst_info->needs_explicit_preservation = it2->dst_info->needs_explicit_preservation = regs == clobbered_regs;
regs == clobbered_regs && !it2->dst_info->discardable;
it2->dst_info->def.setPrecolored(*next_reg); it2->dst_info->def.setPrecolored(*next_reg);
for (unsigned i = 0; i < it2->rc.size(); ++i) for (unsigned i = 0; i < it2->rc.size(); ++i)
BITSET_CLEAR(regs, next_reg->reg() + i); BITSET_CLEAR(regs, next_reg->reg() + i);
@ -1051,8 +1050,7 @@ find_param_regs(Program* program, const ABI& abi, callee_info& info,
next_reg = next_reg->advance(required_padding * 4); next_reg = next_reg->advance(required_padding * 4);
} }
if (next_reg) { if (next_reg) {
params.back().dst_info->needs_explicit_preservation = params.back().dst_info->needs_explicit_preservation = regs == clobbered_regs;
regs == clobbered_regs && !params.back().dst_info->discardable;
param_demand += Temp(0, params.back().rc); param_demand += Temp(0, params.back().rc);
params.back().dst_info->def.setPrecolored(*next_reg); params.back().dst_info->def.setPrecolored(*next_reg);
BITSET_CLEAR_COUNT(regs, next_reg->reg(), params.back().rc.size()); BITSET_CLEAR_COUNT(regs, next_reg->reg(), params.back().rc.size());

View file

@ -3392,7 +3392,10 @@ visit_store_scratch(isel_context* ctx, nir_intrinsic_instr* instr)
offset = as_vgpr(ctx, offset); offset = as_vgpr(ctx, offset);
for (unsigned i = 0; i < write_count; i++) { for (unsigned i = 0; i < write_count; i++) {
aco_opcode op = get_buffer_store_op(write_datas[i].bytes()); aco_opcode op = get_buffer_store_op(write_datas[i].bytes());
Instruction* mubuf = bld.mubuf(op, rsrc, offset, ctx->program->scratch_offsets.back(), Operand soffset = Operand::c32(0);
if (!ctx->program->scratch_offsets.empty())
soffset = Operand(ctx->program->scratch_offsets.back());
Instruction* mubuf = bld.mubuf(op, rsrc, offset, soffset,
write_datas[i], offsets[i], true); write_datas[i], offsets[i], true);
mubuf->mubuf().sync = memory_sync_info(storage_scratch, semantic_private); mubuf->mubuf().sync = memory_sync_info(storage_scratch, semantic_private);
enum ac_access_type type = enum ac_access_type type =

View file

@ -298,6 +298,10 @@ BEGIN_TEST(assembler.long_jump.constaddr)
if (!setup_cs(NULL, (amd_gfx_level)GFX10)) if (!setup_cs(NULL, (amd_gfx_level)GFX10))
return; return;
//! llvm_version: #llvm_ver
fprintf(output, "llvm_version: %u\n", LLVM_VERSION_MAJOR);
//; funcs['lit'] = lambda v: 'lit(%s)' % hex(int(v)) if llvm_ver >= 22 else v
//>> s_branch 16369 ; bf823ff1 //>> s_branch 16369 ; bf823ff1
bld.sopp(aco_opcode::s_branch, 2); bld.sopp(aco_opcode::s_branch, 2);
@ -309,7 +313,7 @@ BEGIN_TEST(assembler.long_jump.constaddr)
bld.reset(program->create_and_insert_block()); bld.reset(program->create_and_insert_block());
//>> s_getpc_b64 s[0:1] ; be801f00 //>> s_getpc_b64 s[0:1] ; be801f00
//! s_add_u32 s0, s0, 32 ; 8000ff00 00000020 //! s_add_u32 s0, s0, @lit(32) ; 8000ff00 00000020
bld.sop1(aco_opcode::p_constaddr_getpc, Definition(PhysReg(0), s2), Operand::zero()); bld.sop1(aco_opcode::p_constaddr_getpc, Definition(PhysReg(0), s2), Operand::zero());
bld.sop2(aco_opcode::p_constaddr_addlo, Definition(PhysReg(0), s1), bld.def(s1, scc), bld.sop2(aco_opcode::p_constaddr_addlo, Definition(PhysReg(0), s1), bld.def(s1, scc),
Operand(PhysReg(0), s1), Operand::zero(), Operand::zero()); Operand(PhysReg(0), s1), Operand::zero(), Operand::zero());
@ -424,12 +428,16 @@ BEGIN_TEST(assembler.p_constaddr)
dst0.setFixed(PhysReg(0)); dst0.setFixed(PhysReg(0));
dst1.setFixed(PhysReg(2)); dst1.setFixed(PhysReg(2));
//! llvm_version: #llvm_ver
fprintf(output, "llvm_version: %u\n", LLVM_VERSION_MAJOR);
//; funcs['lit'] = lambda v: 'lit(%s)' % hex(int(v)) if llvm_ver >= 22 else v
//>> s_getpc_b64 s[0:1] ; be801c00 //>> s_getpc_b64 s[0:1] ; be801c00
//! s_add_u32 s0, s0, 44 ; 8000ff00 0000002c //! s_add_u32 s0, s0, @lit(44) ; 8000ff00 0000002c
bld.pseudo(aco_opcode::p_constaddr, dst0, bld.def(s1, scc), Operand::zero()); bld.pseudo(aco_opcode::p_constaddr, dst0, bld.def(s1, scc), Operand::zero());
//! s_getpc_b64 s[2:3] ; be821c00 //! s_getpc_b64 s[2:3] ; be821c00
//! s_add_u32 s2, s2, 64 ; 8002ff02 00000040 //! s_add_u32 s2, s2, @lit(64) ; 8002ff02 00000040
bld.pseudo(aco_opcode::p_constaddr, dst1, bld.def(s1, scc), Operand::c32(32)); bld.pseudo(aco_opcode::p_constaddr, dst1, bld.def(s1, scc), Operand::c32(32));
aco::lower_to_hw_instr(program.get()); aco::lower_to_hw_instr(program.get());
@ -1056,20 +1064,23 @@ BEGIN_TEST(assembler.exp)
Operand op_m0(bld.tmp(s1)); Operand op_m0(bld.tmp(s1));
op_m0.setFixed(m0); op_m0.setFixed(m0);
//~gfx11>> exp mrt3 v1, v0, v3, v2 ; f800003f 02030001 //! mrt3: @match_func(mrt3)
//~gfx12>> export mrt3 v1, v0, v3, v2 ; f800003f 02030001 fprintf(output, "mrt3: mrt3%s\n", LLVM_VERSION_MAJOR >= 23 ? "," : "");
//~gfx11>> exp @mrt3 v1, v0, v3, v2 ; f800003f 02030001
//~gfx12>> export @mrt3 v1, v0, v3, v2 ; f800003f 02030001
bld.exp(aco_opcode::exp, op[1], op[0], op[3], op[2], 0xf, 3); bld.exp(aco_opcode::exp, op[1], op[0], op[3], op[2], 0xf, 3);
//~gfx11! exp mrt3 v1, off, v0, off ; f8000035 80008001 //~gfx11! exp @mrt3 v1, off, v0, off ; f8000035 80008001
//~gfx12! export mrt3 v1, off, v0, off ; f8000035 80008001 //~gfx12! export @mrt3 v1, off, v0, off ; f8000035 80008001
bld.exp(aco_opcode::exp, op[1], Operand(v1), op[0], Operand(v1), 0x5, 3); bld.exp(aco_opcode::exp, op[1], Operand(v1), op[0], Operand(v1), 0x5, 3);
//~gfx11! exp mrt3 v1, v0, v3, v2 done ; f800083f 02030001 //~gfx11! exp @mrt3 v1, v0, v3, v2 done ; f800083f 02030001
//~gfx12! export mrt3 v1, v0, v3, v2 done ; f800083f 02030001 //~gfx12! export @mrt3 v1, v0, v3, v2 done ; f800083f 02030001
bld.exp(aco_opcode::exp, op[1], op[0], op[3], op[2], 0xf, 3, false, true); bld.exp(aco_opcode::exp, op[1], op[0], op[3], op[2], 0xf, 3, false, true);
//~gfx11! exp mrt3 v1, v0, v3, v2 row_en ; f800203f 02030001 //~gfx11! exp @mrt3 v1, v0, v3, v2 row_en ; f800203f 02030001
//~gfx12! export mrt3 v1, v0, v3, v2 row_en ; f800203f 02030001 //~gfx12! export @mrt3 v1, v0, v3, v2 row_en ; f800203f 02030001
bld.exp(aco_opcode::exp, op[1], op[0], op[3], op[2], op_m0, 0xf, 3)->exp().row_en = true; bld.exp(aco_opcode::exp, op[1], op[0], op[3], op[2], op_m0, 0xf, 3)->exp().row_en = true;
finish_assembler_test(); finish_assembler_test();

View file

@ -172,11 +172,14 @@ BEGIN_TEST(isel.discard_early_exit.mrtz)
} }
); );
//! mrtz: @match_func(mrtz)
fprintf(output, "mrtz: mrtz%s\n", LLVM_VERSION_MAJOR >= 23 ? "," : "");
/* On GFX11, the discard early exit must use mrtz if the shader exports only depth. */ /* On GFX11, the discard early exit must use mrtz if the shader exports only depth. */
//>> exp mrtz v#_, off, off, off done ; $_ $_ //>> exp @mrtz v#_, off, off, off done ; $_ $_
//! s_endpgm ; $_ //! s_endpgm ; $_
//! BB1: //! BB1:
//! exp mrtz off, off, off, off done ; $_ $_ //! exp @mrtz off, off, off, off done ; $_ $_
//! s_endpgm ; $_ //! s_endpgm ; $_
PipelineBuilder pbld(get_vk_device(GFX11)); PipelineBuilder pbld(get_vk_device(GFX11));
@ -197,11 +200,14 @@ BEGIN_TEST(isel.discard_early_exit.mrt0)
} }
); );
//! mrt0: @match_func(mrt0)
fprintf(output, "mrt0: mrt0%s\n", LLVM_VERSION_MAJOR >= 23 ? "," : "");
/* On GFX11, the discard early exit must use mrt0 if the shader exports color. */ /* On GFX11, the discard early exit must use mrt0 if the shader exports color. */
//>> exp mrt0 v#x, v#x, v#x, v#x done ; $_ $_ //>> exp @mrt0 v#x, v#x, v#x, v#x done ; $_ $_
//! s_endpgm ; $_ //! s_endpgm ; $_
//! BB1: //! BB1:
//! exp mrt0 off, off, off, off done ; $_ $_ //! exp @mrt0 off, off, off, off done ; $_ $_
//! s_endpgm ; $_ //! s_endpgm ; $_
PipelineBuilder pbld(get_vk_device(GFX11)); PipelineBuilder pbld(get_vk_device(GFX11));

View file

@ -145,7 +145,7 @@ main()
ir_id_to_offset(children[i]))).aabb; ir_id_to_offset(children[i]))).aabb;
float surface_area = aabb_surface_area(bounds); float surface_area = aabb_surface_area(bounds);
if (surface_area > largest_surface_area) { if (surface_area > largest_surface_area || collapsed_child_index == -1) {
largest_surface_area = surface_area; largest_surface_area = surface_area;
collapsed_child_index = i; collapsed_child_index = i;
} }

View file

@ -328,9 +328,23 @@ main()
vertex_used[i] = false; vertex_used[i] = false;
} }
} else { } else {
uint32_t chosen_invocation = uint32_t candidate_mask = radv_ballot(cluster, !assigned && required_bit_size == min_required_bit_size);
findMSB(radv_ballot(cluster, !assigned && required_bit_size == min_required_bit_size));
if (cluster.invocation_index != chosen_invocation && !assigned) { /* Always choose a quad as the first node to make sure that a potential single triangle node will have the
* highest hw_node_index.
*/
if (assigned_mask == 0) {
uint32_t quad_mask = radv_ballot(cluster, !assigned && pair_index_node_index1 != RADV_BVH_INVALID_NODE);
if (quad_mask != 0) {
uint32_t combined_mask = candidate_mask & quad_mask;
if (combined_mask != 0)
candidate_mask = combined_mask;
else
candidate_mask = quad_mask;
}
}
if (cluster.invocation_index != findMSB(candidate_mask) && !assigned) {
vertex_indices = UNASSIGNED_VERTEX_INDICES; vertex_indices = UNASSIGNED_VERTEX_INDICES;
for (uint32_t i = 0; i < 6; i++) for (uint32_t i = 0; i < 6; i++)
vertex_used[i] = false; vertex_used[i] = false;

View file

@ -778,8 +778,11 @@ sqtt_QueueSubmit2(VkQueue _queue, uint32_t submitCount, const VkSubmitInfo2 *pSu
if (queue->sqtt_present) if (queue->sqtt_present)
return radv_sqtt_wsi_submit(_queue, submitCount, pSubmits, _fence); return radv_sqtt_wsi_submit(_queue, submitCount, pSubmits, _fence);
if (instance->vk.trace_per_submit) if (instance->vk.trace_per_submit) {
/* Make sure to lock in case of multithreaded submissions. */
simple_mtx_lock(&device->sqtt.lock);
radv_sqtt_start_capturing(queue); radv_sqtt_start_capturing(queue);
}
for (uint32_t i = 0; i < submitCount; i++) { for (uint32_t i = 0; i < submitCount; i++) {
const VkSubmitInfo2 *pSubmit = &pSubmits[i]; const VkSubmitInfo2 *pSubmit = &pSubmits[i];
@ -863,12 +866,17 @@ sqtt_QueueSubmit2(VkQueue _queue, uint32_t submitCount, const VkSubmitInfo2 *pSu
"radv: Failed to capture RGP for this submit because the buffer is too small and auto-resizing " "radv: Failed to capture RGP for this submit because the buffer is too small and auto-resizing "
"is disabled. See RADV_THREAD_TRACE_BUFFER_SIZE for increasing the size.\n"); "is disabled. See RADV_THREAD_TRACE_BUFFER_SIZE for increasing the size.\n");
} }
simple_mtx_unlock(&device->sqtt.lock);
} }
return result; return result;
fail: fail:
FREE(new_cmdbufs); FREE(new_cmdbufs);
if (instance->vk.trace_per_submit) {
simple_mtx_unlock(&device->sqtt.lock);
}
return result; return result;
} }

View file

@ -0,0 +1,31 @@
/*
* Copyright © 2026 Valve Corporation
*
* SPDX-License-Identifier: MIT
*/
#include "radv_cmd_buffer.h"
#include "radv_device.h"
#include "radv_entrypoints.h"
VKAPI_ATTR void VKAPI_CALL
strange_brigade_CmdPipelineBarrier2(VkCommandBuffer commandBuffer, const VkDependencyInfo *pDependencyInfo)
{
VK_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
struct radv_device *device = radv_cmd_buffer_device(cmd_buffer);
for (uint32_t i = 0; i < pDependencyInfo->imageMemoryBarrierCount; i++) {
VkImageMemoryBarrier2 *barrier = (VkImageMemoryBarrier2 *)&pDependencyInfo->pImageMemoryBarriers[i];
if (barrier->newLayout == VK_IMAGE_LAYOUT_PRESENT_SRC_KHR &&
barrier->srcAccessMask == VK_ACCESS_COLOR_ATTACHMENT_READ_BIT) {
/* This game has a broken barrier right before present that causes rendering issues. Fix it
* by modifying the src access mask.
*/
barrier->srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
break;
}
}
device->layer_dispatch.app.CmdPipelineBarrier2(commandBuffer, pDependencyInfo);
}

View file

@ -22,6 +22,7 @@ radv_entrypoints_gen_command += [
'--device-prefix', 'rage2', '--device-prefix', 'rage2',
'--device-prefix', 'quantic_dream', '--device-prefix', 'quantic_dream',
'--device-prefix', 'no_mans_sky', '--device-prefix', 'no_mans_sky',
'--device-prefix', 'strange_brigade',
# Command buffer annotation layer entrypoints # Command buffer annotation layer entrypoints
'--device-prefix', 'annotate', '--device-prefix', 'annotate',
@ -42,6 +43,7 @@ libradv_files = files(
'layers/radv_rage2.c', 'layers/radv_rage2.c',
'layers/radv_quantic_dream.c', 'layers/radv_quantic_dream.c',
'layers/radv_no_mans_sky.c', 'layers/radv_no_mans_sky.c',
'layers/radv_strange_brigade.c',
'layers/radv_rmv_layer.c', 'layers/radv_rmv_layer.c',
'layers/radv_rra_layer.c', 'layers/radv_rra_layer.c',
'layers/radv_sqtt_layer.c', 'layers/radv_sqtt_layer.c',

View file

@ -97,6 +97,7 @@ enum radv_meta_object_key_type {
RADV_META_OBJECT_KEY_CLEAR_HIZ, RADV_META_OBJECT_KEY_CLEAR_HIZ,
RADV_META_OBJECT_KEY_FAST_CLEAR_ELIMINATE, RADV_META_OBJECT_KEY_FAST_CLEAR_ELIMINATE,
RADV_META_OBJECT_KEY_DCC_DECOMPRESS, RADV_META_OBJECT_KEY_DCC_DECOMPRESS,
RADV_META_OBJECT_KEY_DCC_DECOMPRESS_CS,
RADV_META_OBJECT_KEY_DCC_RETILE, RADV_META_OBJECT_KEY_DCC_RETILE,
RADV_META_OBJECT_KEY_HTILE_EXPAND_GFX, RADV_META_OBJECT_KEY_HTILE_EXPAND_GFX,
RADV_META_OBJECT_KEY_HTILE_EXPAND_CS, RADV_META_OBJECT_KEY_HTILE_EXPAND_CS,

View file

@ -1475,7 +1475,8 @@ radv_can_fast_clear_color(struct radv_cmd_buffer *cmd_buffer, const struct radv_
static void static void
radv_fast_clear_color(struct radv_cmd_buffer *cmd_buffer, const struct radv_image_view *iview, radv_fast_clear_color(struct radv_cmd_buffer *cmd_buffer, const struct radv_image_view *iview,
const VkClearAttachment *clear_att, const VkClearRect *clear_rect, const VkClearAttachment *clear_att, const VkClearRect *clear_rect,
enum radv_cmd_flush_bits *pre_flush, enum radv_cmd_flush_bits *post_flush) enum radv_cmd_flush_bits *pre_flush, enum radv_cmd_flush_bits *post_flush,
uint32_t view_mask)
{ {
struct radv_device *device = radv_cmd_buffer_device(cmd_buffer); struct radv_device *device = radv_cmd_buffer_device(cmd_buffer);
const struct radv_physical_device *pdev = radv_device_physical(device); const struct radv_physical_device *pdev = radv_device_physical(device);
@ -1488,7 +1489,8 @@ radv_fast_clear_color(struct radv_cmd_buffer *cmd_buffer, const struct radv_imag
.baseMipLevel = iview->vk.base_mip_level, .baseMipLevel = iview->vk.base_mip_level,
.levelCount = iview->vk.level_count, .levelCount = iview->vk.level_count,
.baseArrayLayer = iview->vk.base_array_layer + clear_rect->baseArrayLayer, .baseArrayLayer = iview->vk.base_array_layer + clear_rect->baseArrayLayer,
.layerCount = clear_rect->layerCount, /* radv_can_fast_clear_color blocks multiview fast clears unless the viewmask contains all layers */
.layerCount = view_mask ? iview->vk.layer_count : clear_rect->layerCount,
}; };
if (pre_flush) { if (pre_flush) {
@ -1575,7 +1577,7 @@ emit_clear(struct radv_cmd_buffer *cmd_buffer, const VkClearAttachment *clear_at
if (radv_can_fast_clear_color(cmd_buffer, color_att->iview, color_att->layout, clear_rect, clear_value, if (radv_can_fast_clear_color(cmd_buffer, color_att->iview, color_att->layout, clear_rect, clear_value,
view_mask)) { view_mask)) {
radv_fast_clear_color(cmd_buffer, color_att->iview, clear_att, clear_rect, pre_flush, post_flush); radv_fast_clear_color(cmd_buffer, color_att->iview, clear_att, clear_rect, pre_flush, post_flush, view_mask);
} else { } else {
emit_color_clear(cmd_buffer, clear_att, clear_rect, view_mask); emit_color_clear(cmd_buffer, clear_att, clear_rect, view_mask);
} }
@ -1877,7 +1879,7 @@ radv_fast_clear_range(struct radv_cmd_buffer *cmd_buffer, struct radv_image *ima
if (vk_format_is_color(format)) { if (vk_format_is_color(format)) {
if (radv_can_fast_clear_color(cmd_buffer, &iview, image_layout, &clear_rect, clear_att.clearValue.color, 0)) { if (radv_can_fast_clear_color(cmd_buffer, &iview, image_layout, &clear_rect, clear_att.clearValue.color, 0)) {
radv_fast_clear_color(cmd_buffer, &iview, &clear_att, &clear_rect, NULL, NULL); radv_fast_clear_color(cmd_buffer, &iview, &clear_att, &clear_rect, NULL, NULL, 0);
fast_cleared = true; fast_cleared = true;
} }
} else { } else {

View file

@ -144,6 +144,40 @@ gfx_or_compute_copy_memory_to_image(struct radv_cmd_buffer *cmd_buffer, uint64_t
(use_compute ? RADV_META_SAVE_COMPUTE_PIPELINE : RADV_META_SAVE_GRAPHICS_PIPELINE) | (use_compute ? RADV_META_SAVE_COMPUTE_PIPELINE : RADV_META_SAVE_GRAPHICS_PIPELINE) |
RADV_META_SAVE_CONSTANTS | RADV_META_SAVE_DESCRIPTORS); RADV_META_SAVE_CONSTANTS | RADV_META_SAVE_DESCRIPTORS);
if (use_compute) {
/* For partial copies, HTILE is decompressed before because image stores don't write the
* uncompressed DWORD to HTILE. And then it's needed to re-initialize HTILE to its
* uncompressed state after the copy.
*/
const bool is_partial_copy = region->imageOffset.x || region->imageOffset.y || region->imageOffset.z ||
region->imageExtent.width != image->vk.extent.width ||
region->imageExtent.height != image->vk.extent.height ||
region->imageExtent.depth != image->vk.extent.depth;
uint32_t queue_mask = radv_image_queue_family_mask(image, cmd_buffer->qf, cmd_buffer->qf);
if (radv_layout_is_htile_compressed(device, image, region->imageSubresource.mipLevel, layout, queue_mask) &&
is_partial_copy) {
radv_describe_barrier_start(cmd_buffer, RGP_BARRIER_UNKNOWN_REASON);
u_foreach_bit (i, region->imageSubresource.aspectMask) {
unsigned aspect_mask = 1u << i;
radv_expand_depth_stencil(
cmd_buffer, image,
&(VkImageSubresourceRange){
.aspectMask = aspect_mask,
.baseMipLevel = region->imageSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = region->imageSubresource.baseArrayLayer,
.layerCount = vk_image_subresource_layer_count(&image->vk, &region->imageSubresource),
},
NULL);
}
radv_describe_barrier_end(cmd_buffer);
}
}
/** /**
* From the Vulkan 1.0.6 spec: 18.3 Copying Data Between Images * From the Vulkan 1.0.6 spec: 18.3 Copying Data Between Images
* extent is the size in texels of the source image to copy in width, * extent is the size in texels of the source image to copy in width,
@ -222,6 +256,27 @@ gfx_or_compute_copy_memory_to_image(struct radv_cmd_buffer *cmd_buffer, uint64_t
slice_array++; slice_array++;
} }
if (use_compute) {
/* Fixup HTILE after a copy on compute. */
uint32_t queue_mask = radv_image_queue_family_mask(image, cmd_buffer->qf, cmd_buffer->qf);
if (radv_layout_is_htile_compressed(device, image, region->imageSubresource.mipLevel, layout, queue_mask)) {
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_CS_PARTIAL_FLUSH | RADV_CMD_FLAG_INV_VCACHE;
VkImageSubresourceRange range = {
.aspectMask = region->imageSubresource.aspectMask,
.baseMipLevel = region->imageSubresource.mipLevel,
.levelCount = 1,
.baseArrayLayer = region->imageSubresource.baseArrayLayer,
.layerCount = vk_image_subresource_layer_count(&image->vk, &region->imageSubresource),
};
uint32_t htile_value = radv_get_htile_initial_value(device, image);
cmd_buffer->state.flush_bits |= radv_clear_htile(cmd_buffer, image, &range, htile_value, false);
}
}
radv_meta_restore(&saved_state, cmd_buffer); radv_meta_restore(&saved_state, cmd_buffer);
} }
@ -704,7 +759,14 @@ radv_CmdCopyImage2(VkCommandBuffer commandBuffer, const VkCopyImageInfo2 *pCopyI
const enum util_format_layout format_layout = radv_format_description(dst_image->vk.format)->layout; const enum util_format_layout format_layout = radv_format_description(dst_image->vk.format)->layout;
for (unsigned r = 0; r < pCopyImageInfo->regionCount; r++) { for (unsigned r = 0; r < pCopyImageInfo->regionCount; r++) {
VkExtent3D dst_extent = pCopyImageInfo->pRegions[r].extent; VkExtent3D dst_extent = pCopyImageInfo->pRegions[r].extent;
if (src_image->vk.format != dst_image->vk.format) {
/* The Vulken spec 1.4.347 says:
*
* "VUID-VkCopyImageInfo2-srcImage-09247
* If the VkFormat of each of srcImage and dstImage is a compressed image format, the
* formats must have the same texel block extent"
*/
if (vk_format_is_compressed(src_image->vk.format) != vk_format_is_compressed(dst_image->vk.format)) {
dst_extent.width = dst_extent.width / vk_format_get_blockwidth(src_image->vk.format) * dst_extent.width = dst_extent.width / vk_format_get_blockwidth(src_image->vk.format) *
vk_format_get_blockwidth(dst_image->vk.format); vk_format_get_blockwidth(dst_image->vk.format);
dst_extent.height = dst_extent.height / vk_format_get_blockheight(src_image->vk.format) * dst_extent.height = dst_extent.height / vk_format_get_blockheight(src_image->vk.format) *

View file

@ -8,6 +8,7 @@
#include <stdbool.h> #include <stdbool.h>
#include "nir/radv_meta_nir.h" #include "nir/radv_meta_nir.h"
#include "radv_cs.h"
#include "radv_meta.h" #include "radv_meta.h"
enum radv_color_op { enum radv_color_op {
@ -19,7 +20,7 @@ enum radv_color_op {
static VkResult static VkResult
get_dcc_decompress_compute_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out) get_dcc_decompress_compute_pipeline(struct radv_device *device, VkPipeline *pipeline_out, VkPipelineLayout *layout_out)
{ {
enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_DCC_DECOMPRESS; enum radv_meta_object_key_type key = RADV_META_OBJECT_KEY_DCC_DECOMPRESS_CS;
VkResult result; VkResult result;
const VkDescriptorSetLayoutBinding bindings[] = { const VkDescriptorSetLayoutBinding bindings[] = {
@ -241,6 +242,7 @@ radv_process_color_image_layer(struct radv_cmd_buffer *cmd_buffer, struct radv_i
const VkImageSubresourceRange *range, int level, int layer, enum radv_color_op op) const VkImageSubresourceRange *range, int level, int layer, enum radv_color_op op)
{ {
struct radv_device *device = radv_cmd_buffer_device(cmd_buffer); struct radv_device *device = radv_cmd_buffer_device(cmd_buffer);
const struct radv_physical_device *pdev = radv_device_physical(device);
struct radv_image_view iview; struct radv_image_view iview;
uint32_t width, height; uint32_t width, height;
@ -303,9 +305,23 @@ radv_process_color_image_layer(struct radv_cmd_buffer *cmd_buffer, struct radv_i
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0); radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
if (op == FMASK_DECOMPRESS || op == DCC_DECOMPRESS) if (op == FMASK_DECOMPRESS || op == DCC_DECOMPRESS) {
/* On GFX6-8, the CB FMASK cache writes corrupted data if cache lines are flushed after their
* context has been retired. To avoid this, we must flush the CB metadata caches immediately
* after every FMASK decompress.
*
* PAL only applies this workaround on GFX6 but GFX7-8 are also affected and that matches
* RadeonSI.
*/
if (pdev->info.gfx_level <= GFX8 && op == FMASK_DECOMPRESS) {
radeon_begin(cmd_buffer->cs);
radeon_event_write(V_028A90_FLUSH_AND_INV_CB_META);
radeon_end();
}
cmd_buffer->state.flush_bits |= radv_src_access_flush(cmd_buffer, VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT, cmd_buffer->state.flush_bits |= radv_src_access_flush(cmd_buffer, VK_PIPELINE_STAGE_2_ALL_COMMANDS_BIT,
VK_ACCESS_2_COLOR_ATTACHMENT_WRITE_BIT, 0, image, range); VK_ACCESS_2_COLOR_ATTACHMENT_WRITE_BIT, 0, image, range);
}
const VkRenderingEndInfoKHR end_info = { const VkRenderingEndInfoKHR end_info = {
.sType = VK_STRUCTURE_TYPE_RENDERING_END_INFO_KHR, .sType = VK_STRUCTURE_TYPE_RENDERING_END_INFO_KHR,

View file

@ -467,7 +467,9 @@ radv_meta_resolve_depth_stencil_cs(struct radv_cmd_buffer *cmd_buffer, struct ra
radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer), VK_PIPELINE_BIND_POINT_COMPUTE, pipeline); radv_CmdBindPipeline(radv_cmd_buffer_to_handle(cmd_buffer), VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
const uint32_t push_constants[2] = {region->srcOffset.x, region->srcOffset.y}; const uint32_t push_constants[5] = {
region->srcOffset.x, region->srcOffset.y, region->dstOffset.x, region->dstOffset.y, region->dstOffset.z,
};
const VkPushConstantsInfoKHR pc_info = { const VkPushConstantsInfoKHR pc_info = {
.sType = VK_STRUCTURE_TYPE_PUSH_CONSTANTS_INFO_KHR, .sType = VK_STRUCTURE_TYPE_PUSH_CONSTANTS_INFO_KHR,

View file

@ -669,8 +669,8 @@ radv_meta_resolve_depth_stencil_fs(struct radv_cmd_buffer *cmd_buffer, struct ra
radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, radv_CmdSetViewport(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1,
&(VkViewport){ &(VkViewport){
.x = region->srcOffset.x, .x = region->dstOffset.x,
.y = region->srcOffset.y, .y = region->dstOffset.y,
.width = region->extent.width, .width = region->extent.width,
.height = region->extent.height, .height = region->extent.height,
.minDepth = 0.0f, .minDepth = 0.0f,
@ -679,6 +679,22 @@ radv_meta_resolve_depth_stencil_fs(struct radv_cmd_buffer *cmd_buffer, struct ra
radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &resolve_area); radv_CmdSetScissor(radv_cmd_buffer_to_handle(cmd_buffer), 0, 1, &resolve_area);
const uint32_t push_constants[2] = {
region->srcOffset.x - region->dstOffset.x,
region->srcOffset.y - region->dstOffset.y,
};
const VkPushConstantsInfoKHR push_constants_info = {
.sType = VK_STRUCTURE_TYPE_PUSH_CONSTANTS_INFO,
.layout = layout,
.stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT,
.offset = 0,
.size = sizeof(push_constants),
.pValues = push_constants,
};
radv_CmdPushConstants2(radv_cmd_buffer_to_handle(cmd_buffer), &push_constants_info);
radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0); radv_CmdDraw(radv_cmd_buffer_to_handle(cmd_buffer), 3, 1, 0, 0);
const VkRenderingEndInfoKHR end_info = { const VkRenderingEndInfoKHR end_info = {

View file

@ -1395,19 +1395,21 @@ radv_meta_nir_build_depth_stencil_resolve_compute_shader(struct radv_device *dev
nir_def *global_id = radv_meta_nir_get_global_ids(&b, 3); nir_def *global_id = radv_meta_nir_get_global_ids(&b, 3);
nir_def *offset = nir_load_push_constant(&b, 2, 32, nir_imm_int(&b, 0), .range = 8); nir_def *src_offset = nir_load_push_constant(&b, 2, 32, nir_imm_int(&b, 0), .range = 8);
nir_def *dst_offset = nir_load_push_constant(&b, 3, 32, nir_imm_int(&b, 8), .range = 20);
nir_def *resolve_coord = nir_iadd(&b, nir_trim_vector(&b, global_id, 2), offset); nir_def *src_coord = nir_iadd(&b, nir_trim_vector(&b, global_id, 2), src_offset);
nir_def *dst_coord = nir_iadd(&b, global_id, dst_offset);
nir_def *img_coord = nir_def *src_img_coord =
nir_vec3(&b, nir_channel(&b, resolve_coord, 0), nir_channel(&b, resolve_coord, 1), nir_channel(&b, global_id, 2)); nir_vec3(&b, nir_channel(&b, src_coord, 0), nir_channel(&b, src_coord, 1), nir_channel(&b, global_id, 2));
nir_deref_instr *input_img_deref = nir_build_deref_var(&b, input_img); nir_deref_instr *input_img_deref = nir_build_deref_var(&b, input_img);
nir_def *outval = nir_txf_ms(&b, img_coord, nir_imm_int(&b, 0), .texture_deref = input_img_deref); nir_def *outval = nir_txf_ms(&b, src_img_coord, nir_imm_int(&b, 0), .texture_deref = input_img_deref);
if (resolve_mode != VK_RESOLVE_MODE_SAMPLE_ZERO_BIT) { if (resolve_mode != VK_RESOLVE_MODE_SAMPLE_ZERO_BIT) {
for (int i = 1; i < samples; i++) { for (int i = 1; i < samples; i++) {
nir_def *si = nir_txf_ms(&b, img_coord, nir_imm_int(&b, i), .texture_deref = input_img_deref); nir_def *si = nir_txf_ms(&b, src_img_coord, nir_imm_int(&b, i), .texture_deref = input_img_deref);
switch (resolve_mode) { switch (resolve_mode) {
case VK_RESOLVE_MODE_AVERAGE_BIT: case VK_RESOLVE_MODE_AVERAGE_BIT:
@ -1435,8 +1437,8 @@ radv_meta_nir_build_depth_stencil_resolve_compute_shader(struct radv_device *dev
outval = nir_fdiv_imm(&b, outval, samples); outval = nir_fdiv_imm(&b, outval, samples);
} }
nir_def *coord = nir_vec4(&b, nir_channel(&b, img_coord, 0), nir_channel(&b, img_coord, 1), nir_def *coord = nir_vec4(&b, nir_channel(&b, dst_coord, 0), nir_channel(&b, dst_coord, 1),
nir_channel(&b, img_coord, 2), nir_undef(&b, 1, 32)); nir_channel(&b, dst_coord, 2), nir_undef(&b, 1, 32));
nir_image_deref_store(&b, &nir_build_deref_var(&b, output_img)->def, coord, nir_undef(&b, 1, 32), outval, nir_image_deref_store(&b, &nir_build_deref_var(&b, output_img)->def, coord, nir_undef(&b, 1, 32), outval,
nir_imm_int(&b, 0), .image_dim = GLSL_SAMPLER_DIM_2D, .image_array = true); nir_imm_int(&b, 0), .image_dim = GLSL_SAMPLER_DIM_2D, .image_array = true);
return b.shader; return b.shader;
@ -1495,10 +1497,11 @@ radv_meta_nir_build_depth_stencil_resolve_fragment_shader(struct radv_device *de
fs_out->data.location = index == RADV_META_DEPTH_RESOLVE ? FRAG_RESULT_DEPTH : FRAG_RESULT_STENCIL; fs_out->data.location = index == RADV_META_DEPTH_RESOLVE ? FRAG_RESULT_DEPTH : FRAG_RESULT_STENCIL;
nir_def *pos_in = nir_trim_vector(&b, nir_load_frag_coord(&b), 2); nir_def *pos_in = nir_trim_vector(&b, nir_load_frag_coord(&b), 2);
nir_def *src_offset = nir_load_push_constant(&b, 2, 32, nir_imm_int(&b, 0), .range = 8);
nir_def *pos_int = nir_f2i32(&b, pos_in); nir_def *pos_int = nir_f2i32(&b, pos_in);
nir_def *img_coord = nir_trim_vector(&b, pos_int, 2); nir_def *img_coord = nir_trim_vector(&b, nir_iadd(&b, pos_int, src_offset), 2);
nir_deref_instr *input_img_deref = nir_build_deref_var(&b, input_img); nir_deref_instr *input_img_deref = nir_build_deref_var(&b, input_img);
nir_def *outval = nir_txf_ms(&b, img_coord, nir_imm_int(&b, 0), .texture_deref = input_img_deref); nir_def *outval = nir_txf_ms(&b, img_coord, nir_imm_int(&b, 0), .texture_deref = input_img_deref);

View file

@ -114,11 +114,32 @@ gather_tail_call_instrs_block(nir_function *caller, const struct nir_block *bloc
if (call->callee->num_params != caller->num_params) if (call->callee->num_params != caller->num_params)
return; return;
for (unsigned i = 0; i < call->num_params; ++i) { for (unsigned i = 0; i < call->callee->num_params; ++i) {
if (call->callee->params[i].is_return != caller->params[i].is_return) if (call->callee->params[i].is_return != caller->params[i].is_return)
return; return;
if ((call->callee->params[i].driver_attributes & ACO_NIR_PARAM_ATTRIB_DISCARDABLE) &&
!(caller->params[i].driver_attributes & ACO_NIR_PARAM_ATTRIB_DISCARDABLE))
return;
bool has_preserved_regs =
(caller->driver_attributes & ACO_NIR_FUNCTION_ATTRIB_ABI_MASK) == ACO_NIR_CALL_ABI_AHIT_ISEC;
if (has_preserved_regs && ((call->callee->params[i].driver_attributes & ACO_NIR_PARAM_ATTRIB_DISCARDABLE) !=
(caller->params[i].driver_attributes & ACO_NIR_PARAM_ATTRIB_DISCARDABLE)))
return;
if (call->callee->params[i].is_uniform != caller->params[i].is_uniform)
return;
if (call->callee->params[i].bit_size != caller->params[i].bit_size)
return;
if (call->callee->params[i].num_components != caller->params[i].num_components)
return;
}
/* The call instruction itself has not been lowered to the new signature yet, so do this in a separate loop and
* adjust parameter indices for the caller.
*/
for (unsigned i = 0; i < call->num_params; ++i) {
unsigned caller_param_idx = i + ACO_NIR_CALL_SYSTEM_ARG_COUNT;
/* We can only do tail calls if the caller returns exactly the callee return values */ /* We can only do tail calls if the caller returns exactly the callee return values */
if (caller->params[i].is_return) { if (caller->params[caller_param_idx].is_return) {
assert(nir_def_as_deref_or_null(call->params[i].ssa)); assert(nir_def_as_deref_or_null(call->params[i].ssa));
nir_deref_instr *deref_root = nir_def_as_deref(call->params[i].ssa); nir_deref_instr *deref_root = nir_def_as_deref(call->params[i].ssa);
while (nir_deref_instr_parent(deref_root)) while (nir_deref_instr_parent(deref_root))
@ -129,16 +150,18 @@ gather_tail_call_instrs_block(nir_function *caller, const struct nir_block *bloc
nir_intrinsic_instr *intrin = nir_def_as_intrinsic_or_null(deref_root->parent.ssa); nir_intrinsic_instr *intrin = nir_def_as_intrinsic_or_null(deref_root->parent.ssa);
if (!intrin || intrin->intrinsic != nir_intrinsic_load_param) if (!intrin || intrin->intrinsic != nir_intrinsic_load_param)
return; return;
/* The call parameters aren't lowered at this point, we need to add the call arg count here */ if (nir_intrinsic_param_idx(intrin) != caller_param_idx)
if (nir_intrinsic_param_idx(intrin) != i + ACO_NIR_CALL_SYSTEM_ARG_COUNT) return;
} else if (!(caller->params[caller_param_idx].driver_attributes & ACO_NIR_PARAM_ATTRIB_DISCARDABLE)) {
/* If the parameter is not marked as discardable, then we have to preserve the caller's value. Passing
* a modified value to a tail call leaves us unable to restore the original value, so bail out if we have
* modified parameters.
*/
nir_intrinsic_instr *intrin = nir_def_as_intrinsic_or_null(call->params[i].ssa);
if (!intrin || intrin->intrinsic != nir_intrinsic_load_param ||
nir_intrinsic_param_idx(intrin) != caller_param_idx)
return; return;
} }
if (call->callee->params[i].is_uniform != caller->params[i].is_uniform)
return;
if (call->callee->params[i].bit_size != caller->params[i].bit_size)
return;
if (call->callee->params[i].num_components != caller->params[i].num_components)
return;
} }
_mesa_set_add(tail_calls, instr); _mesa_set_add(tail_calls, instr);

View file

@ -144,6 +144,7 @@ radv_get_ray_query_type()
struct ray_query_vars { struct ray_query_vars {
nir_variable *var; nir_variable *var;
bool use_bvh_stack_rtn;
bool shared_stack; bool shared_stack;
uint32_t shared_base; uint32_t shared_base;
uint32_t stack_entries; uint32_t stack_entries;
@ -162,13 +163,24 @@ init_ray_query_vars(nir_shader *shader, const glsl_type *opaque_type, struct ray
uint32_t shared_stack_entries = shader->info.ray_queries == 1 ? 16 : 8; uint32_t shared_stack_entries = shader->info.ray_queries == 1 ? 16 : 8;
/* ds_bvh_stack* instructions use a fixed stride of 32 dwords. */ /* ds_bvh_stack* instructions use a fixed stride of 32 dwords. */
if (radv_use_bvh_stack_rtn(pdev)) if (radv_use_bvh_stack_rtn(pdev))
workgroup_size = MAX2(workgroup_size, 32); workgroup_size = align(workgroup_size, 32);
uint32_t shared_stack_size = workgroup_size * shared_stack_entries * 4; uint32_t shared_stack_size = workgroup_size * shared_stack_entries * 4;
uint32_t shared_offset = align(shader->info.shared_size, 4); uint32_t shared_offset = align(shader->info.shared_size, 4);
if (shader->info.stage != MESA_SHADER_COMPUTE || glsl_type_is_array(opaque_type) || if (shader->info.stage != MESA_SHADER_COMPUTE || glsl_type_is_array(opaque_type) ||
shared_offset + shared_stack_size > pdev->max_shared_size) { shared_offset + shared_stack_size > pdev->max_shared_size) {
dst->stack_entries = MAX_SCRATCH_STACK_ENTRY_COUNT; dst->stack_entries = MAX_SCRATCH_STACK_ENTRY_COUNT;
} else { } else {
if (radv_use_bvh_stack_rtn(pdev)) {
/* The hardware ds_bvh_stack_rtn address can only encode a stack base up to 8191 dwords, or 16383 dwords on
* gfx12+.
*/
uint32_t num_wave32_groups = workgroup_size / 32;
uint32_t max_group_stack_base = (num_wave32_groups - 1) * 32 * shared_stack_entries;
uint32_t max_stack_base = (shared_offset / 4) + max_group_stack_base;
uint32_t max_hw_stack_base = pdev->info.gfx_level >= GFX12 ? 16384 : 8192;
dst->use_bvh_stack_rtn = max_stack_base < max_hw_stack_base;
}
dst->shared_stack = true; dst->shared_stack = true;
dst->shared_base = shared_offset; dst->shared_base = shared_offset;
dst->stack_entries = shared_stack_entries; dst->stack_entries = shared_stack_entries;
@ -303,7 +315,7 @@ lower_rq_initialize(nir_builder *b, nir_intrinsic_instr *instr, struct ray_query
if (vars->shared_stack) { if (vars->shared_stack) {
nir_def *stack_idx = nir_load_local_invocation_index(b); nir_def *stack_idx = nir_load_local_invocation_index(b);
if (radv_use_bvh_stack_rtn(pdev)) { if (vars->use_bvh_stack_rtn) {
uint32_t workgroup_size = uint32_t workgroup_size =
b->shader->info.workgroup_size[0] * b->shader->info.workgroup_size[1] * b->shader->info.workgroup_size[2]; b->shader->info.workgroup_size[0] * b->shader->info.workgroup_size[1] * b->shader->info.workgroup_size[2];
nir_def *addr = nir_def *addr =
@ -312,7 +324,6 @@ lower_rq_initialize(nir_builder *b, nir_intrinsic_instr *instr, struct ray_query
rq_store(b, rq, trav_stack_low_watermark, addr); rq_store(b, rq, trav_stack_low_watermark, addr);
} else { } else {
nir_def *base_offset = nir_imul_imm(b, stack_idx, sizeof(uint32_t)); nir_def *base_offset = nir_imul_imm(b, stack_idx, sizeof(uint32_t));
base_offset = nir_iadd_imm(b, base_offset, vars->shared_base);
rq_store(b, rq, trav_stack, base_offset); rq_store(b, rq, trav_stack, base_offset);
rq_store(b, rq, trav_stack_low_watermark, base_offset); rq_store(b, rq, trav_stack_low_watermark, base_offset);
} }
@ -482,7 +493,7 @@ store_stack_entry(nir_builder *b, nir_def *index, nir_def *value, const struct r
struct traversal_data *data = args->data; struct traversal_data *data = args->data;
if (data->vars->shared_stack) if (data->vars->shared_stack)
nir_store_shared(b, value, index, .base = 0, .align_mul = 4); nir_store_shared(b, value, index, .base = data->vars->shared_base, .align_mul = 4);
else else
nir_store_deref(b, nir_build_deref_array(b, rq_deref(b, data->rq, stack), index), value, 0x1); nir_store_deref(b, nir_build_deref_array(b, rq_deref(b, data->rq, stack), index), value, 0x1);
} }
@ -493,7 +504,7 @@ load_stack_entry(nir_builder *b, nir_def *index, const struct radv_ray_traversal
struct traversal_data *data = args->data; struct traversal_data *data = args->data;
if (data->vars->shared_stack) if (data->vars->shared_stack)
return nir_load_shared(b, 1, 32, index, .base = 0, .align_mul = 4); return nir_load_shared(b, 1, 32, index, .base = data->vars->shared_base, .align_mul = 4);
else else
return nir_load_deref(b, nir_build_deref_array(b, rq_deref(b, data->rq, stack), index)); return nir_load_deref(b, nir_build_deref_array(b, rq_deref(b, data->rq, stack), index));
} }
@ -563,19 +574,16 @@ lower_rq_proceed(nir_builder *b, nir_intrinsic_instr *instr, struct ray_query_va
}; };
if (vars->shared_stack) { if (vars->shared_stack) {
args.use_bvh_stack_rtn = radv_use_bvh_stack_rtn(pdev); args.use_bvh_stack_rtn = vars->use_bvh_stack_rtn;
if (args.use_bvh_stack_rtn) { if (args.use_bvh_stack_rtn) {
args.stack_stride = 1; args.stack_stride = 1;
args.stack_base = 0;
} else { } else {
uint32_t workgroup_size = uint32_t workgroup_size =
b->shader->info.workgroup_size[0] * b->shader->info.workgroup_size[1] * b->shader->info.workgroup_size[2]; b->shader->info.workgroup_size[0] * b->shader->info.workgroup_size[1] * b->shader->info.workgroup_size[2];
args.stack_stride = workgroup_size * 4; args.stack_stride = workgroup_size * 4;
args.stack_base = vars->shared_base;
} }
} else { } else {
args.stack_stride = 1; args.stack_stride = 1;
args.stack_base = 0;
} }
rq_store(b, rq, break_flag, nir_imm_false(b)); rq_store(b, rq, break_flag, nir_imm_false(b));

View file

@ -560,15 +560,12 @@ create_bvh_descriptor(nir_builder *b, const struct radv_physical_device *pdev, s
/* Enable pointer flags on GFX11+ */ /* Enable pointer flags on GFX11+ */
dword3 |= BITFIELD_BIT(119 - 96); dword3 |= BITFIELD_BIT(119 - 96);
/* Instead of the default box sorting (closest point), use largest for terminate_on_first_hit rays and midpoint /* Instead of the default box sorting (closest point), use largest for terminate_on_first_hit rays;
* for closest hit; this makes it more likely that the ray traversal will visit fewer nodes. */ * this makes it more likely that the ray traversal will visit fewer nodes. */
const uint32_t box_sort_largest = 1; const uint32_t box_sort_largest = 1;
const uint32_t box_sort_midpoint = 2;
/* Only use largest/midpoint sorting when all invocations have the same ray flags, otherwise /* Only use largest sorting when all invocations have the same ray flags, otherwise
* fall back to the default closest point. */ * fall back to the default closest point. */
dword1 = nir_bcsel(b, nir_vote_any(b, 1, ray_flags->terminate_on_first_hit), dword1,
nir_imm_int(b, (box_sort_midpoint << 21) | sort_triangles_first | box_sort_enable));
dword1 = nir_bcsel(b, nir_vote_all(b, 1, ray_flags->terminate_on_first_hit), dword1 = nir_bcsel(b, nir_vote_all(b, 1, ray_flags->terminate_on_first_hit),
nir_imm_int(b, (box_sort_largest << 21) | sort_triangles_first | box_sort_enable), dword1); nir_imm_int(b, (box_sort_largest << 21) | sort_triangles_first | box_sort_enable), dword1);
} }
@ -878,7 +875,7 @@ radv_build_ray_traversal(struct radv_device *device, nir_builder *b, const struc
/* Early exit if we never overflowed the stack, to avoid having to backtrack to /* Early exit if we never overflowed the stack, to avoid having to backtrack to
* the root for no reason. */ * the root for no reason. */
if (!args->use_bvh_stack_rtn) { if (!args->use_bvh_stack_rtn) {
nir_push_if(b, nir_ilt_imm(b, nir_load_deref(b, args->vars.stack), args->stack_base + args->stack_stride)); nir_push_if(b, nir_ilt_imm(b, nir_load_deref(b, args->vars.stack), args->stack_stride));
{ {
nir_store_var(b, incomplete, nir_imm_false(b), 0x1); nir_store_var(b, incomplete, nir_imm_false(b), 0x1);
nir_jump(b, nir_jump_break); nir_jump(b, nir_jump_break);
@ -1174,7 +1171,7 @@ radv_build_ray_traversal_gfx12(struct radv_device *device, nir_builder *b, const
/* Early exit if we never overflowed the stack, to avoid having to backtrack to /* Early exit if we never overflowed the stack, to avoid having to backtrack to
* the root for no reason. */ * the root for no reason. */
if (!args->use_bvh_stack_rtn) { if (!args->use_bvh_stack_rtn) {
nir_push_if(b, nir_ilt_imm(b, nir_load_deref(b, args->vars.stack), args->stack_base + args->stack_stride)); nir_push_if(b, nir_ilt_imm(b, nir_load_deref(b, args->vars.stack), args->stack_stride));
{ {
nir_store_var(b, incomplete, nir_imm_false(b), 0x1); nir_store_var(b, incomplete, nir_imm_false(b), 0x1);
nir_jump(b, nir_jump_break); nir_jump(b, nir_jump_break);

View file

@ -135,10 +135,9 @@ struct radv_ray_traversal_args {
struct radv_ray_traversal_vars vars; struct radv_ray_traversal_vars vars;
/* The increment/decrement used for radv_ray_traversal_vars::stack, and how many entries are /* The increment/decrement used for radv_ray_traversal_vars::stack, and how many entries are
* available. stack_base is the base address of the stack. */ * available. */
uint32_t stack_stride; uint32_t stack_stride;
uint32_t stack_entries; uint32_t stack_entries;
uint32_t stack_base;
uint32_t set_flags; uint32_t set_flags;
uint32_t unset_flags; uint32_t unset_flags;

View file

@ -39,7 +39,7 @@ radv_nir_init_traversal_params(nir_function *function, unsigned payload_size)
function->params = rzalloc_array_size(function->shader, sizeof(nir_parameter), function->num_params); function->params = rzalloc_array_size(function->shader, sizeof(nir_parameter), function->num_params);
radv_nir_init_common_rt_params(function); radv_nir_init_common_rt_params(function);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_TRAVERSAL_ADDR, glsl_uint64_t_type(), true, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_TRAVERSAL_ADDR, glsl_uint64_t_type(), true, 0);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_SHADER_RECORD_PTR, glsl_uint64_t_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_SHADER_RECORD_PTR, glsl_uint64_t_type(), false, ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_ACCEL_STRUCT, glsl_uint64_t_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_ACCEL_STRUCT, glsl_uint64_t_type(), false, 0);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_CULL_MASK_AND_FLAGS, glsl_uint_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_CULL_MASK_AND_FLAGS, glsl_uint_type(), false, 0);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_SBT_OFFSET, glsl_uint_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_SBT_OFFSET, glsl_uint_type(), false, 0);
@ -49,12 +49,13 @@ radv_nir_init_traversal_params(nir_function *function, unsigned payload_size)
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_RAY_TMIN, glsl_float_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_RAY_TMIN, glsl_float_type(), false, 0);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_RAY_DIRECTION, glsl_vector_type(GLSL_TYPE_UINT, 3), false, radv_nir_param_from_type(function->params + TRAVERSAL_ARG_RAY_DIRECTION, glsl_vector_type(GLSL_TYPE_UINT, 3), false,
0); 0);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_RAY_TMAX, glsl_float_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_RAY_TMAX, glsl_float_type(), false,
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_PRIMITIVE_ADDR, glsl_uint64_t_type(), false, 0); ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_PRIMITIVE_ID, glsl_uint_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_PRIMITIVE_ADDR, glsl_uint64_t_type(), false, ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_INSTANCE_ADDR, glsl_uint64_t_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_PRIMITIVE_ID, glsl_uint_type(), false, ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_GEOMETRY_ID_AND_FLAGS, glsl_uint_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_INSTANCE_ADDR, glsl_uint64_t_type(), false, ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_HIT_KIND, glsl_uint_type(), false, 0); radv_nir_param_from_type(function->params + TRAVERSAL_ARG_GEOMETRY_ID_AND_FLAGS, glsl_uint_type(), false, ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + TRAVERSAL_ARG_HIT_KIND, glsl_uint_type(), false, ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
for (unsigned i = 0; i < DIV_ROUND_UP(payload_size, 4); ++i) { for (unsigned i = 0; i < DIV_ROUND_UP(payload_size, 4); ++i) {
radv_nir_return_param_from_type(function->params + TRAVERSAL_ARG_PAYLOAD_BASE + i, glsl_uint_type(), false, 0); radv_nir_return_param_from_type(function->params + TRAVERSAL_ARG_PAYLOAD_BASE + i, glsl_uint_type(), false, 0);
} }
@ -128,15 +129,11 @@ radv_nir_init_rt_function_params(nir_function *function, mesa_shader_stage stage
radv_nir_init_common_rt_params(function); radv_nir_init_common_rt_params(function);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_TRAVERSAL_ADDR, glsl_uint64_t_type(), true, 0); radv_nir_param_from_type(function->params + CHIT_MISS_ARG_TRAVERSAL_ADDR, glsl_uint64_t_type(), true, 0);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_SHADER_RECORD_PTR, glsl_uint64_t_type(), false, 0); radv_nir_param_from_type(function->params + CHIT_MISS_ARG_SHADER_RECORD_PTR, glsl_uint64_t_type(), false, 0);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_ACCEL_STRUCT, glsl_uint64_t_type(), false, radv_nir_param_from_type(function->params + CHIT_MISS_ARG_ACCEL_STRUCT, glsl_uint64_t_type(), false, 0);
ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_CULL_MASK_AND_FLAGS, glsl_uint_type(), false, 0); radv_nir_param_from_type(function->params + CHIT_MISS_ARG_CULL_MASK_AND_FLAGS, glsl_uint_type(), false, 0);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_SBT_OFFSET, glsl_uint_type(), false, radv_nir_param_from_type(function->params + CHIT_MISS_ARG_SBT_OFFSET, glsl_uint_type(), false, 0);
ACO_NIR_PARAM_ATTRIB_DISCARDABLE); radv_nir_param_from_type(function->params + CHIT_MISS_ARG_SBT_STRIDE, glsl_uint_type(), false, 0);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_SBT_STRIDE, glsl_uint_type(), false, radv_nir_param_from_type(function->params + CHIT_MISS_ARG_MISS_INDEX, glsl_uint_type(), false, 0);
ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_MISS_INDEX, glsl_uint_type(), false,
ACO_NIR_PARAM_ATTRIB_DISCARDABLE);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_RAY_ORIGIN, glsl_vector_type(GLSL_TYPE_UINT, 3), false, radv_nir_param_from_type(function->params + CHIT_MISS_ARG_RAY_ORIGIN, glsl_vector_type(GLSL_TYPE_UINT, 3), false,
0); 0);
radv_nir_param_from_type(function->params + CHIT_MISS_ARG_RAY_TMIN, glsl_float_type(), false, 0); radv_nir_param_from_type(function->params + CHIT_MISS_ARG_RAY_TMIN, glsl_float_type(), false, 0);

View file

@ -1251,7 +1251,6 @@ radv_build_traversal(struct radv_device *device, struct radv_ray_tracing_pipelin
.vars = trav_vars_args, .vars = trav_vars_args,
.stack_stride = stack_stride, .stack_stride = stack_stride,
.stack_entries = MAX_STACK_ENTRY_COUNT, .stack_entries = MAX_STACK_ENTRY_COUNT,
.stack_base = 0,
.ignore_cull_mask = params->ignore_cull_mask, .ignore_cull_mask = params->ignore_cull_mask,
.set_flags = info ? info->set_flags : 0, .set_flags = info ? info->set_flags : 0,
.unset_flags = info ? info->unset_flags : 0, .unset_flags = info ? info->unset_flags : 0,

View file

@ -9550,9 +9550,9 @@ radv_handle_color_fbfetch_output(struct radv_cmd_buffer *cmd_buffer, uint32_t in
radv_describe_barrier_start(cmd_buffer, RGP_BARRIER_UNKNOWN_REASON); radv_describe_barrier_start(cmd_buffer, RGP_BARRIER_UNKNOWN_REASON);
/* Force a transition to FEEDBACK_LOOP_OPTIMAL to decompress DCC. */ /* Force a transition to FEEDBACK_LOOP_OPTIMAL to decompress DCC. */
radv_handle_image_transition(cmd_buffer, att->iview->image, att->layout, radv_handle_rendering_image_transition(
VK_IMAGE_LAYOUT_ATTACHMENT_FEEDBACK_LOOP_OPTIMAL_EXT, RADV_QUEUE_GENERAL, cmd_buffer, att->iview, render->layer_count, render->view_mask, att->layout, VK_IMAGE_LAYOUT_UNDEFINED,
RADV_QUEUE_GENERAL, &range, NULL); VK_IMAGE_LAYOUT_ATTACHMENT_FEEDBACK_LOOP_OPTIMAL_EXT, VK_IMAGE_LAYOUT_UNDEFINED, NULL);
radv_describe_barrier_end(cmd_buffer); radv_describe_barrier_end(cmd_buffer);
@ -9597,9 +9597,10 @@ radv_handle_depth_fbfetch_output(struct radv_cmd_buffer *cmd_buffer)
radv_describe_barrier_start(cmd_buffer, RGP_BARRIER_UNKNOWN_REASON); radv_describe_barrier_start(cmd_buffer, RGP_BARRIER_UNKNOWN_REASON);
/* Force a transition to FEEDBACK_LOOP_OPTIMAL to decompress HTILE. */ /* Force a transition to FEEDBACK_LOOP_OPTIMAL to decompress HTILE. */
radv_handle_image_transition(cmd_buffer, att->iview->image, att->layout, radv_handle_rendering_image_transition(cmd_buffer, att->iview, render->layer_count, render->view_mask, att->layout,
VK_IMAGE_LAYOUT_ATTACHMENT_FEEDBACK_LOOP_OPTIMAL_EXT, RADV_QUEUE_GENERAL, att->stencil_layout, VK_IMAGE_LAYOUT_ATTACHMENT_FEEDBACK_LOOP_OPTIMAL_EXT,
RADV_QUEUE_GENERAL, &range, NULL); VK_IMAGE_LAYOUT_ATTACHMENT_FEEDBACK_LOOP_OPTIMAL_EXT,
render->sample_locations.count > 0 ? &render->sample_locations : NULL);
radv_describe_barrier_end(cmd_buffer); radv_describe_barrier_end(cmd_buffer);
@ -9642,9 +9643,11 @@ radv_CmdExecuteCommands(VkCommandBuffer commandBuffer, uint32_t commandBufferCou
VK_FROM_HANDLE(radv_cmd_buffer, primary, commandBuffer); VK_FROM_HANDLE(radv_cmd_buffer, primary, commandBuffer);
struct radv_device *device = radv_cmd_buffer_device(primary); struct radv_device *device = radv_cmd_buffer_device(primary);
const struct radv_physical_device *pdev = radv_device_physical(device); const struct radv_physical_device *pdev = radv_device_physical(device);
const bool is_gfx_or_ace = primary->qf == RADV_QUEUE_GENERAL || primary->qf == RADV_QUEUE_COMPUTE;
assert(commandBufferCount > 0); assert(commandBufferCount > 0);
if (is_gfx_or_ace) {
radv_emit_mip_change_flush_default(primary); radv_emit_mip_change_flush_default(primary);
/* Emit pending flushes on primary prior to executing secondary */ /* Emit pending flushes on primary prior to executing secondary */
@ -9652,6 +9655,7 @@ radv_CmdExecuteCommands(VkCommandBuffer commandBuffer, uint32_t commandBufferCou
/* Make sure CP DMA is idle on primary prior to executing secondary. */ /* Make sure CP DMA is idle on primary prior to executing secondary. */
radv_cp_dma_wait_for_idle(primary); radv_cp_dma_wait_for_idle(primary);
}
for (uint32_t i = 0; i < commandBufferCount; i++) { for (uint32_t i = 0; i < commandBufferCount; i++) {
VK_FROM_HANDLE(radv_cmd_buffer, secondary, pCmdBuffers[i]); VK_FROM_HANDLE(radv_cmd_buffer, secondary, pCmdBuffers[i]);
@ -9694,6 +9698,9 @@ radv_CmdExecuteCommands(VkCommandBuffer commandBuffer, uint32_t commandBufferCou
if (primary->state.dirty & RADV_CMD_DIRTY_FBFETCH_OUTPUT) { if (primary->state.dirty & RADV_CMD_DIRTY_FBFETCH_OUTPUT) {
radv_handle_fbfetch_output(primary); radv_handle_fbfetch_output(primary);
primary->state.dirty &= ~RADV_CMD_DIRTY_FBFETCH_OUTPUT; primary->state.dirty &= ~RADV_CMD_DIRTY_FBFETCH_OUTPUT;
/* Emit pending flushes if a late decompression was performed. */
radv_emit_cache_flush(primary);
} }
if (primary->state.render.active && (primary->state.dirty & RADV_CMD_DIRTY_FRAMEBUFFER)) { if (primary->state.render.active && (primary->state.dirty & RADV_CMD_DIRTY_FRAMEBUFFER)) {
@ -9769,23 +9776,12 @@ radv_CmdExecuteCommands(VkCommandBuffer commandBuffer, uint32_t commandBufferCou
device->ws->cs_execute_secondary(primary_cs->b, secondary_cs->b, allow_ib2); device->ws->cs_execute_secondary(primary_cs->b, secondary_cs->b, allow_ib2);
/* When the secondary command buffer is compute only we don't
* need to re-emit the current graphics pipeline.
*/
if (secondary->state.emitted_graphics_pipeline) {
primary->state.emitted_graphics_pipeline = secondary->state.emitted_graphics_pipeline; primary->state.emitted_graphics_pipeline = secondary->state.emitted_graphics_pipeline;
}
/* When the secondary command buffer is graphics only we don't
* need to re-emit the current compute pipeline.
*/
if (secondary->state.emitted_compute_pipeline) {
primary->state.emitted_compute_pipeline = secondary->state.emitted_compute_pipeline; primary->state.emitted_compute_pipeline = secondary->state.emitted_compute_pipeline;
}
if (secondary->state.emitted_rt_pipeline) {
primary->state.emitted_rt_pipeline = secondary->state.emitted_rt_pipeline; primary->state.emitted_rt_pipeline = secondary->state.emitted_rt_pipeline;
}
primary->state.ps_epilog = secondary->state.ps_epilog;
primary->state.emitted_vs_prolog = secondary->state.emitted_vs_prolog;
if (secondary->state.last_ia_multi_vgt_param) { if (secondary->state.last_ia_multi_vgt_param) {
primary->state.last_ia_multi_vgt_param = secondary->state.last_ia_multi_vgt_param; primary->state.last_ia_multi_vgt_param = secondary->state.last_ia_multi_vgt_param;
@ -10389,13 +10385,17 @@ radv_cs_emit_compute_predication(const struct radv_device *device, struct radv_c
} }
ALWAYS_INLINE static void ALWAYS_INLINE static void
radv_gfx12_emit_hiz_wa(const struct radv_device *device, const struct radv_cmd_state *cmd_state, radv_gfx12_emit_wa(const struct radv_device *device, const struct radv_cmd_state *cmd_state, struct radv_cmd_stream *cs)
struct radv_cmd_stream *cs)
{ {
const struct radv_physical_device *pdev = radv_device_physical(device); const struct radv_physical_device *pdev = radv_device_physical(device);
const struct radv_rendering_state *render = &cmd_state->render; const struct radv_rendering_state *render = &cmd_state->render;
const bool hiz_partial_wa_enabled = pdev->gfx12_hiz_wa == RADV_GFX12_HIZ_WA_PARTIAL && render->gfx12_has_hiz;
const bool vrs_export_wa_enabled = pdev->info.has_vrs_export_bug && cmd_state->last_vgt_shader &&
cmd_state->last_vgt_shader->info.outinfo.writes_primitive_shading_rate;
if (pdev->gfx12_hiz_wa == RADV_GFX12_HIZ_WA_PARTIAL && render->gfx12_has_hiz) { /* Emit BOP events to mitigate some hardware bugs on GFX12. */
if (hiz_partial_wa_enabled || vrs_export_wa_enabled) {
assert(pdev->info.gfx_level == GFX12);
radeon_begin(cs); radeon_begin(cs);
radeon_emit(PKT3(PKT3_RELEASE_MEM, 6, 0)); radeon_emit(PKT3(PKT3_RELEASE_MEM, 6, 0));
radeon_emit(S_490_EVENT_TYPE(V_028A90_BOTTOM_OF_PIPE_TS) | S_490_EVENT_INDEX(5)); radeon_emit(S_490_EVENT_TYPE(V_028A90_BOTTOM_OF_PIPE_TS) | S_490_EVENT_INDEX(5));
@ -10421,7 +10421,7 @@ radv_cs_emit_draw_packet(struct radv_cmd_buffer *cmd_buffer, uint32_t vertex_cou
radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX | use_opaque); radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX | use_opaque);
radeon_end(); radeon_end();
radv_gfx12_emit_hiz_wa(device, &cmd_buffer->state, cs); radv_gfx12_emit_wa(device, &cmd_buffer->state, cs);
} }
/** /**
@ -10451,7 +10451,7 @@ radv_cs_emit_draw_indexed_packet(struct radv_cmd_buffer *cmd_buffer, uint64_t in
radeon_emit(V_0287F0_DI_SRC_SEL_DMA | S_0287F0_NOT_EOP(not_eop)); radeon_emit(V_0287F0_DI_SRC_SEL_DMA | S_0287F0_NOT_EOP(not_eop));
radeon_end(); radeon_end();
radv_gfx12_emit_hiz_wa(device, &cmd_buffer->state, cs); radv_gfx12_emit_wa(device, &cmd_buffer->state, cs);
} }
/* MUST inline this function to avoid massive perf loss in drawoverhead */ /* MUST inline this function to avoid massive perf loss in drawoverhead */
@ -10503,7 +10503,7 @@ radv_cs_emit_indirect_draw_packet(struct radv_cmd_buffer *cmd_buffer, bool index
radeon_end(); radeon_end();
radv_gfx12_emit_hiz_wa(device, &cmd_buffer->state, cs); radv_gfx12_emit_wa(device, &cmd_buffer->state, cs);
cmd_buffer->state.uses_draw_indirect = true; cmd_buffer->state.uses_draw_indirect = true;
} }
@ -10549,7 +10549,7 @@ radv_cs_emit_indirect_mesh_draw_packet(struct radv_cmd_buffer *cmd_buffer, uint3
radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX); radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX);
radeon_end(); radeon_end();
radv_gfx12_emit_hiz_wa(device, &cmd_buffer->state, cs); radv_gfx12_emit_wa(device, &cmd_buffer->state, cs);
} }
ALWAYS_INLINE static void ALWAYS_INLINE static void
@ -10633,7 +10633,7 @@ radv_cs_emit_dispatch_taskmesh_gfx_packet(const struct radv_device *device, cons
radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX); radeon_emit(V_0287F0_DI_SRC_SEL_AUTO_INDEX);
radeon_end(); radeon_end();
radv_gfx12_emit_hiz_wa(device, cmd_state, cs); radv_gfx12_emit_wa(device, cmd_state, cs);
} }
ALWAYS_INLINE static void ALWAYS_INLINE static void
@ -10937,7 +10937,7 @@ radv_cs_emit_mesh_dispatch_packet(struct radv_cmd_buffer *cmd_buffer, uint32_t x
radeon_emit(S_0287F0_SOURCE_SELECT(V_0287F0_DI_SRC_SEL_AUTO_INDEX)); radeon_emit(S_0287F0_SOURCE_SELECT(V_0287F0_DI_SRC_SEL_AUTO_INDEX));
radeon_end(); radeon_end();
radv_gfx12_emit_hiz_wa(device, &cmd_buffer->state, cs); radv_gfx12_emit_wa(device, &cmd_buffer->state, cs);
} }
ALWAYS_INLINE static void ALWAYS_INLINE static void
@ -15174,10 +15174,19 @@ radv_CmdBeginTransformFeedbackEXT(VkCommandBuffer commandBuffer, uint32_t firstC
assert(firstCounterBuffer + counterBufferCount <= MAX_SO_BUFFERS); assert(firstCounterBuffer + counterBufferCount <= MAX_SO_BUFFERS);
if (pdev->info.gfx_level >= GFX12) if (pdev->info.gfx_level >= GFX12) {
radv_init_streamout_state(cmd_buffer); radv_init_streamout_state(cmd_buffer);
else if (!pdev->use_ngg_streamout)
/* Invalidate L2 in case the buffer filled size needs to be saved because COPY_DATA isn't
* coherent with L2.
*/
if (pdev->info.cp_sdma_ge_use_system_memory_scope) {
cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_INV_L2;
radv_emit_cache_flush(cmd_buffer);
}
} else if (!pdev->use_ngg_streamout) {
radv_flush_vgt_streamout(cmd_buffer); radv_flush_vgt_streamout(cmd_buffer);
}
ASSERTED unsigned cdw_max = radeon_check_space(device->ws, cs->b, MAX_SO_BUFFERS * 10); ASSERTED unsigned cdw_max = radeon_check_space(device->ws, cs->b, MAX_SO_BUFFERS * 10);

View file

@ -390,8 +390,8 @@ static void
radv_add_split_disasm(const char *disasm, uint64_t start_addr, unsigned *num, struct radv_shader_inst *instructions) radv_add_split_disasm(const char *disasm, uint64_t start_addr, unsigned *num, struct radv_shader_inst *instructions)
{ {
struct radv_shader_inst *last_inst = *num ? &instructions[*num - 1] : NULL; struct radv_shader_inst *last_inst = *num ? &instructions[*num - 1] : NULL;
char *next; const char *next;
char *repeat = strstr(disasm, "then repeated"); const char *repeat = strstr(disasm, "then repeated");
while ((next = strchr(disasm, '\n'))) { while ((next = strchr(disasm, '\n'))) {
struct radv_shader_inst *inst = &instructions[*num]; struct radv_shader_inst *inst = &instructions[*num];

View file

@ -786,6 +786,8 @@ init_dispatch_tables(struct radv_device *device, struct radv_physical_device *pd
add_entrypoints(&b, &quantic_dream_device_entrypoints, RADV_APP_DISPATCH_TABLE); add_entrypoints(&b, &quantic_dream_device_entrypoints, RADV_APP_DISPATCH_TABLE);
} else if (!strcmp(instance->drirc.debug.app_layer, "no_mans_sky")) { } else if (!strcmp(instance->drirc.debug.app_layer, "no_mans_sky")) {
add_entrypoints(&b, &no_mans_sky_device_entrypoints, RADV_APP_DISPATCH_TABLE); add_entrypoints(&b, &no_mans_sky_device_entrypoints, RADV_APP_DISPATCH_TABLE);
} else if (!strcmp(instance->drirc.debug.app_layer, "strange_brigade")) {
add_entrypoints(&b, &strange_brigade_device_entrypoints, RADV_APP_DISPATCH_TABLE);
} }
if (instance->vk.trace_mode & RADV_TRACE_MODE_RGP) if (instance->vk.trace_mode & RADV_TRACE_MODE_RGP)
@ -1239,7 +1241,13 @@ radv_CreateDevice(VkPhysicalDevice physicalDevice, const VkDeviceCreateInfo *pCr
device->ws = pdev->ws; device->ws = pdev->ws;
device->vk.sync = device->ws->get_sync_provider(device->ws); device->vk.sync = device->ws->get_sync_provider(device->ws);
device->vk.copy_sync_payloads = pdev->ws->copy_sync_payloads;
/* Disable unordered submits when SQTT queue events are enabled because queue present events
* might be missing otherwise.
*/
device->vk.copy_sync_payloads = ((instance->vk.trace_mode & RADV_TRACE_MODE_RGP) && radv_sqtt_queue_events_enabled())
? NULL
: pdev->ws->copy_sync_payloads;
/* Enable the global BO list by default. */ /* Enable the global BO list by default. */
/* TODO: Remove the per cmdbuf BO list tracking after few Mesa releases if no blockers. */ /* TODO: Remove the per cmdbuf BO list tracking after few Mesa releases if no blockers. */

View file

@ -500,9 +500,9 @@ radv_image_view_init(struct radv_image_view *iview, struct radv_device *device,
if (!extra_create_info || !extra_create_info->from_client) if (!extra_create_info || !extra_create_info->from_client)
assert(pCreateInfo->flags & VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA); assert(pCreateInfo->flags & VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA);
vk_image_view_init(&device->vk, &iview->vk, pCreateInfo);
memset(&iview->descriptor, 0, sizeof(iview->descriptor)); memset(iview, 0, sizeof(*iview));
vk_image_view_init(&device->vk, &iview->vk, pCreateInfo);
iview->image = image; iview->image = image;
iview->plane_id = radv_plane_from_aspect(pCreateInfo->subresourceRange.aspectMask); iview->plane_id = radv_plane_from_aspect(pCreateInfo->subresourceRange.aspectMask);
@ -664,13 +664,13 @@ radv_hiz_image_view_init(struct radv_image_view *iview, struct radv_device *devi
VK_FROM_HANDLE(radv_image, image, pCreateInfo->image); VK_FROM_HANDLE(radv_image, image, pCreateInfo->image);
assert(pCreateInfo->flags & VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA); assert(pCreateInfo->flags & VK_IMAGE_VIEW_CREATE_DRIVER_INTERNAL_BIT_MESA);
memset(iview, 0, sizeof(*iview));
vk_image_view_init(&device->vk, &iview->vk, pCreateInfo); vk_image_view_init(&device->vk, &iview->vk, pCreateInfo);
assert(vk_format_has_depth(image->vk.format) && vk_format_has_stencil(image->vk.format)); assert(vk_format_has_depth(image->vk.format) && vk_format_has_stencil(image->vk.format));
assert(iview->vk.aspects == VK_IMAGE_ASPECT_DEPTH_BIT); assert(iview->vk.aspects == VK_IMAGE_ASPECT_DEPTH_BIT);
memset(&iview->descriptor, 0, sizeof(iview->descriptor));
iview->image = image; iview->image = image;
const uint32_t type = const uint32_t type =

View file

@ -1662,7 +1662,7 @@ radv_graphics_shaders_link_varyings(struct radv_shader_stage *stages, enum amd_g
/* Scalarize all I/O, because nir_opt_varyings and nir_opt_vectorize_io expect all I/O to be scalarized. */ /* Scalarize all I/O, because nir_opt_varyings and nir_opt_vectorize_io expect all I/O to be scalarized. */
nir_variable_mode sca_mode = nir_var_shader_in; nir_variable_mode sca_mode = nir_var_shader_in;
bool sca_progress; bool sca_progress = false;
if (s != MESA_SHADER_FRAGMENT) if (s != MESA_SHADER_FRAGMENT)
sca_mode |= nir_var_shader_out; sca_mode |= nir_var_shader_out;

View file

@ -409,16 +409,16 @@ radv_rt_nir_to_asm(struct radv_device *device, struct vk_pipeline_cache *cache,
stage->info.inline_push_constant_mask = stage->args.ac.inline_push_const_mask; stage->info.inline_push_constant_mask = stage->args.ac.inline_push_const_mask;
stage->info.type = radv_is_traversal_shader(stage->nir) ? RADV_SHADER_TYPE_RT_TRAVERSAL : RADV_SHADER_TYPE_DEFAULT; stage->info.type = radv_is_traversal_shader(stage->nir) ? RADV_SHADER_TYPE_RT_TRAVERSAL : RADV_SHADER_TYPE_DEFAULT;
/* Move ray tracing system values to the top that are set by rt_trace_ray
* to prevent them from being overwritten by other rt_trace_ray calls.
*/
NIR_PASS(_, stage->nir, move_rt_instructions);
uint32_t num_resume_shaders = 0; uint32_t num_resume_shaders = 0;
nir_shader **resume_shaders = NULL; nir_shader **resume_shaders = NULL;
void *mem_ctx = ralloc_context(NULL); void *mem_ctx = ralloc_context(NULL);
if (stage->stage != MESA_SHADER_INTERSECTION && mode == RADV_RT_LOWERING_MODE_CPS) { if (stage->stage != MESA_SHADER_INTERSECTION && mode == RADV_RT_LOWERING_MODE_CPS) {
/* Move ray tracing system values to the top that are set by rt_trace_ray
* to prevent them from being overwritten by other rt_trace_ray calls.
*/
NIR_PASS(_, stage->nir, move_rt_instructions);
nir_builder b = nir_builder_at(nir_after_impl(nir_shader_get_entrypoint(stage->nir))); nir_builder b = nir_builder_at(nir_after_impl(nir_shader_get_entrypoint(stage->nir)));
nir_rt_return_amd(&b); nir_rt_return_amd(&b);
@ -541,6 +541,7 @@ radv_rt_nir_to_asm(struct radv_device *device, struct vk_pipeline_cache *cache,
if (dump_shader) if (dump_shader)
simple_mtx_unlock(&instance->shader_dump_mtx); simple_mtx_unlock(&instance->shader_dump_mtx);
ralloc_free(mem_ctx);
free(binary); free(binary);
*out_shader = shader; *out_shader = shader;
@ -674,7 +675,7 @@ radv_rt_compile_shaders(struct radv_device *device, struct vk_pipeline_cache *ca
bool can_use_monolithic = !library && pipeline->stage_count < 50; bool can_use_monolithic = !library && pipeline->stage_count < 50;
for (uint32_t i = 0; i < pCreateInfo->stageCount; i++) { for (uint32_t i = 0; i < pCreateInfo->stageCount; i++) {
if (rt_stages[i].shader || rt_stages[i].nir) if (rt_stages[i].nir)
continue; continue;
int64_t stage_start = os_time_get_nano(); int64_t stage_start = os_time_get_nano();
@ -749,7 +750,7 @@ radv_rt_compile_shaders(struct radv_device *device, struct vk_pipeline_cache *ca
inline_any_hit_shaders |= raygen_lowering_mode == RADV_RT_LOWERING_MODE_MONOLITHIC && !raygen_imported; inline_any_hit_shaders |= raygen_lowering_mode == RADV_RT_LOWERING_MODE_MONOLITHIC && !raygen_imported;
for (uint32_t idx = 0; idx < pCreateInfo->stageCount; idx++) { for (uint32_t idx = 0; idx < pCreateInfo->stageCount; idx++) {
if (rt_stages[idx].shader || rt_stages[idx].nir) if (rt_stages[idx].nir)
continue; continue;
int64_t stage_start = os_time_get_nano(); int64_t stage_start = os_time_get_nano();
@ -1462,17 +1463,39 @@ radv_GetRayTracingShaderGroupStackSizeKHR(VkDevice device, VkPipeline _pipeline,
VK_FROM_HANDLE(radv_pipeline, pipeline, _pipeline); VK_FROM_HANDLE(radv_pipeline, pipeline, _pipeline);
struct radv_ray_tracing_pipeline *rt_pipeline = radv_pipeline_to_ray_tracing(pipeline); struct radv_ray_tracing_pipeline *rt_pipeline = radv_pipeline_to_ray_tracing(pipeline);
struct radv_ray_tracing_group *rt_group = &rt_pipeline->groups[group]; struct radv_ray_tracing_group *rt_group = &rt_pipeline->groups[group];
struct radv_ray_tracing_stage *shader_stage;
switch (groupShader) { switch (groupShader) {
case VK_SHADER_GROUP_SHADER_GENERAL_KHR: case VK_SHADER_GROUP_SHADER_GENERAL_KHR:
case VK_SHADER_GROUP_SHADER_CLOSEST_HIT_KHR: case VK_SHADER_GROUP_SHADER_CLOSEST_HIT_KHR:
return rt_pipeline->stages[rt_group->recursive_shader].stack_size; shader_stage = &rt_pipeline->stages[rt_group->recursive_shader];
break;
case VK_SHADER_GROUP_SHADER_ANY_HIT_KHR: case VK_SHADER_GROUP_SHADER_ANY_HIT_KHR:
return rt_pipeline->stages[rt_group->any_hit_shader].stack_size; /* If the any-hit shader is inlined into an intersection shader, there is no stack specific to the any-hit shader
* and all stack will be allocated for the intersection shader instead.
*/
if (rt_group->intersection_shader != VK_SHADER_UNUSED_KHR)
return 0;
shader_stage = &rt_pipeline->stages[rt_group->any_hit_shader];
break;
case VK_SHADER_GROUP_SHADER_INTERSECTION_KHR: case VK_SHADER_GROUP_SHADER_INTERSECTION_KHR:
return rt_pipeline->stages[rt_group->intersection_shader].stack_size; shader_stage = &rt_pipeline->stages[rt_group->intersection_shader];
break;
default: default:
return 0; return 0;
} }
uint32_t stack_size = shader_stage->stack_size;
/* Applications need to allocate stack for the traversal shader, too. The API doesn't intend for a constant
* traversal stack size, so add the stack size to every shader potentially called by the traversal shader.
* Applications are expected to max() shader stages together, so this shouldn't result in any unnecessary stack
* usage.
*/
if (shader_stage->stage == MESA_SHADER_CLOSEST_HIT || shader_stage->stage == MESA_SHADER_ANY_HIT ||
shader_stage->stage == MESA_SHADER_INTERSECTION || shader_stage->stage == MESA_SHADER_MISS)
stack_size += rt_pipeline->traversal_stack_size;
return stack_size;
} }
VKAPI_ATTR VkResult VKAPI_CALL VKAPI_ATTR VkResult VKAPI_CALL

View file

@ -790,7 +790,7 @@ rra_map_accel_struct_data(struct rra_copy_context *ctx, uint32_t i)
if (radv_GetEventStatus(ctx->device, data->build_event) != VK_EVENT_SET) if (radv_GetEventStatus(ctx->device, data->build_event) != VK_EVENT_SET)
return NULL; return NULL;
if (data->buffer->memory) { if (data->buffer && data->buffer->memory) {
VkMemoryMapInfo memory_map_info = { VkMemoryMapInfo memory_map_info = {
.sType = VK_STRUCTURE_TYPE_MEMORY_MAP_INFO, .sType = VK_STRUCTURE_TYPE_MEMORY_MAP_INFO,
.memory = data->buffer->memory, .memory = data->buffer->memory,

View file

@ -216,6 +216,7 @@ radv_sdma_get_surf(const struct radv_device *const device, const struct radv_ima
.texel_scale = radv_sdma_get_texel_scale(image), .texel_scale = radv_sdma_get_texel_scale(image),
.is_linear = surf->is_linear, .is_linear = surf->is_linear,
.is_3d = surf->u.gfx9.resource_type == RADEON_RESOURCE_3D, .is_3d = surf->u.gfx9.resource_type == RADEON_RESOURCE_3D,
.is_stencil = subresource.aspectMask == VK_IMAGE_ASPECT_STENCIL_BIT,
}; };
const uint64_t surf_offset = (subresource.aspectMask == VK_IMAGE_ASPECT_STENCIL_BIT) ? surf->u.gfx9.zs.stencil_offset const uint64_t surf_offset = (subresource.aspectMask == VK_IMAGE_ASPECT_STENCIL_BIT) ? surf->u.gfx9.zs.stencil_offset
@ -371,6 +372,7 @@ radv_sdma_emit_copy_tiled_sub_window(const struct radv_device *device, struct ra
.va = tiled->va, .va = tiled->va,
.format = radv_format_to_pipe_format(tiled->aspect_format), .format = radv_format_to_pipe_format(tiled->aspect_format),
.bpp = tiled->bpp, .bpp = tiled->bpp,
.is_stencil = tiled->is_stencil,
.offset = .offset =
{ {
.x = tiled_off.x, .x = tiled_off.x,
@ -414,6 +416,7 @@ radv_sdma_emit_copy_t2t_sub_window(const struct radv_device *device, struct radv
.va = src->va, .va = src->va,
.format = radv_format_to_pipe_format(src->aspect_format), .format = radv_format_to_pipe_format(src->aspect_format),
.bpp = src->bpp, .bpp = src->bpp,
.is_stencil = src->is_stencil,
.offset = .offset =
{ {
.x = src_off.x, .x = src_off.x,
@ -439,6 +442,7 @@ radv_sdma_emit_copy_t2t_sub_window(const struct radv_device *device, struct radv
.va = dst->va, .va = dst->va,
.format = radv_format_to_pipe_format(dst->aspect_format), .format = radv_format_to_pipe_format(dst->aspect_format),
.bpp = dst->bpp, .bpp = dst->bpp,
.is_stencil = dst->is_stencil,
.offset = .offset =
{ {
.x = dst_off.x, .x = dst_off.x,
@ -606,12 +610,6 @@ radv_sdma_use_t2t_scanline_copy(const struct radv_device *device, const struct r
return true; return true;
} }
/* The two images can have a different block size,
* but must have the same swizzle mode.
*/
if (src->micro_tile_mode != dst->micro_tile_mode)
return true;
/* The T2T subwindow copy packet only has fields for one metadata configuration. /* The T2T subwindow copy packet only has fields for one metadata configuration.
* It can either compress or decompress, or copy uncompressed images, but it * It can either compress or decompress, or copy uncompressed images, but it
* can't copy from a compressed image to another. * can't copy from a compressed image to another.
@ -619,6 +617,16 @@ radv_sdma_use_t2t_scanline_copy(const struct radv_device *device, const struct r
if (src->is_compressed && dst->is_compressed) if (src->is_compressed && dst->is_compressed)
return true; return true;
if (ver >= SDMA_7_0) {
/* No support for tiling format transformation at all. */
if (src->surf->u.gfx9.swizzle_mode != dst->surf->u.gfx9.swizzle_mode)
return true;
} else {
/* The two images can have a different block size, but must have the same swizzle mode. */
if (src->micro_tile_mode != dst->micro_tile_mode)
return true;
}
const bool needs_3d_alignment = src->is_3d && (src->micro_tile_mode == RADEON_MICRO_MODE_DISPLAY || const bool needs_3d_alignment = src->is_3d && (src->micro_tile_mode == RADEON_MICRO_MODE_DISPLAY ||
src->micro_tile_mode == RADEON_MICRO_MODE_STANDARD); src->micro_tile_mode == RADEON_MICRO_MODE_STANDARD);
const unsigned log2bpp = util_logbase2(src->bpp); const unsigned log2bpp = util_logbase2(src->bpp);

View file

@ -31,6 +31,7 @@ struct radv_sdma_surf {
uint8_t texel_scale; /* Texel scale for 96-bit formats */ uint8_t texel_scale; /* Texel scale for 96-bit formats */
bool is_linear; /* Whether the image is linear. */ bool is_linear; /* Whether the image is linear. */
bool is_3d; /* Whether the image is 3-dimensional. */ bool is_3d; /* Whether the image is 3-dimensional. */
bool is_stencil; /* Whether the image is stencil only. */
union { union {
/* linear images only */ /* linear images only */

View file

@ -655,15 +655,24 @@ radv_shader_spirv_to_nir(struct radv_device *device, const struct radv_shader_st
NIR_PASS(_, nir, nir_lower_compute_system_values, &csv_options); NIR_PASS(_, nir, nir_lower_compute_system_values, &csv_options);
} }
bool lower_local_invocation_index = false;
if (nir->info.derivative_group == DERIVATIVE_GROUP_QUADS &&
((nir->info.stage == MESA_SHADER_COMPUTE || nir->info.stage == MESA_SHADER_TASK ||
(nir->info.stage == MESA_SHADER_MESH && pdev->info.mesh_fast_launch_2)))) {
lower_local_invocation_index = true;
} else if (nir->info.stage == MESA_SHADER_COMPUTE &&
(((nir->info.workgroup_size[0] == 1) + (nir->info.workgroup_size[1] == 1) +
(nir->info.workgroup_size[2] == 1)) == 2)) {
lower_local_invocation_index = true;
}
nir_lower_compute_system_values_options csv_options = { nir_lower_compute_system_values_options csv_options = {
/* Mesh shaders run as NGG which can implement local_invocation_index from /* Mesh shaders run as NGG which can implement local_invocation_index from
* the wave ID in merged_wave_info, but they don't have local_invocation_ids on GFX10.3. * the wave ID in merged_wave_info, but they don't have local_invocation_ids on GFX10.3.
*/ */
.lower_cs_local_id_to_index = nir->info.stage == MESA_SHADER_MESH && !pdev->info.mesh_fast_launch_2, .lower_cs_local_id_to_index = nir->info.stage == MESA_SHADER_MESH && !pdev->info.mesh_fast_launch_2,
.lower_local_invocation_index = nir->info.stage == MESA_SHADER_COMPUTE && .lower_local_invocation_index = lower_local_invocation_index,
((((nir->info.workgroup_size[0] == 1) + (nir->info.workgroup_size[1] == 1) +
(nir->info.workgroup_size[2] == 1)) == 2) ||
nir->info.derivative_group == DERIVATIVE_GROUP_QUADS),
}; };
NIR_PASS(_, nir, nir_lower_compute_system_values, &csv_options); NIR_PASS(_, nir, nir_lower_compute_system_values, &csv_options);

View file

@ -950,8 +950,8 @@ radv_GetPhysicalDeviceVideoCapabilitiesKHR(VkPhysicalDevice physicalDevice, cons
struct VkVideoDecodeH265CapabilitiesKHR *ext = struct VkVideoDecodeH265CapabilitiesKHR *ext =
vk_find_struct(pCapabilities->pNext, VIDEO_DECODE_H265_CAPABILITIES_KHR); vk_find_struct(pCapabilities->pNext, VIDEO_DECODE_H265_CAPABILITIES_KHR);
pCapabilities->maxDpbSlots = RADV_VIDEO_H264_MAX_DPB_SLOTS; pCapabilities->maxDpbSlots = RADV_VIDEO_H265_MAX_DPB_SLOTS;
pCapabilities->maxActiveReferencePictures = RADV_VIDEO_H264_MAX_NUM_REF_FRAME; pCapabilities->maxActiveReferencePictures = RADV_VIDEO_H265_MAX_NUM_REF_FRAME;
/* for h265 on navi21+ separate dpb images should work */ /* for h265 on navi21+ separate dpb images should work */
if (radv_enable_tier2(pdev)) if (radv_enable_tier2(pdev))
pCapabilities->flags |= VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR; pCapabilities->flags |= VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR;
@ -1120,7 +1120,7 @@ radv_GetPhysicalDeviceVideoCapabilitiesKHR(VkPhysicalDevice physicalDevice, cons
enc_caps->encodeInputPictureGranularity = pCapabilities->pictureAccessGranularity; enc_caps->encodeInputPictureGranularity = pCapabilities->pictureAccessGranularity;
ext->maxTiles.width = 2; ext->maxTiles.width = 2;
ext->maxTiles.height = 16; ext->maxTiles.height = 16;
ext->minTileSize.width = 64; ext->minTileSize.width = pdev->enc_hw_ver >= RADV_VIDEO_ENC_HW_5 ? 256 : 128;
ext->minTileSize.height = 64; ext->minTileSize.height = 64;
ext->maxTileSize.width = 4096; ext->maxTileSize.width = 4096;
ext->maxTileSize.height = 4096; ext->maxTileSize.height = 4096;
@ -2320,22 +2320,6 @@ get_av1_msg(struct radv_device *device, struct radv_video_session *vid, struct v
result.tx_mode = pi->TxMode; result.tx_mode = pi->TxMode;
result.reference_mode = (pi->flags.reference_select == 1) ? 2 : 0; result.reference_mode = (pi->flags.reference_select == 1) ? 2 : 0;
if (pi->pTileInfo) {
result.tile_cols = pi->pTileInfo->TileCols;
result.tile_rows = pi->pTileInfo->TileRows;
result.tile_size_bytes = pi->pTileInfo->tile_size_bytes_minus_1;
result.context_update_tile_id = pi->pTileInfo->context_update_tile_id;
for (i = 0; i < result.tile_cols; i++)
result.tile_col_start_sb[i] = pi->pTileInfo->pMiColStarts[i];
result.tile_col_start_sb[result.tile_cols] =
result.tile_col_start_sb[result.tile_cols - 1] + pi->pTileInfo->pWidthInSbsMinus1[result.tile_cols - 1] + 1;
for (i = 0; i < pi->pTileInfo->TileRows; i++)
result.tile_row_start_sb[i] = pi->pTileInfo->pMiRowStarts[i];
result.tile_row_start_sb[result.tile_rows] =
result.tile_row_start_sb[result.tile_rows - 1] + pi->pTileInfo->pHeightInSbsMinus1[result.tile_rows - 1] + 1;
}
result.max_width = seq_hdr->max_frame_width_minus_1 + 1; result.max_width = seq_hdr->max_frame_width_minus_1 + 1;
result.max_height = seq_hdr->max_frame_height_minus_1 + 1; result.max_height = seq_hdr->max_frame_height_minus_1 + 1;
VkExtent2D frameExtent = frame_info->dstPictureResource.codedExtent; VkExtent2D frameExtent = frame_info->dstPictureResource.codedExtent;
@ -2351,6 +2335,44 @@ get_av1_msg(struct radv_device *device, struct radv_video_session *vid, struct v
result.superres_upscaled_width = frameExtent.width; result.superres_upscaled_width = frameExtent.width;
if (pi->pTileInfo) {
result.tile_cols = pi->pTileInfo->TileCols;
result.tile_rows = pi->pTileInfo->TileRows;
result.tile_size_bytes = pi->pTileInfo->tile_size_bytes_minus_1;
result.context_update_tile_id = pi->pTileInfo->context_update_tile_id;
/* pMi{Row,Col}Starts is unreliable, some apps send SB, some send MI, so use
* p{Width,Height}InSbsMinus1 instead. But for uniform_tile_spacing_flag,
* those are not defined by spec. */
if (pi->pTileInfo->flags.uniform_tile_spacing_flag) {
const unsigned sb_size = seq_hdr->flags.use_128x128_superblock ? 128 : 64;
const unsigned sb_width = DIV_ROUND_UP(result.width, sb_size);
const unsigned sb_height = DIV_ROUND_UP(result.height, sb_size);
const unsigned tile_width_sb = DIV_ROUND_UP(sb_width, result.tile_cols);
const unsigned tile_height_sb = DIV_ROUND_UP(sb_height, result.tile_rows);
result.tile_col_start_sb[0] = 0;
for (i = 1; i < result.tile_cols; ++i)
result.tile_col_start_sb[i] = result.tile_col_start_sb[i - 1] + tile_width_sb;
result.tile_col_start_sb[i] = sb_width;
result.tile_row_start_sb[0] = 0;
for (i = 1; i < result.tile_rows; ++i)
result.tile_row_start_sb[i] = result.tile_row_start_sb[i - 1] + tile_height_sb;
result.tile_row_start_sb[i] = sb_height;
} else {
result.tile_col_start_sb[0] = 0;
assert(pi->pTileInfo->pMiColStarts[0] == 0);
for (i = 0; i < result.tile_cols; ++i)
result.tile_col_start_sb[i + 1] = result.tile_col_start_sb[i] + pi->pTileInfo->pWidthInSbsMinus1[i] + 1;
result.tile_row_start_sb[0] = 0;
assert(pi->pTileInfo->pMiRowStarts[0] == 0);
for (i = 0; i < result.tile_rows; ++i)
result.tile_row_start_sb[i + 1] = result.tile_row_start_sb[i] + pi->pTileInfo->pHeightInSbsMinus1[i] + 1;
}
}
result.order_hint_bits = seq_hdr->order_hint_bits_minus_1 + 1; result.order_hint_bits = seq_hdr->order_hint_bits_minus_1 + 1;
/* The VCN FW will evict references that aren't specified in /* The VCN FW will evict references that aren't specified in

View file

@ -1095,7 +1095,7 @@ radv_enc_slice_header(struct radv_cmd_buffer *cmd_buffer, const VkVideoEncodeInf
radv_enc_code_ue(cmd_buffer, 6); radv_enc_code_ue(cmd_buffer, 6);
break; break;
} }
radv_enc_code_ue(cmd_buffer, 0x0); radv_enc_code_ue(cmd_buffer, pic->pic_parameter_set_id);
unsigned int max_frame_num_bits = sps->log2_max_frame_num_minus4 + 4; unsigned int max_frame_num_bits = sps->log2_max_frame_num_minus4 + 4;
radv_enc_code_fixed_bits(cmd_buffer, pic->frame_num % (1 << max_frame_num_bits), max_frame_num_bits); radv_enc_code_fixed_bits(cmd_buffer, pic->frame_num % (1 << max_frame_num_bits), max_frame_num_bits);

View file

@ -759,6 +759,7 @@ error_va_map:
ac_drm_bo_free(ws->dev, buf_handle); ac_drm_bo_free(ws->dev, buf_handle);
error_bo_alloc: error_bo_alloc:
ac_drm_va_range_free(va_handle);
free(ranges); free(ranges);
error_va_alloc: error_va_alloc:

View file

@ -376,13 +376,15 @@ hk_bind_descriptor_sets(UNUSED struct hk_cmd_buffer *cmd,
* *
* This means that, if some earlier set gets bound in such a way that * This means that, if some earlier set gets bound in such a way that
* it changes set_dynamic_buffer_start[s], this binding is implicitly * it changes set_dynamic_buffer_start[s], this binding is implicitly
* invalidated. Therefore, we can always look at the current value * invalidated.
* of set_dynamic_buffer_start[s] as the base of our dynamic buffer
* range and it's only our responsibility to adjust all
* set_dynamic_buffer_start[p] for p > s as needed.
*/ */
uint8_t dyn_buffer_start = uint8_t dyn_buffer_start = 0u;
desc->root.set_dynamic_buffer_start[info->firstSet]; for (uint32_t i = 0u; i < info->firstSet; ++i) {
const struct hk_descriptor_set_layout *set_layout =
vk_to_hk_descriptor_set_layout(pipeline_layout->set_layouts[i]);
if (set_layout)
dyn_buffer_start += set_layout->dynamic_buffer_count;
}
uint32_t next_dyn_offset = 0; uint32_t next_dyn_offset = 0;
for (uint32_t i = 0; i < info->descriptorSetCount; ++i) { for (uint32_t i = 0; i < info->descriptorSetCount; ++i) {
@ -427,10 +429,6 @@ hk_bind_descriptor_sets(UNUSED struct hk_cmd_buffer *cmd,
assert(dyn_buffer_start <= HK_MAX_DYNAMIC_BUFFERS); assert(dyn_buffer_start <= HK_MAX_DYNAMIC_BUFFERS);
assert(next_dyn_offset <= info->dynamicOffsetCount); assert(next_dyn_offset <= info->dynamicOffsetCount);
for (uint32_t s = info->firstSet + info->descriptorSetCount; s < HK_MAX_SETS;
s++)
desc->root.set_dynamic_buffer_start[s] = dyn_buffer_start;
desc->root_dirty = true; desc->root_dirty = true;
} }

View file

@ -3212,6 +3212,9 @@ hk_handle_passthrough_gs(struct hk_cmd_buffer *cmd, struct agx_draw draw)
struct hk_graphics_state *gfx = &cmd->state.gfx; struct hk_graphics_state *gfx = &cmd->state.gfx;
struct hk_api_shader *gs = gfx->shaders[MESA_SHADER_GEOMETRY]; struct hk_api_shader *gs = gfx->shaders[MESA_SHADER_GEOMETRY];
if (!IS_SHADER_DIRTY(VERTEX) && !IS_SHADER_DIRTY(GEOMETRY))
return;
/* If there's an application geometry shader, there's nothing to un/bind */ /* If there's an application geometry shader, there's nothing to un/bind */
if (gs && !gs->is_passthrough) if (gs && !gs->is_passthrough)
return; return;
@ -3221,20 +3224,17 @@ hk_handle_passthrough_gs(struct hk_cmd_buffer *cmd, struct agx_draw draw)
uint32_t xfb_outputs = last_sw->info.xfb_info.output_count; uint32_t xfb_outputs = last_sw->info.xfb_info.output_count;
bool needs_gs = xfb_outputs; bool needs_gs = xfb_outputs;
/* If we already have a matching GS configuration, we're done */
if ((gs != NULL) == needs_gs)
return;
/* If we don't need a GS but we do have a passthrough, unbind it */ /* If we don't need a GS but we do have a passthrough, unbind it */
if (gs) { if (!needs_gs) {
assert(!needs_gs && gs->is_passthrough); if (gs != NULL) {
assert(gs->is_passthrough);
hk_cmd_bind_graphics_shader(cmd, MESA_SHADER_GEOMETRY, NULL); hk_cmd_bind_graphics_shader(cmd, MESA_SHADER_GEOMETRY, NULL);
}
return; return;
} }
/* Else, we need to bind a passthrough GS */ /* Else, we need to bind a passthrough GS */
size_t key_size = size_t key_size = hk_passthrough_gs_key_size(xfb_outputs);
sizeof(struct hk_passthrough_gs_key) + nir_xfb_info_size(xfb_outputs);
struct hk_passthrough_gs_key *key = alloca(key_size); struct hk_passthrough_gs_key *key = alloca(key_size);
*key = (struct hk_passthrough_gs_key){ *key = (struct hk_passthrough_gs_key){

View file

@ -1493,7 +1493,12 @@ hk_CmdFillBuffer(VkCommandBuffer commandBuffer, VkBuffer dstBuffer,
uint64_t addr = uint64_t addr =
vk_meta_buffer_address(&dev->vk, dstBuffer, dstOffset, dstRange); vk_meta_buffer_address(&dev->vk, dstBuffer, dstOffset, dstRange);
if (util_is_aligned(addr, 16) && util_is_aligned(range, 16)) {
libagx_fill_uint4(cmd, agx_2d(range / 16, 1), AGX_BARRIER_ALL,
addr, 0, data, data, data, data);
} else {
libagx_fill(cmd, agx_1d(range / 4), AGX_BARRIER_ALL, addr, data); libagx_fill(cmd, agx_1d(range / 4), AGX_BARRIER_ALL, addr, data);
}
} }
VKAPI_ATTR void VKAPI_CALL VKAPI_ATTR void VKAPI_CALL

View file

@ -725,7 +725,7 @@ hk_get_device_properties(const struct agx_device *dev,
.maxFragmentInputComponents = max_vgt_output_components, .maxFragmentInputComponents = max_vgt_output_components,
.maxFragmentOutputAttachments = HK_MAX_RTS, .maxFragmentOutputAttachments = HK_MAX_RTS,
.maxFragmentDualSrcAttachments = 1, .maxFragmentDualSrcAttachments = 1,
.maxFragmentCombinedOutputResources = 16, .maxFragmentCombinedOutputResources = HK_MAX_RTS + HK_MAX_DESCRIPTORS,
.maxComputeSharedMemorySize = HK_MAX_SHARED_SIZE, .maxComputeSharedMemorySize = HK_MAX_SHARED_SIZE,
.maxComputeWorkGroupCount = {0x7fffffff, 65535, 65535}, .maxComputeWorkGroupCount = {0x7fffffff, 65535, 65535},
.maxComputeWorkGroupInvocations = 1024, .maxComputeWorkGroupInvocations = 1024,

View file

@ -387,8 +387,16 @@ struct hk_passthrough_gs_key {
/* Decomposed primitive */ /* Decomposed primitive */
enum mesa_prim prim; enum mesa_prim prim;
/* Transform feedback info. Must add nir_xfb_info_size to get the key size */ /* Transform feedback info. Must use hk_passthrough_gs_key_size to get the
* key size */
nir_xfb_info xfb_info; nir_xfb_info xfb_info;
}; };
static inline size_t
hk_passthrough_gs_key_size(uint16_t output_count)
{
return (sizeof(struct hk_passthrough_gs_key) - sizeof(nir_xfb_info)) +
nir_xfb_info_size(output_count);
}
void hk_nir_passthrough_gs(struct nir_builder *b, const void *key_); void hk_nir_passthrough_gs(struct nir_builder *b, const void *key_);

View file

@ -765,9 +765,6 @@ spec@glsl-1.10@execution@glsl-vs-inline-explosion,Crash
# stipple # stipple
spec@!opengl 1.0@gl-1.0-no-op-paths,Fail spec@!opengl 1.0@gl-1.0-no-op-paths,Fail
# Bisected to b3133e250e1 ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
spec@!opengl 1.1@longprim,Crash
# fails on arm64, passes on armhf # fails on arm64, passes on armhf
spec@arb_depth_buffer_float@depthstencil-render-miplevels 1024 s=z24_s8_d=z32f,Fail spec@arb_depth_buffer_float@depthstencil-render-miplevels 1024 s=z24_s8_d=z32f,Fail
@ -853,7 +850,6 @@ spec@!opengl 1.1@polygon-mode-offset@config 6: Expected blue pixel in center,Fai
spec@!opengl 1.1@polygon-mode-offset@config 6: Expected white pixel on right edge,Fail spec@!opengl 1.1@polygon-mode-offset@config 6: Expected white pixel on right edge,Fail
spec@!opengl 1.1@polygon-mode-offset@config 6: Expected white pixel on top edge,Fail spec@!opengl 1.1@polygon-mode-offset@config 6: Expected white pixel on top edge,Fail
spec@!opengl 1.1@texsubimage-unpack,Fail
spec@!opengl 1.1@texwrap 2d proj,Fail spec@!opengl 1.1@texwrap 2d proj,Fail
spec@!opengl 1.1@texwrap 2d proj@GL_RGBA8- NPOT- projected,Fail spec@!opengl 1.1@texwrap 2d proj@GL_RGBA8- NPOT- projected,Fail
spec@!opengl 1.1@texwrap 2d proj@GL_RGBA8- projected,Fail spec@!opengl 1.1@texwrap 2d proj@GL_RGBA8- projected,Fail
@ -953,7 +949,6 @@ spec@arb_occlusion_query@occlusion_query_conform,Fail
spec@arb_occlusion_query@occlusion_query_conform@GetObjivAval_multi2,Fail spec@arb_occlusion_query@occlusion_query_conform@GetObjivAval_multi2,Fail
spec@arb_pixel_buffer_object@fbo-pbo-readpixels-small,Fail spec@arb_pixel_buffer_object@fbo-pbo-readpixels-small,Fail
spec@arb_pixel_buffer_object@pbo-getteximage,Fail spec@arb_pixel_buffer_object@pbo-getteximage,Fail
spec@arb_pixel_buffer_object@texsubimage-unpack pbo,Fail
spec@arb_point_sprite@arb_point_sprite-mipmap,Fail spec@arb_point_sprite@arb_point_sprite-mipmap,Fail
spec@arb_provoking_vertex@arb-provoking-vertex-render,Fail spec@arb_provoking_vertex@arb-provoking-vertex-render,Fail
spec@arb_sampler_objects@sampler-objects,Fail spec@arb_sampler_objects@sampler-objects,Fail

View file

@ -861,93 +861,6 @@ ubsan-dEQP-VK.image.mutable.2d_array.r16g16b16a16_sfloat_r16g16b16a16_uint_draw_
ubsan-dEQP-VK.image.mutable.2d_array.r32_uint_r8g8b8a8_sint_draw_copy_resolve_mutable_color_att,Fail ubsan-dEQP-VK.image.mutable.2d_array.r32_uint_r8g8b8a8_sint_draw_copy_resolve_mutable_color_att,Fail
ubsan-dEQP-VK.pipeline.monolithic.logic_op_na_formats.r16g16_sfloat.nand_blend,Fail ubsan-dEQP-VK.pipeline.monolithic.logic_op_na_formats.r16g16_sfloat.nand_blend,Fail
# New failures with ES CTS 3.2.13.0
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32i_rgba32i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32i_rgba32i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32ui_rgba32ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32ui_rgba32ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16i_r16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16i_r16i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16ui_r16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16ui_r16ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8i_rg8i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8i_rg8i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8_rg8.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8_rg8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8ui_rg8ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8ui_rg8ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32ui_r32ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32ui_r32ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16i_rg16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16i_rg16i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16ui_rg16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16ui_rg16ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16f.renderbuffer_to_texture2d,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16i.renderbuffer_to_texture2d,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16ui.renderbuffer_to_texture2d,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8i_rgba8i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8i_rgba8i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8_rgba8.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8_rgba8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8ui_rgba8ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8ui_rgba8ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.srgb8_alpha8_srgb8_alpha8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32i_rg32i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32i_rg32i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32ui_rg32ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32ui_rg32ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16i_rgba16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16i_rgba16i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16ui_rgba16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16ui_rgba16ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8i_r8i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8i_r8i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8_r8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8ui_r8ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8ui_r8ui.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32ui_rgba32ui.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16i_r16i.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8_rg8.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8i_rg8i.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8ui_rg8ui.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32ui_r32ui.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16ui_rg16ui.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16f.renderbuffer_to_texture2d,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16i.renderbuffer_to_texture2d,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16ui.renderbuffer_to_texture2d,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8_rgba8.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8i_rgba8i.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8ui_rgba8ui.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.srgb8_alpha8_srgb8_alpha8.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32i_rg32i.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32i_rg32i.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32ui_rg32ui.renderbuffer_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16i_rgba16i.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8_r8.texture2d_to_renderbuffer,Fail
arm32-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8ui_r8ui.renderbuffer_to_renderbuffer,Fail
ubsan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32ui_rgba32ui.renderbuffer_to_renderbuffer,Fail
ubsan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.renderbuffer_to_renderbuffer,Fail
ubsan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32ui_r32ui.texture2d_to_renderbuffer,Fail
ubsan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.texture2d_to_renderbuffer,Fail
ubsan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.renderbuffer_to_renderbuffer,Fail
# SKQP failing tests # SKQP failing tests
ES2BlendWithNoTexture,Fail ES2BlendWithNoTexture,Fail
SRGBReadWritePixels,Fail SRGBReadWritePixels,Fail

View file

@ -701,84 +701,6 @@ dEQP-VK.binding_model.unused_invalid_descriptor.write.unused.storage_buffer,Cras
dEQP-VK.binding_model.unused_invalid_descriptor.write.unused.uniform_buffer,Crash dEQP-VK.binding_model.unused_invalid_descriptor.write.unused.uniform_buffer,Crash
asan-dEQP-VK.binding_model.unused_invalid_descriptor.write.invalid.combined_image_sampler,Crash asan-dEQP-VK.binding_model.unused_invalid_descriptor.write.invalid.combined_image_sampler,Crash
# New failures with ES CTS 3.2.13.0
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32i_rgba32i.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32ui_rgba32ui.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32ui_r32ui.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16i_rg16i.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16i_rg16i.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16ui_rg16ui.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16f.renderbuffer_to_texture2d,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16i.renderbuffer_to_texture2d,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8_rgba8.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8i_rgba8i.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8ui_rgba8ui.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32i_rg32i.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32ui_rg32ui.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16i_rgba16i.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16ui_rgba16ui.texture2d_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8ui_r8ui.renderbuffer_to_renderbuffer,Fail
asan-dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8ui_r8ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32i_rgba32i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32i_rgba32i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32ui_rgba32ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_128_bits.rgba32ui_rgba32ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16i_r16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16i_r16i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16ui_r16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.r16ui_r16ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8_rg8.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8_rg8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8i_rg8i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8i_rg8i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8ui_rg8ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_16_bits.rg8ui_rg8ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_24_bits.rgb8_rgb8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32i_r32i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32ui_r32ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.r32ui_r32ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16i_rg16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16i_rg16i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16ui_rg16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rg16ui_rg16ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2_rgb10_a2.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16f.renderbuffer_to_texture2d,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16i.renderbuffer_to_texture2d,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rg16ui.renderbuffer_to_texture2d,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgb10_a2ui_rgb10_a2ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8_rgba8.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8_rgba8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8i_rgba8i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8i_rgba8i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8ui_rgba8ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.rgba8ui_rgba8ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.srgb8_alpha8_srgb8_alpha8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32i_rg32i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32i_rg32i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32ui_rg32ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rg32ui_rg32ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16i_rgba16i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16i_rgba16i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16ui_rgba16ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_64_bits.rgba16ui_rgba16ui.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8_r8.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8i_r8i.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8i_r8i.texture2d_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8ui_r8ui.renderbuffer_to_renderbuffer,Fail
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_8_bits.r8ui_r8ui.texture2d_to_renderbuffer,Fail
# SKQP failing tests # SKQP failing tests
ES2BlendWithNoTexture,Fail ES2BlendWithNoTexture,Fail
SRGBReadWritePixels,Fail SRGBReadWritePixels,Fail

View file

@ -1,4 +1,4 @@
<vcxml gen="3.3" min_ver="42" max_ver="71"> <vcxml gen="4.2" min_ver="42" max_ver="71">
<enum name="Compare Function" prefix="V3D_COMPARE_FUNC"> <enum name="Compare Function" prefix="V3D_COMPARE_FUNC">
<value name="NEVER" value="0"/> <value name="NEVER" value="0"/>

View file

@ -64,12 +64,12 @@
#define V3D71_TFU_ICFG_OTYPE_SHIFT 16 #define V3D71_TFU_ICFG_OTYPE_SHIFT 16
#define V3D71_TFU_ICFG_IFORMAT_SHIFT 23 #define V3D71_TFU_ICFG_IFORMAT_SHIFT 23
#define V3D71_TFU_ICFG_FORMAT_RASTER 0 #define V3D71_TFU_ICFG_FORMAT_RASTER 0
#define V3D71_TFU_ICFG_FORMAT_SAND_128 1 #define V3D71_TFU_ICFG_FORMAT_SAND 1
#define V3D71_TFU_ICFG_FORMAT_SAND_256 2 #define V3D71_TFU_ICFG_FORMAT_CONSTANT_COLOUR 2
#define V3D71_TFU_ICFG_FORMAT_LINEARTILE 11 #define V3D71_TFU_ICFG_FORMAT_LINEARTILE 3
#define V3D71_TFU_ICFG_FORMAT_UBLINEAR_1_COLUMN 12 #define V3D71_TFU_ICFG_FORMAT_UBLINEAR_1_COLUMN 4
#define V3D71_TFU_ICFG_FORMAT_UBLINEAR_2_COLUMN 13 #define V3D71_TFU_ICFG_FORMAT_UBLINEAR_2_COLUMN 5
#define V3D71_TFU_ICFG_FORMAT_UIF_NO_XOR 14 #define V3D71_TFU_ICFG_FORMAT_UIF_NO_XOR 6
#define V3D71_TFU_ICFG_FORMAT_UIF_XOR 15 #define V3D71_TFU_ICFG_FORMAT_UIF_XOR 7
#endif #endif

View file

@ -50,9 +50,12 @@ enum clc_spirv_version {
}; };
struct clc_optional_features { struct clc_optional_features {
bool atomic_order_seq_cst;
bool atomic_scope_device;
bool extended_bit_ops; bool extended_bit_ops;
bool fp16; bool fp16;
bool fp64; bool fp64;
bool generic_address_space;
bool int64; bool int64;
bool images; bool images;
bool images_depth; bool images_depth;

View file

@ -28,8 +28,6 @@
#include <sstream> #include <sstream>
#include <mutex> #include <mutex>
#include "util/ralloc.h"
#include "util/set.h"
#include <llvm/ADT/ArrayRef.h> #include <llvm/ADT/ArrayRef.h>
#include <llvm/IR/DiagnosticPrinter.h> #include <llvm/IR/DiagnosticPrinter.h>
#include <llvm/IR/DiagnosticInfo.h> #include <llvm/IR/DiagnosticInfo.h>
@ -68,7 +66,17 @@
#include <llvm/Support/VirtualFileSystem.h> #include <llvm/Support/VirtualFileSystem.h>
#endif #endif
#if LLVM_VERSION_MAJOR >= 22
#include <clang/Options/OptionUtils.h>
#endif
/* We have to include our own headers after LLVM/clang as they seem to use
* `UNUSED` within enum definitions:
* https://github.com/llvm/llvm-project/blob/ea443eeb2ab8ed49ffb783c2025fed6629a36f10/clang/include/clang/Basic/OffloadArch.h#L19
*/
#include "util/macros.h" #include "util/macros.h"
#include "util/ralloc.h"
#include "util/set.h"
#include "util/u_dl.h" #include "util/u_dl.h"
#include "glsl_types.h" #include "glsl_types.h"
@ -915,7 +923,9 @@ clc_compile_to_llvm_module(LLVMContext &llvm_ctx,
// GetResourcePath is a way to retrieve the actual libclang resource dir based on a given binary // GetResourcePath is a way to retrieve the actual libclang resource dir based on a given binary
// or library. // or library.
auto tmp_res_path = auto tmp_res_path =
#if LLVM_VERSION_MAJOR >= 20 #if LLVM_VERSION_MAJOR >= 22
clang::GetResourcesPath(std::string(clang_path));
#elif LLVM_VERSION_MAJOR >= 20
Driver::GetResourcesPath(std::string(clang_path)); Driver::GetResourcesPath(std::string(clang_path));
#else #else
Driver::GetResourcesPath(std::string(clang_path), CLANG_RESOURCE_DIR); Driver::GetResourcesPath(std::string(clang_path), CLANG_RESOURCE_DIR);
@ -959,6 +969,12 @@ clc_compile_to_llvm_module(LLVMContext &llvm_ctx,
c->getPreprocessorOpts().addMacroDef("cl_khr_expect_assume=1"); c->getPreprocessorOpts().addMacroDef("cl_khr_expect_assume=1");
bool needs_opencl_c_h = false; bool needs_opencl_c_h = false;
if (args->features.atomic_order_seq_cst) {
c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+__opencl_c_atomic_order_seq_cst");
}
if (args->features.atomic_scope_device) {
c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+__opencl_c_atomic_scope_device");
}
if (args->features.extended_bit_ops) { if (args->features.extended_bit_ops) {
c->getPreprocessorOpts().addMacroDef("cl_khr_extended_bit_ops=1"); c->getPreprocessorOpts().addMacroDef("cl_khr_extended_bit_ops=1");
} }
@ -969,6 +985,9 @@ clc_compile_to_llvm_module(LLVMContext &llvm_ctx,
c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+cl_khr_fp64"); c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+cl_khr_fp64");
c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+__opencl_c_fp64"); c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+__opencl_c_fp64");
} }
if (args->features.generic_address_space) {
c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+__opencl_c_generic_address_space");
}
if (args->features.int64) { if (args->features.int64) {
c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+cles_khr_int64"); c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+cles_khr_int64");
c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+__opencl_c_int64"); c->getTargetOpts().OpenCLExtensionsAsWritten.push_back("+__opencl_c_int64");

View file

@ -134,6 +134,11 @@ main(int argc, char **argv)
.args = util_dynarray_begin(&clang_args), .args = util_dynarray_begin(&clang_args),
.num_args = util_dynarray_num_elements(&clang_args, char *), .num_args = util_dynarray_num_elements(&clang_args, char *),
.c_compatible = true, .c_compatible = true,
.features = {
.atomic_order_seq_cst = true,
.atomic_scope_device = true,
.generic_address_space = true,
},
}; };
/* Enable all features, we don't know the target here and it is the /* Enable all features, we don't know the target here and it is the

View file

@ -263,7 +263,7 @@ libclc_add_generic_variants(nir_shader *shader)
if (strstr(func->name, "async_work_group_strided_copy")) if (strstr(func->name, "async_work_group_strided_copy"))
continue; continue;
char *U3AS1 = strstr(func->name, "U3AS1"); const char *U3AS1 = strstr(func->name, "U3AS1");
if (U3AS1 == NULL) if (U3AS1 == NULL)
continue; continue;

View file

@ -3379,19 +3379,21 @@ static void
apply_explicit_location(const struct ast_type_qualifier *qual, apply_explicit_location(const struct ast_type_qualifier *qual,
ir_variable *var, ir_variable *var,
struct _mesa_glsl_parse_state *state, struct _mesa_glsl_parse_state *state,
YYLTYPE *loc) YYLTYPE *loc, bool force_explict_uniform_loc_zero)
{ {
bool fail = false; bool fail = false;
unsigned qual_location; unsigned qual_location = 0;
if (!process_qualifier_constant(state, loc, "location", qual->location, if (!process_qualifier_constant(state, loc, "location", qual->location,
&qual_location)) { &qual_location) &&
!force_explict_uniform_loc_zero) {
return; return;
} }
/* Checks for GL_ARB_explicit_uniform_location. */ /* Checks for GL_ARB_explicit_uniform_location. */
if (qual->flags.q.uniform) { if (qual->flags.q.uniform) {
if (!state->check_explicit_uniform_location_allowed(loc, var)) if (!force_explict_uniform_loc_zero &&
!state->check_explicit_uniform_location_allowed(loc, var))
return; return;
const struct gl_constants *consts = state->consts; const struct gl_constants *consts = state->consts;
@ -3919,8 +3921,13 @@ apply_layout_qualifier_to_variable(const struct ast_type_qualifier *qual,
qual_string); qual_string);
} }
if (qual->flags.q.explicit_location) { bool force_explict_uniform_loc_zero =
apply_explicit_location(qual, var, state, loc); state->ctx->Const.ForceExplicitUniformLocZero && qual->flags.q.uniform &&
strcmp(state->ctx->Const.ForceExplicitUniformLocZero, var->name) == 0;
if (qual->flags.q.explicit_location || force_explict_uniform_loc_zero) {
apply_explicit_location(qual, var, state, loc,
force_explict_uniform_loc_zero);
if (qual->flags.q.explicit_component) { if (qual->flags.q.explicit_component) {
unsigned qual_component; unsigned qual_component;
@ -7667,6 +7674,7 @@ ast_process_struct_or_iface_block_members(ir_exec_list *instructions,
* embedded structures in 1.10 only. * embedded structures in 1.10 only.
*/ */
if (state->language_version != 110 && if (state->language_version != 110 &&
!state->allow_glsl_embedded_structure_declarations &&
decl_list->type->specifier->structure != NULL) decl_list->type->specifier->structure != NULL)
_mesa_glsl_error(&loc, state, _mesa_glsl_error(&loc, state,
"embedded structure declarations are not allowed"); "embedded structure declarations are not allowed");

View file

@ -1684,6 +1684,20 @@ cross_validate_globals(void *mem_ctx, const struct gl_constants *consts,
existing->data.mode == nir_var_mem_ssbo && existing->data.mode == nir_var_mem_ssbo &&
existing->data.from_ssbo_unsized_array && existing->data.from_ssbo_unsized_array &&
glsl_get_gl_type(var->type) == glsl_get_gl_type(existing->type))) { glsl_get_gl_type(var->type) == glsl_get_gl_type(existing->type))) {
/* Relax precision matching on unused uniforms for early ES shaders */
if (prog->IsES && !var->interface_type &&
!(existing->data.used && var->data.used) &&
glsl_base_type_is_integer(glsl_get_gl_type(var->type)) == glsl_base_type_is_integer(glsl_get_gl_type(existing->type)) &&
glsl_base_type_is_float(glsl_get_gl_type(var->type)) == glsl_base_type_is_float(glsl_get_gl_type(existing->type)) &&
prog->GLSL_Version < 300) {
linker_warning(prog, "%s `%s' declared as type "
"`%s' and type `%s'\n",
gl_nir_mode_string(var),
var->name, glsl_get_type_name(var->type),
glsl_get_type_name(existing->type));
} else {
linker_error(prog, "%s `%s' declared as type " linker_error(prog, "%s `%s' declared as type "
"`%s' and type `%s'\n", "`%s' and type `%s'\n",
gl_nir_mode_string(var), gl_nir_mode_string(var),
@ -1693,6 +1707,7 @@ cross_validate_globals(void *mem_ctx, const struct gl_constants *consts,
} }
} }
} }
}
if (var->data.explicit_location) { if (var->data.explicit_location) {
if (existing->data.explicit_location if (existing->data.explicit_location

View file

@ -329,6 +329,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct gl_context *_ctx,
ctx->Const.AllowVertexTextureBias; ctx->Const.AllowVertexTextureBias;
this->allow_glsl_120_subset_in_110 = this->allow_glsl_120_subset_in_110 =
ctx->Const.AllowGLSL120SubsetIn110; ctx->Const.AllowGLSL120SubsetIn110;
this->allow_glsl_embedded_structure_declarations =
ctx->Const.AllowGLSLEmbeddedStructureDeclarations;
this->allow_builtin_variable_redeclaration = this->allow_builtin_variable_redeclaration =
ctx->Const.AllowGLSLBuiltinVariableRedeclaration; ctx->Const.AllowGLSLBuiltinVariableRedeclaration;
this->ignore_write_to_readonly_var = this->ignore_write_to_readonly_var =

View file

@ -1023,6 +1023,7 @@ struct _mesa_glsl_parse_state {
char *alias_shader_extension; char *alias_shader_extension;
bool allow_vertex_texture_bias; bool allow_vertex_texture_bias;
bool allow_glsl_120_subset_in_110; bool allow_glsl_120_subset_in_110;
bool allow_glsl_embedded_structure_declarations;
bool allow_builtin_variable_redeclaration; bool allow_builtin_variable_redeclaration;
bool ignore_write_to_readonly_var; bool ignore_write_to_readonly_var;

View file

@ -676,6 +676,14 @@ glsl_type_is_e5m2(const glsl_type *t)
return t->base_type == GLSL_TYPE_FLOAT_E5M2; return t->base_type == GLSL_TYPE_FLOAT_E5M2;
} }
static inline bool
glsl_type_is_nonnative_float(const glsl_type *t)
{
return t->base_type == GLSL_TYPE_BFLOAT16 ||
t->base_type == GLSL_TYPE_FLOAT_E4M3FN ||
t->base_type == GLSL_TYPE_FLOAT_E5M2;
}
static inline bool static inline bool
glsl_type_is_int_16_32_64(const glsl_type *t) glsl_type_is_int_16_32_64(const glsl_type *t)
{ {

Some files were not shown because too many files have changed in this diff Show more