Commit graph

201589 commits

Author SHA1 Message Date
Eric Engestrom
45aa964eb8 pick-ui: make Backport-to: 25.0 backport to 25.0 *and more recent release branches*
It is what developers expect, so make the code match it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34580>
(cherry picked from commit c37a468a8a)
2025-04-22 18:46:38 +02:00
Eric Engestrom
35d5005925 .pick_status.json: Update to 5f3a3740dc 2025-04-22 18:46:36 +02:00
Eric Engestrom
310da5f30b docs: add sha sum for 25.0.4
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
2025-04-17 02:22:01 +02:00
Eric Engestrom
d0f8720019 VERSION: bump for 25.0.4 2025-04-17 02:04:03 +02:00
Eric Engestrom
bd6a277901 docs: add release notes for 25.0.4 2025-04-17 02:04:03 +02:00
Pierre-Eric Pelloux-Prayer
4437cdabf0 winsys/amdgpu: disable VM_ALWAYS_VALID
The referenced commit has been identified as the root cause of
graphic artifacts / hangs on some APUs.

For now disable AMDGPU_GEM_CREATE_VM_ALWAYS_VALID on all chips
except when user queues are used.

See https://gitlab.freedesktop.org/mesa/mesa/-/issues/12809.

Fixes: 8c91624614 ("winsys/amdgpu: use VM_ALWAYS_VALID for all VRAM and GTT allocations")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34547>
(cherry picked from commit 555821ff93)
2025-04-17 01:24:17 +02:00
David Rosca
0e9f94576f radeonsi/vpe: Use float division to get scaling ratio
Fixes: e85a6b6a63 ("radeonsi/vpe: check reduction ratio")
Reviewed-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34519>
(cherry picked from commit bd6f9e8aee)
2025-04-17 01:24:17 +02:00
Marek Olšák
ba2a1ba2e5 ac/surface: select 3D tile mode without overallocating too much for gfx6-8
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
(cherry picked from commit 78cacfd9ce)
2025-04-17 01:24:17 +02:00
Marek Olšák
48bfe6dbfd ac/surface: make gfx12_estimate_size reusable by gfx6
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12466
Fixes: c87ce78d - ac/surface: enable thick tiling for 3D textures for better perf on gfx6-8

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34432>
(cherry picked from commit 195e7b4f75)
2025-04-17 01:24:16 +02:00
Ryan Mckeever
651c53fc1f pan/format: Update format flags to follow HW spec
Fixes: 861e7dca ("panfrost: Switch formats to table")

Signed-off-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33787>
(cherry picked from commit b9a9798c46)
2025-04-16 15:52:03 +02:00
Eric Engestrom
9cbca28609 .pick_status.json: Update to 555821ff93 2025-04-16 15:50:33 +02:00
Kenneth Graunke
bb83fd7ac0 brw: Don't assert about MAX_VGRF_SIZE in brw_opt_split_virtual_grfs()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This allows us to create temporary VGRFs that are larger than
MAX_VGRF_SIZE(devinfo), which will be split eventually.  They may not
be split on the initial pass, because we may need LOAD_PAYLOAD lowering,
copy propagation, and so on to occur first.  So we allow registers to
exceed that size initially.

The "Register allocation relies on split_virtual_grfs()" assertion in
brw_reg_allocate.cpp still asserts that all VGRFs which reach the
register allocator have been properly split.

One case where this is useful is for vectorizing convergent block loads.
We create temporaries to splat the SIMD1 values out to SIMD(N), which
can lead to some very large temporaries.  However, copy propagation and
so on ultimately eliminate these and they'll get split down to proper
sizes or elided entirely in the end.

(Note: both this and the prior commits from this merge request are
 needed to close the linked issue.)

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12324
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit eb1ec9cf8e)
2025-04-16 15:37:06 +02:00
Kenneth Graunke
7a588a5a8e brw: Use live->max_vgrf_size in pre-RA scheduling
Post-RA scheduling doesn't use liveness analysis, so we continue using
MAX_VGRF_SIZE(devinfo).  But for pre-RA scheduling, we now use
live->max_vgrf_size.

This helps get us to a place where we can emit arbitrarily large VGRFs
early on in compilation, but which will be split and cleaned up prior to
register allocation.  It may also allocate smaller arrays in practice
since MAX_VGRF_SIZE(devinfo) assumes the worst case scenario for things
we actually could need to allocate.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit a45583f078)
2025-04-16 15:37:06 +02:00
Kenneth Graunke
0d1e83ca6a brw: Use live->max_vgrf_size in register coalescing
We already require liveness, so just use the actual maximum size we saw
instead of a hardcoded pessimal size.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit 4b27b5895c)
2025-04-16 15:37:05 +02:00
Kenneth Graunke
c906f565b6 brw: Track the largest VGRF size in liveness analysis
We're already looking at this data to calculate the per-component
vars_from_vgrf[] and vgrf_from_vars[] mappings, so just record the
largest VGRF size while we're here.  This will allow passes to size
arrays based on the actual size needed, rather than hardcoding some
fixed size.  In many cases, MAX_VGRF_SIZE(devinfo) is larger than
necessary, because e.g. vec5 sparse sampling results aren't used.
Not hardcoding this means we can also temporarily handle very large
VGRFs which we know will be split eventually, without having to
increase the maximum which is ultimately used for RA classes.

Cc: mesa-stable
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34461>
(cherry picked from commit ea468412f6)
2025-04-16 15:37:05 +02:00
Erik Faye-Lund
6c6c6873c4 panvk: claim official conformance on v10
It's official, PanVK is Vulkan 1.1 conformant on v10. Let's make this
clear.

Backport-to: 25.0
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34500>
(cherry picked from commit 65b7d2e865)
2025-04-16 15:37:05 +02:00
Erik Faye-Lund
238399e93a panvk: set shared_addr_format
We need to set this, otherwise we end up failing tests.

Fixes: 4e111c259c ("panvk: Lower shared memory")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34514>
(cherry picked from commit e77a815299)
2025-04-16 15:37:05 +02:00
Marek Olšák
1fe9f5d3ac radeonsi: add ACO-specific main shader parts
We can't have merged shaders where the first part is compiled using ACO
and the second part is compiled using LLVM.

Add ACO-specific main shader parts to fix that.

This happens when ACO is enabled for gfx12 streamout where GS can be paired
with a previous shader compiled by LLVM.

Fixes: 8ba718fb7d - radeonsi/gfx12: use ACO for streamout because it's faster

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491>
(cherry picked from commit 7f7d6deb18)
2025-04-16 15:37:05 +02:00
Marek Olšák
15ea052c20 radeonsi: make si_shader_selector::main_shader_part_* an iterable union
for the next commit

Fixes: 8ba718fb7d - radeonsi/gfx12: use ACO for streamout because it's faster

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34491>
(cherry picked from commit 4865ac57cc)
2025-04-16 15:37:05 +02:00
Jose Maria Casanova Crespo
9babb23138 v3dv: avoid TFU reading unmapped pages beyond the end of the buffers
TFU units is doing a readahead of 64 bytes. This is causing invalid read
MMU errors that can be observed at the nightly full Vulkan runs on
Broadcom devices.

04:13:59.969: [   85.623205] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x4869000, pte invalid
04:14:05.408: [   91.019321] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x5209000, pte invalid
04:14:05.413: [   91.031662] v3d 1002000000.v3d: MMU error from client TLB (3) at 0x7521000, pte invalid

Although the log reports the TLB the real culprit is the TFU. A fix
to the kernel was submitted to fix AXI ID on V3D 4.2 and 7.1

So doing an over-allocation of 64-bytes at v3dv_AllocateMemory is
the simplest method to make these MMU errors itp disapear.

Running ./deqp-vk for an hour, we can see that ~%40 of allocations
would need an extra page (4096 bytes) to accomodate this 64 bytes
padding.

Fixes: ca330f7f04 ("v3dv: implement VK_EXT_memory_budget")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34475>
(cherry picked from commit 0bcb82048c)
2025-04-16 15:37:04 +02:00
Mike Blumenkrantz
31e9893f64 zink: stop setting ArrayStride on image arrays
this is illegal

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651>
(cherry picked from commit b4e3535650)
2025-04-16 15:37:04 +02:00
Mike Blumenkrantz
0f3b6ba7ad zink: don't set shared block stride without KHR_workgroup_memory_explicit_layout
this is illegal

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33651>
(cherry picked from commit 1c0de360bc)
2025-04-16 15:37:04 +02:00
Eric R. Smith
5a685929d3 panfrost: fix transaction elimination crc valid calculation
The setting of the clean_pixel_write_enable flag in pan_prepare_rt
was not consistent with the crc valid calculations in pan_emit_fbd.
This caused the crc_valid flag to not be accurate, causing transaction
elimination to fail.

Fixes: eac8f1d460 ("Revert "panfrost: Disable CRC by default"")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34408>
(cherry picked from commit 69a6db4b2b)
2025-04-16 15:37:04 +02:00
Erik Faye-Lund
27342a5532 nir/lower_tex: use texture_mask instead of shifting on use
In commit 292ac71a4a ("nir/lower_tex: handle deref casts"), we avoided
using texture_index when a texture instruction contained a variable
deref. There's no good reason why this should be done to some of the
lowering, but not all.

So let's fix up code-paths that were added after this change to do the
same.

The first two patches here crossed paths with the commit that introduced
texture_mask, so it's not strange that the change was missed. The last
one seems to have just copied what was done around it, propagating the
issue.

Fixes: 880b00dc59 ("nir/lower_tex: Add support for lowering YUYV formats")
Fixes: 1358d93650 ("nir/lower_tex: Add support for lowering Y41x formats")
Fixes: 65d6f5aed2 ("nir: add options to lower y_vu, yv_yu, yx_xvxu and xy_vxux")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34365>
(cherry picked from commit 41b136f674)
2025-04-16 15:37:04 +02:00
Faith Ekstrand
5d6c82000c nil: Multiply by array_stride_B instead of adding
Fixes: 5577128c83 ("nil: Rewrite the TIC code in Rust")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34495>
(cherry picked from commit fadac25b0c)
2025-04-16 15:37:04 +02:00
Faith Ekstrand
ea963009f0 nvk/nvkmd: Check the correct flag for the Kepler GART workaround
Fixes: 1db57bb414 ("nvk/nvkmd: Rework memory placement flags")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34495>
(cherry picked from commit 5c81b3546f)
2025-04-16 15:37:04 +02:00
Caio Oliveira
aedb7eb700 nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset
Otherwise this would require combining two values to produce a single
(new bit-size) channel, which vectorize_stores() don't handle.  The pass
can still keep trying smaller bit-sizes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12946
Fixes: ce9205c03b ("nir: add a load/store vectorization pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34414>
(cherry picked from commit 2ed79f80ba)
2025-04-16 15:37:03 +02:00
David Rosca
8ffedebf1c radv/video: Fix encode session info for VCN3+
Last dword should be 0.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
(cherry picked from commit 7249d9548e)
2025-04-16 15:37:03 +02:00
David Rosca
15b2a440da radv/video: Fix msg header total size
It needs to include also codec msg size.

Cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34449>
(cherry picked from commit 34031531fc)
2025-04-16 15:37:03 +02:00
Erik Faye-Lund
b839ea42bf panfrost: fixup typo in 16x sample-pattern
This is an n-queen pattern, where no two values should be on the same
row or column. But this and the second to last element has the same y
component, and neither has the negative one.

Let's fix this up by setting the first value to the negative value. This
matches the D3D 16x sample pattern.

Fixes: a61fb62966 ("panfrost: Upload sample positions on device init")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33925>
(cherry picked from commit b4ebffa1aa)
2025-04-16 15:37:03 +02:00
Lionel Landwerlin
f018626745 brw: fix Wa_22013689345 emission
2 problems :
  - not detecting null destination correctly
  - applied too late using SHADER_OPCODE_MEMORY_FENCE, when lowering
    already happened

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34319>
(cherry picked from commit 06ad9a25e5)
2025-04-16 15:37:03 +02:00
Lars-Ivar Hesselberg Simonsen
60a2b66f63 vk/sync: Fix execution only barriers
With vkCmdPipelineBarrier, it's possible to specify a barrier with
pipeline stages but without any memory barriers. These might not be
practical, but are legal Vulkan code.

Barriers like this are currently ignored in mesa, as we only convert
barriers with passed memory barriers into vkCmdPipelineBarrier2.

This commit adds handling of execution only barriers by converting them
into a memory barrier without access masks.

Fixes: 97f0a4494b ("vulkan: implement legacy entrypoints on top of VK_KHR_synchronization2")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34187>
(cherry picked from commit 20c0d169e4)
2025-04-16 15:37:03 +02:00
Tapani Pälli
f3db21ec11 mesa: various fixes for ClearTexImage/ClearTexSubImage
Fixes some upcoming CTS tests for texture clears.

* some drivers will attempt to issue clears with zero range
  and hit asserts/crashes (spec clarification for negative
  values)

* fix error thrown with negative values to match spec

* fix cases for clearing generic compressed formats

* fix negative case of using color format while having
  depth/stencil internalformat and vice versa

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>
(cherry picked from commit 30d78dc942)
2025-04-16 15:37:02 +02:00
Tapani Pälli
0824f95f92 mesa: clamp texbuf query size to MAX_TEXTURE_BUFFER_SIZE
Fixes upcoming CTS test checking for clamping.

Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34428>
(cherry picked from commit 3bc016bb6c)
2025-04-16 15:37:02 +02:00
Lionel Landwerlin
499324de9b anv: fix self dependency computation
Some upcoming changes in the runtime will make it impossible to rely
on the pipeline or runtime information to know whether a fragment
shader has input attachments.

Instead we gather that information at compile time and store it in our
shader bind_map.

At runtime we check whether the fragment shader has input attachments
and whether those map to the runtime depth/stencil input attachments
to set the 3DSTATE_PS_EXTRA::PixelShaderKillsPixel.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: d2f7b6d5a7 ("anv: implement VK_KHR_dynamic_rendering_local_read")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
(cherry picked from commit e321c438dc)
2025-04-16 15:37:02 +02:00
Boris Brezillon
fc46313072 vk/pass: Add input attachment location info
For drivers using the render pass emulation provided by the
runtime, it's important to express the mapping between
depth/stencil/color attachments and input attachments using
VkRenderingInputAttachmentIndexInfoKHR, otherwise those drivers
have to special-case emulated render passes in their
CmdBeginRendering() implementation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
(cherry picked from commit be2532fc00)
2025-04-16 15:37:02 +02:00
Boris Brezillon
b9d5a60d10 vulkan/state: Fix input attachment map state initialization/copy
vk_dynamic_graphics_state_copy() is not copying the input attachment
map, and color_attachment_count is not initialized in
vk_dynamic_graphics_state_init_ial().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32540>
(cherry picked from commit 38e546c202)
2025-04-16 15:37:02 +02:00
Alyssa Rosenzweig
c4ef3cb651 panfrost: do not push "true" UBOs
Panfrost supports pushing uniforms to hardware uniform registers (RMU/FAU for
Midgard/Bifrost respectively). Since OpenGL uniforms are lowered to UBO #0, it
does this with a pass that pushes UBOs. That's good!

The pass also pushes 'true' OpenGL UBOs, since they look the same in the backend
at this point. This is where the trouble comes in:

- True UBOs are allocated in GPU BOs, not CPU allocated buffers. That means it's
  write-combine memory, which we cannot read from efficiently (at least
  depending on coherency details that were never plumbed through panfrost.ko and
  unlikely to be replumbed now that panthor is the new hot stuff). So, pushing
  true UBOs reduces GPU overhead at the cost of tremendous CPU overhead. This is
  dubious... When I benchmarked this on MT8192 in early 2023, this pushing
  improved FPS in SuperTuxKart but hurt FPS in Dolphin.

- True UBOs can be written on the GPU. In OpenGL, we have batch tracking
  infrastructure to sort this mess out in theory. What this means is that
  pushing UBOs requires us to flush writers AND STALL at draw-time. If this is
  ever hit, our performance is utterly trashed. But it gets worse.

- True UBOs can be written in the same batch that reads them. For example, we
  could bind a buffer as a transform feedback buffer, do a draw with XFB, then
  rebind as a UBO and do a draw reading. This is where we collapse -- our logic
  will flush the writer, which is the same batch we were in the middle of
  enqueueing a draw to. When we try to push words, we'll crash with theatrics.
  This could be solved by smartening the batch tracking logic but it's not
  trivial by any means.

So, pushing true UBOs on the CPU is broken and can hurt performance. Stop doing
it!

Long term, the solution will be to push on the GPU instead. This avoids all of
these issues. This can be done with a compute kernel or with CSF instructions.
The Vulkan driver will likely have to do this for performance, since pushing
UBOs from the CPU is utterly broken in Vulkan for the above reasons.

I have a branch somewhere doing this on v9 but I'm doing this on NIR time to
unblock a core change that was crashing piglit due to this pile of unsoundness.
Let's fix the correctness issues first, then someone can look at recovering
performance later when we're not blocking unrelated work.

Fixes corruption in Piglit test
gles-3.0-transform-feedback-uniform-buffer-object, which writes a UBO with
transform feedback. (I suspect the test still doesn't pass for the same reason
it's broken on other tilers. But that's a better place to be than oodles of
memory corruption.)

According to CI, fixes spec@arb_uniform_buffer_object@rendering{-dsa}-offset.

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit 59a3e12039)
2025-04-15 23:54:48 +02:00
Caterina Shablia
e98a912791 panfrost: update nr_uniform_buffers before dispatching XFB
Currently nr_uniform_buffers will be whatever the previous draw set
for its vertex shader, which is not what the XFB shader usually
expects.

Fixes: c246af0d ("panfrost: Only upload UBOs when needed")

Cc: mesa-stable

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit 2c75b6bb01)
2025-04-15 23:54:47 +02:00
Caterina Shablia
aed66adbd2 panfrost: don't overwrite push uniforms and sysvals UBO with user's UBO
ss->info.ubo_mask includes the push+sysval UBO so if there's a user
UBO bound at the same index as the push+sysval UBO, without this
change we end up writing a descriptor for the user UBO at that index.

Fixes: 3b3cd59f ("panfrost: Launch transform feedback shaders")

Cc: mesa-stable

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit 6948ab727f)
2025-04-15 23:54:46 +02:00
Alyssa Rosenzweig
5ad25a98ef panfrost: invert and rename no_ubo_to_push flag
only the GL driver actually wants this, neither panvk nor internal shaders do.

Cc'd as a prereq to the next patch

Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34193>
(cherry picked from commit f179f6952f)
2025-04-15 23:54:45 +02:00
Eric Engestrom
4fde719367 .pick_status.json: Update to 58321cf2e5 2025-04-15 23:49:17 +02:00
Samuel Pitoiset
3c6e241f0d radv: apply the workaround for buggy HiZ/HiS on GFX12 for DGC
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit d2da54e6f3)
2025-04-15 17:24:55 +02:00
Samuel Pitoiset
3c932e7824 radv: add a workaround for buggy HiZ/HiS on GFX12
HiZ/HiS is buggy and can cause random GPU hangs when stencil is enabled.
There are basically two alternatives but RADV follows RadeonSI and emit
a dummy RELEASE_MEM packet after every draw which should workaround the
issue and maintain performance.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12944
Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit 6388db03c8)
2025-04-15 17:24:05 +02:00
Samuel Pitoiset
5449bd2eb7 radv: determine if HiZ/HiS is enabled earlier on GFX12
To lower CPU overhead of the hardware workaround.

Backport-to: 25.0
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34381>
(cherry picked from commit 11b6d2ba60)
2025-04-10 18:06:29 +02:00
Patrick Lerda
9315eb140f i915: fix draw_create_fragment_shader() related memory leak
For instance, this issue is triggered with "piglit/bin/fcc-blit-between-clears -auto -fbo":
Direct leak of 16400 byte(s) in 5 object(s) allocated from:
    #0 0xb720689a in __interceptor_calloc (/usr/lib/libasan.so.6+0xb289a)
    #1 0xaf10f896 in draw_create_fragment_shader ../src/gallium/auxiliary/draw/draw_fs.c:47
    #2 0xaef64619 in i915_create_fs_state ../src/gallium/drivers/i915/i915_state.c:550
    #3 0xae16a955 in ureg_create_shader ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2194
    #4 0xae17f45f in ureg_create_shader_with_so_and_destroy ../src/gallium/auxiliary/tgsi/tgsi_ureg.h:150
    #5 0xae17f45f in ureg_create_shader_and_destroy ../src/gallium/auxiliary/tgsi/tgsi_ureg.h:159
    #6 0xae17f45f in util_make_fs_blit_zs ../src/gallium/auxiliary/util/u_simple_shaders.c:365
    #7 0xaf13300e in blitter_get_fs_texfetch_depth ../src/gallium/auxiliary/util/u_blitter.c:1157
    #8 0xaf13300e in util_blitter_cache_all_shaders ../src/gallium/auxiliary/util/u_blitter.c:1322
    #9 0xaef6b738 in i915_create_context ../src/gallium/drivers/i915/i915_context.c:233
    #10 0xacb33c49 in st_api_create_context ../src/mesa/state_tracker/st_manager.c:986
    #11 0xac845740 in dri_create_context ../src/gallium/frontends/dri/dri_context.c:178
    #12 0xac854d97 in driCreateContextAttribs ../src/gallium/frontends/dri/dri_util.c:631
    #13 0xb6ce79a3 in dri2_create_context_attribs ../src/glx/dri2_glx.c:240
    #14 0xb6c9606f in dri_common_create_context ../src/glx/dri_common.c:665
    #15 0xb6ca4f00 in CreateContext ../src/glx/glxcmds.c:322
    #16 0xb6ca5c0b in glXCreateNewContext ../src/glx/glxcmds.c:1449

Fixes: 1a69b50b3b ("i915g: Fix point sprites.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
(cherry picked from commit f0cfc1bbdc)
2025-04-10 17:12:25 +02:00
Patrick Lerda
737b18393b i915: fix nir_to_tgsi() related memory leak
For instance, this issue is triggered with "piglit/bin/glx-multithread-texture -auto -fbo":
Direct leak of 256 byte(s) in 1 object(s) allocated from:
    #0 0xb71eda62 in __interceptor_realloc (/usr/lib/libasan.so.6+0xb2a62)
    #1 0xadd5a32f in tokens_expand ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:239
    #2 0xadd5a32f in get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:262
    #3 0xadd62519 in copy_instructions ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2079
    #4 0xadd62519 in ureg_finalize ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2129
    #5 0xadd64bde in ureg_get_tokens ../src/gallium/auxiliary/tgsi/tgsi_ureg.c:2206
    #6 0xade377d0 in nir_to_tgsi_options ../src/gallium/auxiliary/nir/nir_to_tgsi.c:4043
    #7 0xade3da63 in nir_to_tgsi ../src/gallium/auxiliary/nir/nir_to_tgsi.c:3831
    #8 0xaeb606c9 in i915_create_vs_state ../src/gallium/drivers/i915/i915_state.c:662
    #9 0xac781a2c in st_create_common_variant ../src/mesa/state_tracker/st_program.c:720
    #10 0xac78e8a4 in st_get_common_variant ../src/mesa/state_tracker/st_program.c:773
    #11 0xac78fc10 in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1259
    #12 0xac78fc10 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1345
    #13 0xac790b1a in st_program_string_notify ../src/mesa/state_tracker/st_program.c:1378
    #14 0xace457a9 in _mesa_get_fixed_func_vertex_program ../src/mesa/main/ffvertex_prog.c:1397
    #15 0xac5ef8db in update_program ../src/mesa/main/state.c:281
    #16 0xac5f0ece in _mesa_update_state_locked ../src/mesa/main/state.c:560
    #17 0xac5f1653 in _mesa_update_state ../src/mesa/main/state.c:593
    #18 0xacdf9fe2 in _mesa_DrawArrays ../src/mesa/main/draw.c:1403

Fixes: 487a493325 ("i915g: Add support for per-vertex point size.")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
(cherry picked from commit 5af5f508b1)
2025-04-10 17:12:25 +02:00
Patrick Lerda
35ad8014cf i915: fix slab_create() related memory leaks
For instance, this issue is triggered with "piglit/bin/fcc-blit-between-clears -auto -fbo":
Direct leak of 836 byte(s) in 1 object(s) allocated from:
    #0 0xb71eb6f2 in malloc (/usr/lib/libasan.so.6+0xb26f2)
    #1 0xaefadc78 in slab_add_new_page ../src/util/slab.c:179
    #2 0xaefadc78 in slab_alloc ../src/util/slab.c:221
    #3 0xaef7d461 in i915_texture_transfer_map ../src/gallium/drivers/i915/i915_resource_texture.c:789
    #4 0xac9e931e in pipe_texture_map ../src/gallium/auxiliary/util/u_inlines.h:555
    #5 0xac9e931e in _mesa_map_renderbuffer ../src/mesa/main/renderbuffer.c:494
    #6 0xad49c5e4 in readpixels_memcpy ../src/mesa/main/readpix.c:260
    #7 0xad49c5e4 in _mesa_readpixels ../src/mesa/main/readpix.c:898
    #8 0xad5d8cfe in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:568
    #9 0xad4a0caf in read_pixels ../src/mesa/main/readpix.c:1199
    #10 0xad4a0caf in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1216
    #11 0xad4a155b in _mesa_ReadPixels ../src/mesa/main/readpix.c:1231

or "piglit/bin/fcc-read-to-pbo-after-clear -auto":
Direct leak of 772 byte(s) in 1 object(s) allocated from:
    #0 0xb726b6f2 in malloc (/usr/lib/libasan.so.6+0xb26f2)
    #1 0xaf0adc88 in slab_add_new_page ../src/util/slab.c:179
    #2 0xaf0adc88 in slab_alloc ../src/util/slab.c:221
    #3 0xaf07aad7 in i915_buffer_transfer_map ../src/gallium/drivers/i915/i915_resource_buffer.c:75
    #4 0xad10de74 in pipe_buffer_map_range ../src/gallium/auxiliary/util/u_inlines.h:398
    #5 0xad10de74 in _mesa_bufferobj_map_range ../src/mesa/main/bufferobj.c:499
    #6 0xad5677ce in _mesa_map_pbo_dest ../src/mesa/main/pbo.c:308
    #7 0xad59be3b in _mesa_readpixels ../src/mesa/main/readpix.c:894
    #8 0xad6d8cfe in st_ReadPixels ../src/mesa/state_tracker/st_cb_readpixels.c:568
    #9 0xad5a0caf in read_pixels ../src/mesa/main/readpix.c:1199
    #10 0xad5a0caf in _mesa_ReadnPixelsARB ../src/mesa/main/readpix.c:1216
    #11 0xad5a155b in _mesa_ReadPixels ../src/mesa/main/readpix.c:1231

Fixes: e7a73b75a0 ("gallium: switch drivers to the slab allocator in src/util")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27570>
(cherry picked from commit 92802ea90a)
2025-04-10 17:12:25 +02:00
Ian Romanick
3e789ce50d brw/nir: Use offset() for all uses of offs in emit_pixel_interpolater_alu_at_offset
This is necessary to appropriately uniformize the first component
access of a convergent vector. Without this, this is produced:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

This is the correct code:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+2.0<0>:F, 0.5f

Without 38b58e286f, the code generated was more incorrect, but happened
to work for this test case:

    load_payload(16) %18:D, 0d, 0d NoMask group0
    add(32) %21:F, %18+0.0<0>:F, 0.5f
    add(32) %22:F, %18+0.4<0>:F, 0.5f

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 38b58e286f ("brw/nir: Fix source handling of nir_intrinsic_load_barycentric_at_offset")
Closes: #12969
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34427>
(cherry picked from commit cb69d019cf)
2025-04-10 17:12:25 +02:00
Patrick Lerda
885b1cfd36 i915: fix i915_set_vertex_buffers() related refcnt imbalance and remove redundancies
Indeed, this resource was assigned twice and was not properly freed.

For instance, this issue is triggered with:
"piglit/bin/glsl-fs-pointcoord -auto -fbo"
while setting GALLIUM_REFCNT_LOG=refcnt.log.

Fixes: 0278d1fa32 ("gallium: add unbind_num_trailing_slots to set_vertex_buffers")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27572>
(cherry picked from commit 22c399320b)
2025-04-10 17:12:25 +02:00