Commit graph

204578 commits

Author SHA1 Message Date
Sviatoslav Peleshko
b843ba4bf1 intel/brw: Use correct instruction for value change check when coalescing
When we have partial VGRF MOVs with offsets, we will reach
`channels_remaining == 0` with `inst` that is not writing the whole VGRF.
Currently, even though we check `can_coalesce_vars()` for each offset
separately, it will always check if the dst value is not changed only
for the offset from the instruction that satisfied the
`channels_remaining == 0` condition.

Instead, we should remember and use the correct instruction for each
written offset separately.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10916
Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35062>
(cherry picked from commit 0e3e5146cf)
2025-06-04 15:52:48 +02:00
Mel Henning
eacca4b1ec nak: Don't swap f2fp sources in legalize
The order of these is important.

Fixes: e19871bd6a ("nak: Use F2FP for nir_op_pack_half_2x16_split on SM86+")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12717
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35267>
(cherry picked from commit aae67ab678)
2025-06-04 15:52:48 +02:00
Mel Henning
a77ee5440a nak: Forbid reordering labeled OpNop
Totals:
Static cycle count: 1104322907 -> 1108862573 (+0.41%)

Totals from 111376 (56.68% of 196502) affected shaders:
Static cycle count: 948085895 -> 952625561 (+0.48%)

Fixes: 79d0f8263d ("nak: Add a simple postpass instruction scheduler")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35141>
(cherry picked from commit 018f4f1c27)
2025-06-04 15:52:48 +02:00
Eric Engestrom
e6c03f1755 .pick_status.json: Mark f0dde6ca7f as denominated 2025-06-04 15:52:48 +02:00
David Rosca
05f6b0f3bb radeonsi/vcn: Use picture fence in JPEG decode
The fence needs to be passed to frontend to make vaSyncSurface work
correctly.

Cc: mesa-stable
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35258>
(cherry picked from commit 3bb9905e7f)
2025-06-04 15:52:48 +02:00
Yao Zi
a18027acec radeonsi: Fix violation of aliasing rules in radeon_ws_bo_reference
Applications using Mesa built with LLVM 20.1.4 fail to start with
strange segmentfaults/bus errors when radeonsi driver is used. The last
piece of stacktrace looks like

  - pipe_reference_described
  - pipe_reference
  - radeon_bo_reference
  - radeon_ws_bo_reference
  - radeon_lookup_or_add_real_buffer

Coredump shows the pointer dst passed to pipe_reference_described() is
either unaligned or even invalid, which is the reason of crashing. The
crash goes away when Mesa is built without optimization.

Looking through the related functions, it's found that
radeon_ws_bo_reference() contains unsafe type cast from radeon_bo to
pb_buffer_lean: though the former's first field is just the later, this
violates strict aliasing rules as pb_buffer_lean isn't compatible with
radeon_bo. Such violation ultimately results in miscompilation.

Let's take the address of pb_buffer_lean field, avoiding the unsafe
cast. It's still required to cast pb_buffer_lean back to radeon_bo since
radeon_bo_reference may update the pointer, which is safe as radeon_bo
contains a pb_buffer_lean member and C language permits access members
through a pointer in type of the container.

Fixes: 6d913a2bcc ("r300,r600,radeonsi: switch to pb_buffer_lean")
Link: https://www.gnu.org/software/c-intro-and-ref/manual/html_node/Aliasing-Type-Rules.html
Signed-off-by: Yao Zi <ziyao@disroot.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35249>
(cherry picked from commit b1d81a7df1)
2025-06-04 15:52:48 +02:00
David Rosca
6676ae1a2d frontends/va: Fix H264 top/bottom is reference flags
All pics in the ReferenceFrames array should be references,
so there is no need to require the SHORT_TERM_REFERENCE flag
to actually treat them as references.
This fixes decoding with apps that doesn't set this flag,
eg. NoMachine remote desktop viewer (nxplayer).

See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13229
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35186>
(cherry picked from commit a9a54632af)
2025-06-04 15:52:48 +02:00
Eric R. Smith
54caa53302 panfrost, panvk: fix G31 use of SHADER_MODE_EARLY_ZS_ALWAYS
PRE_POST_FRAME_SHADER_MODE_EARLY_ZS_ALWAYS was introduced in
architecture version 7.2, not 7.0 as we assumed. Using it on
G31 (a 7.0 device) caused some CTS failures.

Cc: mesa-stable
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34744>
(cherry picked from commit 13b35a3c9c)
2025-06-04 15:52:48 +02:00
Mike Blumenkrantz
6eb10b69a4 zink: fix queue transition check in check_for_layout_update()
this only applies if the resource has active binds, otherwise it triggers crashes

Fixes: 18d206d67c ("zink: Check queue families when binding image resources")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35234>
(cherry picked from commit 44bff7eb05)
2025-06-04 15:52:48 +02:00
Mike Blumenkrantz
af3a5a15d2 zink: also check for host-visible on staging uploads
this has strange mechanics on lavapipe

Fixes: e63acdd2b7 ("zink: force cached mem for streaming uploads")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35239>
(cherry picked from commit d8d913c341)
2025-06-04 15:52:48 +02:00
Mel Henning
031c20e9d3 nvk: Call ensure_slm for nvk_cmd_dispatch_shader
Internal shaders can also use slm, so we need to allocate it correctly.

This fixes
dEQP-VK.dgc.ext.compute.misc.max_pc_range_256_full_preprocess_with_execution_set
with NAK_DEBUG=spill

Fixes: 105bdf2e36 ("nvk: Add a helper for dispatching compute shaders")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35143>
(cherry picked from commit 0e5880ebe4)
2025-06-04 15:52:48 +02:00
Faith Ekstrand
569459a804 nvk: Only allow importing mappable dma-bufs to HOST_VISIBLE types
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35213>
(cherry picked from commit 601cf33c44)
2025-06-04 15:52:48 +02:00
Olivia Lee
43b914aef4 panfrost: legalize afbc before zs and rt clears
In panfrost_clear_depth_stencil and panfrost_clear_render_target, we
start the blit context before binding the clear targets. If we don't
legalize AFBC beforehand, we get a recursive blit crash. panfrost_clear
does not need this because the resource should already be legalized in
panfrost_batch_add_surface.

Fixes the following piglit tests with pan_force_afbc_packing:
 - spec@arb_clear_texture@arb_clear_texture-base-formats
 - spec@arb_clear_texture@arb_clear_texture-simple
 - spec@arb_clear_texture@arb_clear_texture-sized-formats

Fixes: 17a62ff993 ("panfrost: legalize afbc before blitting")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34992>
(cherry picked from commit 104ea2e4cf)
2025-06-04 15:52:48 +02:00
Olivia Lee
b4d4799ce9 panfrost: fix assertion failure compiling image conversion shaders
In 59a3e12039, we changed the UBO->push optimization in panfrost to
only push UBOs that are available in a CPU buffer. We require
first_ubo_is_default_ubo, to ensure that UBO0 will be a user buffer. We
weren't setting this flag for the image conversion shaders, so got an
assertion failure compiling them. This can be triggered by the
panvk_force_afbc_packing driconf option.

The conversion shader info UBO isn't exactly a "default" UBO in the
sense of being lowered from uniforms, but it is a user buffer, so
setting the flag should be fine.

Fixes: 59a3e12039 ("panfrost: do not push "true" UBOs")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34992>
(cherry picked from commit bed54fa402)
2025-06-04 15:52:47 +02:00
Yiwei Zhang
ed16a3a87c vulkan/wsi: include missing barrier for transferring to blit dst image
Fixes: 2975a7f453 ("vulkan/wsi: Add support for image -> image blits")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35220>
(cherry picked from commit 2af2314fb2)
2025-06-04 15:52:47 +02:00
Paulo Zanoni
ba15c7a660 intel/isl: don't clamp num_elements to (1 << 27)
The BSpec page for Structure_RENDER_SURFACE_STATE says:

  "For typed buffer and structured buffer surfaces, the number of
   entries in the buffer ranges from 1 to 2^27. For raw buffer
   surfaces, the number of entries in the buffer is the number of
   bytes which can range from 1 to 2^30. After subtracting one from
   the number of entries, software must place the fields of the
   resulting 27-bit value into the Height, Width, and Depth fields as
   indicated, right-justified in each field. Unused upper bits must be
   set to zero."

According to the vkd3d-proton developers, this is what is happening
with the applications:

  "There's also the problematic case of games using typed descriptors
   but passing non-typed buffer descriptors, which is an extremely
   common app bug that works on all D3D12 drivers that we need to work
   around by creating typed views."

Previously, we had an assert() to check for "num_elements > (1 <<
27)", but that assert was preventing us from running games such as
Marvel's Spider-Man Remastered and Assassin's Creed: Valhalla in Debug
mode. So not only I removed the assert, but I also made the code clamp
num_elements to the maximum of (1 << 27) based on my incorrect
interpretation of the paragraph quoted above from BSpec.

What I did not realize was that num_elements is being used just to
calculate Structure_RENDER_SURFACE_STATE Height, Width and Depth, and
our register bit fields on SKL and newer are big enough to fit any
number of num_elements up to 2^32, not only 2^27. Clamping
num_elements results in an incorrect value for S.Depth, which
generates visual corruption in some games.

On Marvel's Spider-Man Remastered, without this patch the texture of
the asphalt in some streets (like the very first one you jump to when
the game starts) gets rendered incorrectly.

Testcase: vkd3d-proton/d3d12/test_large_texel_buffer_view
Link: https://github.com/HansKristian-Work/vkd3d-proton/issues/2071
Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12827
Fixes: f3c7e14f09 ("isl: don't assert(num_elements > (1ull << 27))")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35032>
(cherry picked from commit ecc90e1bb3)
2025-06-04 15:52:47 +02:00
Jordan Justen
b7472364c2 intel/dev: Add BMG PCI IDs 0xe220-0xe223
Ref: bspec 68090
Backport-to: 25.0, 25.1
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35139>
(cherry picked from commit 4c4d90ae49)
2025-06-04 15:52:47 +02:00
Karol Herbst
bfecaf4040 rusticl/kernel: rework validation in clSetKernelExecInfo
We should use the cl_slice code to get proper validation, which also makes
it simpler to read out data and gets rid of some UB there.

This also fixes CL_KERNEL_EXEC_INFO_SVM_PTRS with param_value being null.

Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32942>
(cherry picked from commit 35a9829391)
2025-06-04 15:52:47 +02:00
Karol Herbst
a1966159d9 zink: set unordered_read/write after buffer_barrier in set_global_binding
Fixes: a6e9e0f0d7 ("zink: add set_global_binding")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32942>
(cherry picked from commit a04569b2ea)
2025-06-04 15:52:47 +02:00
Yiwei Zhang
6acc812477 panvk: fix memory binding for wsi image alias
Fixes: f77fe432c1 ("panvk: support binding swapchain memory")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35197>
(cherry picked from commit 7e2fe6d1c1)
2025-06-04 15:52:47 +02:00
Mary Guillemard
6ce036e6c0 pan/genxml: Fix typo for NEXT_SB_ENTRY
"NEXT_SB_ENTR" -> "NEXT_SB_ENTRY"

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: 811525b543 ("pan/genxml: Build libpanfrost_decode for v12")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35089>
(cherry picked from commit f6f5bee080)
2025-06-04 15:52:47 +02:00
Marek Olšák
4fd8946062 glsl: fix sampler and image type checking in lower_precision
Use the param type, not the referenced variable. The referenced variable
can be a structure, which wouldn't be recognized as a sampler or image.

Fixes: 733bee57eb - glsl: lower samplers with highp coordinates correctly

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Dieter Nützel Dieter@nuetzel-hh.de on gfx8 (Polaris 20)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34959>
(cherry picked from commit bd5d623674)
2025-06-04 15:52:47 +02:00
Marek Olšák
39a8d4425a winsys/amdgpu: fix running out of 32bit address space with high FPS
Reproduced with gfxbench5 gl_tess_off.

Fixes: 4d486888ee - winsys/amdgpu: rewrite BO fence tracking by adding a new queue fence system

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34983>
(cherry picked from commit 4bf2a28334)
2025-06-04 15:52:47 +02:00
Samuel Pitoiset
e35c5d643b radv: add radv_disable_hiz_his_gfx12 and enable for Mafia Definitive Edition
This is a workaround for random GPU hangs with HiZ/HiS on GFX12
because the correct fix is complex and it will take time to be
implemented properly.

Mafia Definitive Edition is the first known game affected by this.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13222
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35182>
(cherry picked from commit 2ebfa64be7)
2025-06-04 15:52:47 +02:00
Adam Jackson
a186710269 vtn/opencl: Handle OpenCLstd_F{Min,Max}_common
Normal fmin doesn't make any promises about NaN, common additionally
doesn't make any promises about infinities. Would be nice to hook that
up to codegen but lowering them to normal works for now.

Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34941>
(cherry picked from commit 4b1c824b67)
2025-06-04 15:52:47 +02:00
Adam Jackson
0ab0792c46 vtn: (Silently) handle FunctionParameterAttributeNo{Capture,Write}
Silences a few thousand warnings in sycl/test-e2e

Cc: mesa-stable
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34941>
(cherry picked from commit 92f07860a4)
2025-06-04 15:52:47 +02:00
Samuel Pitoiset
5ad7ae003f radv: fix capture/replay with sparse images and descriptor buffer
The sparse image VA needs to be returned to the application for replay.

Reported by Baldur.

VKCTS has coverage but it doesn't verify this yet.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35162>
(cherry picked from commit 63758bc093)
2025-06-04 15:52:47 +02:00
Erik Faye-Lund
c29cacb77a panfrost: do not try to use 4x4 tiles on v4 gpus
Mali V4 GPUs only ever use 16x16 tiles, so we need to set the minimum
tile-size to match.

Fixes: 329568b5eb ("panfrost: add color-attachment and msaa helpers")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
(cherry picked from commit 483ce5a1dc)
2025-06-04 15:52:47 +02:00
Erik Faye-Lund
7fb8044bcc mesa/main: remove non-existing function prototype
This function was removed about a decade ago, let's get rid of the
prototype as well!

Fixes: a347a0f53f ("mesa: Completely remove QuerySamplesForFormat from driver func table")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35184>
(cherry picked from commit 439b88c619)
2025-06-04 15:52:46 +02:00
Faith Ekstrand
7e0c8b8efd nouveau/mme: Don't install the HW tests
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35163>
(cherry picked from commit 26ba29f75b)
2025-06-04 15:52:46 +02:00
Mel Henning
4483824a0a nak/spill_values: Follow phis from src to dest
ssa_state_out has the predecessor's SSAValue, so we need look for it in
the phi_src map.

Totals:
CodeSize: 4545122720 -> 4534830176 (-0.23%); split: -0.23%, +0.00%
Number of GPRs: 10963889 -> 10963693 (-0.00%); split: -0.00%, +0.00%
SLM Size: 1855380 -> 1649308 (-11.11%); split: -11.11%, +0.01%
Static cycle count: 1104322907 -> 1093035821 (-1.02%); split: -1.02%, +0.00%
Spills to memory: 480689 -> 139107 (-71.06%)
Fills from memory: 480689 -> 139107 (-71.06%)
Spills to reg: 458804 -> 242139 (-47.22%); split: -47.23%, +0.01%
Fills from reg: 303068 -> 222030 (-26.74%); split: -26.75%, +0.01%
Max warps/SM: 7245516 -> 7245580 (+0.00%)

Totals from 9899 (5.04% of 196502) affected shaders:
CodeSize: 1056727952 -> 1046435408 (-0.97%); split: -0.98%, +0.00%
Number of GPRs: 1666652 -> 1666456 (-0.01%); split: -0.01%, +0.00%
SLM Size: 1107988 -> 901916 (-18.60%); split: -18.61%, +0.01%
Static cycle count: 254942337 -> 243655251 (-4.43%); split: -4.43%, +0.01%
Spills to memory: 480689 -> 139107 (-71.06%)
Fills from memory: 480689 -> 139107 (-71.06%)
Spills to reg: 367784 -> 151119 (-58.91%); split: -58.92%, +0.01%
Fills from reg: 222209 -> 141171 (-36.47%); split: -36.49%, +0.02%
Max warps/SM: 119188 -> 119252 (+0.05%)

Fixes: bcad2add47 ("nak: Add a spilling pass")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit 6c68c2c3ba)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35244>
2025-06-04 15:52:46 +02:00
Olivia Lee
d0dd9ab2a8 panvk/csf: fix provoking vertex mode in partial secondary cmdbufs
For partial secondary cmdbufs, we emit FBDs/TDs in the primary cmdbuf
before calling the secondary. In order to set the provoking vertex mode
correctly here, we need to look at the mode set by pipelines bound in
the secondary cmdbuf.

This leaves one edge case: reemitting FBDs/TDs in a secondary cmdbuf
after a flush. If the secondary cmdbuf only contains vk_meta draws,
without ever binding a pipeline, we won't know which provoking vertex
mode to use here. This is actually okay, because in that case the
provoking vertex mode doesn't matter for any of the draws in the
secondary, and the FBDs/TDs will be reemitted on the primary with the
correct mode.

Fixes: 7a9f14d3c2 ("panvk: advertise VK_EXT_provoking_vertex")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
(cherry picked from commit 65406cf500)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35194>
2025-06-04 15:52:46 +02:00
Olivia Lee
fa98cf6af0 panvk/csf: fix case where vk_meta is used before PROVOKING_VERTEX_MODE_LAST
In this case, we need to emit the FBDs and TDs for the meta command
before we know what provoking vertex mode the application is going to
use. To handle this, we make a guess for which provoking vertex mode we
need. Then we use cs_maybe to leave space to flip the provoking vertex
bit if the guess was wrong.

This case is still unhandled on JM.

Fixes: 7a9f14d3c2 ("panvk: advertise VK_EXT_provoking_vertex")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
(cherry picked from commit 885805560f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35194>
2025-06-04 15:52:46 +02:00
Olivia Lee
f8b061c99c panvk: fix case where vk_meta is used after PROVOKING_VERTEX_MODE_LAST
Because we advertise provokingVertexModePerPipeline=false, the provoking
vertex mode must be set the same for all pipelines used in a renderpass.
vk_meta doesn't care about the provoking vertex mode, but the vulkan api
doesn't provide a way to express this, so it always sets
PROVOKING_VERTEX_MODE_FIRST (the vulkan default). This causes an
assertion failure when vk_meta is used in a renderpass where the
application sets PROVOKING_VERTEX_MODE_LAST.

There are a few different cases here, that need different handling. The
simplest is when vk_meta is used after the first application draw, in
which case we can just ignore the state passed by vk_meta and use the
existing state.

Fixes: 7a9f14d3c2 ("panvk: advertise VK_EXT_provoking_vertex")
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
(cherry picked from commit 4d99346477)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35194>
2025-06-04 15:52:46 +02:00
Olivia Lee
4f2353e598 panvk: track whether we are in a vk_meta command
This is needed to handle the provoking vertex mode correctly. vk_meta
doesn't care which provoking vertex mode is used, but there is no way to
express this directly in the vulkan api.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
(cherry picked from commit 32177b99d5)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35194>
2025-06-04 15:52:46 +02:00
Olivia Lee
49bdf4669b panvk/csf: set up shared register dump regions for cs functions
The tiler OOM exception handler allocated a region of memory to dump
save/restored registers. For defining more functions in the future, we
allocate a register dump region for each subqueue, that can hold the
largest number of registers needed by any functions executed on that
subqueue.

This does mean that we cannot have function calls more than one deep. If
we ever need nested function calls, we will have to consider a real
stack.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
(cherry picked from commit d60c688317)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35194>
2025-06-04 15:52:46 +02:00
Olivia Lee
41c8b9a461 pan/csf: rename cs_exception_handler to cs_function
The register save/restore machinery is useful for more general callable
functions, not just exception handlers.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
(cherry picked from commit 61e7d47270)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35194>
2025-06-04 15:52:46 +02:00
Olivia Lee
48e9e7ba47 pan/csf: add cs_maybe mechanism to retroactively patch cs contents
We have an edge case with VK_EXT_provoking_vertex where we may need to
emit FBDs and TDs before we know what provoking vertex mode the
application is using for the renderpass. To handle this, we want to
retroactively patch the provoking vertex bit. This commit introduces an
abstraction to do that.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Tested-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
(cherry picked from commit 83bb97796b)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35194>
2025-06-04 15:52:46 +02:00
Mike Blumenkrantz
a4f5779bb1 tc: fix detection of in-flight resource usage when sync is used
tc_sync reuses the same batch, which breaks the current disambiguation
methods by returning !busy for work which is currently executing
on the reused batch

by also tracking the completed generation, this scenario is detected
and disambuguated

Fixes: 9cc06f817c ("tc: allow unsynchronized texture_subdata calls where possible")
(cherry picked from commit b89e0fa226)

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35204>
2025-06-04 15:52:46 +02:00
Lars-Ivar Hesselberg Simonsen
c01db1fb7f panfrost: Apply direct dispatch WLS instance limit
Apply the direct dispatch WLS instance limit to panfrost as well to keep
compute jobs with large workgroup counts from running out of memory.

Fixes: 1304f4578d ("panfrost: Adapt emit_shared_memory for indirect dispatch")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34979>
(cherry picked from commit 64ce37b2d9)
2025-06-04 15:52:46 +02:00
Lars-Ivar Hesselberg Simonsen
cc2e341a14 panvk/jm: Apply direct dispatch WLS instance limit
Apply the direct dispatch WLS instance limit to PanVK/JM as well to keep
compute jobs with large workgroup counts from hitting
VK_ERROR_OUT_OF_DEVICE_MEMORY.

Fixes: 005703e5b5 ("panvk: Move TLS preparation logic to cmd_dispatch_prepare_tls"
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34979>
(cherry picked from commit e6e406de0e)
2025-06-04 15:52:46 +02:00
Lars-Ivar Hesselberg Simonsen
6d1e51de04 panvk/v10+: Limit direct dispatch WLS allocation
During direct dispatch, we calculate the size of the WLS allocation
based on the number of WLS instances which is an unbounded calculation
on number of workgroups.

This leads to extreme allocation sizes and potentially
VK_ERROR_OUT_OF_DEVICE_MEMORY for direct dispatches with a high amount
of workgroups.

This change adds an upper bound to the number of WLS instances, using
the same value we assume for indirect dispatches.

Additionally, this commit fixes the WLS max instance calculation (which
should be per core).

Fixes: 5544d39f44 ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34979>
(cherry picked from commit 0a47a1cb6d)
2025-06-04 15:52:46 +02:00
Lars-Ivar Hesselberg Simonsen
de8423ef2a panvk/v10+: Remove unnecessary alloc in dispatch_precomp
The CSF version of dispatch_precomp allocates TLS/WLS prior to calling
cmd_dispatch_prepare_tls, which will do the same.

This commit removes this unnecessary allocation.

Fixes: cc02c5deb4 ("panvk: Implement precomp dispatch")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34979>
(cherry picked from commit a6c7a774ab)
2025-06-04 15:52:46 +02:00
Faith Ekstrand
84e66ae44a nvk: Allocate the correct VAB size on Kepler
We were allocating 128 KiB but claimed 256 KiB.  Allocate the right size
and assert that the size matches.

Fixes: 970bd70584 ("nvk: allocate VAB memory area")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35172>
(cherry picked from commit 9fe2a21e93)
2025-06-04 15:52:46 +02:00
Patrick Lerda
80263066b9 r600: fix pop-free clipping
This update is aimed at fixing pop-free clipping and follows
the advices by Vitaliy Kuzmin: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12440

This functionality requires calculating the value of the following two
registers: PA_CL_GB_HORZ_DISC_ADJ and PA_CL_GB_VERT_DISC_ADJ. These two
registers are available on all the gpus of the r600 family.

This code is built on the backport of radeonsi updates which are relevant
to this very functionality:
57e658d041 "radeonsi: rework how guardband registers are updated to decrease overhead"
146c2b7c28 "radeonsi: adjust clip discard based on line width / point size"
4d74432dd3 "radeonsi: don't discard points and lines"
63680471f9 "radeonsi: remove si_context::{scissor_enabled,clip_halfz}"

This change was tested on rv770, barts and cayman:
deqp-gles[2-3]/functional/clipping/line/wide_line_clip_viewport_center: fail pass
deqp-gles[2-3]/functional/clipping/line/wide_line_clip_viewport_corner: fail pass
deqp-gles[2-3]/functional/clipping/point/wide_point_clip: fail pass
deqp-gles[2-3]/functional/clipping/point/wide_point_clip_viewport_center: fail pass
deqp-gles[2-3]/functional/clipping/point/wide_point_clip_viewport_corner: fail pass

Cc: mesa-stable
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35052>
(cherry picked from commit df2c774a83)
2025-06-04 15:52:45 +02:00
Qiang Yu
25604929b1 nir/opt_varyings: fix mesh shader miss promote varying to flat
We still allow mesh shader promote constant output to flat, but
mesh shader like geometry shader may store multi vertices'
varying in a single thread. So mesh shader may store different
constant values to different vertices in a single thread, we
should not promote this case to flat.

I'm not using shader_info.mesh.ms_cross_invocation_output_access
because OpenGL does not require IO to have explicit location, so
when nir_shader_gather_info is called in OpenGL GLSL compiler to
compute ms_cross_invocation_output_access, some implicit output
has -1 location which causes ms_cross_invocation_output_access
unset for it.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13134
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35081>
(cherry picked from commit 6f2a1e19da)
2025-06-04 15:52:45 +02:00
Timothy Arceri
d976a8fdf7 util: add workaround for the game Foundation
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12882
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35107>
(cherry picked from commit bf24d56862)
2025-06-04 15:52:45 +02:00
Timothy Arceri
826fe18abd mesa: extend linear_as_nearest work around
Here we allow packed stencils to skip the completeness check also.
Will be used in the following patch for a bug in the game Foundation.

Cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35107>
(cherry picked from commit 27945bbd8a)
2025-06-04 15:52:45 +02:00
Mel Henning
3dedf9bbc1 nak: Fix a perf regression in tex lowering
These lines look like they were mistakenly introduced, and cause a
significant perf hit. Eg. this fix improves the Horizon Zero Dawn
in-game benchamark by ~42% on my ampere machine (5992 pts -> 8517 pts).

Fixes: d16e75e55f ("nak: Lower texture inputs for Kepler B")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35100>
(cherry picked from commit 9d620fabd2)
2025-06-04 15:52:45 +02:00
Mike Blumenkrantz
c9ff965c22 lavapipe: handle counterOffset in vkCmdDrawIndirectByteCountEXT
fixes dEQP-VK.transform_feedback.simple.draw_indirect*counter_offset*

cc: mesa-stable

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35076>
(cherry picked from commit 42b303c7b0)
2025-06-04 15:52:45 +02:00