Commit graph

6592 commits

Author SHA1 Message Date
Rob Clark
0a1f56fb90 freedreno/devices: Update chicken bits
b22 should be set on all a7xx.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37411>
2025-09-16 16:21:42 +00:00
Danylo Piliaiev
1c57f88908 tu: Reset BIN_FOVEAT regs for tiling with and without HW binning
We didn't reset the regs when HW binning was disabled.

Fixes: b34b089ca1 ("tu: Use GRAS bin offset registers")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37373>
2025-09-15 20:11:21 +00:00
Job Noorman
563b0b347a ir3: don't create merge sets for subreg moves
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
There are multiple places where RA assumes merge sets are either
all-full or all-half registers. Creating merge sets for subreg moves
mixes full and half registers which may lead to RA failures.

Fix this by not creating merge sets for subreg moves anymore. Instead,
we manually try to allocate  a subreg move's src for its dst when
selecting a register during RA, similar to how ALU/SFU instructions try
to reuse their srcs.

Totals:
Instrs: 363174291 -> 363175216 (+0.00%); split: -0.00%, +0.00%
CodeSize: 922975364 -> 922977230 (+0.00%); split: -0.00%, +0.00%
NOPs: 47652421 -> 47652444 (+0.00%); split: -0.00%, +0.00%
MOVs: 15652959 -> 15653065 (+0.00%); split: -0.00%, +0.00%
COVs: 4097203 -> 4097052 (-0.00%); split: -0.01%, +0.00%
(ss): 7806025 -> 7806183 (+0.00%); split: -0.00%, +0.00%
(sy): 3981862 -> 3981855 (-0.00%); split: -0.00%, +0.00%
(ss)-stall: 26612057 -> 26612789 (+0.00%); split: -0.00%, +0.00%
(sy)-stall: 111568786 -> 111568721 (-0.00%); split: -0.00%, +0.00%
STPs: 345796 -> 345792 (-0.00%)
LDPs: 191118 -> 191111 (-0.00%)
Preamble Instrs: 160491915 -> 160492355 (+0.00%); split: -0.00%, +0.00%
Last helper: 116587870 -> 116588273 (+0.00%); split: -0.00%, +0.00%
Cat0: 53288367 -> 53288384 (+0.00%); split: -0.00%, +0.00%
Cat1: 20954383 -> 20954336 (-0.00%); split: -0.00%, +0.00%
Cat2: 155294307 -> 155295252 (+0.00%); split: -0.00%, +0.00%
Cat6: 4623070 -> 4623059 (-0.00%)
Cat7: 9302363 -> 9302384 (+0.00%); split: -0.00%, +0.00%

Totals from 979 (0.07% of 1352016) affected shaders:
Instrs: 1324850 -> 1325775 (+0.07%); split: -0.07%, +0.14%
CodeSize: 2596114 -> 2597980 (+0.07%); split: -0.04%, +0.11%
NOPs: 330197 -> 330220 (+0.01%); split: -0.23%, +0.24%
MOVs: 62592 -> 62698 (+0.17%); split: -0.35%, +0.52%
COVs: 49011 -> 48860 (-0.31%); split: -0.62%, +0.31%
(ss): 35671 -> 35829 (+0.44%); split: -0.28%, +0.73%
(sy): 18936 -> 18929 (-0.04%); split: -0.13%, +0.09%
(ss)-stall: 157929 -> 158661 (+0.46%); split: -0.36%, +0.82%
(sy)-stall: 543371 -> 543306 (-0.01%); split: -0.20%, +0.19%
STPs: 2741 -> 2737 (-0.15%)
LDPs: 3022 -> 3015 (-0.23%)
Preamble Instrs: 322588 -> 323028 (+0.14%); split: -0.01%, +0.14%
Last helper: 298996 -> 299399 (+0.13%); split: -0.05%, +0.19%
Cat0: 361575 -> 361592 (+0.00%); split: -0.21%, +0.22%
Cat1: 111733 -> 111686 (-0.04%); split: -0.45%, +0.41%
Cat2: 487366 -> 488311 (+0.19%); split: -0.04%, +0.23%
Cat6: 21239 -> 21228 (-0.05%)
Cat7: 37170 -> 37191 (+0.06%); split: -0.06%, +0.12%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: c757b22c5f ("ir3: add subreg move optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37368>
2025-09-15 15:07:47 +00:00
Daniel Stone
1754bfa94a ci/freedreno: Skip overly-slow trace
The Godot trace has started timing out, taking close to or over 5min to
run. It's been skipped out on zink-tu-a618 for this reason, so do it on
the native driver too.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13894
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37366>
2025-09-15 10:03:22 +00:00
Yonggang Luo
bebd167d74 glsl: Fixes warning: deprecated directive: ‘%pure-parser’, ‘%error-verbose’
../../src/compiler/glsl/glcpp/glcpp-parse.y:179.1-12: warning: deprecated directive: ‘%pure-parser’, use ‘%define api.pure’ [-Wdeprecated]
  179 | %pure-parser
      | ^~~~~~~~~~~~
      | %define api.pure
../../src/compiler/glsl/glcpp/glcpp-parse.y:180.1-14: warning: deprecated directive: ‘%error-verbose’, use ‘%define parse.error verbose’ [-Wdeprecated]
  180 | %error-verbose
      | ^~~~~~~~~~~~~~
      | %define parse.error verbose

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289>
2025-09-13 08:23:07 +00:00
Collabora's Gfx CI Team
db3501ec4f Uprev Piglit to 517270ccca11a795d2f29bd723c362eb6ef9ce8f
28d1349844...517270ccca

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37273>
2025-09-12 23:09:46 -03:00
Eric Engestrom
11a7693065 turnip/ci: update test expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37302>
2025-09-11 16:02:38 +00:00
Rob Clark
0fe652971e freedreno/a6xx: Add missing format
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37279>
2025-09-11 03:08:54 +00:00
Rob Clark
250dba1dce freedreno/a6xx: Fallback to original blit in the snorm_copy path
Unlike z/s blits, where we want the fallback to use the re-written blit,
we don't want this in the handle_snorm_copy_blit() path.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37279>
2025-09-11 03:08:54 +00:00
Anna Maniscalco
011ba1842e freedreno/registers: add CP_ALWAYS_ON_CONTEXT
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37237>
2025-09-10 15:10:14 +00:00
Yonggang Luo
f3c3b99e60 clang-format: Move ForEachMacros into src/.clang-format for freedreno
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Yonggang Luo
773a7f347a clang-format: Update the .clang-format files to conformance clang-format json-schema
The document is at
https://clang.llvm.org/docs/ClangFormatStyleOptions.html

The json-schema at
https://www.schemastore.org/clang-format.json

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Rob Clark
15ee3873aa freedreno/registers: Update GMU register xml
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Synced from kernel commit f23e09a60d48 ("drm/msm: Update GMU register
xml").

Update GMU register xml with additional definitions for a7x family.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
a31dc3c5af freedreno/registers: Generate _HI/LO builders for reg64
Mesa has shifted more things to reg64 instead of seperate 32b HI/LO
reg32's.  This works better with the "new-style" c++ builders that
mesa has been migrating to for a6xx+ (to better handle register
shuffling between gens), but it leaves the C builders with missing
_HI/LO builders.

So handle the special case of reg64, automatically generating the
missing _HI/LO builders.  (This is for the benefit of the kernel
which cannot use the c++ builders.)

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
04e2140d8b freedreno/registers: remove python 3.9 dependency for compiling msm
Synced from kernel commit bb1953588068 ("drm/msm: remove python 3.9
dependency for compiling msm").

Since commit 5acf49119630 ("drm/msm: import gen_header.py script from Mesa"),
compilation is broken on machines having python versions older than 3.9
due to dependency on argparse.BooleanOptionalAction.

Switch to use simple bool for the validate flag to remove the dependency.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
91ff96b513 freedreno/registers: Remove license/etc from generated headers
Since these generated files are no longer checked in, either in mesa or
in the linux kernel, simplify things by dropping the verbose generated
comment.

These were semi-nerf'd on the kernel side, in the name of build
reproducibility, by commit ba64c6737f86 ("drivers: gpu: drm: msm:
registers: improve reproducibility"), but in a way that was semi-
kernel specific.  We can just reduce the divergence between kernel
and mesa by just dropping all of this.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
a70279adc2 freedreno/registers: Re-enable validation for gen_header.py
Commit 84e93daa26 ("freedreno/registers: allow skipping the
validation") synced a change that made validation optional for
kernel builds, to avoid a lxml dependency for kernel builds.
But this inadvertantly also disabled schema validation on the
mesa side.  CI (and meson "test" target) still validates the
xml against the schema, but it is easier if this is also done
as part of the normal build to avoid suprises from Marge.

Fixes: 84e93daa26 ("freedreno/registers: allow skipping the validation")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:23 +00:00
Connor Abbott
764b3d9161 tu: Implement transient attachments and lazily allocated memory
Transient attachments have been in Vulkan since 1.0, and are a way to
avoid allocating memory for attachments that can be stored entirely in
tile memory. The driver exposes a memory type with LAZILY_ALLOCATED_BIT,
and apps use this type to allocate images with TRANSIENT_ATTACHMENT
usage, which are restricted to color/depth/stencil/input attachment
usage. The driver is supposed to then delay allocating memory until it
knows that one of the images bound to the VkDeviceMemory must have
actual backing memory.

Implement this using the "lazy VMA" mechanism added earlier. We reserve
an iova range for lazy BOs, and only allocate them if we chose sysmem
rendering or there is a LOAD_OP_LOAD/STORE_OP_STORE. Because we never
split render passes and force sysmem instead, we don't have to deal with
the additional complexity of that here and just allocate everything.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
3b990ba210 tu: Make tu_image point to tu_device_memory instead of tu_bo
Up until now tu_device_memory (turnip's VkDeviceMemory) was a thin
wrapper around tu_bo (the GEM BO), so when binding an image to a
VkDeviceMemory we could just store the BO. But now we have to skip
allocating the BO unless we need to for lazily-allocated memory, and the
tracking for that needs to happen at the API level instead of the
kernel/GEM level, so store the tu_device_memory instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
88d001383a tu: Add support for a "lazy" sparse VMA
Add an extremely limited form of sparse where zeroing memory is not
supported and only one BO can be fully bound to the sparse VMA
immediately when it's created. This can be implemented on drm/msm even
without VM_BIND, by just reserving the iova range. However kgsl doesn't
let us control iova offsets, so we have to use "real" sparse support to
implement it. In effect this lets us reserve an iova range and then
"lazily" allocate the BO. This will be used for transient allocations in
Vulkan when we have to fallback to sysmem.

As part of this we add skeleton sparse VMA support to virtio, which is
just enough for lazy VMAs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
93a80f4bb9 tu/drm: Split out iova allocation and BO allocation
Reserve an iova range separately before allocating a BO. This reduces
the size of the critical section under the VMA lock and paves the way
for lazy BOs, where iova initialization is separated out.

While we're here, shrink the area where the VMA mutex is applied when
importing a dma-buf. AFAICT it's not useful to lock the entire function,
only the VMA lookup and zombie BO handling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
b663d8f762 freedreno: Add blit_wfi_quirk and use in turnip
When enabling
dEQP-VK.renderpass2.dedicated_allocation.attachment_allocation.grow.17,
we see a hang on a618 when a draw is immediately followed by a blit
without anything in between. The draw and clear are writing completely
different surfaces.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Emma Anholt
29fb897c0a ir3: Enable nir_opt_shrink_shrink_vec_array_vars.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The effect is surprisingly big, though it does seem to be concentrated in
just a few apps (Batman: Arkham Origins, Metro 2033 Redux, Shadow
Warrior):

Totals:
MaxWaves: 19680240 -> 19788620 (+0.55%); split: +0.55%, -0.00%
Instrs: 369291159 -> 367831500 (-0.40%); split: -0.40%, +0.01%
CodeSize: 936669580 -> 933798912 (-0.31%); split: -0.31%, +0.00%

...

Totals from 16918 (1.21% of 1402199) affected shaders:
MaxWaves: 125724 -> 234104 (+86.20%); split: +86.83%, -0.63%
Instrs: 11328230 -> 9868571 (-12.89%); split: -13.13%, +0.25%
CodeSize: 23684238 -> 20813570 (-12.12%); split: -12.24%, +0.12%
NOPs: 1633346 -> 1640119 (+0.41%); split: -2.09%, +2.50%
MOVs: 1940036 -> 510016 (-73.71%); split: -75.07%, +1.36%
COVs: 188107 -> 188546 (+0.23%); split: -0.32%, +0.56%
Full: 454239 -> 263078 (-42.08%); split: -42.80%, +0.71%
(ss): 251004 -> 231443 (-7.79%); split: -9.81%, +2.01%
(sy): 116086 -> 115153 (-0.80%); split: -2.38%, +1.58%
(ss)-stall: 738920 -> 794215 (+7.48%); split: -7.13%, +14.62%
(sy)-stall: 3321071 -> 3193717 (-3.83%); split: -5.58%, +1.74%
STPs: 101880 -> 71523 (-29.80%)
LDPs: 17406 -> 14411 (-17.21%)
Preamble Instrs: 2519390 -> 2548205 (+1.14%); split: -0.31%, +1.46%
Subgroup size: 1097472 -> 1097920 (+0.04%)

Cat0: 1833041 -> 1839613 (+0.36%); split: -1.91%, +2.27%
Cat1: 2128393 -> 698894 (-67.16%); split: -68.42%, +1.26%
Cat2: 3602449 -> 3595086 (-0.20%); split: -0.24%, +0.03%
Cat3: 2817384 -> 2815410 (-0.07%); split: -0.08%, +0.01%
Cat4: 273682 -> 273655 (-0.01%)
Cat5: 304630 -> 304398 (-0.08%)
Cat6: 207434 -> 179648 (-13.40%); split: -13.70%, +0.31%
Cat7: 161217 -> 161867 (+0.40%); split: -1.25%, +1.65%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37191>
2025-09-06 00:03:12 +00:00
Emma Anholt
b353f868dc ir3: Enable nir_opt_shrink_stores.
This pass strips trailing components not in the writemask of store
intrinsics, or from the trailing components that aren't part of an image's
format.

Totals from 11641 (0.83% of 1402199) affected shaders:
MaxWaves: 159402 -> 159422 (+0.01%); split: +0.08%, -0.07%
Instrs: 3073536 -> 3064117 (-0.31%); split: -0.59%, +0.28%
CodeSize: 7529906 -> 7417398 (-1.49%); split: -1.54%, +0.04%
NOPs: 286665 -> 289623 (+1.03%); split: -2.71%, +3.74%
MOVs: 85466 -> 74849 (-12.42%); split: -14.28%, +1.86%
Full: 116869 -> 116557 (-0.27%); split: -0.35%, +0.09%
(ss): 68245 -> 65758 (-3.64%); split: -5.23%, +1.59%
(sy): 31673 -> 31812 (+0.44%); split: -0.75%, +1.19%
(ss)-stall: 160473 -> 161653 (+0.74%); split: -3.63%, +4.37%
(sy)-stall: 668624 -> 668566 (-0.01%); split: -2.82%, +2.81%
Preamble Instrs: 1059243 -> 1033109 (-2.47%); split: -2.47%, +0.00%
Early Preamble: 10550 -> 10530 (-0.19%)
Subgroup size: 1172672 -> 1172416 (-0.02%); split: +0.01%, -0.03%

Cat0: 323161 -> 326364 (+0.99%); split: -2.50%, +3.49%
Cat1: 156177 -> 145280 (-6.98%); split: -7.92%, +0.95%
Cat2: 1448974 -> 1448964 (-0.00%)
Cat3: 874169 -> 874175 (+0.00%)
Cat5: 75743 -> 75742 (-0.00%)
Cat7: 38702 -> 36982 (-4.44%); split: -5.80%, +1.35%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37191>
2025-09-06 00:03:12 +00:00
Faith Ekstrand
446d5ef103 vulkan: Drop the driver_internal from vk_image_view_init/create()
It alwways comes in through the create flags now.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36957>
2025-09-05 23:34:14 +00:00
Connor Abbott
7527ad001a tu: Lower ViewIndex to 0 when multiview is disabled
This is an optimization, but it also seems to be required because the HW
sometimes fails to set ViewIndex to 0. This fixes flakes with
dEQP-VK.renderpass2.fragment_density_map.*multiviewport where the VS for
the main renderpass is reused for the copy renderpass afterwards and it
copies ViewIndex to ViewportIndex expecting it to be 0 since multiview
is disabled for the copy renderpass.

Closes: #13534
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37206>
2025-09-05 22:17:39 +00:00
Connor Abbott
de60f2ff68 tu: Advertise shaderResourceMinLod
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
e72fed3faa ir3: Support min_lod tex source
Use the .clp modifier. In order to fix
dEQP-VK.glsl.texture_functions.textureoffsetclamp.* we need to add a
workaround for an empirically-discovered problem.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
63959bb716 ir3: Assemble and disassemble .clp modifier
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
655934eef7 tu: Expose shaderResourceResidency
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
70cf40086c ir3: Implement sparse residency check
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
120f755bdb ir3: Assemble and disassemble rck modifier
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
918e25e158 tu: Support sparse residency for images
The tricky thing here is that we have to emulate the 64k "standard"
tile sizes in terms of the native 4k macrotiles. We do this by
manipulating which 4k pages get mapped, dividing the 64k tile into 4k
macrotiles and mapping each tile in such a way that, when viewed in
terms of the final swizzled image coordinates, the 4k tiles linearly
tile the image region that's supposed to be mapped to the 64k "tile".
Supporting the standard block sizes allows emulation layers to claim D3D
Tiled Resources Tier 2, which is required for the 12.0 feature level.
It's also required for ARB_sparse_texture2.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
ae53234414 freedreno/fdl: Add sparse layout support
Compute the Vulkan "sparse miptail," add support for padding the array
stride in order to make sure that the sparse miptail is large enough as
mandated by the Vulkan spec, and add a function to compute the standard
sparse block size.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
166bda02aa freedreno/fdl: Handle layout differences for r8g8 images
We don't handle copying r8g8 tiled images yet, but at least return the
correct tile size and bank swizzle so that r8g8 sparse textures work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
7225334589 freedreno/fdl: Handle cpp=32 and cpp=64 when getting macrotile size
These can only happen with multisampled images, which aren't supported
by fdl_tiled_memcpy. However these cases can be hit by multisampled
sparse textures.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
8ef64f2042 freedreno/fdl: Refactor and expose bank swizzling logic
For sparse, we will need to handle bank swizzling and macrotiles when
mapping sparse textures. However the functions for handling this were
leaking internal tiled_memcpy implementation details, like the concept
of a 256-byte "block" that doesn't really exist in the tiling (instead
everyone else deals with UBWC blocks, which may be 256 bytes or smaller,
and 4K macrotiles). Rewrite them to work in terms of macrotiles, and
take an fdl_layout.

In order to avoid having to pass an fdl_layout everywhere, pass around
the computed bank_mask and bank_swizzle everywhere. This also means that
we don't recompute several times.

Finally, expose a function to compute the macrotile size, which will
also be needed to work with bank swizzling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Connor Abbott
348ffdc996 freedreno/fdl: Expose fdl6_is_r8g8_layout() publicly
We will need to use this in other places in fdl.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32671>
2025-09-05 16:58:09 +00:00
Rob Clark
76fece61c6 freedreno/registers: Add A7XX_CX_DBGC
This was added on kernel side in commit 13ed0a1af263 ("drm/msm: Fix a7xx
debugbus read"), but mesa copy of the registers was updated from an
earlier revision of that patch which did not have A7XX_CX_DBGC.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37192>
2025-09-05 14:23:13 +00:00
Karmjit Mahil
651df8029a freedreno/registers: Fix SP_READ_SEL_LOCATION
Five possible values are defined by `enum a7xx_state_location`
so SP_READ_SEL_LOCATION must be at least 3 bit wide.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13836
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37192>
2025-09-05 14:23:13 +00:00
Valentine Burley
31f6235126 tu: Enable robustBufferAccessUpdateAfterBind
This is supported and must be enabled when
descriptorBinding*UpdateAfterBind is active.

Fixes the following VVL error:
Validation Error: [ VUID-VkDeviceCreateInfo-robustBufferAccess-10247 ]
vkCreateDevice(): robustBufferAccessUpdateAfterBind is false, but both
robustBufferAccess and a descriptorBinding*UpdateAfterBind feature are
enabled.

Fixes: d9fcf5de55 ("turnip: Enable nonuniform descriptor indexing")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36787>
2025-09-05 14:03:55 +00:00
Christoph Pillmayer
f81f3c85e2 nir/opt_algebraic: Convert a + b + a to b + 2a
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This allows fusing into one FMA later.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37113>
2025-09-05 11:39:51 +00:00
Georg Lehmann
2725eaf9a2 nir/lower_subgroups: change filter to intrinsic callback
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37178>
2025-09-04 14:04:00 +00:00
Job Noorman
9d4ba885bb ir3/ra: make main shader reg select independent of preamble
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ir3_ra allocates registers in a round-robin fashion to avoid false
dependencies. In order to do this, it keeps track of a "file start"
register for each register file and will search starting from there for
available registers.

This file start is initialized at the beginning of RA of kept across
blocks, including across the preamble. This means that a change that
only affects the preamble may cause changes in how registers are
allocated in the main shader. This may result in more or less copies,
and more or less false dependencies which changes the behavior of
postsched.

Changes in the preamble affecting the main shader makes it more
difficult to analyze shader-db results, as I often find myself chasing
down a regression that is just caused by RA/postsched "bad luck" in a
main shader that didn't actually change. Prevent this by resetting the
file start at the beginning of the main shader.

Totals:
Instrs: 364710030 -> 364631384 (-0.02%); split: -0.19%, +0.17%
CodeSize: 926766046 -> 926671488 (-0.01%); split: -0.10%, +0.09%
NOPs: 47703035 -> 47653319 (-0.10%); split: -1.05%, +0.94%
MOVs: 17072354 -> 17075112 (+0.02%); split: -1.28%, +1.29%
COVs: 4098062 -> 4096784 (-0.03%); split: -0.04%, +0.01%
Full: 15164359 -> 15112404 (-0.34%); split: -0.34%, +0.00%
(ss): 7818796 -> 7819147 (+0.00%); split: -1.10%, +1.11%
(sy): 3985674 -> 3983435 (-0.06%); split: -0.72%, +0.67%
(ss)-stall: 26535279 -> 26525929 (-0.04%); split: -1.36%, +1.32%
(sy)-stall: 111983489 -> 111716382 (-0.24%); split: -1.26%, +1.02%
Last helper: 116734916 -> 116595531 (-0.12%); split: -0.62%, +0.50%
Cat0: 53338794 -> 53289450 (-0.09%); split: -0.94%, +0.85%
Cat1: 22352349 -> 22328303 (-0.11%); split: -1.28%, +1.17%
Cat2: 155348173 -> 155348012 (-0.00%); split: -0.00%, +0.00%
Cat7: 9314194 -> 9309099 (-0.05%); split: -0.88%, +0.82%

Totals from 224302 (16.59% of 1352016) affected shaders:
Instrs: 148838101 -> 148759455 (-0.05%); split: -0.47%, +0.42%
CodeSize: 404838970 -> 404744412 (-0.02%); split: -0.22%, +0.20%
NOPs: 26261983 -> 26212267 (-0.19%); split: -1.90%, +1.71%
MOVs: 8372715 -> 8375473 (+0.03%); split: -2.60%, +2.63%
COVs: 2061488 -> 2060210 (-0.06%); split: -0.09%, +0.02%
Full: 3420300 -> 3368345 (-1.52%); split: -1.52%, +0.00%
(ss): 3848423 -> 3848774 (+0.01%); split: -2.24%, +2.25%
(sy): 2021040 -> 2018801 (-0.11%); split: -1.43%, +1.32%
(ss)-stall: 13554064 -> 13544714 (-0.07%); split: -2.65%, +2.59%
(sy)-stall: 59778475 -> 59511368 (-0.45%); split: -2.36%, +1.91%
Last helper: 52847662 -> 52708277 (-0.26%); split: -1.38%, +1.12%
Cat0: 29270336 -> 29220992 (-0.17%); split: -1.72%, +1.55%
Cat1: 10820261 -> 10796215 (-0.22%); split: -2.63%, +2.41%
Cat2: 57289060 -> 57288899 (-0.00%); split: -0.00%, +0.00%
Cat7: 5686726 -> 5681631 (-0.09%); split: -1.43%, +1.34%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37003>
2025-09-04 05:58:09 +00:00
Yiwei Zhang
ed80e33f51 tu: properly implement VkBindMemoryStatus from maint6
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Per spec: If the maintenance6 feature is enabled, this command must
attempt to perform all of the memory binding operations described by
pBindInfos, and must not early exit on the first failure.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37099>
2025-09-04 02:29:33 +00:00
Yiwei Zhang
cef48af271 tu: bind aliased wsi image at memory offset zero
The vulkan spec says that we should ignore memoryOffset when
VkBindImageMemorySwapchainInfoKHR is present. wsi common assumes that we
bind the wsi image at offset 0, so set the offset to 0. This change
aligns with common wsi, and also obeys dedicated alloc requirement.

Fixes: f887116c49 ("turnip: adopt wsi_common_get_memory")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37099>
2025-09-04 02:29:33 +00:00
Yiwei Zhang
96ac80aed1 tu: simplify AHB image view format resolving for external format
vk_image_view_init has resolved the external format already.

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37099>
2025-09-04 02:29:32 +00:00
Yiwei Zhang
76370c1edf tu: drop redundant Android headers
compile and cross-compile tested

Acked-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37099>
2025-09-04 02:29:31 +00:00
Faith Ekstrand
acd7cae0fa turnip: Use vk_drm_syncobj_copy_payloads
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36948>
2025-09-03 23:11:10 +00:00
Job Noorman
f46e2baeb3 ir3/spill: initialize base reg as late as possible
We currently insert the base reg at the very start of the shader. This
prevents enabling early preamble even if nothing is spilled in the
preamble.

Prevent this by keeping track of the least common ancestor of all block
that spill/reload and moving the base reg there.

Totals:
Instrs: 48207402 -> 48210556 (+0.01%); split: -0.00%, +0.01%
CodeSize: 101907026 -> 101909942 (+0.00%); split: -0.00%, +0.00%
NOPs: 8386320 -> 8387956 (+0.02%); split: -0.01%, +0.03%
MOVs: 1468853 -> 1469173 (+0.02%); split: -0.02%, +0.04%
COVs: 823724 -> 823852 (+0.02%); split: -0.00%, +0.02%
(ss): 1113167 -> 1113157 (-0.00%); split: -0.01%, +0.01%
(sy): 552317 -> 552306 (-0.00%); split: -0.01%, +0.00%
(ss)-stall: 4013046 -> 4013109 (+0.00%); split: -0.00%, +0.00%
(sy)-stall: 16741190 -> 16740000 (-0.01%); split: -0.02%, +0.01%
Preamble Instrs: 11506988 -> 11506257 (-0.01%); split: -0.01%, +0.00%
Early Preamble: 121339 -> 121367 (+0.02%)
Last helper: 11686328 -> 11686316 (-0.00%); split: -0.00%, +0.00%
Cat0: 9241457 -> 9243099 (+0.02%); split: -0.01%, +0.03%
Cat1: 2353411 -> 2354995 (+0.07%); split: -0.04%, +0.11%
Cat2: 17468471 -> 17468507 (+0.00%); split: -0.00%, +0.00%
Cat7: 1637795 -> 1637687 (-0.01%); split: -0.01%, +0.00%

Totals from 48 (0.03% of 164705) affected shaders:
Instrs: 347473 -> 350627 (+0.91%); split: -0.40%, +1.31%
CodeSize: 565490 -> 568406 (+0.52%); split: -0.23%, +0.74%
NOPs: 70496 -> 72132 (+2.32%); split: -1.07%, +3.39%
MOVs: 27524 -> 27844 (+1.16%); split: -1.23%, +2.39%
COVs: 6275 -> 6403 (+2.04%); split: -0.38%, +2.42%
(ss): 8850 -> 8840 (-0.11%); split: -0.76%, +0.64%
(sy): 4666 -> 4655 (-0.24%); split: -0.69%, +0.45%
(ss)-stall: 12116 -> 12179 (+0.52%); split: -0.65%, +1.17%
(sy)-stall: 266208 -> 265018 (-0.45%); split: -1.08%, +0.63%
Preamble Instrs: 20657 -> 19926 (-3.54%); split: -3.56%, +0.02%
Early Preamble: 0 -> 28 (+inf%)
Last helper: 25507 -> 25495 (-0.05%); split: -0.12%, +0.07%
Cat0: 76458 -> 78100 (+2.15%); split: -0.99%, +3.14%
Cat1: 82669 -> 84253 (+1.92%); split: -1.11%, +3.03%
Cat2: 89414 -> 89450 (+0.04%); split: -0.09%, +0.13%
Cat7: 8595 -> 8487 (-1.26%); split: -1.33%, +0.07%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36667>
2025-09-03 21:17:57 +00:00