Commit graph

6619 commits

Author SHA1 Message Date
Connor Abbott
c2eb768eb2 tu: Expose VK_EXT_dynamic_rendering_unused_attachments
We only use attachment formats for things used by the pipeline, so we
can trivially enable this.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37529>
2025-09-23 17:33:19 +00:00
Danylo Piliaiev
fce9dbc493 tu/perfetto: Init perfetto datasources once
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37465>
2025-09-23 12:04:01 +00:00
Danylo Piliaiev
0621d5cd39 tu/perfetto: Make GPU clock sequence-scoped
When CPU clock is the same with the authoritative trace clock (normally
default to CLOCK_BOOTTIME), perfetto drops the non-monotonic snapshots
to ensure validity of the global source clock in the resolution graph.
When they are different, the clocks are marked invalid and the rest of
the clock syncs will fail during trace processing.

There's no central daemon emitting consistent snapshots for
synchronization between CPU and GPU clocks on behalf of renderstages and
counters producers. The sequence-scoped clock (64 <= ID < 128) is unique
per producer + writer pair within the tracing session.

Turnip is a bit tricky here, since clocks may be synchronized before
`tu_perfetto_end_submit` is called (in case of KGSL), but emission of
perfetto event has to happen on the same thread as other renderstage events.
To solve this I save the clocks in `tu_perfetto_state` and emit them in
`stage_end` when needed.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37465>
2025-09-23 12:04:01 +00:00
Danylo Piliaiev
09f5c9d0ad tu/perfetto: Track GPU timestamps per-device
In preparation for using sequence-scope perfetto clocks.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37465>
2025-09-23 12:04:01 +00:00
Danylo Piliaiev
e2b63472e4 tu/perfetto: Don't check sync_gpu_ts when emitting renderstage
In short, perfetto doesn't require the initial clock snapshot to be
earlier than the timestamp to be converted. So we don't have to do
complex handling for it.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37465>
2025-09-23 12:04:01 +00:00
Danylo Piliaiev
ea849b5557 tu: Destroy all mutexes used for device
We never destroyed most of mutexes we used, it was likely fine on
platforms turnip is running on, but still not correct.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37465>
2025-09-23 12:04:00 +00:00
Job Noorman
f536d76341 ir3/parser: don't use instr as ralloc context
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Instructions are allocated using a linear context so cannot themselves
be used as a ralloc context anymore. Use the variant's ir instead.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 114e6a3104 ("ir3: Use a linear allocation context for ir3_instructions.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37523>
2025-09-23 11:45:07 +00:00
Danylo Piliaiev
518008c3b0 tu/a7xx: Update reg stomping info to fix GPU crashes when stomping
- Removed DBG/CHICKEN regs from being stomped, because they randomly
  cause issues, and there is no even point of stomping them.
- *ATTR_BUF_GMEM regs are not emitted at every renderpass.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37372>
2025-09-23 10:12:30 +00:00
Connor Abbott
a7922e7188 tu/fdm: Use better bounds for LRZ overallocation with FDM offset
Use tile_max_w/h which is the HW bound for the tile width/height and is
much smaller than the theoretical maximum width/height with a lopsided
tile with just the depth attachment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37513>
2025-09-22 17:01:05 +00:00
Connor Abbott
964e84d468 tu: Fix 3d load and clear when FDM bin offsets are in use
Unlike the store/resolve that uses A2D, The FDM load path uses the 3d
pipeline and is therefore affected by the hardware FDM offset registers.
The fallback sysmem clear path also uses the 3d pipeline. Subtract off
the HW offset from the destination coordinates, similar to how it is
subtracted from viewport and scissor.

Fixes: b34b089ca1 ("tu: Use GRAS bin offset registers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37496>
2025-09-22 15:17:39 +00:00
Zan Dobersek
d3cedd2fa5 tu/drm: msm's has_set_iova codepath should avoid freeing zombified tu_sparse_vma
In msm backend's has_set_iova codepath, mapping a BO into a lazy VMA will
require moving that VMA into the zombie VMA mechanism once the BO is
destroyed. That means tu_sparse_vma destruction should avoid freeing VMA if
BO was mapped into it and then zombified.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 764b3d9161 ("tu: Implement transient attachments and lazily allocated memory")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37413>
2025-09-22 13:05:34 +00:00
Zan Dobersek
64fc91bb58 tu/drm: msm backend shouldn't use util_vma_heap in the !has_set_iova codepaths
For the fallback !has_set_iova codepath, util_vma_heap shouldn't be used
for freeing allocations since it's not initialized or used for allocations.

A helper tu_free_iova() function is added to complement tu_allocate_iova(),
handling the vma lock and freeing the allocation in the util_vma_heap when
appropriate.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 93a80f4bb9 ("tu/drm: Split out iova allocation and BO allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37413>
2025-09-22 13:05:34 +00:00
Zan Dobersek
07a599ff3e tu/drm: avoid has_set_iova-specific util_vma_heap freeing in tu_bo_init
After the refactoring, tu_bo_init() is not allocating iova anymore so it
should also not free the util_vma_heap allocation for the has_set_iova
case.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 93a80f4bb9 ("tu/drm: Split out iova allocation and BO allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37413>
2025-09-22 13:05:33 +00:00
Rob Clark
3a4b3322d4 freedreno/decode: checkreg handling for bitsize/stride
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The initial version was not accounting for reg64 vs reg32, or array
stride.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37502>
2025-09-21 08:27:00 -07:00
Rob Clark
159d0596c4 freedreno/registers: Fix x_CONTEXT_SWITCH_GFX_PREEMPTION_SAFE_MODE
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The HLSQ version only existed in a6xx.  And the SP one had the wrong
offset.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37484>
2025-09-20 16:52:22 +00:00
Rob Clark
897a47602a freedreno/registers: Remove conflicting RBBM regs
These are the same as a6xx, so just keep the declarations without
variants attribute.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37484>
2025-09-20 16:52:21 +00:00
Rob Clark
68e5f150e3 freedreno/decode: Add test to check for conflicting regs
Add a tool to check for conflicting/overlapping register definitions.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37484>
2025-09-20 16:52:21 +00:00
Karmjit Mahil
2c676a38ea freedreno/registers: Fix typo
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37489>
2025-09-19 20:19:41 +00:00
Mike Blumenkrantz
4b30df4462 tu: don't deref end info in tu_CmdEndRendering2EXT
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
this can be null

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37476>
2025-09-19 10:13:58 -04:00
Martin Roukala (né Peres)
e4668b8427 turnip/ci: switch vkcts testing to the KWS farm
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This commit keeps vkcts as a nightly job, but this puts us in shooting
distance to what we've been working for for the past 2.5 years!

We will flip the switch to making this job part of the merge pipeline
after a week of stress testing to make sure reliability issues,
especially around USB, don't come back to haunt my days and nights.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37367>
2025-09-19 11:41:54 +00:00
Martin Roukala (né Peres)
e5509237bf turnip/ci: document more flakes
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37367>
2025-09-19 11:41:54 +00:00
Danylo Piliaiev
0908694f02 freedreno/decode: Fix preamble decoding
Fixes: 46ad5a01a8 ("freedreno: Rename CP_SET_CTXSWITCH_IB to CP_SET_AMBLE")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37458>
2025-09-18 14:44:33 +00:00
Emma Anholt
114e6a3104 ir3: Use a linear allocation context for ir3_instructions.
Again, instrs don't get freed as we go, so the linear gc context saves us
5 pointers per instr.

Fossil replay time for deadspace3 on a debugoptimized build -4.85258% +/-
3.04009% (n=10)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37316>
2025-09-17 12:02:47 -07:00
Emma Anholt
12fae29ec2 ir3: Use a linear allocation context for ir3_registers.
Since we don't free registers as we go, we can just allocate them in a
linear gc context that gets freed at ralloc destroy.  Saves 5 pointers of
memory per register for the ralloc overhead.

Fossil replay time for deadspace3 on a debugoptimized build -4.30353% +/-
1.80078% (n=10).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37316>
2025-09-17 12:02:47 -07:00
Emma Anholt
1b4c2c1566 ir3: Use a bitset for the defs-seen table.
Fossil reply time for deadspace3 on a debugoptimized build -3.20856% +/-
1.48994% (n=15).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37316>
2025-09-17 12:02:47 -07:00
Karmjit Mahil
9c6183604f nir, ir3: Add lower_fmulz_with_abs_min backend option
This commits adds the `lower_fmulz_with_abs_min` which lowers
`fmulz` -> `min(abs(a), abs(b)) == 0.0 ? 0.0 : a * b`
`ffmaz` -> `min(abs(a), abs(b)) == 0.0 ? c : ffma(a, b, c)

This is useful for ISAs which have `abs` for free on `min` such as
ir3.

Adreno A750 Benchmark of 10 runs of 5 DX9 single frame trimmed
captures looped 2048 times using u_trace measuring
`start_render_pass` to `end_render_pass` results:

sysmem:
-1.91156%, -2.21791%, -2.02533%, -2.21666%, -2.33272%,
-2.67349%, -1.75278%, -2.05923%, -2.26892%, -2.10506%
Avg:  ~ -2.16%
ST.S: ~  0.25%

gmem:
-3.61496%, -3.66682%, -3.80901%, -3.51198%, -3.72950%,
-3.71413%, -3.64467%, -3.67092%, -3.90640%, -3.83888%
Avg:  ~ -3.71%
ST.S: ~  0.12%

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31479>
2025-09-17 15:02:50 +00:00
Konstantin Seurer
ea51a67996 vulkan/bvh: Enable glsl extensions in meson
Having a list of all enabled/used extensions in meson allows us to get
rid of a lot of boilerplate in every bvh build shader.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35326>
2025-09-16 20:18:01 +00:00
Rob Clark
0a1f56fb90 freedreno/devices: Update chicken bits
b22 should be set on all a7xx.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37411>
2025-09-16 16:21:42 +00:00
Danylo Piliaiev
1c57f88908 tu: Reset BIN_FOVEAT regs for tiling with and without HW binning
We didn't reset the regs when HW binning was disabled.

Fixes: b34b089ca1 ("tu: Use GRAS bin offset registers")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37373>
2025-09-15 20:11:21 +00:00
Job Noorman
563b0b347a ir3: don't create merge sets for subreg moves
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
There are multiple places where RA assumes merge sets are either
all-full or all-half registers. Creating merge sets for subreg moves
mixes full and half registers which may lead to RA failures.

Fix this by not creating merge sets for subreg moves anymore. Instead,
we manually try to allocate  a subreg move's src for its dst when
selecting a register during RA, similar to how ALU/SFU instructions try
to reuse their srcs.

Totals:
Instrs: 363174291 -> 363175216 (+0.00%); split: -0.00%, +0.00%
CodeSize: 922975364 -> 922977230 (+0.00%); split: -0.00%, +0.00%
NOPs: 47652421 -> 47652444 (+0.00%); split: -0.00%, +0.00%
MOVs: 15652959 -> 15653065 (+0.00%); split: -0.00%, +0.00%
COVs: 4097203 -> 4097052 (-0.00%); split: -0.01%, +0.00%
(ss): 7806025 -> 7806183 (+0.00%); split: -0.00%, +0.00%
(sy): 3981862 -> 3981855 (-0.00%); split: -0.00%, +0.00%
(ss)-stall: 26612057 -> 26612789 (+0.00%); split: -0.00%, +0.00%
(sy)-stall: 111568786 -> 111568721 (-0.00%); split: -0.00%, +0.00%
STPs: 345796 -> 345792 (-0.00%)
LDPs: 191118 -> 191111 (-0.00%)
Preamble Instrs: 160491915 -> 160492355 (+0.00%); split: -0.00%, +0.00%
Last helper: 116587870 -> 116588273 (+0.00%); split: -0.00%, +0.00%
Cat0: 53288367 -> 53288384 (+0.00%); split: -0.00%, +0.00%
Cat1: 20954383 -> 20954336 (-0.00%); split: -0.00%, +0.00%
Cat2: 155294307 -> 155295252 (+0.00%); split: -0.00%, +0.00%
Cat6: 4623070 -> 4623059 (-0.00%)
Cat7: 9302363 -> 9302384 (+0.00%); split: -0.00%, +0.00%

Totals from 979 (0.07% of 1352016) affected shaders:
Instrs: 1324850 -> 1325775 (+0.07%); split: -0.07%, +0.14%
CodeSize: 2596114 -> 2597980 (+0.07%); split: -0.04%, +0.11%
NOPs: 330197 -> 330220 (+0.01%); split: -0.23%, +0.24%
MOVs: 62592 -> 62698 (+0.17%); split: -0.35%, +0.52%
COVs: 49011 -> 48860 (-0.31%); split: -0.62%, +0.31%
(ss): 35671 -> 35829 (+0.44%); split: -0.28%, +0.73%
(sy): 18936 -> 18929 (-0.04%); split: -0.13%, +0.09%
(ss)-stall: 157929 -> 158661 (+0.46%); split: -0.36%, +0.82%
(sy)-stall: 543371 -> 543306 (-0.01%); split: -0.20%, +0.19%
STPs: 2741 -> 2737 (-0.15%)
LDPs: 3022 -> 3015 (-0.23%)
Preamble Instrs: 322588 -> 323028 (+0.14%); split: -0.01%, +0.14%
Last helper: 298996 -> 299399 (+0.13%); split: -0.05%, +0.19%
Cat0: 361575 -> 361592 (+0.00%); split: -0.21%, +0.22%
Cat1: 111733 -> 111686 (-0.04%); split: -0.45%, +0.41%
Cat2: 487366 -> 488311 (+0.19%); split: -0.04%, +0.23%
Cat6: 21239 -> 21228 (-0.05%)
Cat7: 37170 -> 37191 (+0.06%); split: -0.06%, +0.12%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: c757b22c5f ("ir3: add subreg move optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37368>
2025-09-15 15:07:47 +00:00
Daniel Stone
1754bfa94a ci/freedreno: Skip overly-slow trace
The Godot trace has started timing out, taking close to or over 5min to
run. It's been skipped out on zink-tu-a618 for this reason, so do it on
the native driver too.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13894
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37366>
2025-09-15 10:03:22 +00:00
Yonggang Luo
bebd167d74 glsl: Fixes warning: deprecated directive: ‘%pure-parser’, ‘%error-verbose’
../../src/compiler/glsl/glcpp/glcpp-parse.y:179.1-12: warning: deprecated directive: ‘%pure-parser’, use ‘%define api.pure’ [-Wdeprecated]
  179 | %pure-parser
      | ^~~~~~~~~~~~
      | %define api.pure
../../src/compiler/glsl/glcpp/glcpp-parse.y:180.1-14: warning: deprecated directive: ‘%error-verbose’, use ‘%define parse.error verbose’ [-Wdeprecated]
  180 | %error-verbose
      | ^~~~~~~~~~~~~~
      | %define parse.error verbose

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37289>
2025-09-13 08:23:07 +00:00
Collabora's Gfx CI Team
db3501ec4f Uprev Piglit to 517270ccca11a795d2f29bd723c362eb6ef9ce8f
28d1349844...517270ccca

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37273>
2025-09-12 23:09:46 -03:00
Eric Engestrom
11a7693065 turnip/ci: update test expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37302>
2025-09-11 16:02:38 +00:00
Rob Clark
0fe652971e freedreno/a6xx: Add missing format
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37279>
2025-09-11 03:08:54 +00:00
Rob Clark
250dba1dce freedreno/a6xx: Fallback to original blit in the snorm_copy path
Unlike z/s blits, where we want the fallback to use the re-written blit,
we don't want this in the handle_snorm_copy_blit() path.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37279>
2025-09-11 03:08:54 +00:00
Anna Maniscalco
011ba1842e freedreno/registers: add CP_ALWAYS_ON_CONTEXT
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37237>
2025-09-10 15:10:14 +00:00
Yonggang Luo
f3c3b99e60 clang-format: Move ForEachMacros into src/.clang-format for freedreno
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Yonggang Luo
773a7f347a clang-format: Update the .clang-format files to conformance clang-format json-schema
The document is at
https://clang.llvm.org/docs/ClangFormatStyleOptions.html

The json-schema at
https://www.schemastore.org/clang-format.json

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37235>
2025-09-09 07:04:55 +00:00
Rob Clark
15ee3873aa freedreno/registers: Update GMU register xml
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Synced from kernel commit f23e09a60d48 ("drm/msm: Update GMU register
xml").

Update GMU register xml with additional definitions for a7x family.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
a31dc3c5af freedreno/registers: Generate _HI/LO builders for reg64
Mesa has shifted more things to reg64 instead of seperate 32b HI/LO
reg32's.  This works better with the "new-style" c++ builders that
mesa has been migrating to for a6xx+ (to better handle register
shuffling between gens), but it leaves the C builders with missing
_HI/LO builders.

So handle the special case of reg64, automatically generating the
missing _HI/LO builders.  (This is for the benefit of the kernel
which cannot use the c++ builders.)

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
04e2140d8b freedreno/registers: remove python 3.9 dependency for compiling msm
Synced from kernel commit bb1953588068 ("drm/msm: remove python 3.9
dependency for compiling msm").

Since commit 5acf49119630 ("drm/msm: import gen_header.py script from Mesa"),
compilation is broken on machines having python versions older than 3.9
due to dependency on argparse.BooleanOptionalAction.

Switch to use simple bool for the validate flag to remove the dependency.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
91ff96b513 freedreno/registers: Remove license/etc from generated headers
Since these generated files are no longer checked in, either in mesa or
in the linux kernel, simplify things by dropping the verbose generated
comment.

These were semi-nerf'd on the kernel side, in the name of build
reproducibility, by commit ba64c6737f86 ("drivers: gpu: drm: msm:
registers: improve reproducibility"), but in a way that was semi-
kernel specific.  We can just reduce the divergence between kernel
and mesa by just dropping all of this.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:24 +00:00
Rob Clark
a70279adc2 freedreno/registers: Re-enable validation for gen_header.py
Commit 84e93daa26 ("freedreno/registers: allow skipping the
validation") synced a change that made validation optional for
kernel builds, to avoid a lxml dependency for kernel builds.
But this inadvertantly also disabled schema validation on the
mesa side.  CI (and meson "test" target) still validates the
xml against the schema, but it is easier if this is also done
as part of the normal build to avoid suprises from Marge.

Fixes: 84e93daa26 ("freedreno/registers: allow skipping the validation")
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37216>
2025-09-08 18:28:23 +00:00
Connor Abbott
764b3d9161 tu: Implement transient attachments and lazily allocated memory
Transient attachments have been in Vulkan since 1.0, and are a way to
avoid allocating memory for attachments that can be stored entirely in
tile memory. The driver exposes a memory type with LAZILY_ALLOCATED_BIT,
and apps use this type to allocate images with TRANSIENT_ATTACHMENT
usage, which are restricted to color/depth/stencil/input attachment
usage. The driver is supposed to then delay allocating memory until it
knows that one of the images bound to the VkDeviceMemory must have
actual backing memory.

Implement this using the "lazy VMA" mechanism added earlier. We reserve
an iova range for lazy BOs, and only allocate them if we chose sysmem
rendering or there is a LOAD_OP_LOAD/STORE_OP_STORE. Because we never
split render passes and force sysmem instead, we don't have to deal with
the additional complexity of that here and just allocate everything.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
3b990ba210 tu: Make tu_image point to tu_device_memory instead of tu_bo
Up until now tu_device_memory (turnip's VkDeviceMemory) was a thin
wrapper around tu_bo (the GEM BO), so when binding an image to a
VkDeviceMemory we could just store the BO. But now we have to skip
allocating the BO unless we need to for lazily-allocated memory, and the
tracking for that needs to happen at the API level instead of the
kernel/GEM level, so store the tu_device_memory instead.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
88d001383a tu: Add support for a "lazy" sparse VMA
Add an extremely limited form of sparse where zeroing memory is not
supported and only one BO can be fully bound to the sparse VMA
immediately when it's created. This can be implemented on drm/msm even
without VM_BIND, by just reserving the iova range. However kgsl doesn't
let us control iova offsets, so we have to use "real" sparse support to
implement it. In effect this lets us reserve an iova range and then
"lazily" allocate the BO. This will be used for transient allocations in
Vulkan when we have to fallback to sysmem.

As part of this we add skeleton sparse VMA support to virtio, which is
just enough for lazy VMAs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
93a80f4bb9 tu/drm: Split out iova allocation and BO allocation
Reserve an iova range separately before allocating a BO. This reduces
the size of the critical section under the VMA lock and paves the way
for lazy BOs, where iova initialization is separated out.

While we're here, shrink the area where the VMA mutex is applied when
importing a dma-buf. AFAICT it's not useful to lock the entire function,
only the VMA lookup and zombie BO handling.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Connor Abbott
b663d8f762 freedreno: Add blit_wfi_quirk and use in turnip
When enabling
dEQP-VK.renderpass2.dedicated_allocation.attachment_allocation.grow.17,
we see a hang on a618 when a draw is immediately followed by a blit
without anything in between. The draw and clear are writing completely
different surfaces.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37151>
2025-09-08 15:22:17 +00:00
Emma Anholt
29fb897c0a ir3: Enable nir_opt_shrink_shrink_vec_array_vars.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The effect is surprisingly big, though it does seem to be concentrated in
just a few apps (Batman: Arkham Origins, Metro 2033 Redux, Shadow
Warrior):

Totals:
MaxWaves: 19680240 -> 19788620 (+0.55%); split: +0.55%, -0.00%
Instrs: 369291159 -> 367831500 (-0.40%); split: -0.40%, +0.01%
CodeSize: 936669580 -> 933798912 (-0.31%); split: -0.31%, +0.00%

...

Totals from 16918 (1.21% of 1402199) affected shaders:
MaxWaves: 125724 -> 234104 (+86.20%); split: +86.83%, -0.63%
Instrs: 11328230 -> 9868571 (-12.89%); split: -13.13%, +0.25%
CodeSize: 23684238 -> 20813570 (-12.12%); split: -12.24%, +0.12%
NOPs: 1633346 -> 1640119 (+0.41%); split: -2.09%, +2.50%
MOVs: 1940036 -> 510016 (-73.71%); split: -75.07%, +1.36%
COVs: 188107 -> 188546 (+0.23%); split: -0.32%, +0.56%
Full: 454239 -> 263078 (-42.08%); split: -42.80%, +0.71%
(ss): 251004 -> 231443 (-7.79%); split: -9.81%, +2.01%
(sy): 116086 -> 115153 (-0.80%); split: -2.38%, +1.58%
(ss)-stall: 738920 -> 794215 (+7.48%); split: -7.13%, +14.62%
(sy)-stall: 3321071 -> 3193717 (-3.83%); split: -5.58%, +1.74%
STPs: 101880 -> 71523 (-29.80%)
LDPs: 17406 -> 14411 (-17.21%)
Preamble Instrs: 2519390 -> 2548205 (+1.14%); split: -0.31%, +1.46%
Subgroup size: 1097472 -> 1097920 (+0.04%)

Cat0: 1833041 -> 1839613 (+0.36%); split: -1.91%, +2.27%
Cat1: 2128393 -> 698894 (-67.16%); split: -68.42%, +1.26%
Cat2: 3602449 -> 3595086 (-0.20%); split: -0.24%, +0.03%
Cat3: 2817384 -> 2815410 (-0.07%); split: -0.08%, +0.01%
Cat4: 273682 -> 273655 (-0.01%)
Cat5: 304630 -> 304398 (-0.08%)
Cat6: 207434 -> 179648 (-13.40%); split: -13.70%, +0.31%
Cat7: 161217 -> 161867 (+0.40%); split: -1.25%, +1.65%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37191>
2025-09-06 00:03:12 +00:00