Commit graph

6512 commits

Author SHA1 Message Date
Danylo Piliaiev
8faf76a754 tu/a6xx: Fix unaligned buffer_to_image on close to (1 << 14) width
I'm not sure why exactly it didn't work because
TPL1_A2D_SRC_TEXTURE_SIZE seemingly has (1 << 15) width
limit. However tests have shown that it doesn't work out.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36939>
2025-08-26 10:56:13 +00:00
Danylo Piliaiev
a288b77403 tu: Fix unaligned image_to_buffer on close to (1 << 14) width
The bottom right corner of the copy exceeded the maximum allowed
value in GRAS_A2D_DEST_BR.x

In order to fix this, we have to do a second copy per line of
the last texels.

Fixes asserts in:
 dEQP-GLES31.functional.copy_image.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36939>
2025-08-26 10:56:13 +00:00
Job Noorman
51fa8ad748 freedreno/drm-shim: disable VM_BIND
Turnip crashes under drm-shim when enabling VM_BIND. We don't care about
VM_BIND for shader compilation so just disable it.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 4efbfa1441 ("tu/drm: Enable VM_BIND")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37000>
2025-08-26 09:09:48 +00:00
Mark Collins
098521559d freedreno/drm: Only initialize memory data source when Perfetto is active
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
FdMemoryDataSource was being registered as a Perfetto data source
unconditionally which led to anything calling fd_device_new(...)
attempting to do this even when they might not have Perfetto
initialized which is done as a part of util_perfetto_init, without
which trying to register the event causes a SEGFAULT.

Fixes: c7045e3e63 ("perfetto: unify init")

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36993>
2025-08-25 22:04:45 +00:00
Connor Abbott
7d925dbc52 freedreno/ci: Update a750 expectations
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:58 +00:00
Martin Roukala (né Peres)
44dbf8756e freedreno/ci: uprev the kernel for the a750
We are still in the process of moving our kernels to gfx-ci/linux, but
we got the request to uprev the kernel a month ago when I started my
holiday, so let's not delay it more. Anyway, it is better to change
only one variable at a time so no harm done.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:58 +00:00
Connor Abbott
d921225af1 freedreno/ci: Update kernel with VM_BIND fixes
Pull in msm-fixes plus a few extra fixes we've accumulated from the
list.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:58 +00:00
Connor Abbott
96513b5e8e freedreno/ci: Skip dEQP-VK.memory.mapping.*.full.variable.*
These use too much memory with VM_BIND and aren't super useful. We have
to skip them even with the full jobs to avoid taking them out.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:58 +00:00
Connor Abbott
938ac2b67d freedreno/ci: Add sparse-related a618 skips
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:58 +00:00
Connor Abbott
4d2c14847f tu: Support sparseResidencyAliased
UCHE and CCU use virtual-tagged addresses, so whenever an alias may have
changed we have to always flush and invalidate everything. We detect
this through the sparse memory aliasing flag on the buffer/image, or for
plain memory barriers whether the feature is enabled.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
8feed47fce tu: Initial support for sparse binding
Plumb through support for a sparse queue and enable sparse binding using
the kernel interfaces we added earlier. We also support sparse residency
for buffers, which is straightforward, but sparse residency for images
is much more complicated so it will be enabled later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
71ef46717c tu/kgsl: Add support for sparse binding
Use the "virtual BO" interface.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
797c74452f tu/drm: Add support for sparse binding
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
f9daddf5d5 tu/knl: Add an API for sparse binding
Add a "sparse VMA" abstraction, and functions creating them, destroying
them, and submitting commands to map and unmap BOs into them. This
mirrors the Vulkan API, but with image offsets resolved to page offsets.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
4efbfa1441 tu/drm: Enable VM_BIND
Use a new driver-internal VM_BIND submit queue for mapping and unmapping
"normal" BOs. This will be required for sparse, because we can't mix
the old and new interface, but it should also allow us to stop using
"zombie" VMAs and the bo list.

Also use MSM_BO_NO_SHARE, which we assume is available when VM_BIND is.
This should significantly reduce kernel submit overhead, in parallel to
the userspace submit overhead cut by using VM_BIND.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
460ed35916 tu: Fix CmdBindTransformFeedbackBuffersEXT size handling
According to the spec and as implemented by other drivers, this should
use the size of the buffer instead of the size of the VkDeviceMemory
it's bound to when VK_WHOLE_SIZE is specified or pSizes is NULL. The
current behavior doesn't make sense at all for sparse buffers which are
not bound to a single VkDeviceMemory. Just use the common helper that
already does the right thing, copied from anv.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
aa392e1ec2 tu: Align BO size to page size
The kernel was rounding the size up for us, but it doesn't like a
non-aligned map size, so just sanitize the size here.

tu_cs was relying on the size not being rounded to keep the maximum size
2^20-1 or less, so fix that by using the initial unrounded size.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
d8d0e73899 freedreno/drm: Import new UABI for VM_BIND
Imported from kernel commit 203dcde88156
("Merge tag 'drm-msm-next-2025-07-05' of https://gitlab.freedesktop.org/drm/msm into drm-next").

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Connor Abbott
51a7aebc86 tu: Refactor BO deletion
For VM_BIND, BO deletion will have to be implemented differently in
native drm and virtio. We already have a somewhat awkward situation with
native-specific code in the common BO deletion helper, which we only get
away with because it's for kernels without SET_IOVA in which case virtio
isn't supported. Add a few common helpers for some of the guts, and move
the guts into backend-specific functions.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
2025-08-25 20:11:57 +00:00
Eric Engestrom
fa74e939bf ci/piglit: automatically use LAVA proxy
This avoids having to hardcode the proxy in the traces `download-url` or
jobs setting `PIGLIT_REPLAY_EXTRA_ARGS` and accidentally overriding the
default args when the author meant to append.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36955>
2025-08-25 14:52:38 +00:00
Valentine Burley
2595b029fa tu: Advertise VK_EXT_shader_atomic_float
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We pass the tests for exchange, load, and store on R32_SFLOAT, including
shared memory (which the proprietary driver does not advertise). The blob
does not support add operations either.

Passes:
dEQP-VK.glsl.atomic_operations.exchange_float*
dEQP-VK.image.atomic_operations.exchange*r32f*

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36907>
2025-08-23 20:13:44 +00:00
Valentine Burley
59f5f239f6 freedreno/ci: Add missing caching proxy for traces
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The currently active a618 trace jobs haven’t been using the caching
proxy.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36929>
2025-08-23 08:02:11 +00:00
Collabora's Gfx CI Team
62f39fca25 Uprev Piglit to 28d1349844eacda869f0f82f551bcd4ac0c4edfe
c3a3e29d59...28d1349844

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36908>
2025-08-22 07:35:15 +00:00
Collabora's Gfx CI Team
640e2eddea Uprev ANGLE to 995c4c4d89ed6a5c28b210e9c0f83eb4f8b6e2f5
6a04a50f98...995c4c4d89

- Skip tests failing on all drivers due to a CTS bug
- Disable clang options not supported by the 'unbundled' toolchain

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36908>
2025-08-22 07:35:15 +00:00
Job Noorman
24cdb0b636 ir3: emit descriptor prefetch in block dominated by its sources
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Descriptor prefetches may be generated for instructions in control flow.
This means we cannot simply emit prefetches at the end of the preamble
because that may not be dominated by all their sources. This commit uses
the helpers introduced by e7ac1094f6 ("ir3: rematerialize preamble defs
in block dominated by sources") to find the correct block to insert
prefetches.

Fixes NIR validation errors in Dying Light 2.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 4e2a0a5ad0 ("ir3: Add descriptor prefetching optimization on a7xx")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36885>
2025-08-22 05:21:25 +00:00
Emma Anholt
1c0c3a2375 ir3: Don't try to use indirect access in the alias table.
Fixes validation failures about missing instr->address in gravitymark
ultra.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36896>
2025-08-22 01:04:23 +00:00
Connor Abbott
03388baa6d tu, freedreno: Document GRAS shading rate LUT
Name the register, which is actually an array, and initialize it
programmatically using the same table as the per-primitive case. This
should produce the same value as the old hardcoded constant.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36892>
2025-08-21 22:48:32 +00:00
Connor Abbott
1dff4dcb0b ir3: Use common shading rate lookup table
This should be identical to the old one.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36892>
2025-08-21 22:48:32 +00:00
Connor Abbott
8d9e3bda44 freedreno: Add common VRS helpers
These will be used by ir3 and turnip.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36892>
2025-08-21 22:48:32 +00:00
Connor Abbott
658fe94241 ir3: Simplify and rationalize shading rate LUT
We had an extra 16 entries in the VK-to-HW table that were clearly
unnecessary because Vulkan does not allow values greater than 16 for the
primitive shading rate. This appears to be an extra debug/test thing
added by the blob. Similarly there were unused entries in the HW-to-VK
table that shouldn't be necessary. Delete them.

The HW-to-VK table was also inconsistent about whether invalid values
should be 0 or 11, fix that too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36892>
2025-08-21 22:48:32 +00:00
Zan Dobersek
1f61867b48 tu: prevent tu_bo unmapping during destruction while being dumped
A tu_bo object can be in the process of being dumped during queue submit
while also being destroyed on a separate thread. During destruction, tu_bo
should be removed from the device's dump_bo_list before unmapping, this
way the mapping of any given tu_bo won't disappear while it's being dumped.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Reviewed-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36904>
2025-08-21 18:16:59 +00:00
Connor Abbott
2797069e9a tu: Enable LRZ with FDM
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36475>
2025-08-21 16:42:19 +00:00
Connor Abbott
b34b089ca1 tu: Use GRAS bin offset registers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36475>
2025-08-21 16:42:19 +00:00
Connor Abbott
10e7f63734 tu: Add documentation for VK_EXT_fragment_density_map
This has gotten complicated enough that we need somewhere outside of the
driver itself to give an overall flow of how the feature is implemented.

This includes a few things that are enabled in the subsequent commits,
specifically the LRZ parts.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36475>
2025-08-21 16:42:18 +00:00
Connor Abbott
cf7a52d2a6 freedreno: Add HW bin scaling feature
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36475>
2025-08-21 16:42:18 +00:00
Connor Abbott
09a80e04d6 freedreno: Document GRAS_SC_BIN_CNTL::FORCE_LRZ_DIS
This will be necessary for disabling LRZ in cases we can't handle it.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36475>
2025-08-21 16:42:17 +00:00
Connor Abbott
bfb6d09e95 freedreno: Add bin scaling registers
These let us avoid manually patching the viewport as we had to do on
a6xx. However they do not affect blits, so we still have to manually
scale there. They exist from a740.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36475>
2025-08-21 16:42:17 +00:00
Yiwei Zhang
bc46a32e9b turnip: advertise present_id/wait behind TU_USE_WSI_PLATFORM
wsi_common_vk_instance_supports_present_wait returns true for all
supported wsi platforms here, so we can unconditionally advertise them
behind TU_USE_WSI_PLATFORM like the other wsi extensions (also to not
tangle with Android).

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36835>
2025-08-21 07:53:15 +00:00
Marek Olšák
68b80e4d25 nir/instr_set: don't ralloc the set
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Marek Olšák
3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Job Noorman
b53682f41b ir3: don't vectorize nir_op_sdot_4x8_iadd[_sat]
They don't support being repeated.

Fixes a compiler crash in Hogwarts Legacy.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 58d18bc7a8 ("ir3: lower vectorized NIR instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36886>
2025-08-21 05:11:51 +00:00
Job Noorman
7752cc26c4 ir3: use offset_shift for SSBO intrinsics
Our SSBO access instructions expect offsets in units of the accessed
type's size. However, we were ingesting SSBO intrinsics that use byte
addresses. We were fixing this up in ir3_nir_lower_io_offsets by
inserting a ushr or, if possible, propagating this shift into another
shift that's part of the address calculation.

Having to insert a ushr if unfortunate, as for most accesses, it should
be possible to extract this shift directly from the access chain because
the array strides and struct offsets would be properly aligned. It also
prohibits nir_opt_offsets to find constant additions to extract as they
would be hidden behind a ushr that often cannot be optimized away.

57ea689273 ("ir3: optimize SSBO offset shifts for nir_opt_offsets")
tried to overcome the latter problem somewhat by pushing a ushr into
additions. This turned out to be unsound because even though SSBO
offsets are unsigned, intermediate results in the offset calculation
might be negative values which means we should use ishr in those cases.
Unfortunately, we cannot know when to use ushr or ishr.

This commit switches ir3 to the newly introduced offset_shift index for
SSBO intrinsics. This allows the shift to be extracted when lowering
derefs in nir_lower_explicit_io. In some, we still might have to add an
extra shift to make sure the offset uses the correct units. It turns out
that this is very rare and using offset_shift greatly improves the
shader stats:

Totals from 33267 (20.20% of 164705) affected shaders:
MaxWaves: 440368 -> 455258 (+3.38%); split: +3.40%, -0.01%
Instrs: 22974358 -> 21844188 (-4.92%); split: -4.98%, +0.06%
CodeSize: 45456418 -> 43099334 (-5.19%); split: -5.22%, +0.03%
NOPs: 4612549 -> 4524353 (-1.91%); split: -2.97%, +1.05%
MOVs: 802018 -> 817547 (+1.94%); split: -3.29%, +5.23%
COVs: 381987 -> 382061 (+0.02%); split: -0.03%, +0.05%
Full: 514078 -> 477339 (-7.15%); split: -7.18%, +0.04%
(ss): 544419 -> 502332 (-7.73%); split: -9.12%, +1.39%
(sy): 292099 -> 304697 (+4.31%); split: -3.19%, +7.50%
(ss)-stall: 2106134 -> 2104011 (-0.10%); split: -1.82%, +1.71%
(sy)-stall: 9704720 -> 10324864 (+6.39%); split: -4.64%, +11.03%
STPs: 11301 -> 10074 (-10.86%)
LDPs: 18654 -> 17202 (-7.78%)
Preamble Instrs: 4652214 -> 4580289 (-1.55%); split: -1.59%, +0.04%
Early Preamble: 13977 -> 13978 (+0.01%)
Constlen: 1881764 -> 1881304 (-0.02%); split: -0.03%, +0.01%
Last helper: 5157587 -> 5074042 (-1.62%); split: -1.86%, +0.24%
Subgroup size: 2262976 -> 2263232 (+0.01%)
Cat0: 5065452 -> 4976324 (-1.76%); split: -2.73%, +0.97%
Cat1: 1241085 -> 1251974 (+0.88%); split: -2.52%, +3.40%
Cat2: 8462897 -> 7723367 (-8.74%); split: -8.74%, +0.01%
Cat3: 5738382 -> 5735312 (-0.05%); split: -0.06%, +0.00%
Cat5: 761945 -> 763017 (+0.14%); split: -0.00%, +0.14%
Cat6: 199819 -> 197766 (-1.03%); split: -1.34%, +0.31%
Cat7: 890192 -> 581842 (-34.64%); split: -35.20%, +0.57%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:30 +00:00
Job Noorman
65d559fcf6 tu: pass SSBO/UBO min alignment to SPIR-V frontend
Values are taken from minStorageBufferOffsetAlignment and
minUniformBufferOffsetAlignment.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:29 +00:00
Job Noorman
2a8c5ebc77 ir3: enable scalar predicates
Enable the use of scalar predicates by marking predicate dsts as uniform
when possible during instruction emission and in opt_predicates.

Totals:
Instrs: 48207402 -> 47967272 (-0.50%); split: -0.54%, +0.05%
CodeSize: 101907026 -> 101768626 (-0.14%); split: -0.15%, +0.01%
NOPs: 8386320 -> 8165410 (-2.63%); split: -2.88%, +0.25%
MOVs: 1468853 -> 1470546 (+0.12%); split: -0.17%, +0.28%
COVs: 823724 -> 823746 (+0.00%); split: -0.01%, +0.01%
Full: 1716708 -> 1716767 (+0.00%); split: -0.00%, +0.01%
(ss): 1113167 -> 1168194 (+4.94%); split: -0.15%, +5.09%
(sy): 552317 -> 552288 (-0.01%); split: -0.10%, +0.09%
(ss)-stall: 4013046 -> 4261336 (+6.19%); split: -0.11%, +6.30%
(sy)-stall: 16741190 -> 16748983 (+0.05%); split: -0.17%, +0.22%
STPs: 18895 -> 18901 (+0.03%); split: -0.02%, +0.05%
LDPs: 23853 -> 23762 (-0.38%); split: -0.39%, +0.01%
Preamble Instrs: 11506988 -> 11493425 (-0.12%); split: -0.12%, +0.01%
Early Preamble: 121339 -> 121695 (+0.29%)
Last helper: 11686328 -> 11628618 (-0.49%); split: -0.72%, +0.23%
Cat0: 9241457 -> 9020508 (-2.39%); split: -2.62%, +0.22%
Cat1: 2353411 -> 2354860 (+0.06%); split: -0.17%, +0.23%
Cat2: 17468471 -> 17447932 (-0.12%); split: -0.12%, +0.00%
Cat6: 515728 -> 515643 (-0.02%); split: -0.02%, +0.00%
Cat7: 1637795 -> 1637789 (-0.00%); split: -0.05%, +0.05%

Totals from 33275 (20.20% of 164705) affected shaders:
Instrs: 30329487 -> 30089357 (-0.79%); split: -0.86%, +0.07%
CodeSize: 59715922 -> 59577522 (-0.23%); split: -0.26%, +0.03%
NOPs: 6265422 -> 6044512 (-3.53%); split: -3.86%, +0.33%
MOVs: 1058197 -> 1059890 (+0.16%); split: -0.23%, +0.39%
COVs: 427513 -> 427535 (+0.01%); split: -0.02%, +0.03%
Full: 548495 -> 548554 (+0.01%); split: -0.01%, +0.02%
(ss): 769340 -> 824367 (+7.15%); split: -0.21%, +7.36%
(sy): 368276 -> 368247 (-0.01%); split: -0.14%, +0.13%
(ss)-stall: 3076333 -> 3324623 (+8.07%); split: -0.15%, +8.22%
(sy)-stall: 10740547 -> 10748340 (+0.07%); split: -0.27%, +0.34%
STPs: 12872 -> 12878 (+0.05%); split: -0.02%, +0.07%
LDPs: 20808 -> 20717 (-0.44%); split: -0.45%, +0.01%
Preamble Instrs: 6354490 -> 6340927 (-0.21%); split: -0.22%, +0.01%
Early Preamble: 15233 -> 15589 (+2.34%)
Last helper: 8106631 -> 8048921 (-0.71%); split: -1.04%, +0.32%
Cat0: 6888653 -> 6667704 (-3.21%); split: -3.51%, +0.30%
Cat1: 1541452 -> 1542901 (+0.09%); split: -0.25%, +0.35%
Cat2: 10963398 -> 10942859 (-0.19%); split: -0.19%, +0.00%
Cat6: 265945 -> 265860 (-0.03%); split: -0.03%, +0.00%
Cat7: 1164800 -> 1164794 (-0.00%); split: -0.07%, +0.07%

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
2025-08-20 06:14:02 +00:00
Job Noorman
cccb3ecc6a ir3/opt_predicates: move some helpers up
We'll need them earlier in the next commit.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
2025-08-20 06:14:02 +00:00
Job Noorman
0223ab01b7 ir3/isa: add encoding for scalar predicates
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.

This commit follows the blob and disassembles scalar predicates as
up0.c. The "u" presumably stands for "uniform".

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
2025-08-20 06:14:02 +00:00
Job Noorman
25ab37ae5b ir3: make backend aware of scalar predicates
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.

This commit makes the ir3 backend aware of scalar predicates. A new
register flag (IR3_REG_UNIFORM) is added that can be used to mark
predicate dsts as being written by the scalar ALU. For such dsts, the
same synchronization rules apply as for shared registers written by the
scalar ALU (e.g., (ss) is needed to read them from the vector ALU).
Scalar predicates can be used in the early preamble, which makes control
flow available there.

In many ways, the backend treats IR3_REG_UNIFORM the same as
IR3_REG_SHARED. A new flag was added because IR3_REG_SHARED is mainly
used to denote a separate register file, not as a flag to indicate usage
by the scalar ALU. Scalar predicates still use the normal predicate
register file but allow it to be written from the scalar ALU.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
2025-08-20 06:14:02 +00:00
Job Noorman
bd28a40bd4 ir3/legalize: don't special-case early-preamble a1 reads
We can just generically read from the regmask.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
2025-08-20 06:14:02 +00:00
Job Noorman
8760c36579 ir3: use shared srcs for demote/kill condition
No reason to force vector srcs.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
2025-08-20 06:14:02 +00:00
Job Noorman
dbfed965ae ir3: use ir3_get_predicate for demote/kill
Instead of duplicating its functionality.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
2025-08-20 06:14:02 +00:00