Commit graph

221791 commits

Author SHA1 Message Date
Danylo Piliaiev
5fcde4d65d freedreno: Fix CP_CCHE_INVALIDATE not being applied at the right point
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Apparently CP_CCHE_INVALIDATE is just a plain register write underneath,
so it needs WFI before it, in order to invalidate at the right point.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41266>
2026-04-29 16:25:16 +00:00
Danylo Piliaiev
c2e78f1b22 tu: Fix CP_CCHE_INVALIDATE not being applied at the right point
Apparently CP_CCHE_INVALIDATE is just a plain register write underneath,
so it needs WFI before it, in order to invalidate at the right point.

```
CP_CCHE_INVALIDATE:
mov $addr, 0x9881
mov $data, 0x1
waitin
mov $01, $data
```

Fixes misrendering in Doom Eternal on A750.

Fixes: fb1c3f7f5d ("tu: Implement CCHE invalidation")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41266>
2026-04-29 16:25:16 +00:00
Caio Oliveira
e1745e0bd9 brw: Fix max_dispatch_width collection for CS with variable size
The intention of the original commit was to make all the shaders report
the same max_dispatch_width.  When CS has multiple variants, this was
not happening as expected.

Fixes: 2acc2f18ea ("intel/compiler: report max dispatch width statistic")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41209>
2026-04-29 15:52:04 +00:00
Roman Stratiienko
bdbf4ed739 v3dv/android: Add deferred ANB allocation support
Fixes:

dEQP-VK.wsi.android.maintenance1.deferred_alloc.mailbox#basic
dEQP-VK.wsi.android.maintenance1.deferred_alloc.mailbox#bind_image
dEQP-VK.wsi.android.maintenance1.deferred_alloc.fifo#basic
dEQP-VK.wsi.android.maintenance1.deferred_alloc.fifo#bind_image

Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41235>
2026-04-29 15:31:28 +00:00
Mike Blumenkrantz
d4d7055aee radv: add RADV_QUEUE_DISABLE env var for selectively disabling queues
it is sometimes useful to test radv without certain queues disabled in order
to exercise alternative codepaths. this exposes that capability

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41086>
2026-04-29 15:08:28 +00:00
Job Noorman
aaf4d77f43 ir3/shared_ra: fix live-out reload after src reload
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When reloading live-out values along loop back-edges, we make sure to
reuse the original register. However, we failed to detect cases where
the spilled value got reloaded earlier for a src in a different
register. Fix this by reloading the value again in the original
register.

Fixes a RA validation failure in Windrose.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41262>
2026-04-29 14:40:48 +00:00
Jose Maria Casanova Crespo
3a8d5aeaa1 v3dv: Expose hardware-accelerated integer dot products on V3D 7.1+
Expose VK_KHR_shader_integer_dot_product 4x8-bit packed dot
products using native HW instructions v8dot and setnnmode.

QPU instruction count for sdot_4x8_iadd compute shader:

  Before (scalar decomposition):  18 ALU cycles
  After (setnnmode + v8dot):       3 ALU cycles  (6x)

We advertise integerDotProduct4x8BitPacked*Accelerated for V3D 7.1+

Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
2026-04-29 13:21:08 +00:00
Jose Maria Casanova Crespo
8f06961bf5 broadcom/compiler: Eliminate redundant setnnmode instructions
This new VIR optimization pass tracks the current NN signedness
mode per block and removes duplicate setnnmode instructions.

When consecutive dot products use the same signedness mode, the backend
emits one setnnmode per dot product. This pass removes the redundant
ones, keeping only the first.

Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
2026-04-29 13:21:08 +00:00
Jose Maria Casanova Crespo
24ecc9cbcc broadcom/compiler: Add v8dot and setnnmode scheduler dependencies.
As nnmode register is read by v8dot instruction we need to add dependencies
between setnnmode instructions and v8dot via the nnmode register, so they
are scheduled correcty using last_nn_mode virtual register..

Add a last_nn_mode virtual register to the scheduler state and create:
- Write dependencies for all SETNNMODE variants
- Read dependencies for V8DOT.

This follows the same pattern as the existing MULTOP/UMUL24 rtop tracking.

Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
2026-04-29 13:21:08 +00:00
Jose Maria Casanova Crespo
33a700be91 broadcom/compiler: hardware-accelerated 4x8-bit dot products on V3D 7.1+
VIR instructions and nir_to_vir implementation of 4x8-bit dot products
using native HW accelerated ALU instructions.

setnnmode instructions are marked as having side effects.

Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
2026-04-29 13:21:08 +00:00
Jose Maria Casanova Crespo
afe4e321e1 broadcom/compiler: Add V3D 7.1 v8dot dot product QPU instructions
Add QPU instruction definitions, metadata, and encoding for V3D 7.1
v8dot product instruction and the setnnmode instruction that allows
defining the signedness (UU/SU/US/SS) of the v8dot operation.

Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
2026-04-29 13:21:07 +00:00
Benjamin Cheng
656b3814c2 radv/wsi: Re-use transfer queue if it exists
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This avoids writing past the end of pdev->vk_queue_to_radv if all the
queue families are available.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/14834
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41155>
2026-04-29 12:38:02 +00:00
Icenowy Zheng
2bffc653ec isaspec: decode: manually print the sign when printing NaN float values
The IEEE754-2019 standard declaring the preceding sign "optional" when
converting NaN values to strings because the standard tries to not
regulate how sign bits in NaNs are interpreted.

In the real world, when using printf-series function to print a number
with type `float` on RISC-V, the sign of NaNs is wiped during the
conversion from `float` to `double` (defined as part of the default
argument promotions rule for variable arguments in the C spec).

Change the code to stop relying on isa_print() to print the negative
sign, instead parse it from the highest bit of value and manually print
it before "nan" string.

This fixes the `etnaviv_isa_disasm` unit test on RISC-V.

Suggested-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40887>
2026-04-29 11:39:12 +00:00
Luigi Santivetti
120bd20e49 pvr: add missing multi-arch support for pipeline exec and stats
Entry points must be wrapped in the PVR_PER_ARCH macro else there
will be multiple definitions of the same symbol.

Fixes: dfddb3fe ("pvr: Add support for VK_KHR_pipeline_executable_properties")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41238>
2026-04-29 11:22:25 +00:00
Luigi Santivetti
20b42f4466 pvr: de-dup strncmp in pvrsrvkm winsys
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41236>
2026-04-29 10:47:26 +00:00
Brendan King
443c259630 pvr: move some asserts in pvr_srv_alloc_display_pmr
Don't assert if the pvr_srv_physmem_import_dmabuf bridge call fails.

Signed-off-by: Brendan King <brendan.king@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41236>
2026-04-29 10:47:26 +00:00
Icenowy Zheng
01ba4867fa pvr: skip emitting query program when copy result / reset with 0 queries
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
When calling vkResetQueryPool() or vkCmdCopyQueryPoolResults() with a
queryCount of 0, currently a query compute program with workgroup size
0*1*1 will be emited, which is ridiculous and will be rejected by some
assertion in pvr_compute_generate_control_stream() .

As the operation should be noop when queryCount is 0, the functions can
and should just return in such cases.

Fixes: 0aa9f32b95 ("pvr: Implement vkCmdResetQueryPool API.")
Fixes: b6e8e1cf37 ("pvr: Implement vkCmdCopyQueryPoolResults API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40911>
2026-04-29 10:24:59 +00:00
Samuel Pitoiset
e0b5724e85 meson: bump required libdrm to 2.4.133 for AMDGPU
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41207>
2026-04-29 09:30:13 +00:00
Samuel Pitoiset
0ec9b0d5c5 ci: bump libdrm to 2.4.133
libdrm needs to be build for alpine because the version provided by
the image is too old.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41207>
2026-04-29 09:30:13 +00:00
Rhys Perry
aac8787fda radv: remove radv_device_cache_key
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:12 +00:00
Rhys Perry
44b09b8396 radv: remove radv_physical_device_cache_key
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:12 +00:00
Rhys Perry
27815719aa radv: remove most fields from radv_physical_device_cache_key
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:12 +00:00
Rhys Perry
9ad0cd7e38 radv: hash radv_compiler_info::key into the cache key
This will replace both the pdev and device cache keys.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:12 +00:00
Rhys Perry
1ac306c11e radv: remove radv_compiler_info::cache_key
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:11 +00:00
Rhys Perry
c6c4f523af radv: add fields to radv_compiler_info from radv_physical_device_cache_key
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:11 +00:00
Rhys Perry
3aa6903883 radv: move fields to radv_compiler_info::key
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:10 +00:00
Rhys Perry
7c93a6e91c radv: move use_llvm to radv_compiler_info::key
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:10 +00:00
Rhys Perry
5f3b73b2f0 radv: move load_grid_size_from_user_sgpr to radv_physical_device
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:10 +00:00
Rhys Perry
48645f21b5 radv: initialize nir_shader_compiler_options directly in compiler info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:09 +00:00
Rhys Perry
0249fcfbb6 radv: assert there is no padding in cache keys
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:08 +00:00
Rhys Perry
5ee0935861 ac: move has_cs_regalloc_hang_bug to ac_compiler_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:08 +00:00
Rhys Perry
e40457b136 ac: move lds_size_per_workgroup to ac_compiler_info
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41022>
2026-04-29 08:10:08 +00:00
Peyton Lee
9b06b0f219 radeonsi/vpe: add VPE 2.0 support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add YUV semi-planar format and YUV packed format support.
Add multi-layer blending support.
Add 3DLut fast loading support.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41190>
2026-04-29 07:46:28 +00:00
Peyton Lee
ab878cc1ea amd/gmlib: add tm_generate_formatted_3DLut
Adds a utility to format a 3D LUT into the required memory layout and write it into a given buffer.

Signed-off-by: Peyton Lee <peytolee@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41190>
2026-04-29 07:46:28 +00:00
Valentine Burley
ca92f8697e panfrost/ci: Update kernel to pick up ZSTD support for ZRAM
No other changes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15342
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41217>
2026-04-29 07:24:18 +00:00
Olivia Lee
72e0eda260 pan/bi: fix memory access alignment
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Memory accesses need to be aligned up to the next power of two of the
full access size. Component count and bit-size don't matter to the
hardware, only the total size.

shader-db results are pretty much what you would expect, there are a few
shaders that have increased LS instructions as a result of splitting
accesses to satisfy alignment requirements that were previously ignored.
The one surprising thing is that there are several shaders that have
reduced uniform usage. Looking at some of these individually, what
happened is that splitting UBO loads early allowed the compiler to
eliminate loads from unused ranges of the access.

total instrs in shared programs: 719166 -> 719174 (<.01%)
instrs in affected programs: 2355 -> 2363 (0.34%)
helped: 4
HURT: 6
helped stats (abs) min: 1.0 max: 9.0 x̄: 3.00 x̃: 1
helped stats (rel) min: 0.36% max: 6.52% x̄: 1.99% x̃: 0.54%
HURT stats (abs)   min: 1.0 max: 4.0 x̄: 3.33 x̃: 4
HURT stats (rel)   min: 0.65% max: 2.13% x̄: 1.38% x̃: 1.48%
95% mean confidence interval for instrs value: -2.14 3.74
95% mean confidence interval for instrs %-change: -1.76% 1.82%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 30210.83 -> 30218.81 (0.03%)
cycles in affected programs: 50 -> 57.99 (15.97%)
helped: 2
HURT: 6
helped stats (abs) min: 0.0078129999999999589 max: 0.070312000000000041 x̄: 0.04 x̃: 0
helped stats (rel) min: 1.10% max: 10.23% x̄: 5.66% x̃: 5.66%
HURT stats (abs)   min: 0.03125 max: 5.0 x̄: 1.34 x̃: 1
HURT stats (rel)   min: 2.38% max: 25.00% x̄: 13.05% x̃: 14.26%
95% mean confidence interval for cycles value: -0.42 2.41
95% mean confidence interval for cycles %-change: -1.74% 18.49%
Inconclusive result (value mean confidence interval includes 0).

total cvt in shared programs: 2385.91 -> 2385.91 (<.01%)
cvt in affected programs: 11.14 -> 11.14 (<.01%)
helped: 5
HURT: 4
helped stats (abs) min: 0.0078119999999999301 max: 0.070312000000000041 x̄: 0.02 x̃: 0
helped stats (rel) min: 0.27% max: 10.23% x̄: 2.61% x̃: 0.82%
HURT stats (abs)   min: 0.01562600000000014 max: 0.03125 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 1.31% max: 2.75% x̄: 2.21% x̃: 2.40%
95% mean confidence interval for cvt value: -0.02 0.02
95% mean confidence interval for cvt %-change: -3.51% 2.58%
Inconclusive result (value mean confidence interval includes 0).

total ls in shared programs: 25871 -> 25879 (0.03%)
ls in affected programs: 46 -> 54 (17.39%)
helped: 0
HURT: 4
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 2.00 x̃: 1
HURT stats (rel)   min: 10.00% max: 25.00% x̄: 18.38% x̃: 19.26%
95% mean confidence interval for ls value: -1.18 5.18
95% mean confidence interval for ls %-change: 8.46% 28.30%
Inconclusive result (value mean confidence interval includes 0).

total code size in shared programs: 6302848 -> 6302976 (<.01%)
code size in affected programs: 1536 -> 1664 (8.33%)
helped: 0
HURT: 1

total registers used in shared programs: 117324 -> 117329 (<.01%)
registers used in affected programs: 45 -> 50 (11.11%)
helped: 1
HURT: 2
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 6.25% max: 6.25% x̄: 6.25% x̃: 6.25%
HURT stats (abs)   min: 2.0 max: 4.0 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 12.50% max: 30.77% x̄: 21.63% x̃: 21.63%

total uniforms used in shared programs: 78538 -> 78274 (-0.34%)
uniforms used in affected programs: 2688 -> 2424 (-9.82%)
helped: 104
HURT: 4
helped stats (abs) min: 1.0 max: 18.0 x̄: 2.65 x̃: 2
helped stats (rel) min: 1.96% max: 54.55% x̄: 12.78% x̃: 11.11%
HURT stats (abs)   min: 1.0 max: 5.0 x̄: 3.00 x̃: 3
HURT stats (rel)   min: 3.70% max: 16.13% x̄: 9.92% x̃: 9.92%
95% mean confidence interval for uniforms used value: -3.01 -1.88
95% mean confidence interval for uniforms used %-change: -14.15% -9.74%
Uniforms used are helped.


Total CPU time (seconds): 73.26 -> 74.48 (1.67%)

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 2f2738dc90 (pan/bi: Use nir_lower_mem_access_bit_sizes)
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41033>
2026-04-29 01:26:47 +00:00
Xinju Li
0319be8b02 nir: resolve functions: only resolve functions that are reachable from main
Use DFS traversal from main to resolve reachable functions. Avoid spurious
"unresolved reference" linker errors for dead helper functions.
It avoid reporting linking error for following shader test. The shader test used
to pass before merge_requests/31137:

[require]
GLSL >= 1.50

[vertex shader]
/* declared but not defined */
vec4 transform_color(vec3 color, float alpha);

/* calls transform_color — but this function is never called from main */
vec4 apply_transform(vec3 color, float alpha)
{
    return transform_color(color, alpha);
}

[vertex shader]
in vec4 piglit_vertex;

void main()
{
    /* apply_transform is never called here */
    gl_Position = piglit_vertex;
}

Signed-off-by: Xinju Li <xinju.li@broadcom.com>

use pass_flags to mark function as reachable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41065>
2026-04-28 23:35:17 +00:00
Alyssa Rosenzweig
a78634ccb0 jay/to_binary: rename grf -> phys_reg
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
since it covers accumulators to

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
ab87a035c9 jay: drop a bunch of stale TODO and XXX
These are either done, or never going to be done, or otherwise stale or
silly or unnecessary. Drop a bunch.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
70d09d97ef jay: predicate NoMask instructions in uniform IF's
Totals:
Instrs: 4742391 -> 4742257 (-0.00%)
CodeSize: 70245120 -> 70243520 (-0.00%); split: -0.00%, +0.00%

Totals from 81 (3.06% of 2647) affected shaders:
Instrs: 337727 -> 337593 (-0.04%)
CodeSize: 4992992 -> 4991392 (-0.03%); split: -0.03%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
f199f00564 jay: adjust flag replication
Now instructions still read/write UFLAG, which preserves the information about
lane 0 we need for proper predication etc.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
930d36b54a jay: smarten predication pass
Merge the empty else optimization, the then-block predication, and the
break-while fusion into a unified "try to predicate each side of an if, peephole
optimizing control flow" optimization. This is simpler and more general.

Totals:
Instrs: 4783809 -> 4775647 (-0.17%)
CodeSize: 70766656 -> 70674064 (-0.13%); split: -0.13%, +0.00%

Totals from 1109 (41.90% of 2647) affected shaders:
Instrs: 4130644 -> 4122482 (-0.20%)
CodeSize: 61180848 -> 61088256 (-0.15%); split: -0.15%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
80081ef7b2 jay: check for inverse-ballots in jay_uses_flag
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
86f19bc983 jay: propagate inverse-ballots only locally
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
d7283a25d7 jay: do not copyprop ballots globally
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
5828b66b65 jay: convert to LCSSA
for correctness with loops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
fed6b7bea0 jay: drop UGPR->UMEM spilling path
This is totally broken now that we have a physical CFG for UGPRs. And of course,
UGPRs generally were totally broken without the physical CFG. So I conclude
this code basically never worked. Which is good because it was also basically
always dead too. Just delete it and replace with a clear error message, instead
of pretending it works and either randomly splatting validation or just straight
up miscompiling silently or whatever.

We might need an alternative UGPR->GPR spill path some day but that day is not
today.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
ad040f2fbb jay: introduce a physical control flow graph
Consider:

   u0 = foo()

   if (divergent) {
      u0 = bar()
      r0 = baz(u0)
   } else {
      r0 = quux(u0)
   }

Logically, this is fine, there is no interference between bar() and u0. But
physically, both sides of the if execute so the bar() write to u0 overwrites the
variable the else reads. So this is a miscompile.

The solution is to model the extra edges in the physical control flow graph,
which lives next to the existing logical control flow graph. Liveness for UGPRs
now follows the physical CFG, while liveness for GPRs continues to follow the
logical CFG. That models the interference properly, while still allowing phis to
work as before (since phis writing UGPRs follow uniform bits of control flow
that are necessarily critical edge free for the same reason the logical CFG is).

Because our RA copies shuffled registers back at block ends (following
Colombet), there's no issue with live range splits here (unlike aco which
inserts phis for this case and then needs to worry about critical edges around
those phis).

There might still be an extremely-challenging-to-hit bug here with UGPR spilling
which I need to think more about. It might be fine as-is? Not convinced though.
But this is big enough and strictly less broken than what we have right now and
the full solution will build on this, so here we are.

Fixes artefating in SuperTuxKart and Celestia knows what else.

Totals:
Instrs: 2770938 -> 2771269 (+0.01%); split: -0.00%, +0.02%
CodeSize: 40133712 -> 40138480 (+0.01%); split: -0.01%, +0.02%

Totals from 158 (5.97% of 2647) affected shaders:
Instrs: 514523 -> 514854 (+0.06%); split: -0.02%, +0.09%
CodeSize: 7603040 -> 7607808 (+0.06%); split: -0.03%, +0.09%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
fadb826515 jay/opt_propagate: disable f64 opts for now
could be done but would need more work.

No stats change.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00
Alyssa Rosenzweig
8e4145948f jay/opt_propagate: fold uflag copies
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41215>
2026-04-28 23:13:50 +00:00