Commit graph

221878 commits

Author SHA1 Message Date
Lorenzo Rossi
312603b2fa pan/compiler: Rename bifrost_optimize_nir
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:12 +00:00
Lorenzo Rossi
6f05b27b9a panvk: Remove pan_optimize_nir call
The shader will be optimized a few passes later in preprocess, this way
we can have the same pipeline as in Gallium

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:12 +00:00
Lorenzo Rossi
39f54ddea2 panvk,panfrost: Pass inputs and info to postprocess
This is needed if we want postprocess to decide IDVS and layout later in
the series

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:12 +00:00
Lorenzo Rossi
01e6a0555c pan/compiler: Rework scratch memory strategy
Before this commit, all scartch memory was allocated in 16-byte chunks
and indirect references where always lowered into if-else trees.  This
patch tries to clean this up a little bit, by using a more compact layout
that is still TLS friendly, allowing indirect accesses and only lowering
them for optimizations and using the newer nir_lower_explicit_io.

The patches should improve performance on some shaders, but lifts a lot
of dust off the compiler uncovering some new bugs.  They have been kept
at bay by disabling local memory vectorization.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:11 +00:00
Lorenzo Rossi
f0d2ad9840 panvk/jm: Fix tls_size overwrite in indirect draws
Only caused problems when the VS/FS has more TLS than our internal shaders
that doesn't usually happen but will cause bugs when we start to
compress local memory.

Fixes: 005703e5b5 ("panvk: Move TLS preparation logic to cmd_dispatch_prepare_tls")
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:11 +00:00
Lorenzo Rossi
768d7cb149 pan/compiler: Sort preprocess
Reorders the preprocess passes to be more in-line with modern compilers.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Co-authored-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:11 +00:00
Lorenzo Rossi
cf6ce1bb8c panfrost: Constant-fold io locations after lowering
This makes panfrost IO lowering similar to panvk and avoids subtle
bugs later in the series.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:11 +00:00
Lorenzo Rossi
63aceb07ff nir/opt_sink: Add pan-specific load_input
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
30d8f9c554 nir/lower_point_size: Handle 16-bit point sizes
panfrost has float16 point size, handling that precision too allows the
compiler to call lower_point_size later in the compilation pipeline

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
dd96a1514b pan/compiler: Handle ssbo_atomics in lower_vs_atomics
This way the pass does not depend on lower_ssbo anymore

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
408d03291d pan/compiler: Lower unaligned scratch memory accesses
Using OpenCL size/alignment requirements we might get some types
with a size bigger than their alignment.  This breaks the current TLS
load/stores that expect 16-byte alignment for 16-byte load/stores. This
problem probably hasn't surfaced yet because we reassigned OpenCL scratch
in 16-byte slots, but will break if we compact the layout.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
ac23e3c6c5 pan/compiler: Fix WaRaR hazard in pressure scheduler
A common memory swap operation might be compiled as:
%v1 = LOAD %a1  # L1
%v2 = LOAD %a2  # L2
STORE %v2, %a1  # S1
STORE %v1, %a2  # S2

The current pressure scheduler just records the last load/store
operation for dependencies, thus the dependency chain becomes L2 -> S1
-> S2.  The compiler might thus reorder them as L2, S1, L1, S2, i.e
                #    L1:
%v2 = LOAD %a2  # L2 |
STORE %v2, %a1  # S1 |
%v1 = LOAD %a1  # L1<-
STORE %v1, %a2  # S2

This is incorrect as S1 depends on L1 too.  The fix makes all loads also
depend on each other, restricting load reordering.  The proper fix that
NAK has is to track all loads and make each store depend on every load,
building a more correct DAG.  This doesn't matter as much in panfrost
since all loads are serialized by the scoreboard.  We might still want
to implement it for register pressure in the future.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:09 +00:00
Lorenzo Rossi
abde403a7c pan/compiler: Allow 16-bit alpha for atest_pan
We just need to handle it while translating NIR to BIR, the hardware can
do automatic widening to 32-bits.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>
2026-04-30 17:33:09 +00:00
Lorenzo Rossi
2a7d817591 nir/opt_algebraic: optimize fadd/fmul with 16-bit source and constant
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>
2026-04-30 17:33:09 +00:00
Lorenzo Rossi
89436db611 nir: Extract float_is_half tests in common code
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>
2026-04-30 17:33:09 +00:00
Samuel Pitoiset
e092e945a7 radv: fix printing image format with RADV_DEBUG=img
It should print the Vulkan format, not the pipe format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41265>
2026-04-30 17:09:00 +00:00
Caio Oliveira
1ebc14bcb9 brw: Stop tracking inline parameter usage in prog_key/prog_data
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Since inline parameter is the last field of the thread payload, the
backend can always assume they may exist.  They won't affect the
position of other payload fields and the register allocator will
reuse any unused space.

In Anv, also update EmitInlineParameter for Task/Mesh/CS to reflect
previous changes in inline parameter setup.  Remove/Update some stale
comments since we are here.

Finally, remove the prog_key/prog_data bits that tracked whether inline
data or a push address was needed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41230>
2026-04-30 16:39:22 +00:00
David Rosca
3d16845e9a frontends/va: Add missing NULL check for additional output surface
Fixes: efc6d27fd4 ("frontends/va: Add support for decode/encode processing")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41024>
2026-04-30 16:19:20 +00:00
David Rosca
d3dc812eb1 frontends/va: Fix dereference before NULL check in postproc
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41024>
2026-04-30 16:19:20 +00:00
David Rosca
69db546936 frontends/va: Fix setting output color properties from color standard
Fixes: 6e8a8d8ee7 ("frontends/va: Stop using vpp colors standard")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41024>
2026-04-30 16:19:20 +00:00
Karol Herbst
4e67582ddf nir: add fmul_rtz optimizations
NVK is only going to use it for `fmul_rtz(frcp(ipa), ipa)` patterns, so
try not too hard to optimize this.

Totals from 10 (0.00% of 1212873) affected shaders:
CodeSize: 34480 -> 34288 (-0.56%); split: -0.60%, +0.05%
Static cycle count: 6225 -> 6132 (-1.49%); split: -1.57%, +0.08%

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
f5e92e5493 nak: use fmul_rtz for NAK_INTERP_MODE_PERSPECTIVE
Fixes rendering artifacts in The Surge 2 and Shadow of the Tomb Raider.
And it's what nvidia's driver is doing.

Totals from 170446 (14.05% of 1212873) affected shaders:
CodeSize: 2019019440 -> 2026071952 (+0.35%); split: -0.07%, +0.41%
Number of GPRs: 8158110 -> 8098382 (-0.73%); split: -0.80%, +0.07%
SLM Size: 106448 -> 106440 (-0.01%)
Static cycle count: 1398452243 -> 1400038117 (+0.11%); split: -0.17%, +0.28%
Spills to memory: 546 -> 520 (-4.76%)
Fills from memory: 546 -> 520 (-4.76%)
Spills to reg: 22585 -> 22670 (+0.38%); split: -0.31%, +0.68%
Fills from reg: 18243 -> 18331 (+0.48%); split: -0.34%, +0.82%
Max warps/SM: 6797472 -> 6822196 (+0.36%); split: +0.38%, -0.02%

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/11447
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/11706
Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
5d9225388c nak: handle nir_op_fmul_rtz
Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
2e09b4ac68 nir: handle fmul_rtz in a couple of places
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
4e520f671c nir: add fmul_rtz
It's needed in NVK for correctness with interpolation.

Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Samuel Pitoiset
f2ce2868c5 ci: uprev vkd3d
This contains new tests for DGC+multiview which are valid in DX12
but invalid in Vulkan, unless RADV allows support for it. Important
to have coverage for us because it's used for Crimson Desert.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41193>
2026-04-30 15:00:02 +00:00
Raviraj Uppal
3359de8247 driconf: disable allow_rgb16_configs for SPECviewperf
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Commit f2aaa9ce00 added 16 bpc unorm display formats gated behind the flag
allow_rgb16_configs driconf option, defaulting to true.

This causes SPECviewperf's maya_06 viewset to fail. Disabling
allow_rgb16_configs for SPECviewperf alongside the existing
allow_rgb10_configs workaround.

Fixes: f2aaa9ce00 ("dri,gallium: Add support for RGB[A]16_UNORM display formats.")

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41285>
2026-04-30 14:26:39 +00:00
Jose Maria Casanova Crespo
d95076e581 v3dv: lower oversized compute workgroups to 256 invocations
V3D advertises maxComputeWorkGroupInvocations = 256 but ggml-vulkan
in many cases ignores this limit an creates compute pipelines with
over this limit. Although this is a bug in the application we can
take advantage of nir_lower_workgroup_size and make the application
work.

This issue was causing an assertion failure at nir_to_vir.c:

  assert(c->local_invocation_index_bits <= 8);

The solution is lowering the oversized workgroups to a 256-invocation
workgroup loop, like radv and radeonsi are doing on GFX7, by running
nir_lower_workgroup_size(256) for this scenario.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41257>
2026-04-30 13:59:19 +00:00
Jose Maria Casanova Crespo
c3ba5effe2 v3d/v3dv: Use new V3D_MAX_CSD_WG_SIZE = 256
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41257>
2026-04-30 13:59:18 +00:00
Jose Maria Casanova Crespo
e378a7d773 v3dv: bump maxComputeSharedMemorySize to 32 KB
Currently local shared memory is backed by a BO that is read/written
using the TMU.

ggml-vulkan probes the size of maxComputeSharedMemorySize and rejects
V3DV (falling back to CPU) when the value is below what its larger
compute pipelines request, although in the end the shaders ollama
runs don't actually use shared memory.

32 KB is what ggml-vulkan demands; the value can grow further with no
real per-op cost since shared memory currently goes through the TMU
like any other BO.

V3D OpenGL driver also has 32 KB for SharedMemory.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41257>
2026-04-30 13:59:18 +00:00
Karmjit Mahil
ee5daf590b gbm: Replace VER_MIN with common MIN2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41290>
2026-04-30 13:00:03 +00:00
Karmjit Mahil
dab24c34dd gbm: Remove unused ARRAY_SIZE macro
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41290>
2026-04-30 13:00:03 +00:00
Eric R. Smith
8d1fba686b panfrost: add some sanity checks
`__builtin_ctz` is not well defined for `0`, so provide a default value
of `0` for this case. The other sensible choice would be `64`, but that
does not fit in the 5 bit `divisor_r` field (which is how I noticed this,
we were triggering a run time assert in a debug build).

We should skip `launch_draw` if there are no vertices to draw.
This avoids a crash in some indirect rendering tests on Bifrost.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41117>
2026-04-30 12:39:18 +00:00
Pohsiang (John) Hsu
41965d4082 mediafoundation: code clean up
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41279>
2026-04-30 11:51:33 +00:00
Pohsiang (John) Hsu
ff1c171bae mediafoundation: periodic clang-format
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41279>
2026-04-30 11:51:33 +00:00
Lionel Landwerlin
b795a1a20c intel/tools: add eu stall viewer
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:45 +00:00
Lionel Landwerlin
d595529475 imgui: update copy and port all tools using it
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:45 +00:00
Lionel Landwerlin
0a965c0bce anv: add a shader-dump debug option
Will use this with EU stall monitor.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:45 +00:00
Lionel Landwerlin
3951a00d86 anv: reorder debug options
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:43 +00:00
Lionel Landwerlin
b3c1cba483 anv/docs: update environment variable docs
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41244>
2026-04-30 10:59:43 +00:00
Jose Maria Casanova Crespo
2cd51a6efc broadcom/compiler: move nir_lower_undef_to_zero out of optimization loop
The combination of nir_opt_if and nir_lower_undef_to_zero running inside
the optimization loop could make it to not converge.

This was exercised by ollama running gemma3 compute shaders.

Removing the pass from the optimization loop results in No changes in
shader-db.

Assisted-by: Claude Opus 4.6
Fixes: cbe24a0e9c ("broadcom/compiler: use nir_lower_undef_to_zero")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41256>
2026-04-30 12:30:34 +02:00
Samuel Pitoiset
b4591f4b30 radv/ci: update list of skipped tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
They are fixed now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
a0d39a29da radv: set RADEON_FLAG_EMULATE_SPARSE_RESIDENCY for sparse SSBO/UBO buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
41fa965386 radv/amdgpu: emulate sparse residency for the SMEM loads with NULL PRT workaround
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
0be39ce4ad radv: use the "LOW" address space for UBOs
Read-only and no sparse feedback support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
3237666fc4 radv: run the pass to fixup SMEM loads with NULL PRT pages
Right after global access are lowered.

fossils-db (NAVI21):
Totals from 37734 (33.68% of 112041) affected shaders:
MaxWaves: 898021 -> 897961 (-0.01%); split: +0.00%, -0.01%
Instrs: 34145252 -> 34267942 (+0.36%); split: -0.00%, +0.36%
CodeSize: 182360344 -> 182943952 (+0.32%); split: -0.00%, +0.32%
VGPRs: 1796672 -> 1796816 (+0.01%); split: -0.00%, +0.01%
SpillSGPRs: 13708 -> 13964 (+1.87%); split: -0.28%, +2.15%
Latency: 442451029 -> 442827188 (+0.09%); split: -0.02%, +0.10%
InvThroughput: 105259490 -> 105287803 (+0.03%); split: -0.01%, +0.03%
VClause: 672269 -> 672252 (-0.00%); split: -0.12%, +0.12%
SClause: 847133 -> 847677 (+0.06%); split: -0.35%, +0.41%
Copies: 2974422 -> 2979443 (+0.17%); split: -0.35%, +0.52%
Branches: 860896 -> 861639 (+0.09%); split: -0.00%, +0.09%
PreSGPRs: 1677701 -> 1682387 (+0.28%); split: -0.01%, +0.29%
VALU: 22386780 -> 22386984 (+0.00%); split: -0.01%, +0.01%
SALU: 5282218 -> 5406460 (+2.35%); split: -0.01%, +2.36%

fossils-db (POLARIS10):
Totals from 15054 (21.74% of 69255) affected shaders:
MaxWaves: 87688 -> 87689 (+0.00%); split: +0.01%, -0.00%
Instrs: 12542117 -> 12596734 (+0.44%); split: -0.00%, +0.44%
CodeSize: 65209280 -> 65458732 (+0.38%); split: -0.00%, +0.39%
SGPRs: 1149639 -> 1149975 (+0.03%); split: -0.24%, +0.27%
VGPRs: 749928 -> 749956 (+0.00%); split: -0.02%, +0.02%
SpillSGPRs: 11139 -> 11413 (+2.46%); split: -0.29%, +2.75%
Latency: 169204114 -> 169533989 (+0.19%); split: -0.01%, +0.21%
InvThroughput: 88091947 -> 88185872 (+0.11%); split: -0.01%, +0.11%
VClause: 280519 -> 280318 (-0.07%); split: -0.18%, +0.10%
SClause: 343474 -> 344686 (+0.35%); split: -0.32%, +0.67%
Copies: 1529440 -> 1530545 (+0.07%); split: -0.30%, +0.38%
Branches: 286849 -> 286856 (+0.00%); split: -0.01%, +0.01%
PreSGPRs: 661815 -> 663239 (+0.22%); split: -0.02%, +0.23%
VALU: 8758472 -> 8759214 (+0.01%); split: -0.01%, +0.01%
SALU: 1775129 -> 1829513 (+3.06%); split: -0.02%, +3.08%

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
a4668733e5 ac/nir: add a pass to fixup SMEM loads with NULL PRT pages
Only global/SSBO SMEM loads are considered because for UBOs the "LOW"
VA will be set in descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
60b406e233 ac/gpu_info: query the PRT workaround control bit from libdrm
libdrm splits the HIGH address space in two equal parts for GPUs that
are affected by the SMEM loads with NULL PRT page.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
978605fd06 ac/gpu_info: add has_smem_with_null_prt_bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00
Samuel Pitoiset
ecfda339ca ac/gpu_info: store more addr space info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38698>
2026-04-30 09:29:43 +00:00