Commit graph

221242 commits

Author SHA1 Message Date
Dhruv Mark Collins
3d6fab0404 fd/pps: Allocate performance counters from high-to-low
The UMD will be switching to allocating counters from low-to-high,
so to avoid the chances of conflict with this new policy the PPS
driver now allocates the other way around. Additionally, this will
future proof it for the MSM-DRM uAPI for performance counters which
will similarly allocate from high-to-low.

Cc: mesa-stable
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Assisted-by: OpenAI Codex (GPT-5.4)
(cherry picked from commit 24849eef9f)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Dhruv Mark Collins
920a848027 tu/autotune: Fail gracefully when CP counters are unavailable
When preemption optimization is supported then the necessary CP
counters being missing causes a device initialization error which
is unnecessary as support can simply be disabled instead to allow
for a more graceful fail. This also fixes A8XX which doesn't have
performance counters hooked up yet.

Cc: mesa-stable
Signed-off-by: Dhruv Mark Collins <mark@igalia.com>
Assisted-by: OpenAI Codex (GPT-5.4)
(cherry picked from commit a5ec9b7892)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Zan Dobersek
c63da3365d tu: only support userspace-managed perfcounters on a7xx and earlier
Future kernel API for perfcounter management will likely be required for
a8xx and onwards. For a7xx and earlier, cmdstream-based selector and
counter register management is still supported.

Cc: mesa-stable
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
(cherry picked from commit c2708afbc7)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Erik Faye-Lund
0d6e04debe panvk: drop out-of-date TODO
We already did this, so let's drop this TODO.

Fixes: d36e6af329 ("panvk: Bump the max image size on v11+")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
(cherry picked from commit f137207108)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Michel Dänzer
705dabbc26 winsys/amdgpu: Use render node only as fallback
If ac_drm_device_initialize returns -EACCES for the fd passed in.

A render node file description can't have DRM master status, which means
AMDGPU_CTX_PRIORITY_HIGH can't work without CAP_SYS_NICE (which
generally only the root user has).

Fixes: 8f30e90fc1 ("winsys/amdgpu: Prefer render node FD for ac_drm_device_initialize")
(cherry picked from commit 5cc3264b53)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Samuel Pitoiset
3740d70fc5 radv: lower SHADER_RECORD_INDEX to non-uniform
This fixes an issue with RADV and NVIDIA-RTX/Donut-Samples with heap
support in vkd3d-proton.

Backport-to: 26.1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 477c44ba93)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Samuel Pitoiset
010072b5bc vulkan: add an option to lower SHADER_RECORD_INDEX to non-uniform
Applications are required to set NonUniform if the resource is arrayed,
but with VK_DESCRIPTOR_MAPPING_SOURCE_HEAP_WITH_SHADER_RECORD_INDEX_EXT,
the resource is non-arrayed in the shader. So, it's technically not
required to set it. Although, the offset can vary per-lane and
NonUniform is implicit.

Backport-to: 26.1
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 8e2869fa41)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Connor Abbott
eb7c92b6a2 ir3: Use correct immediate size for constlen calculation
"size" is the allocated size of the array, not the number of immediates
actually used. We could wind up returning a too-large constlen, larger
than 512, and since the binning variant uses the non-binning variant's
constlen as it's max_const we could make binning variants use c512.x and
crash when encoding.

Fixes: 86f3c0c4c2 ("ir3: simplify constlen calculation")
(cherry picked from commit 49d29d4f10)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Connor Abbott
c15245f814 ir3: Don't reset immediate count to 0 after lowering
We need to know the immediate count even after lowering, to compute the
overall const size. Previously we were using the capacity field, but
that's unreliable and won't be available once we switch to a real
dynamic array container instead of (poorly) reinventing one.

Fixes: 86f3c0c4c2 ("ir3: simplify constlen calculation")
(cherry picked from commit 280c64d720)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Karol Herbst
76f4e2078a nak: the MS location comes last in TLD, same spot as depth compare in TEX
Some Max Payne 3 shaders are impacted by this and probably will fix some
issue there. The VK CTS isn't testing this, but it was verified to fix a
real problem by inserting 0 offsets into the instruction and having CTS
tests fail with the old ordering.

Totals from 3 (0.00% of 1163204) affected shaders:
CodeSize: 2496 -> 2736 (+9.62%)
Static cycle count: 732 -> 741 (+1.23%)

Fixes: ad01fbdda0 ("nak: Add a NIR texture lowering pass")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
(cherry picked from commit e09045e26c)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Eric Engestrom
ddb44422f7 .pick_status.json: Update to 806fcc6193
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41104>
2026-04-22 14:34:48 +02:00
Eric Engestrom
0108eba5ed VERSION: bump for 26.1.0-rc1 2026-04-15 15:42:35 +02:00
Alyssa Rosenzweig
3a9ef908ea intel: fuse off Jay in Mesa 26.1
Jay is under heavy development and is not considered released. It is
available in upstream Mesa for developers to hack on but is not part of
the 26.1 release. Add a comment acting like a chicken bit to fuse off the
compiler while minimizing conflicts with backports (which is why we don't remove
Jay wholesale from the release).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
2026-04-15 15:39:49 +02:00
Benjamin Cheng
9182da14a7 radv: Relax linear requirement to VCN1 and prior
With the previous commit ("ac/surface: Filter swizzle modes for VCN"),
only video-compatible swizzle modes will be picked, so we can enable
tiling for VCN2+.

Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40948>
2026-04-15 12:48:57 +00:00
Benjamin Cheng
fcaab2b921 ac/surface: Filter swizzle modes for VCN
This will allow compatible swizzle modes to be picked for RADV (radeonsi
filters modifiers when creating video surfaces).

This mirrors the logic from ac_modifier_supports_video, and in
addition ensures that XOR swizzle modes are disabled for image arrays
because VCN does not support slice indices.

Reviewed-by: David Rosca <david.rosca@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40948>
2026-04-15 12:48:57 +00:00
Christian Gmeiner
713cecb1df panvk: Advertise VK_EXT_rgba10x6_formats
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Map X6R10X6G10X6B10X6A10_UNORM to the native R10X6G10X6B10X6A10X6_UNORM
HW format on PAN_ARCH >= 11 where it is supported.

Enable the extension with formatRgba10x6WithoutYCbCrSampler in the
physical device, allowing VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16
to be used as a regular color format without YCbCr sampler conversion.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>
2026-04-15 12:16:53 +00:00
Christian Gmeiner
9f172ba4da util/format, vulkan: Add PIPE_FORMAT_X6R10X6G10X6B10X6A10_UNORM
The format has 4 x 16-bit words with 10-bit unorm values in bits [15:6]
and 6 padding bits in [5:0]. Since this requires 8 channel slots but the
format system only supports 4, use layout "other" with hand-written
pack/unpack conversion functions.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>
2026-04-15 12:16:53 +00:00
Christian Gmeiner
81b8113a9f radv: Don't advertise any features for R10X6G10X6B10X6A10X6_UNORM_4PACK16
The recent addition of PIPE_FORMAT_X6R10X6G10X6B10X6A10_UNORM caused
vk_format_to_pipe_format() to map VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16
to a real pipe format, which made radv_physical_device_get_format_properties()
advertise BLIT_SRC/SAMPLED_IMAGE for it. The hardware samples the data as plain
R16G16B16A16 UNORM, which doesn't match the 10-bit UNORM semantics the spec
(and CTS) require, so dEQP-VK.api.copy_and_blit.core.blit_image.* tests with
r10x6g10x6b10x6a10x6_unorm_4pack16 as the source started failing on gfx1201.

Override the mapping to PIPE_FORMAT_NONE so RADV reports zero format features,
matching the behavior prior to the new pipe format being added. Proper support
can be restored once VK_EXT_rgba10x6_formats is implemented.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40653>
2026-04-15 12:16:53 +00:00
Samuel Pitoiset
dc0d6100f9 radv/ci: document a descriptor heap failure
Test bug.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>
2026-04-15 11:22:10 +00:00
Samuel Pitoiset
6462055e38 radv/ci: fix setting RADV_EXPERIMENTAL=heap
It's overwritten if manually set per jobs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>
2026-04-15 11:22:10 +00:00
Samuel Pitoiset
282bb0d11b radv/ci: update flakes of VKCTS jobs
Collected after 25 runs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>
2026-04-15 11:22:10 +00:00
Samuel Pitoiset
3af1f8dc0a radv/ci: remove a hack for the number of deqp instances with RENOIR
Latest VKCTS main uses way less memory than before, and increasing the
number of deqp instances to 16 seems to work just fine now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>
2026-04-15 11:22:10 +00:00
Samuel Pitoiset
3777f7fe3b ci: uprev VKCTS main to 634a3fc62d82c34de68c3b1add25e6b7f5777524
RADV is the only driver using VKCTS main.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40918>
2026-04-15 11:22:10 +00:00
Icenowy Zheng
b8c5e47949 pvr: propagate get_vis_results flag from secondary cmdbuf gfx jobs
When recording secondary command buffers with occlusion queries, the
get_vis_results flag could be set for some graphics sub_cmd's job.

Propagate this flag from secondary command buffer graphics sub_cmds to
primary command buffer sub_cmds to ensure occlusion queries in secondary
command buffers being correctly executed.

Fixes: 5c34be4340 ("pvr: Process secondary buffer queries in vkCmdExecuteCommands.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>
2026-04-15 11:03:06 +00:00
Icenowy Zheng
87f4122e11 pvr: fix the code copying query_indices to sub_query_indices
There's a dynarray field inside gfx sub_cmd called sub_query_indices,
which will contain pending query indices for gfx sub_cmds inside a
secondary command buffer. It's expected that when finishing such gfx
sub_cmds, the content of query_indices is going to be moved there.
However the `util_dynarray_append_dynarray()` call is called with wrong
parameter order, thus it's copying sub_query_indices to query_indices
and then immediately wiping query_indices, forgetting all query indices
in such case.

Fix the `util_dynarray_append_dynarray()` call to fix occlusion queries
in secondary command buffers.

Fixes: 8c506c4b03 ("pvr: Use util_dynarray_append_dynarray()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>
2026-04-15 11:03:06 +00:00
Icenowy Zheng
36f34a72c1 pvr: finalize query_indices array after ending last sub_cmd
The last sub_cmd in the command buffer could be a graphics one, and when
ending a graphics sub_cmd, the query_indices array will be checked to
know whether a occlusion query starts during this graphics sub_cmd.

Finalize the query_indices array after ending the last sub_cmd,
otherwise the check for query initiation may have a false negative
result.

Fixes the `dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.
random.seed6` test case.

Fixes: 2b1992a000 ("pvr: Implement vkCmdBeginQuery API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>
2026-04-15 11:03:05 +00:00
Lorenzo Rossi
7ccca9f972 pan/compiler: Document compilation pipeline expectations
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>
2026-04-15 10:32:19 +00:00
Lorenzo Rossi
43ba475d4c panfrost,panvk: Move lower_texture_early inside preproc
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>
2026-04-15 10:32:19 +00:00
Lorenzo Rossi
e24228e327 panfrost,panvk: Move lower_texture_late inside postproc
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>
2026-04-15 10:32:19 +00:00
Lorenzo Rossi
d096a8e962 panfrost: Move lower_res_indices before postproc
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>
2026-04-15 10:32:18 +00:00
Lorenzo Rossi
eafc822dbd panfrost,panvk: Move postprocess near shader_compile
Ideally there should be only sysval lowering in the middle.

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>
2026-04-15 10:32:18 +00:00
Lorenzo Rossi
83fd45aa5a pan/compiler: Fix noperspective int varyings
Ints and floats do not need to match between VS and FS, some crazy
shaders might write an uint from the VS and read a noperspective float
from the FS.  There will be new tests in the conformance tests that
check that too shortly.

Is this a performance regression? yes.
Can we fix this easily? No, we'll need dynamic prolog/epilog linking.

Since maybe_noperspective is almost useless after this fix, the whole
logic has been removed

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>
2026-04-15 10:32:18 +00:00
Lorenzo Rossi
6e67f2a996 pan/compiler: Don't crash nopersp if pos is undefined
VS does not need to write the position, it can also leave it as
undefined.  We agree that there isn't much sense in noperspective
varyings with undefined perspective, but we still do not want to crash.
This does lead to some real crashes if we mistake some int varying to
noperspective (see next commit).

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40844>
2026-04-15 10:32:18 +00:00
Georg Lehmann
5607417f57 radv: remove radv_nir_lower_viewport_to_zero
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
nir_opt_varyings does this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:17:01 +00:00
Georg Lehmann
c842186e39 radv: remove lower array vars to elem
No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:17:01 +00:00
Georg Lehmann
c7a953809a radv: don't lower io vars to scalar
Done later on lowered IO.

Foz-DB Navi48:
Totals from 4 (0.00% of 205045) affected shaders:
Instrs: 1434 -> 1418 (-1.12%)
CodeSize: 7912 -> 7848 (-0.81%)
Latency: 5688 -> 5646 (-0.74%)
InvThroughput: 642 -> 646 (+0.62%)
PreVGPRs: 104 -> 100 (-3.85%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:17:01 +00:00
Georg Lehmann
b2e59d80b0 radv: do not shrink vectors when lowering IO vars to scalar
I wanted to move this later, but looking at the stats, this pass
actually hurts here because it shrinks smem loads that would be better
vectorized. So just remove it.

Foz-DB Navi48:
Totals from 2268 (1.11% of 205045) affected shaders:
Instrs: 1573491 -> 1569535 (-0.25%); split: -0.35%, +0.10%
CodeSize: 8399092 -> 8378632 (-0.24%); split: -0.39%, +0.14%
SpillSGPRs: 312 -> 355 (+13.78%)
Latency: 12223349 -> 12225239 (+0.02%); split: -0.20%, +0.21%
InvThroughput: 2235646 -> 2236174 (+0.02%); split: -0.15%, +0.17%
VClause: 26526 -> 26549 (+0.09%); split: -0.02%, +0.11%
SClause: 34974 -> 34053 (-2.63%); split: -3.01%, +0.37%
Copies: 114417 -> 115513 (+0.96%); split: -0.33%, +1.28%
Branches: 28085 -> 26899 (-4.22%); split: -4.24%, +0.02%
PreSGPRs: 98109 -> 99024 (+0.93%); split: -0.10%, +1.03%
PreVGPRs: 78224 -> 78226 (+0.00%)
VALU: 929067 -> 928588 (-0.05%); split: -0.08%, +0.03%
SALU: 204756 -> 206936 (+1.06%); split: -0.19%, +1.26%
SMEM: 67181 -> 64687 (-3.71%); split: -3.83%, +0.11%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:17:00 +00:00
Georg Lehmann
f001afad23 radv: remove some unneeded passes from radv_nir_lower_io_vars_to_scalar
No Foz-DB changes on Navi48.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:16:59 +00:00
Georg Lehmann
e0883d107a radv: do not remove dead variables
No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:16:59 +00:00
Georg Lehmann
8c98ed9e85 radv: do not vectorize io variables
No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:16:59 +00:00
Georg Lehmann
d14fc27f44 radv: do not vectorize fs out variables
This is scalarized later anyway.

No Foz-DB changes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:16:58 +00:00
Georg Lehmann
5d8c817fd7 radv: lower lowered io to scalar
We already did this for everything except
fragment shader outputs with epilogs.

If we move it a bit earlier, we can stop lowering IO
variables to scalar.

Foz-DB Navi48:
Totals from 1001 (0.49% of 205045) affected shaders:
MaxWaves: 31252 -> 31256 (+0.01%)
Instrs: 372258 -> 372036 (-0.06%); split: -0.14%, +0.08%
CodeSize: 1999064 -> 1997836 (-0.06%); split: -0.13%, +0.06%
VGPRs: 39096 -> 39072 (-0.06%)
Latency: 1235558 -> 1235435 (-0.01%); split: -0.08%, +0.07%
InvThroughput: 213845 -> 213875 (+0.01%); split: -0.06%, +0.07%
VClause: 5840 -> 5838 (-0.03%)
SClause: 10964 -> 10969 (+0.05%); split: -0.03%, +0.07%
Copies: 21469 -> 21545 (+0.35%); split: -0.42%, +0.78%
Branches: 5326 -> 5324 (-0.04%)
PreSGPRs: 34214 -> 34206 (-0.02%); split: -0.03%, +0.01%
PreVGPRs: 21931 -> 22001 (+0.32%); split: -0.06%, +0.38%
VALU: 212386 -> 212418 (+0.02%); split: -0.07%, +0.09%
SALU: 50409 -> 50378 (-0.06%); split: -0.07%, +0.01%
VMEM: 8352 -> 8331 (-0.25%)
SMEM: 17966 -> 17963 (-0.02%)

This is mostly RA noise in GPL FS shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40955>
2026-04-15 09:16:58 +00:00
Natalie Vock
1f998b38f4 radv: Run nir_opt_deref after first optimization loop
Only at this point are loads from uninitialized variables lowered to
undef and copy-propagated so that nir_opt_deref's cast-of-undef
optimization works properly.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40799>
2026-04-15 08:42:12 +00:00
Natalie Vock
57f796752d nir/deref: Elide loads/stores from deref cast of undef
These can never be meaningful. DOOM: The Dark Ages also relies on this.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40799>
2026-04-15 08:42:12 +00:00
Job Noorman
118b975ce7 ir3: use ldg.k load size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ldg.k can copy up to 256 vec4s at once but we currently emit one ldg.k
per vec4. Fix this by using the load size field of ldg.k.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>
2026-04-15 07:58:01 +00:00
Job Noorman
a1272cabe0 ir3/isa: fix load size encoding for ldg.k
The load size field starts at b23 instead of b24 and is 8 bits in size.
b23 makes the blob disassembler select between interpreting the load
size as an immediate or a GPR. However, using a GPR doesn't work as the
HW still seems to interpret the field as an immediate. We copy the
blob's behavior here for consistency.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>
2026-04-15 07:58:01 +00:00
Job Noorman
e6529b54c0 ir3: add support for the ldg.k a1.x addressing mode
We assumed a1.x addressing doesn't work. However, it turns out it
actually does work but instead of taking the offset's hight bits from
a1.x and adding an immediate to the low bits, the full offset is stored
in a1.x and the offset is ignored.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>
2026-04-15 07:58:01 +00:00
Job Noorman
bf167ca73b ir3: allow shared address src for ldg.k
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40947>
2026-04-15 07:58:00 +00:00
Christoph Pillmayer
3427b20b71 pan/bi: Fix MEMMOV size calculation
Doing stores first, loads second doesn't work because there can be
chains of store, load, store... .
Use a fixed point approach instead to calculate sizes for all
destinations.

Fixes: 2fd5b8a391 ("pan/bi: Account for MEMMOV in bi_record_sizes")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40915>
2026-04-15 07:22:35 +00:00
Job Noorman
ce810bb19b ir3/parser: add @constlen header
Constlen cannot always be derived from the usage of @const et al. For
example when using ldc.k/ldg.k. Add a @constlen header to explicitly set
it.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40940>
2026-04-15 06:46:10 +00:00