Commit graph

3694 commits

Author SHA1 Message Date
Samuel Pitoiset
755cb6cb75 radv: fix independent sets with dynamic buffers and GPL
If a set layout is missing the driver can't compute the dynamic buffer
start offsets correctly. The only solution is to load these offsets from
an user SGPR.

To avoid adding more complexity, these offsets are re-emitted every
time dynamic buffers are dirty. That shouldn't matter because the
combination of dynamic buffers and independent sets is just super rare.

This fixes new VKCTS coverage
dEQP-VK.pipeline.pipeline_library.graphics_library.independent_sets_random.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39988>
2026-02-24 11:12:14 +00:00
David Rosca
0d7117f0d7 ac/vcn_dec: Fix tier2 dpb array size
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In some cases, this would incorrectly set higher dpbArraySize
when overwriting already existing dpb slot.
This didn't seem to cause any issues, but the extra slot would
have zero va which was wrong.
Get the actual ref count from codec param, instead of using
cmd->num_refs which always includes current slot. Also add sanity
check that the ref surface was found.

Fixes: 79af03556c ("ac: Add VCN ac_video_dec implementation")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39877>
2026-02-19 12:24:29 +00:00
Priya Hosur
0bfad39f15 ac/nir/ngg: re-enable use of known compile-time GS connectivity
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38075>
2026-02-18 01:29:37 +00:00
Marek Olšák
a2309edb6b ac/nir/meta: properly align sparse buffer clears with 12-byte clear values
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>
2026-02-17 14:47:41 +00:00
Marek Olšák
62cce3abcd ac/nir/meta: use the clear/copy compute shader if CP DMA doesn't support sparse
ac_prepare_cs_clear_copy_buffer determines whether to use CP DMA, and
the driver obeys that.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>
2026-02-17 14:47:41 +00:00
Marek Olšák
bbcfab9f4f ac/nir/meta: don't scalarize sparse loads if the address is aligned to load size
This should make copying sparse faster if we get aligned buffer bounds.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841>
2026-02-17 14:47:41 +00:00
Rhys Perry
e4b8ade092 ac/nir,radv,radeonsi: flip branches to avoid waitcnts
fossil-db (navi31):
Totals from 5123 (6.42% of 79825) affected shaders:
Instrs: 12712435 -> 12703672 (-0.07%); split: -0.12%, +0.05%
CodeSize: 67068852 -> 67033244 (-0.05%); split: -0.10%, +0.05%
VGPRs: 363896 -> 363956 (+0.02%)
SpillSGPRs: 5035 -> 5074 (+0.77%); split: -0.83%, +1.61%
Latency: 115048972 -> 111944013 (-2.70%); split: -2.89%, +0.19%
InvThroughput: 19102126 -> 18696069 (-2.13%); split: -2.34%, +0.22%
VClause: 258693 -> 258770 (+0.03%); split: -0.01%, +0.04%
SClause: 346271 -> 346225 (-0.01%); split: -0.02%, +0.00%
Copies: 1040815 -> 1042017 (+0.12%); split: -0.23%, +0.34%
Branches: 332467 -> 332565 (+0.03%); split: -0.04%, +0.07%
PreSGPRs: 304888 -> 304699 (-0.06%); split: -0.10%, +0.04%
PreVGPRs: 296652 -> 296654 (+0.00%)
VALU: 7591803 -> 7594601 (+0.04%); split: -0.01%, +0.05%
SALU: 1454420 -> 1455764 (+0.09%); split: -0.24%, +0.33%
VOPD: 1826 -> 1810 (-0.88%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262>
2026-02-16 19:39:43 +00:00
Marek Olšák
a9df891bc6 nir: allow get_ssbo_size to return a 64-bit result
to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743>
2026-02-16 12:59:36 +00:00
Marek Olšák
d1e6a5c1c8 ac: lower load_num_workgroups in NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>
2026-02-13 15:33:19 +00:00
Marek Olšák
1e11e83d1c ac/nir: add ac_nir_lower_intrinsics_to_args_options structure
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>
2026-02-13 15:33:19 +00:00
Marek Olšák
a9e47751d2 ac: lower load_subgroup_id for ACO in NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>
2026-02-13 15:33:19 +00:00
Marek Olšák
0a9bdcac79 ac: lower load_workgroup_ids for ACO in NIR
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638>
2026-02-13 15:33:19 +00:00
Samuel Pitoiset
1be4ffdff9 ac,radv,radeonsi: use correct swizzle/pitch for depth-only images with SDMA
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes new VKCTS coverage
dEQP-VK.api.copy_and_blit.core.use_after_copy.*.

is_stencil isn't set for RadeonSI because it doesn't do SDMA copies
with Z/S.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39800>
2026-02-13 07:52:29 +01:00
David Rosca
24c74f522c ac/vcn_dec: Make the helper functions static
They are only used in ac_vcn_dec.c now.

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
4d06fb9acd ac: Add UVD ac_video_dec implementation
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
9608abb26b ac: Add VCN JPEG ac_video_dec implementation
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
79af03556c ac: Add VCN ac_video_dec implementation
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:26 +00:00
David Rosca
b5028e84c8 ac: Add video decode interface
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39627>
2026-02-12 15:38:25 +00:00
Pierre-Eric Pelloux-Prayer
8f7f7a90b7 radeonsi/sqtt: use pipe_aligned_buffer_create to allocate bo
pipe_aligned_buffer_create can allow allocate 4GB but that's large enough
for now.
PIPE_USAGE_STREAM is used for now to keep the 2 BOs in GTT.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39194>
2026-02-12 10:08:43 +00:00
Samuel Pitoiset
f2d7d998a2 radv: track redundant PA_SC_VRS_OVERRIDE_CNTL register writes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39675>
2026-02-06 07:15:10 +00:00
Samuel Pitoiset
cbf0a38fa4 ac,radv,radeonsi: shorten some emit macro names
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
config -> cfg
uconfig -> ucfg
context -> ctx

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39680>
2026-02-04 13:27:49 +00:00
Marek Olšák
edffb2d76d ac: add FMASK codes
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39631>
2026-02-03 17:10:32 +00:00
Marek Olšák
6f36a2be2e ac: unify HTILE codes and encoding
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39631>
2026-02-03 17:10:32 +00:00
Marek Olšák
e0c7c642f4 ac: unify and demystify CMASK clear codes
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39631>
2026-02-03 17:10:32 +00:00
Marek Olšák
6af6197136 ac: unify DCC clear code definitions
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39631>
2026-02-03 17:10:30 +00:00
Marek Olšák
85916c8af0 ac/nir: lower buffer image_load to load_buffer_amd in NIR
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:54 +00:00
Marek Olšák
ef3d43085a ac/nir: lower buffer txf to load_buffer_amd in NIR
This also:
- removes the sparse flag (TFE) if it has no uses
- removes trailing unused components (if not sparse) or all contiguous unused
  components before the sparse flag (if sparse)
- lowers 64-bit formatted buffer loads to 32 bits

Everything here could also be used by 64-bit non-buffer image loads
and txf if needed.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:54 +00:00
Marek Olšák
30ee7044bc ac/nir: rename ac_nir_lower_tex -> ac_nir_lower_image_tex
It will lower txf and buffer image loads to load_buffer_amd.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:54 +00:00
Marek Olšák
61bfc298ba ac: set missing dest_type for image_deref_load
required for lowering to load_buffer_amd

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:53 +00:00
Marek Olšák
fbfac92738 ac,radeonsi: add AC_NIR_TEX_BACKEND_FLAG_IS_IMAGE
image_load lowered to tex will use this (descriptor loads only for now)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:53 +00:00
Marek Olšák
44bc1e6bf4 nir: add dest_type to load_buffer_amd
for lowering the result to 16 bits

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39474>
2026-02-02 17:56:52 +00:00
Wang Ruitang
e11c04c0cc amd/common/virtio: use device fd to init sync provider
Use fd after dup instead of the one before dup to avoid
drm_syncobj_find failed in guest kernel when dev is found in
dev_list.

When dev is not found in dev_list, it uses device fd which is
duplicated, to init sync provider. And when it's found, the same
device fd should be used. Otherwise, it would caused inconsistency
and failures like in the Android domU CTS test where the guest
kernel attempts to locate a syncobj. This occurs because
vdrm_device_connect and VIRTGPU_EXECBUFFER ioctl use fd after dup
while util_sync_provider_drm uses the one before dup.

The fix has been validated with the CtsSdkSandboxWebkitTestCases in
Android domU, and the previously failing test cases no longer occur.

Signed-off-by: Ruitang.Wang@amd.com
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39520>
2026-01-27 08:24:35 +00:00
David Rosca
62f07b8c63 radeonsi/vcn: Add low latency decode debug option
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to the low latency option for encode, this reduces latency
of decoding at the cost of increased power usage.

Can be enabled with AMD_DEBUG=lowlatencydec

Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39450>
2026-01-26 15:00:06 +00:00
Georg Lehmann
809fb0fba3 ac/nir/lower_ps_late: emit scalar f2f16_rtz for when one half of a packed export is undef
Foz-DB Navi48:
Totals from 7200 (8.74% of 82405) affected shaders:
Instrs: 9056391 -> 9048177 (-0.09%); split: -0.09%, +0.00%
CodeSize: 48681288 -> 48640684 (-0.08%); split: -0.09%, +0.00%
VGPRs: 413088 -> 413784 (+0.17%)
Latency: 76340711 -> 76320080 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 12692959 -> 12684618 (-0.07%); split: -0.07%, +0.00%
VClause: 148823 -> 148821 (-0.00%)
Copies: 601739 -> 601874 (+0.02%); split: -0.01%, +0.03%
VALU: 5213356 -> 5207253 (-0.12%); split: -0.12%, +0.00%
SALU: 1160815 -> 1160817 (+0.00%); split: -0.00%, +0.00%
VOPD: 79520 -> 79444 (-0.10%); split: +0.09%, -0.18%

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:23 +00:00
Georg Lehmann
8c895c5c61 ac/nir/lower_ps_late: CSE partial packed exports
Foz-DB Navi48:
Totals from 425 (0.52% of 82405) affected shaders:
Instrs: 1110029 -> 1109658 (-0.03%); split: -0.03%, +0.00%
CodeSize: 6135272 -> 6133848 (-0.02%); split: -0.02%, +0.00%
VGPRs: 29856 -> 29844 (-0.04%)
Latency: 10258411 -> 10258043 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1898177 -> 1897661 (-0.03%)
Copies: 88221 -> 88173 (-0.05%)
VALU: 575276 -> 574894 (-0.07%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39412>
2026-01-26 10:54:22 +00:00
Samuel Pitoiset
c91ed27582 radv: use the SQTT enable bit for PKT3_DISPATCH_TASKMESH_INDIRECT_MULTI_ACE
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39425>
2026-01-26 08:10:53 +00:00
Samuel Pitoiset
c7da19e2bf radv: use the SQTT enable bit for PKT3_DRAW_{INDEX}_INDIRECT_MULTI
This reports more info in RGP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39425>
2026-01-26 08:10:52 +00:00
Marek Olšák
ebeb904c95 ac,radeonsi: set optimal COMPUTE_DISPATCH_INTERLEAVE for buffer clears/copies
Small buffer clears are a bit faster now.

The numbers were tuned specifically for this compute shader.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>
2026-01-22 22:28:39 +00:00
Marek Olšák
a5e1d31dad ac/nir/meta: tune 12B clear buffer performance for gfx12
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>
2026-01-22 22:28:39 +00:00
Marek Olšák
9257cf04a1 ac/nir/meta: tune image clear & copy performance for gfx12
Compute shaders are the fastest for all copies and some clears.
Note that this is a very different compute shader than the one in RADV.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39290>
2026-01-22 22:28:38 +00:00
jaap aarts
8f7941f92d radv/sqtt: Prevent concurrent submit when sqtt is enabled
cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39090>
2026-01-21 18:55:56 +00:00
Timur Kristóf
87a8d19b51 ac/gpu_info: Remove FIXME from regalloc hang description
This is now implemented.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>
2026-01-21 17:24:57 +00:00
Samuel Pitoiset
de64c7238a ac/nir: fix computing cube derivatives when the major axis is negative
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This corresponds to the face 1.0, 3.0 or 5.0.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39303>
2026-01-21 07:12:34 +00:00
Natalie Vock
a03e9287c3 radv/rt: Compile ahit/isec shaders to asm
We can express any-hit/intersection shaders as functions, too.
Any-hit/Intersection shaders need the usual parameters like launch
IDs/descriptor data/ray properties, origin, direction/etc., but also
some special parameters related to traversal state. Any-hit/intersection
shaders need to return whether the hit was accepted and/or traversal
should be terminated, as well as the intersection T value (for
intersection shaders). Both any-hit and intersection shaders also need
to be passed hit attributes via parameters. Closest-Hit shaders need
those too, but we pass them out-of-band via LDS. LDS is used for the
traversal stack when any-hit/intersection shaders, so we need to pass
them via parameters.

Hit attributes are similar to ray payloads in the sense that they're
dynamically sized depending on how much space the application uses.
However, unlike ray payloads, hit attribute sizes have a strict upper
bound of 8 dwords. To make managing parameters easier, we put all hit
attributes in a single vector parameter with 0-8 components. This
prevents having a function with two sets of arbitrary numbers of
parameters.

This commit sets up ahit/isec function signatures and implements
lowering for ahit/isec-specific intrinsics in the context of these
functions. Subsequent commits will merely have to call into these
functions to execute a separate-compiled any-hit/intersection shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39314>
2026-01-20 21:49:55 +00:00
Georg Lehmann
711598982a ac/nir,radv: remove ac_nir_opt_pack_half
Foz-DB Navi21:
Totals from 2937 (3.01% of 97591) affected shaders:
Instrs: 1908695 -> 1908291 (-0.02%); split: -0.02%, +0.00%
CodeSize: 10232148 -> 10229224 (-0.03%); split: -0.03%, +0.01%
VGPRs: 142168 -> 142080 (-0.06%)
Latency: 8052895 -> 8052622 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 2550330 -> 2549602 (-0.03%); split: -0.03%, +0.01%
VClause: 32601 -> 32603 (+0.01%); split: -0.01%, +0.02%
Copies: 118570 -> 118587 (+0.01%); split: -0.04%, +0.05%
PreVGPRs: 110090 -> 110082 (-0.01%)
VALU: 1468422 -> 1468043 (-0.03%); split: -0.03%, +0.00%
SALU: 173858 -> 173828 (-0.02%)

Foz-DB Navi48:
Totals from 4196 (4.30% of 97637) affected shaders:
MaxWaves: 118678 -> 118680 (+0.00%); split: +0.01%, -0.01%
Instrs: 3627604 -> 3624093 (-0.10%); split: -0.10%, +0.00%
CodeSize: 18956684 -> 18939824 (-0.09%); split: -0.09%, +0.01%
VGPRs: 225624 -> 225060 (-0.25%); split: -0.26%, +0.01%
Latency: 11856204 -> 11857280 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 2388584 -> 2389178 (+0.02%); split: -0.01%, +0.03%
VClause: 50409 -> 50410 (+0.00%)
SClause: 64701 -> 64699 (-0.00%)
Copies: 208353 -> 207522 (-0.40%); split: -0.43%, +0.03%
PreVGPRs: 161314 -> 161306 (-0.00%)
VALU: 2345604 -> 2345172 (-0.02%); split: -0.02%, +0.00%
SALU: 391466 -> 388723 (-0.70%)
VOPD: 1788 -> 1806 (+1.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:23 +00:00
Marek Olšák
482c410f41 ac: remove never enabled gfx12 HiS
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39260>
2026-01-19 16:58:17 +00:00
Samuel Pitoiset
43eff522e9 ac/debug: add a function that dumps texture descriptors
In a human readable way. Useful for debugging.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39285>
2026-01-16 11:35:34 +00:00
Emma Anholt
ed8676dc28 nir: Rename the unit_test_*_amd intrinics to be un-vendored.
We'll reuse these from the nir_opt_algebraic_pattern_test.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:37 +00:00
Samuel Pitoiset
ae34627e54 ac/cmdbuf: disable ENABLE_PING_PONG_BIN_ORDER on GFX11.5
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Might be a hardware bug.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14240
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39315>
2026-01-15 17:55:26 +00:00
Aitor Camacho
fcf53988c4 nir/opt_varyings: Support implementations that cannot compact 16-bits
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add nir_io_compact_to_higher_16 flag so that the pass knows if it can
compact 16-bit varyings into the higher 16 bits of a 32-bit varying.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38994>
2026-01-14 20:44:41 +00:00