David Rosca
0d7117f0d7
ac/vcn_dec: Fix tier2 dpb array size
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In some cases, this would incorrectly set higher dpbArraySize
when overwriting already existing dpb slot.
This didn't seem to cause any issues, but the extra slot would
have zero va which was wrong.
Get the actual ref count from codec param, instead of using
cmd->num_refs which always includes current slot. Also add sanity
check that the ref surface was found.
Fixes: 79af03556c ("ac: Add VCN ac_video_dec implementation")
Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39877 >
2026-02-19 12:24:29 +00:00
Samuel Pitoiset
8b5296b01c
radv: simplify buffer-to-image and image-to-image operations for 96-bit formats
...
It's possible to use the existing shaders with a small tweak. This
removes a bunch of code in meta.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39935 >
2026-02-19 07:12:47 +00:00
Natalie Vock
47e4a68a83
radv: Initialize nir_lower_io_to_scalar progress variable
...
The NIR_PASS macro only overwrites this when the pass actually makes
progress. If the pass doesn't make progress, the variable stays
uninitialized.
Clang correctly spots this and warns about it.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39968 >
2026-02-18 21:44:49 +00:00
Natalie Vock
59a397793e
radv/rt: Only use ds_bvh_stack_rtn if the stack base is possible to encode
...
The hardware only provides 13 bits for encoding the stack base (in
dwords). That translates to the stack base being required to be below
8192 dwords, or 32kB. It's possible to exceed this - LDS is 64kB after
all. Add an explicit check to make sure we don't end up with offsets
that overflow the hw's address fields. This fixes Metro Exodus Enhanced
Edition, which was using ray queries in a 1024-thread sized workgroup,
resulting in exactly 64kB of LDS being required for the stack.
This check isn't required for RT pipelines as we always use 32 or 64
wide workgroups with no other LDS used, so it's impossible to reach this
stack base limit.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39691 >
2026-02-18 19:12:18 +00:00
Konstantin Seurer
ae84d41d48
radv/meta: Rework saving/restoring state
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The current approach of explicitly saving/restoring some states is
unnecessarily complicated and inefficient. For example, some meta OPs
that use memory fills/copies will have nested save/restores. This patch
is the first step towards avoiding unnecessary state re-emits around
meta OPs.
The changes are:
- Move radv_meta_saved_state to radv_cmd_buffer::state
- Add radv_meta_begin/end helpers that initialize radv_meta_saved_state
and restore states used by the meta OP
- Remove all explicit saves/restores, use the new helpers
radv_meta_begin/end is called inside the entrypoint and not some nested
helper function which means that state is only restored once per meta
OP.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39774 >
2026-02-18 09:37:55 +01:00
Konstantin Seurer
d3cb2978b8
radv/meta: Add and use helpers for setting state
...
It's less code and allows the next commit to track which states a meta
command uses.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39774 >
2026-02-18 09:37:30 +01:00
Samuel Pitoiset
090b67a163
vulkan/runtime: add support for ETC2 emulation with copy_memory_indirect
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39908 >
2026-02-18 07:04:43 +00:00
Priya Hosur
0bfad39f15
ac/nir/ngg: re-enable use of known compile-time GS connectivity
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38075 >
2026-02-18 01:29:37 +00:00
Marek Olšák
a2309edb6b
ac/nir/meta: properly align sparse buffer clears with 12-byte clear values
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841 >
2026-02-17 14:47:41 +00:00
Marek Olšák
62cce3abcd
ac/nir/meta: use the clear/copy compute shader if CP DMA doesn't support sparse
...
ac_prepare_cs_clear_copy_buffer determines whether to use CP DMA, and
the driver obeys that.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841 >
2026-02-17 14:47:41 +00:00
Marek Olšák
bbcfab9f4f
ac/nir/meta: don't scalarize sparse loads if the address is aligned to load size
...
This should make copying sparse faster if we get aligned buffer bounds.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841 >
2026-02-17 14:47:41 +00:00
Samuel Pitoiset
95c4d8d726
radv/meta: rework get_image_stride_for_96bit() and make it non-static
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
c1a507bf42
radv/meta: rename r32g32b32 to 96bit
...
Tt's shorter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
29ce18cb6f
radv/meta: rename some variables for btoi 96-bit shader
...
To match push constants.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
9c90622c94
radv: remove a redundant check in radv_image_is_renderable()
...
RADEON_SURF_NO_RENDER_TARGET is already sets for such an image.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
61b20e726f
radv/ci: mark more WSI tests as flakes on NAVI21
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
7fceeff970
radv/ci: mark more WSI flakes for NAVI21
...
Fixes: c332ee5dd6 ("ci/radv: Add some flakes I hit while testing WSI.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39929 >
2026-02-17 08:04:03 +01:00
Rhys Perry
e4b8ade092
ac/nir,radv,radeonsi: flip branches to avoid waitcnts
...
fossil-db (navi31):
Totals from 5123 (6.42% of 79825) affected shaders:
Instrs: 12712435 -> 12703672 (-0.07%); split: -0.12%, +0.05%
CodeSize: 67068852 -> 67033244 (-0.05%); split: -0.10%, +0.05%
VGPRs: 363896 -> 363956 (+0.02%)
SpillSGPRs: 5035 -> 5074 (+0.77%); split: -0.83%, +1.61%
Latency: 115048972 -> 111944013 (-2.70%); split: -2.89%, +0.19%
InvThroughput: 19102126 -> 18696069 (-2.13%); split: -2.34%, +0.22%
VClause: 258693 -> 258770 (+0.03%); split: -0.01%, +0.04%
SClause: 346271 -> 346225 (-0.01%); split: -0.02%, +0.00%
Copies: 1040815 -> 1042017 (+0.12%); split: -0.23%, +0.34%
Branches: 332467 -> 332565 (+0.03%); split: -0.04%, +0.07%
PreSGPRs: 304888 -> 304699 (-0.06%); split: -0.10%, +0.04%
PreVGPRs: 296652 -> 296654 (+0.00%)
VALU: 7591803 -> 7594601 (+0.04%); split: -0.01%, +0.05%
SALU: 1454420 -> 1455764 (+0.09%); split: -0.24%, +0.33%
VOPD: 1826 -> 1810 (-0.88%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
f81aaee7f1
aco/ra: create vectors for affinities of split definitions
...
For example:
a = ...
b = ...
if {
c, d = split
}
phi(a, c)
phi(b, d)
This patch will allocate 'a' and 'b' as a vector.
fossil-db (navi31):
Totals from 2556 (3.20% of 79825) affected shaders:
MaxWaves: 59957 -> 59955 (-0.00%)
Instrs: 9170941 -> 9154954 (-0.17%); split: -0.19%, +0.02%
CodeSize: 48245956 -> 48182620 (-0.13%); split: -0.15%, +0.02%
VGPRs: 189372 -> 189900 (+0.28%); split: -0.04%, +0.32%
Latency: 85469322 -> 85262360 (-0.24%); split: -0.32%, +0.08%
InvThroughput: 14515911 -> 14486970 (-0.20%); split: -0.27%, +0.07%
VClause: 197980 -> 197959 (-0.01%); split: -0.02%, +0.01%
Copies: 787838 -> 774288 (-1.72%); split: -1.91%, +0.19%
Branches: 271810 -> 271799 (-0.00%); split: -0.01%, +0.01%
VALU: 5331813 -> 5318566 (-0.25%); split: -0.28%, +0.03%
SALU: 1133559 -> 1133054 (-0.04%); split: -0.05%, +0.01%
VOPD: 2435 -> 2418 (-0.70%); split: +0.12%, -0.82%
fossil-db (navi21):
Totals from 37513 (46.99% of 79825) affected shaders:
Instrs: 26734825 -> 26681225 (-0.20%); split: -0.23%, +0.03%
CodeSize: 141353284 -> 141144360 (-0.15%); split: -0.17%, +0.02%
VGPRs: 1556760 -> 1556384 (-0.02%); split: -0.21%, +0.18%
Latency: 146201548 -> 146156473 (-0.03%); split: -0.20%, +0.17%
InvThroughput: 33921803 -> 33867398 (-0.16%); split: -0.23%, +0.07%
VClause: 502263 -> 502209 (-0.01%); split: -0.27%, +0.26%
SClause: 593142 -> 593155 (+0.00%); split: -0.00%, +0.00%
Copies: 2600995 -> 2551257 (-1.91%); split: -2.16%, +0.25%
Branches: 857910 -> 857787 (-0.01%); split: -0.03%, +0.02%
VALU: 15674532 -> 15625013 (-0.32%); split: -0.35%, +0.04%
SALU: 4635548 -> 4634680 (-0.02%); split: -0.04%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
86f0195f5c
aco/ra: prefer phi operands which don't create waitcnt
...
fossil-db (navi31):
Totals from 89 (0.11% of 79825) affected shaders:
Instrs: 343443 -> 343384 (-0.02%); split: -0.10%, +0.09%
CodeSize: 1792948 -> 1792668 (-0.02%); split: -0.10%, +0.08%
Latency: 2656294 -> 2656490 (+0.01%); split: -0.02%, +0.02%
InvThroughput: 517696 -> 517691 (-0.00%); split: -0.01%, +0.01%
SClause: 9213 -> 9215 (+0.02%); split: -0.01%, +0.03%
Copies: 39138 -> 39089 (-0.13%); split: -0.84%, +0.71%
Branches: 10863 -> 10872 (+0.08%); split: -0.05%, +0.13%
SALU: 49185 -> 49136 (-0.10%); split: -0.67%, +0.57%
fossil-db (navi21):
Totals from 34490 (43.21% of 79825) affected shaders:
Instrs: 23005853 -> 22956529 (-0.21%); split: -0.25%, +0.04%
CodeSize: 120532004 -> 120341412 (-0.16%); split: -0.19%, +0.03%
VGPRs: 1396928 -> 1397520 (+0.04%); split: -0.07%, +0.11%
Latency: 108740068 -> 108499644 (-0.22%); split: -0.53%, +0.30%
InvThroughput: 25286526 -> 25358695 (+0.29%); split: -0.11%, +0.39%
VClause: 421179 -> 421132 (-0.01%); split: -0.29%, +0.27%
SClause: 446414 -> 446423 (+0.00%); split: -0.00%, +0.00%
Copies: 2242236 -> 2243168 (+0.04%); split: -0.42%, +0.46%
Branches: 724556 -> 724903 (+0.05%); split: -0.02%, +0.07%
VALU: 13321078 -> 13321940 (+0.01%); split: -0.07%, +0.08%
SALU: 4069929 -> 4070580 (+0.02%); split: -0.02%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
310f588f92
aco/ra: move variables from affinity register to avoid waitcnt
...
If we don't use this affinity register, we're likely to end up moving the
temporary later. If it's a memory instruction destination, that's probably
more expensive than just copying the blocking variables.
fossil-db (navi31):
Totals from 504 (0.63% of 79825) affected shaders:
Instrs: 4108284 -> 4109026 (+0.02%); split: -0.01%, +0.03%
CodeSize: 21226764 -> 21229764 (+0.01%); split: -0.01%, +0.02%
Latency: 26931635 -> 26806989 (-0.46%); split: -0.47%, +0.00%
InvThroughput: 8443520 -> 8439235 (-0.05%); split: -0.06%, +0.01%
VClause: 99209 -> 99314 (+0.11%); split: -0.00%, +0.11%
SClause: 85089 -> 85085 (-0.00%)
Copies: 340323 -> 340993 (+0.20%); split: -0.06%, +0.26%
Branches: 117225 -> 117209 (-0.01%); split: -0.02%, +0.00%
VALU: 2421859 -> 2422529 (+0.03%); split: -0.01%, +0.04%
SALU: 503465 -> 503470 (+0.00%); split: -0.00%, +0.00%
fossil-db (navi21):
Totals from 582 (0.73% of 79825) affected shaders:
Instrs: 3714908 -> 3714990 (+0.00%); split: -0.02%, +0.02%
CodeSize: 19977880 -> 19973076 (-0.02%); split: -0.04%, +0.01%
VGPRs: 40480 -> 40496 (+0.04%)
Latency: 26028895 -> 25772711 (-0.98%); split: -0.99%, +0.00%
InvThroughput: 9827389 -> 9818194 (-0.09%); split: -0.10%, +0.01%
VClause: 103702 -> 103815 (+0.11%); split: -0.02%, +0.13%
SClause: 90861 -> 90857 (-0.00%)
Copies: 335276 -> 335992 (+0.21%); split: -0.09%, +0.30%
Branches: 123912 -> 123897 (-0.01%); split: -0.02%, +0.00%
VALU: 2466032 -> 2466748 (+0.03%); split: -0.01%, +0.04%
SALU: 533658 -> 533667 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
681ec4cba7
aco/ra: track cost of moving variables
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
69bc4efa37
aco/sched_ilp: improve scheduling with VMEM/DS->VALU WaW
...
This improves scheduling with one side of a divergent branch writing to a
VGPR using VMEM/DS, and the other writing using VALU. At the merge block,
it will properly consider that the VGPR was written by a VMEM/DS.
fossil-db (navi31):
Totals from 1224 (1.53% of 79825) affected shaders:
Instrs: 5264815 -> 5267604 (+0.05%); split: -0.00%, +0.06%
CodeSize: 27406404 -> 27422132 (+0.06%); split: -0.00%, +0.06%
Latency: 48325204 -> 48293975 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8923880 -> 8919191 (-0.05%); split: -0.07%, +0.02%
fossil-db (navi21):
Totals from 1267 (1.59% of 79825) affected shaders:
Instrs: 4628583 -> 4629190 (+0.01%); split: -0.00%, +0.01%
CodeSize: 24974672 -> 24977188 (+0.01%); split: -0.00%, +0.01%
Latency: 45080476 -> 44998120 (-0.18%); split: -0.20%, +0.02%
InvThroughput: 12288202 -> 12269634 (-0.15%); split: -0.16%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
88b6b6db17
aco: only consider cost of memory loads at waitcnt
...
We don't run this code before waitcnt insertion, so this isn't necessary.
This change improves accuracy in these two situations, because the waitcnt
insertion pass is more aware of divergent control flow:
v0 = valu
if (divergent) {
v0 = vmem
} else {
use(v0)
}
v0 = vmem
if (divergent) {
wait vmcnt(0)
} else {
wait vmcnt(0)
}
use(v0)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
6963c8dd80
radv,aco/gfx11: preserve s2 when NGG_WAVE_ID_EN=1
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
According to the ISA doc, this is needed for hang recovery.
This works by just avoiding putting temporaries in s0-3 unless they're
precolored there.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39720 >
2026-02-16 14:33:58 +00:00
Rhys Perry
f9c11a8e15
radv: add ngg_wave_id_en to radv_shader_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39720 >
2026-02-16 14:33:57 +00:00
Marek Olšák
61a96be494
nir/lower_non_uniform_access: add an option not to lower tex & image queries
...
AMD can do non-uniform queries. The RADV change will be in a separate commit.
NFC for drivers.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
a9df891bc6
nir: allow get_ssbo_size to return a 64-bit result
...
to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Samuel Pitoiset
47841c1142
radv/meta: remove useless DCC decompressions for image<->buffer
...
It's not needed to decompress DCC when formats are compatible each
other, this basically removes all decompressions on GFX11-GFX11.5.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39888 >
2026-02-16 07:40:13 +00:00
Emma Anholt
db532eaf00
ci/radv: Enable WSI testing.
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This gets us coverage of present_timing for KHR_display, which we don't
have on the older CTS used by the other drivers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39701 >
2026-02-13 23:57:14 +00:00
Emma Anholt
c332ee5dd6
ci/radv: Add some flakes I hit while testing WSI.
...
I upgraded some clearly flaky groups of tests in zink to regexes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39701 >
2026-02-13 23:57:14 +00:00
Rhys Perry
b60bff0429
aco: consider 64-bit transcendental normal valu for s_delay_alu
...
https://github.com/llvm/llvm-project/pull/180940
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39851 >
2026-02-13 17:03:34 +00:00
Marek Olšák
9237ca7e46
ac/llvm: remove unused functions
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
d1e6a5c1c8
ac: lower load_num_workgroups in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
1e11e83d1c
ac/nir: add ac_nir_lower_intrinsics_to_args_options structure
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
a9e47751d2
ac: lower load_subgroup_id for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
0a9bdcac79
ac: lower load_workgroup_ids for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Daniel Schürmann
97f095f6e0
aco/lower_branches: Add try_rotate_latch_block() optimization
...
This optimization looks for unconditional back-edges and aims
to rotate the loop in a way that the final block is emitted
before the loop header, essentially turning
BB1:
if ()
goto BB3;
BB2:
<loop body>
goto BB1;
BB3:
...
into
goto BB1;
BB2:
<loop body>
BB1:
if(!cond)
goto BB2;
BB3:
...
Totals from 4969 (5.89% of 84383) affected shaders: (Navi48)
Instrs: 15253038 -> 15254019 (+0.01%); split: -0.00%, +0.01%
CodeSize: 81225300 -> 81227696 (+0.00%); split: -0.02%, +0.02%
Latency: 320796283 -> 320693480 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 51395922 -> 51376156 (-0.04%); split: -0.04%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ade5e300ab
aco/insert_delay_alu: handle loop latch block before loop body
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
102aca9843
aco/assembler: emit block_kind_loop_latch before the loop header
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
da1594f8bb
aco: introduce notion of block_kind_loop_latch
...
A block annotated with block_kind_loop_latch denotes a block
the re-entry point for a loop back-edge. It is emitted after
the loop preheader and (potentially) before the loop header.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
9887ce6709
aco/print_asm: Sort block markers by block offset
...
We are going to emit blocks in a different order.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
800a4957bb
aco/lower_branches: Consider branch target of nested conditional branches
...
Totals from 1470 (1.74% of 84383) affected shaders: (Navi48)
Instrs: 5128451 -> 5126842 (-0.03%)
CodeSize: 29359832 -> 29353656 (-0.02%); split: -0.02%, +0.00%
Latency: 41047203 -> 41040786 (-0.02%)
InvThroughput: 6040459 -> 6039619 (-0.01%); split: -0.01%, +0.00%
Branches: 146219 -> 144648 (-1.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
fbf2083b8f
aco/isel: Don't emit ELSE side of divergent branches which jump
...
Totals from 50 (0.06% of 84383) affected shaders: (Navi48)
Instrs: 402490 -> 402444 (-0.01%); split: -0.01%, +0.00%
CodeSize: 2239024 -> 2238864 (-0.01%); split: -0.01%, +0.00%
SpillSGPRs: 1493 -> 1496 (+0.20%)
Latency: 5836785 -> 5836747 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1120893 -> 1120909 (+0.00%); split: -0.00%, +0.00%
Copies: 46128 -> 46082 (-0.10%)
VALU: 222708 -> 222715 (+0.00%); split: -0.00%, +0.00%
SALU: 53039 -> 52993 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ba32219cf8
aco/isel: Don't emit ELSE side of uniform branches which jump
...
Totals from 4 (0.00% of 84383) affected shaders: (Navi48)
Instrs: 16473 -> 16468 (-0.03%)
CodeSize: 85276 -> 85300 (+0.03%)
SpillSGPRs: 175 -> 176 (+0.57%)
Latency: 267907 -> 267885 (-0.01%)
InvThroughput: 36302 -> 36298 (-0.01%)
Copies: 1353 -> 1345 (-0.59%)
VALU: 9025 -> 9029 (+0.04%)
SALU: 2635 -> 2627 (-0.30%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
96a639918c
aco: don't emit p_logical_start / p_logical_end after divergent branches
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
3743230252
aco/isel: Do IF-simplification if that didn't happen during NIR optimizations
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:43 +00:00
Daniel Schürmann
50b093ec90
aco/builder: Fix v_add_co_u32 carry-out to VCC if post_ra
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:43 +00:00
Eric Engestrom
fb1cb00a96
radv/ci: add vulkan fluster job on navi48
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39861 >
2026-02-13 13:48:03 +00:00
Samuel Pitoiset
1be4ffdff9
ac,radv,radeonsi: use correct swizzle/pitch for depth-only images with SDMA
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes new VKCTS coverage
dEQP-VK.api.copy_and_blit.core.use_after_copy.*.
is_stencil isn't set for RadeonSI because it doesn't do SDMA copies
with Z/S.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39800 >
2026-02-13 07:52:29 +01:00