Konstantin Seurer
ae84d41d48
radv/meta: Rework saving/restoring state
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
The current approach of explicitly saving/restoring some states is
unnecessarily complicated and inefficient. For example, some meta OPs
that use memory fills/copies will have nested save/restores. This patch
is the first step towards avoiding unnecessary state re-emits around
meta OPs.
The changes are:
- Move radv_meta_saved_state to radv_cmd_buffer::state
- Add radv_meta_begin/end helpers that initialize radv_meta_saved_state
and restore states used by the meta OP
- Remove all explicit saves/restores, use the new helpers
radv_meta_begin/end is called inside the entrypoint and not some nested
helper function which means that state is only restored once per meta
OP.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39774 >
2026-02-18 09:37:55 +01:00
Konstantin Seurer
d3cb2978b8
radv/meta: Add and use helpers for setting state
...
It's less code and allows the next commit to track which states a meta
command uses.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39774 >
2026-02-18 09:37:30 +01:00
Samuel Pitoiset
090b67a163
vulkan/runtime: add support for ETC2 emulation with copy_memory_indirect
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39908 >
2026-02-18 07:04:43 +00:00
Priya Hosur
0bfad39f15
ac/nir/ngg: re-enable use of known compile-time GS connectivity
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38075 >
2026-02-18 01:29:37 +00:00
Marek Olšák
a2309edb6b
ac/nir/meta: properly align sparse buffer clears with 12-byte clear values
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841 >
2026-02-17 14:47:41 +00:00
Marek Olšák
62cce3abcd
ac/nir/meta: use the clear/copy compute shader if CP DMA doesn't support sparse
...
ac_prepare_cs_clear_copy_buffer determines whether to use CP DMA, and
the driver obeys that.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841 >
2026-02-17 14:47:41 +00:00
Marek Olšák
bbcfab9f4f
ac/nir/meta: don't scalarize sparse loads if the address is aligned to load size
...
This should make copying sparse faster if we get aligned buffer bounds.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39841 >
2026-02-17 14:47:41 +00:00
Samuel Pitoiset
95c4d8d726
radv/meta: rework get_image_stride_for_96bit() and make it non-static
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
c1a507bf42
radv/meta: rename r32g32b32 to 96bit
...
Tt's shorter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
29ce18cb6f
radv/meta: rename some variables for btoi 96-bit shader
...
To match push constants.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
9c90622c94
radv: remove a redundant check in radv_image_is_renderable()
...
RADEON_SURF_NO_RENDER_TARGET is already sets for such an image.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
61b20e726f
radv/ci: mark more WSI tests as flakes on NAVI21
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39909 >
2026-02-17 10:39:01 +00:00
Samuel Pitoiset
7fceeff970
radv/ci: mark more WSI flakes for NAVI21
...
Fixes: c332ee5dd6 ("ci/radv: Add some flakes I hit while testing WSI.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39929 >
2026-02-17 08:04:03 +01:00
Rhys Perry
e4b8ade092
ac/nir,radv,radeonsi: flip branches to avoid waitcnts
...
fossil-db (navi31):
Totals from 5123 (6.42% of 79825) affected shaders:
Instrs: 12712435 -> 12703672 (-0.07%); split: -0.12%, +0.05%
CodeSize: 67068852 -> 67033244 (-0.05%); split: -0.10%, +0.05%
VGPRs: 363896 -> 363956 (+0.02%)
SpillSGPRs: 5035 -> 5074 (+0.77%); split: -0.83%, +1.61%
Latency: 115048972 -> 111944013 (-2.70%); split: -2.89%, +0.19%
InvThroughput: 19102126 -> 18696069 (-2.13%); split: -2.34%, +0.22%
VClause: 258693 -> 258770 (+0.03%); split: -0.01%, +0.04%
SClause: 346271 -> 346225 (-0.01%); split: -0.02%, +0.00%
Copies: 1040815 -> 1042017 (+0.12%); split: -0.23%, +0.34%
Branches: 332467 -> 332565 (+0.03%); split: -0.04%, +0.07%
PreSGPRs: 304888 -> 304699 (-0.06%); split: -0.10%, +0.04%
PreVGPRs: 296652 -> 296654 (+0.00%)
VALU: 7591803 -> 7594601 (+0.04%); split: -0.01%, +0.05%
SALU: 1454420 -> 1455764 (+0.09%); split: -0.24%, +0.33%
VOPD: 1826 -> 1810 (-0.88%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
f81aaee7f1
aco/ra: create vectors for affinities of split definitions
...
For example:
a = ...
b = ...
if {
c, d = split
}
phi(a, c)
phi(b, d)
This patch will allocate 'a' and 'b' as a vector.
fossil-db (navi31):
Totals from 2556 (3.20% of 79825) affected shaders:
MaxWaves: 59957 -> 59955 (-0.00%)
Instrs: 9170941 -> 9154954 (-0.17%); split: -0.19%, +0.02%
CodeSize: 48245956 -> 48182620 (-0.13%); split: -0.15%, +0.02%
VGPRs: 189372 -> 189900 (+0.28%); split: -0.04%, +0.32%
Latency: 85469322 -> 85262360 (-0.24%); split: -0.32%, +0.08%
InvThroughput: 14515911 -> 14486970 (-0.20%); split: -0.27%, +0.07%
VClause: 197980 -> 197959 (-0.01%); split: -0.02%, +0.01%
Copies: 787838 -> 774288 (-1.72%); split: -1.91%, +0.19%
Branches: 271810 -> 271799 (-0.00%); split: -0.01%, +0.01%
VALU: 5331813 -> 5318566 (-0.25%); split: -0.28%, +0.03%
SALU: 1133559 -> 1133054 (-0.04%); split: -0.05%, +0.01%
VOPD: 2435 -> 2418 (-0.70%); split: +0.12%, -0.82%
fossil-db (navi21):
Totals from 37513 (46.99% of 79825) affected shaders:
Instrs: 26734825 -> 26681225 (-0.20%); split: -0.23%, +0.03%
CodeSize: 141353284 -> 141144360 (-0.15%); split: -0.17%, +0.02%
VGPRs: 1556760 -> 1556384 (-0.02%); split: -0.21%, +0.18%
Latency: 146201548 -> 146156473 (-0.03%); split: -0.20%, +0.17%
InvThroughput: 33921803 -> 33867398 (-0.16%); split: -0.23%, +0.07%
VClause: 502263 -> 502209 (-0.01%); split: -0.27%, +0.26%
SClause: 593142 -> 593155 (+0.00%); split: -0.00%, +0.00%
Copies: 2600995 -> 2551257 (-1.91%); split: -2.16%, +0.25%
Branches: 857910 -> 857787 (-0.01%); split: -0.03%, +0.02%
VALU: 15674532 -> 15625013 (-0.32%); split: -0.35%, +0.04%
SALU: 4635548 -> 4634680 (-0.02%); split: -0.04%, +0.02%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
86f0195f5c
aco/ra: prefer phi operands which don't create waitcnt
...
fossil-db (navi31):
Totals from 89 (0.11% of 79825) affected shaders:
Instrs: 343443 -> 343384 (-0.02%); split: -0.10%, +0.09%
CodeSize: 1792948 -> 1792668 (-0.02%); split: -0.10%, +0.08%
Latency: 2656294 -> 2656490 (+0.01%); split: -0.02%, +0.02%
InvThroughput: 517696 -> 517691 (-0.00%); split: -0.01%, +0.01%
SClause: 9213 -> 9215 (+0.02%); split: -0.01%, +0.03%
Copies: 39138 -> 39089 (-0.13%); split: -0.84%, +0.71%
Branches: 10863 -> 10872 (+0.08%); split: -0.05%, +0.13%
SALU: 49185 -> 49136 (-0.10%); split: -0.67%, +0.57%
fossil-db (navi21):
Totals from 34490 (43.21% of 79825) affected shaders:
Instrs: 23005853 -> 22956529 (-0.21%); split: -0.25%, +0.04%
CodeSize: 120532004 -> 120341412 (-0.16%); split: -0.19%, +0.03%
VGPRs: 1396928 -> 1397520 (+0.04%); split: -0.07%, +0.11%
Latency: 108740068 -> 108499644 (-0.22%); split: -0.53%, +0.30%
InvThroughput: 25286526 -> 25358695 (+0.29%); split: -0.11%, +0.39%
VClause: 421179 -> 421132 (-0.01%); split: -0.29%, +0.27%
SClause: 446414 -> 446423 (+0.00%); split: -0.00%, +0.00%
Copies: 2242236 -> 2243168 (+0.04%); split: -0.42%, +0.46%
Branches: 724556 -> 724903 (+0.05%); split: -0.02%, +0.07%
VALU: 13321078 -> 13321940 (+0.01%); split: -0.07%, +0.08%
SALU: 4069929 -> 4070580 (+0.02%); split: -0.02%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
310f588f92
aco/ra: move variables from affinity register to avoid waitcnt
...
If we don't use this affinity register, we're likely to end up moving the
temporary later. If it's a memory instruction destination, that's probably
more expensive than just copying the blocking variables.
fossil-db (navi31):
Totals from 504 (0.63% of 79825) affected shaders:
Instrs: 4108284 -> 4109026 (+0.02%); split: -0.01%, +0.03%
CodeSize: 21226764 -> 21229764 (+0.01%); split: -0.01%, +0.02%
Latency: 26931635 -> 26806989 (-0.46%); split: -0.47%, +0.00%
InvThroughput: 8443520 -> 8439235 (-0.05%); split: -0.06%, +0.01%
VClause: 99209 -> 99314 (+0.11%); split: -0.00%, +0.11%
SClause: 85089 -> 85085 (-0.00%)
Copies: 340323 -> 340993 (+0.20%); split: -0.06%, +0.26%
Branches: 117225 -> 117209 (-0.01%); split: -0.02%, +0.00%
VALU: 2421859 -> 2422529 (+0.03%); split: -0.01%, +0.04%
SALU: 503465 -> 503470 (+0.00%); split: -0.00%, +0.00%
fossil-db (navi21):
Totals from 582 (0.73% of 79825) affected shaders:
Instrs: 3714908 -> 3714990 (+0.00%); split: -0.02%, +0.02%
CodeSize: 19977880 -> 19973076 (-0.02%); split: -0.04%, +0.01%
VGPRs: 40480 -> 40496 (+0.04%)
Latency: 26028895 -> 25772711 (-0.98%); split: -0.99%, +0.00%
InvThroughput: 9827389 -> 9818194 (-0.09%); split: -0.10%, +0.01%
VClause: 103702 -> 103815 (+0.11%); split: -0.02%, +0.13%
SClause: 90861 -> 90857 (-0.00%)
Copies: 335276 -> 335992 (+0.21%); split: -0.09%, +0.30%
Branches: 123912 -> 123897 (-0.01%); split: -0.02%, +0.00%
VALU: 2466032 -> 2466748 (+0.03%); split: -0.01%, +0.04%
SALU: 533658 -> 533667 (+0.00%); split: -0.00%, +0.00%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
681ec4cba7
aco/ra: track cost of moving variables
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
69bc4efa37
aco/sched_ilp: improve scheduling with VMEM/DS->VALU WaW
...
This improves scheduling with one side of a divergent branch writing to a
VGPR using VMEM/DS, and the other writing using VALU. At the merge block,
it will properly consider that the VGPR was written by a VMEM/DS.
fossil-db (navi31):
Totals from 1224 (1.53% of 79825) affected shaders:
Instrs: 5264815 -> 5267604 (+0.05%); split: -0.00%, +0.06%
CodeSize: 27406404 -> 27422132 (+0.06%); split: -0.00%, +0.06%
Latency: 48325204 -> 48293975 (-0.06%); split: -0.09%, +0.03%
InvThroughput: 8923880 -> 8919191 (-0.05%); split: -0.07%, +0.02%
fossil-db (navi21):
Totals from 1267 (1.59% of 79825) affected shaders:
Instrs: 4628583 -> 4629190 (+0.01%); split: -0.00%, +0.01%
CodeSize: 24974672 -> 24977188 (+0.01%); split: -0.00%, +0.01%
Latency: 45080476 -> 44998120 (-0.18%); split: -0.20%, +0.02%
InvThroughput: 12288202 -> 12269634 (-0.15%); split: -0.16%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
88b6b6db17
aco: only consider cost of memory loads at waitcnt
...
We don't run this code before waitcnt insertion, so this isn't necessary.
This change improves accuracy in these two situations, because the waitcnt
insertion pass is more aware of divergent control flow:
v0 = valu
if (divergent) {
v0 = vmem
} else {
use(v0)
}
v0 = vmem
if (divergent) {
wait vmcnt(0)
} else {
wait vmcnt(0)
}
use(v0)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38262 >
2026-02-16 19:39:43 +00:00
Rhys Perry
6963c8dd80
radv,aco/gfx11: preserve s2 when NGG_WAVE_ID_EN=1
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
According to the ISA doc, this is needed for hang recovery.
This works by just avoiding putting temporaries in s0-3 unless they're
precolored there.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39720 >
2026-02-16 14:33:58 +00:00
Rhys Perry
f9c11a8e15
radv: add ngg_wave_id_en to radv_shader_info
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39720 >
2026-02-16 14:33:57 +00:00
Marek Olšák
61a96be494
nir/lower_non_uniform_access: add an option not to lower tex & image queries
...
AMD can do non-uniform queries. The RADV change will be in a separate commit.
NFC for drivers.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Marek Olšák
a9df891bc6
nir: allow get_ssbo_size to return a 64-bit result
...
to match get_ubo_size, and to support HW where SSBOs can have a 64-bit size.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39743 >
2026-02-16 12:59:36 +00:00
Samuel Pitoiset
47841c1142
radv/meta: remove useless DCC decompressions for image<->buffer
...
It's not needed to decompress DCC when formats are compatible each
other, this basically removes all decompressions on GFX11-GFX11.5.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39888 >
2026-02-16 07:40:13 +00:00
Emma Anholt
db532eaf00
ci/radv: Enable WSI testing.
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This gets us coverage of present_timing for KHR_display, which we don't
have on the older CTS used by the other drivers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39701 >
2026-02-13 23:57:14 +00:00
Emma Anholt
c332ee5dd6
ci/radv: Add some flakes I hit while testing WSI.
...
I upgraded some clearly flaky groups of tests in zink to regexes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39701 >
2026-02-13 23:57:14 +00:00
Rhys Perry
b60bff0429
aco: consider 64-bit transcendental normal valu for s_delay_alu
...
https://github.com/llvm/llvm-project/pull/180940
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39851 >
2026-02-13 17:03:34 +00:00
Marek Olšák
9237ca7e46
ac/llvm: remove unused functions
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
d1e6a5c1c8
ac: lower load_num_workgroups in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
1e11e83d1c
ac/nir: add ac_nir_lower_intrinsics_to_args_options structure
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
a9e47751d2
ac: lower load_subgroup_id for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Marek Olšák
0a9bdcac79
ac: lower load_workgroup_ids for ACO in NIR
...
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39638 >
2026-02-13 15:33:19 +00:00
Daniel Schürmann
97f095f6e0
aco/lower_branches: Add try_rotate_latch_block() optimization
...
This optimization looks for unconditional back-edges and aims
to rotate the loop in a way that the final block is emitted
before the loop header, essentially turning
BB1:
if ()
goto BB3;
BB2:
<loop body>
goto BB1;
BB3:
...
into
goto BB1;
BB2:
<loop body>
BB1:
if(!cond)
goto BB2;
BB3:
...
Totals from 4969 (5.89% of 84383) affected shaders: (Navi48)
Instrs: 15253038 -> 15254019 (+0.01%); split: -0.00%, +0.01%
CodeSize: 81225300 -> 81227696 (+0.00%); split: -0.02%, +0.02%
Latency: 320796283 -> 320693480 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 51395922 -> 51376156 (-0.04%); split: -0.04%, +0.00%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ade5e300ab
aco/insert_delay_alu: handle loop latch block before loop body
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
102aca9843
aco/assembler: emit block_kind_loop_latch before the loop header
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
da1594f8bb
aco: introduce notion of block_kind_loop_latch
...
A block annotated with block_kind_loop_latch denotes a block
the re-entry point for a loop back-edge. It is emitted after
the loop preheader and (potentially) before the loop header.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
9887ce6709
aco/print_asm: Sort block markers by block offset
...
We are going to emit blocks in a different order.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
800a4957bb
aco/lower_branches: Consider branch target of nested conditional branches
...
Totals from 1470 (1.74% of 84383) affected shaders: (Navi48)
Instrs: 5128451 -> 5126842 (-0.03%)
CodeSize: 29359832 -> 29353656 (-0.02%); split: -0.02%, +0.00%
Latency: 41047203 -> 41040786 (-0.02%)
InvThroughput: 6040459 -> 6039619 (-0.01%); split: -0.01%, +0.00%
Branches: 146219 -> 144648 (-1.07%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
fbf2083b8f
aco/isel: Don't emit ELSE side of divergent branches which jump
...
Totals from 50 (0.06% of 84383) affected shaders: (Navi48)
Instrs: 402490 -> 402444 (-0.01%); split: -0.01%, +0.00%
CodeSize: 2239024 -> 2238864 (-0.01%); split: -0.01%, +0.00%
SpillSGPRs: 1493 -> 1496 (+0.20%)
Latency: 5836785 -> 5836747 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 1120893 -> 1120909 (+0.00%); split: -0.00%, +0.00%
Copies: 46128 -> 46082 (-0.10%)
VALU: 222708 -> 222715 (+0.00%); split: -0.00%, +0.00%
SALU: 53039 -> 52993 (-0.09%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
ba32219cf8
aco/isel: Don't emit ELSE side of uniform branches which jump
...
Totals from 4 (0.00% of 84383) affected shaders: (Navi48)
Instrs: 16473 -> 16468 (-0.03%)
CodeSize: 85276 -> 85300 (+0.03%)
SpillSGPRs: 175 -> 176 (+0.57%)
Latency: 267907 -> 267885 (-0.01%)
InvThroughput: 36302 -> 36298 (-0.01%)
Copies: 1353 -> 1345 (-0.59%)
VALU: 9025 -> 9029 (+0.04%)
SALU: 2635 -> 2627 (-0.30%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
96a639918c
aco: don't emit p_logical_start / p_logical_end after divergent branches
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:44 +00:00
Daniel Schürmann
3743230252
aco/isel: Do IF-simplification if that didn't happen during NIR optimizations
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:43 +00:00
Daniel Schürmann
50b093ec90
aco/builder: Fix v_add_co_u32 carry-out to VCC if post_ra
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39519 >
2026-02-13 14:49:43 +00:00
Eric Engestrom
fb1cb00a96
radv/ci: add vulkan fluster job on navi48
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39861 >
2026-02-13 13:48:03 +00:00
Samuel Pitoiset
1be4ffdff9
ac,radv,radeonsi: use correct swizzle/pitch for depth-only images with SDMA
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This fixes new VKCTS coverage
dEQP-VK.api.copy_and_blit.core.use_after_copy.*.
is_stencil isn't set for RadeonSI because it doesn't do SDMA copies
with Z/S.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39800 >
2026-02-13 07:52:29 +01:00
Samuel Pitoiset
4ec3840184
radv/meta: move the barrier for depth/stencil compute resolves outside
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This barrier is only needed for rendering resolves (ie. not for
vkCmdResolveImage()). It's similar to color compute resolves now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805 >
2026-02-12 20:17:20 +00:00
Samuel Pitoiset
3e41b04de9
radv/meta: optimize a barrier with depth/stencil compute resolves
...
The compute resolve doesn't use HTILE of the destination image, so the
potential HTILE clear can run in parallel.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805 >
2026-02-12 20:17:20 +00:00
Samuel Pitoiset
85a3f7816d
radv/meta: add HTILE support to radv_fixup_resolve_dst_metadata()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805 >
2026-02-12 20:17:20 +00:00
Samuel Pitoiset
6a454dabda
radv/meta: stop fixing up HTILE after a partial resolve using compute
...
The decompression pass already resets HTILE to its uncompressed state,
so this is just redundant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39805 >
2026-02-12 20:17:19 +00:00