Pierre-Eric Pelloux-Prayer
8f7f7a90b7
radeonsi/sqtt: use pipe_aligned_buffer_create to allocate bo
...
pipe_aligned_buffer_create can allow allocate 4GB but that's large enough
for now.
PIPE_USAGE_STREAM is used for now to keep the 2 BOs in GTT.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39194 >
2026-02-12 10:08:43 +00:00
Samuel Pitoiset
9a6ec08960
radv: enable trimming FS color exports for internal shaders
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This should be safe now, and potentially more optimal.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39786 >
2026-02-12 07:33:58 +00:00
Samuel Pitoiset
dbad9144f2
radv/meta: use R32G32 formats for R64 slow color clears
...
This is required because CB doesn't support 64-bit formats.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39786 >
2026-02-12 07:33:58 +00:00
Samuel Pitoiset
db89f94441
radv/meta: stop trying to reduce the number of format variants
...
Now that we have a solid logic for caching meta objects, trying to
reduce the number of format variants isn't super useful. In practice,
the shaders would be cached on disk, so this would only allocate few
more bytes for the meta objects.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39786 >
2026-02-12 07:33:58 +00:00
Samuel Pitoiset
e58ef1b3bc
radv: do not set the resume rendering flag for custom resolves
...
It's not a resume operation, it's a complete new rendering pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39782 >
2026-02-12 07:12:56 +00:00
Samuel Pitoiset
cbf981e99a
radv: do not resolve when rendering is suspended
...
The Vulkan spec says:
"Store and resolve operations are only performed at the end of a
render pass instance that does not specify the
VK_RENDERING_SUSPENDING_BIT_KHR flag."
VK_RENDERING_SUSPENDING_BIT is also illegal with custom resolves.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39782 >
2026-02-12 07:12:56 +00:00
Samuel Pitoiset
c1c031ca91
radv: make sure rendering isn't already active in CmdBeginRendering()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39782 >
2026-02-12 07:12:56 +00:00
Samuel Pitoiset
99344bdfe5
radv: clear rendering state before performing resolves
...
This is mostly for not calling CmdBeginRendering() while rendering
is already active in order to catch potential driver issues. This
requires a small refactoring of how the rendering info is passed for
resolves though.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39782 >
2026-02-12 07:12:55 +00:00
Samuel Pitoiset
4c18a36765
radv: pass VkSampleLocationsInfoEXT for depth/stencil expand
...
Instead of using an intermediate structure.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39782 >
2026-02-12 07:12:55 +00:00
Samuel Pitoiset
6f279445e7
radv/meta: stop using custom sample locations for color resolves
...
Only needed for depth/stencil resolves.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39782 >
2026-02-12 07:12:54 +00:00
Georg Lehmann
d7814bcad0
aco: remove redundant can_use_DPP declaration
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801 >
2026-02-11 11:34:29 +00:00
Georg Lehmann
fc7b5d7eed
aco/opt_postRA: don't optimize across calls
...
Could do better by checking which registers are clobbered/preserved,
but that's unlikely to be useful anyway.
Backport-to: 26.0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801 >
2026-02-11 11:34:29 +00:00
Georg Lehmann
10b12a6ee2
aco: handle all SALU that modifies PC in needs_exec_mask
...
Calls use swappc.
Backport-to: 26.0
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801 >
2026-02-11 11:34:29 +00:00
Georg Lehmann
421a4dacf0
aco/lower_branches: consider jump target of conditional branches based on vcc
...
Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39801 >
2026-02-11 11:34:29 +00:00
Georg Lehmann
77d05ac1ba
aco/optimizer: stop checking precise for med3
...
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:03 +00:00
Georg Lehmann
a87cdfc6b7
radv/nir/rt: preserve inf/nan for emulated RT intersect
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Georg Lehmann
e873b8764a
aco/optimizer: use nan preserve flag to prevent incorrect med3
...
No Foz-DB changes.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39641 >
2026-02-10 18:42:02 +00:00
Samuel Pitoiset
2cd9693a31
radv/meta: remove an useless barrier when fixing up HTILE for copies on compute
...
The copy operation doesn't use HTILE of the destination image, so the
clear can run in parallel.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656 >
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
5663ebffc4
radv/meta: skip some HTILE operations when it's decompressed on image stores
...
Only GFX11-GFX11.5 are affected.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656 >
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
0996b4c527
radv/meta: do not disable compression for depth/stencil expand on compute
...
This doesn't make sense for the destination image and this would
prevent COMPRESSION_EN=1 to work correctly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656 >
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
452304897f
radv: set COMPRESSION_EN=1 for depth or stencil storage images when supported
...
On GFX10+, the hardware can write decompressed DWORDS to HTILE when
COMPRESSION_EN=1, which means some HTILE decompression/initialization
operations can be avoided because it automatically mark the tiles that
are touched as uncompressed.
Though according to PAL, there are issues with that on GFX10-10.3, so
it's only enabled on GFX11-11.5.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656 >
2026-02-10 10:42:22 +00:00
Samuel Pitoiset
6f2b048f84
radv/meta: stop fixing up HTILE after a partial copy
...
The decompression pass already resets HTILE to its uncompressed state,
so this is just redundant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656 >
2026-02-10 10:42:21 +00:00
Samuel Pitoiset
4f41818194
radv/meta: add a function to fixup HTILE metadata for copies on compute queue
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656 >
2026-02-10 10:42:21 +00:00
Samuel Pitoiset
9f5a20abde
radv/meta: fix CmdCopyBufferToImage2() on compute queue with compressed HTILE
...
Only for partial copies because image stores don't decompress on writes
(ie. HTILE isn't updated by image stores).
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39656 >
2026-02-10 10:42:21 +00:00
Samuel Pitoiset
17bbd45d59
radv: emit the framebuffer state when rendering begins
...
Much better.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39731 >
2026-02-09 09:43:02 +00:00
Samuel Pitoiset
e178382fb8
radv: add a new dirty bit for the GFX12 HiZ workaround
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39731 >
2026-02-09 09:43:02 +00:00
Samuel Pitoiset
a010c2694a
radv: move {depth,stencil}_compress_disable to the image view extra info
...
Doesn't have to be a pipeline parameter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39731 >
2026-02-09 09:43:01 +00:00
Samuel Pitoiset
9abe6d4dc2
radv: remove declared but unused radv_get_dcc_max_uncompressed_block_size()
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39731 >
2026-02-09 09:43:01 +00:00
Samuel Pitoiset
8d9fb0744e
radv: move color/depth-stencil init surface helpers to radv_image_view.c/h
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39731 >
2026-02-09 09:43:01 +00:00
Samuel Pitoiset
39719c6c44
radv/meta: remove dead code in the gfx depth/stencil clear path
...
The driver either does a fast-clear using compute or a slow clear
using graphics, so the "fast" clear using graphics isn't used at all.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39731 >
2026-02-09 09:43:00 +00:00
Samuel Pitoiset
e488085942
radv/meta: remove unused saving/restoring rendering state logic
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39729 >
2026-02-09 08:41:07 +00:00
Samuel Pitoiset
98186aba36
radv/meta: stop saving/restoring rendering state for color/depth decompressions
...
These should always happen outside of rendering.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39729 >
2026-02-09 08:41:07 +00:00
Samuel Pitoiset
04d5077b00
radv: emit late decompressions for fbfetch slightly earlier
...
Right after "normal" layout transitions and just before the rendering
state is set, mostly because it doesn't need to be saved/restored
either.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39729 >
2026-02-09 08:41:07 +00:00
Samuel Pitoiset
04f6bfae51
radv: only pass custom sample locations when relevant
...
Custom sample locations are only needed for depth decompression.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39729 >
2026-02-09 08:41:07 +00:00
Samuel Pitoiset
ce3539b54f
radv: fix late decompressions for fbfetch with more corner cases
...
With layers, or custom sample locations for depth.
Found this by inspection.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39729 >
2026-02-09 08:41:06 +00:00
Reilly Brogan
ece5f671b3
amd,compiler: fix const errors found with C23 glibc support
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
In glibc 2.43 the strstr function now propagate const to the output, triggering -Wincompatible-pointer-types-discards-qualifiers
under clang/gcc with -Werror.
Fix two of these cases by adding the const qualifier.
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39767 >
2026-02-08 23:18:15 +00:00
Samuel Pitoiset
c817ef30ee
radv/meta: remove dead DCC clear code about E5B9B9R9_UFLOAT_PACK32
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Only GFX10.3+ supports COLOR_ATTACHMENT/STORAGE with this format, so older
gens can't have DCC either.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
181bb1fc93
radv/meta: remove dead code for VK_FORMAT_R4G4_UNORM_PACK8
...
This isn't supported at all.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
cd54224a73
radv/meta: remove useless check in radv_CmdClearAttachments()
...
Rendering must be active.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
ad7151f4bf
radv/meta: fix the key for DCC decompress on compute
...
This could return the graphics DCC pipeline if it was created before,
and crash or potentially hang the GPU.
Found this while working on in-progress VKCTS coverage.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
18317460bc
radv/meta: stop saving/restoring rendering state for FS/HW resolves
...
This isn't needed because resolves are at the end of the rendering.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688 >
2026-02-06 12:29:40 +00:00
Samuel Pitoiset
30db01ed05
radv/meta: make radv_decompress_resolve_src() static
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688 >
2026-02-06 12:29:40 +00:00
Samuel Pitoiset
7ea6b311d9
radv/meta: decompress resolve src outside of depth/stencil resolves
...
For consistency with color resolves.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688 >
2026-02-06 12:29:39 +00:00
Georg Lehmann
0c46053c05
aco/optimzer: apply extract with any uses
...
Foz-DB Navi48:
Totals from 362 (0.44% of 82405) affected shaders:
MaxWaves: 5052 -> 5066 (+0.28%)
Instrs: 5297858 -> 5294009 (-0.07%); split: -0.09%, +0.01%
CodeSize: 30187188 -> 30177592 (-0.03%); split: -0.05%, +0.02%
VGPRs: 44280 -> 44172 (-0.24%)
Latency: 35632812 -> 35619796 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 7050206 -> 7041058 (-0.13%); split: -0.14%, +0.01%
VClause: 137780 -> 137794 (+0.01%); split: -0.01%, +0.02%
SClause: 114821 -> 114781 (-0.03%)
Copies: 466018 -> 465150 (-0.19%); split: -0.24%, +0.05%
Branches: 171990 -> 171988 (-0.00%)
PreVGPRs: 39268 -> 39084 (-0.47%)
VALU: 2557456 -> 2554297 (-0.12%); split: -0.15%, +0.02%
SALU: 893170 -> 893192 (+0.00%); split: -0.00%, +0.01%
VOPD: 393760 -> 394427 (+0.17%); split: +0.39%, -0.22%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:40 +00:00
Georg Lehmann
85c62f1515
aco/optimizer: only copy propagate p_split_vector if it can be eliminated
...
Foz-DB Navi48:
Totals from 402 (0.49% of 82405) affected shaders:
Instrs: 3078116 -> 3070117 (-0.26%); split: -0.28%, +0.02%
CodeSize: 17329444 -> 17240360 (-0.51%); split: -0.53%, +0.01%
VGPRs: 48960 -> 48924 (-0.07%); split: -0.12%, +0.05%
SpillVGPRs: 1683 -> 1687 (+0.24%)
Latency: 27758978 -> 27728451 (-0.11%); split: -0.17%, +0.06%
InvThroughput: 5748513 -> 5741761 (-0.12%); split: -0.18%, +0.06%
VClause: 69557 -> 69575 (+0.03%); split: -0.01%, +0.03%
SClause: 74850 -> 74866 (+0.02%)
Copies: 338241 -> 329400 (-2.61%); split: -2.71%, +0.10%
Branches: 118443 -> 118431 (-0.01%)
PreVGPRs: 44561 -> 44598 (+0.08%)
VALU: 1463081 -> 1455438 (-0.52%); split: -0.56%, +0.04%
SALU: 574113 -> 574013 (-0.02%); split: -0.03%, +0.01%
VMEM: 105789 -> 105797 (+0.01%)
VOPD: 140203 -> 139009 (-0.85%); split: +0.44%, -1.29%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
5ecc800edd
aco/optimizer: add second copy prop for pseudo instructions
...
Foz-DB Navi48:
Totals from 28 (0.03% of 82405) affected shaders:
Instrs: 144993 -> 144645 (-0.24%); split: -0.26%, +0.02%
CodeSize: 784668 -> 783604 (-0.14%); split: -0.19%, +0.05%
SpillVGPRs: 215 -> 209 (-2.79%)
Latency: 2529900 -> 2526895 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 775379 -> 773859 (-0.20%); split: -0.20%, +0.00%
VClause: 2815 -> 2803 (-0.43%)
Copies: 23474 -> 23170 (-1.30%); split: -1.38%, +0.09%
Branches: 4638 -> 4632 (-0.13%)
VALU: 81924 -> 81620 (-0.37%); split: -0.40%, +0.03%
SALU: 23986 -> 23995 (+0.04%); split: -0.03%, +0.07%
VMEM: 3726 -> 3714 (-0.32%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
269007faf3
aco/optimizer: apply byte p_split_vector as extract
...
Foz-DB Navi48:
Totals from 80 (0.10% of 82405) affected shaders:
Instrs: 3022374 -> 3024178 (+0.06%); split: -0.00%, +0.06%
CodeSize: 17396984 -> 17403108 (+0.04%); split: -0.00%, +0.04%
Latency: 17685547 -> 17687073 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 3622683 -> 3622618 (-0.00%); split: -0.02%, +0.02%
VClause: 83840 -> 83841 (+0.00%)
Copies: 242072 -> 242528 (+0.19%); split: -0.01%, +0.20%
Branches: 81582 -> 81578 (-0.00%)
PreVGPRs: 7536 -> 7527 (-0.12%)
VALU: 1520822 -> 1521762 (+0.06%); split: -0.01%, +0.07%
VOPD: 294392 -> 293908 (-0.16%); split: +0.03%, -0.20%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
b21b36b6ab
aco/optimizer: apply further extracts to v_cvt_f32_ubyte
...
Foz-DB Navi48:
Totals from 21 (0.03% of 82405) affected shaders:
Instrs: 2818255 -> 2817482 (-0.03%)
CodeSize: 16282360 -> 16273080 (-0.06%)
Latency: 14172672 -> 14172405 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2728551 -> 2728493 (-0.00%); split: -0.00%, +0.00%
Copies: 213703 -> 212973 (-0.34%)
VALU: 1407351 -> 1406585 (-0.05%)
VOPD: 291185 -> 291221 (+0.01%); split: +0.04%, -0.03%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
08f9bad0b5
aco/isel: avoid extracts for continuous alu src components
...
Helps fp8 FSR4, hurts parallel_rdp.
Foz-DB Navi48:
Totals from 23 (0.03% of 82405) affected shaders:
MaxWaves: 380 -> 383 (+0.79%)
Instrs: 71228 -> 71487 (+0.36%); split: -0.26%, +0.62%
CodeSize: 411500 -> 415004 (+0.85%); split: -0.21%, +1.06%
VGPRs: 2856 -> 2784 (-2.52%)
Latency: 1654160 -> 1665555 (+0.69%); split: -0.14%, +0.83%
InvThroughput: 354145 -> 361122 (+1.97%); split: -0.10%, +2.07%
VClause: 1557 -> 1541 (-1.03%); split: -1.41%, +0.39%
Copies: 9857 -> 10059 (+2.05%); split: -1.76%, +3.80%
PreVGPRs: 2285 -> 2182 (-4.51%); split: -4.73%, +0.22%
VALU: 38873 -> 39066 (+0.50%); split: -0.47%, +0.96%
VOPD: 1237 -> 1246 (+0.73%); split: +1.13%, -0.40%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
a0c663378c
aco/isel: split vector into dwords/words first
...
Foz-DB Navi48:
Totals from 361 (0.44% of 82405) affected shaders:
MaxWaves: 5806 -> 5832 (+0.45%)
Instrs: 2343746 -> 2343762 (+0.00%); split: -0.04%, +0.04%
CodeSize: 13270504 -> 13267116 (-0.03%); split: -0.10%, +0.08%
VGPRs: 42008 -> 41708 (-0.71%)
SpillVGPRs: 308 -> 303 (-1.62%)
Scratch: 1574656 -> 1574400 (-0.02%)
Latency: 26571385 -> 22602486 (-14.94%); split: -14.95%, +0.01%
InvThroughput: 5474157 -> 4614777 (-15.70%); split: -15.70%, +0.00%
VClause: 57512 -> 57515 (+0.01%); split: -0.03%, +0.03%
SClause: 56313 -> 56319 (+0.01%)
Copies: 251626 -> 248707 (-1.16%); split: -1.24%, +0.08%
Branches: 89620 -> 89614 (-0.01%)
PreVGPRs: 37361 -> 36910 (-1.21%); split: -1.21%, +0.01%
VALU: 1111534 -> 1108507 (-0.27%); split: -0.29%, +0.02%
SALU: 443684 -> 443687 (+0.00%); split: -0.00%, +0.00%
VMEM: 85287 -> 85277 (-0.01%)
VOPD: 97987 -> 98091 (+0.11%); split: +0.30%, -0.20%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00