Commit graph

201805 commits

Author SHA1 Message Date
Caio Oliveira
b91c576ae7 intel/mda: add difflog command
Compares versions of two objects one by one.  Useful to compare two
shader compilations and find the first pass that changed.

This could already be done by using something like
`diff <(mda log ...) <(mda log ...)` but it is useful enough to become
a builtin.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39420>
2026-01-28 18:00:45 +00:00
Faith Ekstrand
797198e7a6 nak: Use .xx swizzles for f2f.32.16
This is a no-op from a codegen PoV since both SrcSwizzle::Xx and
SrcSwizzle::None will result in .high not being set.  However, it allows
other parts of the compiler to more easily reason about the fact that it
only reads the bottom 16 bits.

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39572>
2026-01-28 17:15:33 +00:00
Faith Ekstrand
15d6637282 nak: Make OpF2F take a F16v2 source
Instead of depending on a global "high" bit that affects both source and
destination, this models f2f.32.16 as an F16v2 op which ignores one of
the two components.  This makes encoding the op a tiny bit more complex
(though that's easy enough to shove in a helper) in exchange for letting
copy-prop propagate OpPrmt and swizzles into it.

Shader-db stats:

    Totals:
    CodeSize: 24304240 -> 24298928 (-0.02%)
    Static cycle count: 274812403 -> 274809320 (-0.00%)

    Totals from 39 (0.57% of 6891) affected shaders:
    CodeSize: 266672 -> 261360 (-1.99%)
    Static cycle count: 138321 -> 135238 (-2.23%)

    PERCENTAGE DELTAS                            Shaders  CodeSize Static cycle count
    google-meet-clvk/BgBlur                      49        -0.49%        -0.44%
    google-meet-clvk/Relight                     81        -0.55%        -0.18%
    q2rtx/q2rtx-rt-pipeline                      42        -0.31%        -0.10%

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39572>
2026-01-28 17:15:33 +00:00
Rhys Perry
0b0e124a73 aco: use lv1.resize() pattern
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39537>
2026-01-28 16:46:30 +00:00
Rhys Perry
5f5032bb6a aco: use lv1/lv2 instead of v1/v2.as_linear()
This is just a search+replace then clang-format.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39537>
2026-01-28 16:46:30 +00:00
Rhys Perry
c98204c963 aco: add lv1/lv2 as alias for v1/v2.as_linear()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39537>
2026-01-28 16:46:29 +00:00
Tomeu Vizoso
a5daecafd3 dril: don't build a rocket_dri.so
As Rocket has no graphics capability.

Fixes: 5b829658f7 ("rocket: Initial commit of a driver for Rockchip's NPU")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38532>
2026-01-28 16:06:42 +00:00
Samuel Pitoiset
50a3699552 radv: advertise VK_KHR_internally_synchronized_queues
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39489>
2026-01-28 15:32:58 +00:00
Samuel Pitoiset
d8ef386f98 vulkan: add support for VK_KHR_internally_synchronized_queues
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39489>
2026-01-28 15:32:57 +00:00
Aitor Camacho
8a4a369795 kk: Move nir_opt_shrink_stores after nir_opt_remove_phis for correct shrink
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39522>
2026-01-28 15:12:39 +00:00
Icenowy Zheng
bed1576b14 pvr: preliminary EXT_image_drm_format_modifier support
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Adds a trivial EXT_image_drm_format_modifier support that only handles
LINEAR modifier.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Acked-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38991>
2026-01-28 14:49:24 +00:00
Mike Blumenkrantz
cf68dc570b ntv: stop tracking ubo variables
this is broken for the case where conflicting variables exist but aren't
accessed

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39582>
2026-01-28 14:24:18 +00:00
Mike Blumenkrantz
36d9f5a4bf ntv: add a simple pass to convert vulkan descriptor access to direct derefs
this should yield more transparent passthrough

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39582>
2026-01-28 14:24:18 +00:00
Ella Stanforth
aad9a26de3 pvr: enable sampler ycbcr conversion
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:28 +00:00
Ella Stanforth
5eeac21181 pvr: add ycbcr formats
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:28 +00:00
Ella Stanforth
0a01f7aeeb pvr: workaround hardware clamping for YCBCR_IDENTITY conversion
The TPU clamps to 0..1 so we have to workaround in software on any hardware
that does not have XR clamp support.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:28 +00:00
Ella Stanforth
3204e8b1a2 pvr: implement chroma swap
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:28 +00:00
Ella Stanforth
3495831d72 pvr: setup csc tables
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:28 +00:00
Ella Stanforth
c856d34056 pvr: handle plane addresses for ycbcr images.
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:27 +00:00
Ella Stanforth
f8e3e893b9 pvr: handle ycbcr swizzle
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:27 +00:00
Ella Stanforth
4baf6d3043 pvr: handle packing texstate for ycbcr images
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:26 +00:00
Ella Stanforth
fa6704a523 pvr: add multiplanar format support
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:26 +00:00
Ella Stanforth
7be87ca82a pvr/csbgen: fix packing multiple addresses
Cc: mesa-stable
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:26 +00:00
Simon Perretta
60c1a0cf86 pvr: add initial yuv tex/smp state words
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:26 +00:00
Ella Stanforth
71ecc9430c pvr: fix transfer double stride
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:26 +00:00
Ella Stanforth
abaa4a80ad vulkan/runtime: use nir_shader_tex_pass for ycbcr lowering
Acked-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:26 +00:00
Ella Stanforth
b4457dd5d0 vulkan: add plane aspect format helper
Acked-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39231>
2026-01-28 13:41:25 +00:00
Lionel Landwerlin
a05fc97bc9 anv/iris: add drirc to enable sampler state & compute surface state prefetch
I noticed we disable the prefetch only on Gfx12.5. But surely that
recommendation carries on on later platforms.

It seems other drivers just disable it all the time and only have an
option to force the prefetch. So implementing the same thing here.

Blorp path is left untouched.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39424>
2026-01-28 13:13:40 +00:00
David Rosca
2f0d18f6af radv/video: Use coded size from session params instead of codedExtent
cef8eff74d ("radv/video: Override H265 SPS unaligned resolutions")
fixes the case where app specifies resolution with lower than required
alignment. But in case of higher alignment, the stream is still not
going to be correctly decodable.
Use size from session params to set the coded size, instead of using
codedExtent of input image.
Only use codedExtent to calculate padding.

Fixes dEQP-VK.video.encode.h265.quantization_map_delta*

Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39529>
2026-01-28 12:46:29 +00:00
Samuel Pitoiset
83fabf7d41 radv: rework app workarounds implemented using internal layers
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Just override the needed entrypoints.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39549>
2026-01-28 11:46:25 +00:00
Samuel Pitoiset
875b6ab951 radv/sqtt: reduce the number of timed cmdbufs
Use the same for post/pre GPU timestamps when possible.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39174>
2026-01-28 11:11:24 +00:00
Samuel Pitoiset
4508518f8e radv/sqtt: rework acquiring timed cmdbufs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39174>
2026-01-28 11:11:24 +00:00
Samuel Pitoiset
553179ab73 radv/sqtt: rework acquiring GPU timestamps
To acquire all GPU timestamp objects at the same time.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39174>
2026-01-28 11:11:24 +00:00
Aitor Camacho
a8fac76ea6 kk: Enable vertexPipelineStoresAndAtomics
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38880>
2026-01-28 10:30:26 +00:00
Nick Hamilton
079377c767 pco: Fix for atomic operations on an image buffer
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Within the driver buffers are treated as 2D as sampling them as 1D
will run into HW restrictions on max size.

The compiler does the same however for atomic image ops the address
is manually calculated and doing this via the 2D path leads to
incorrect offsets.

The fix is to treat buffers as 1D for atomic ops which calculates
the correct offsets for the operations.

Fix deqp:
dEQP-VK.image.atomic_operations.add.buffer.*
dEQP-VK.image.atomic_operations.and.buffer.*
dEQP-VK.image.atomic_operations.compare_exchange.buffer.*
dEQP-VK.image.atomic_operations.dec.buffer.*
dEQP-VK.image.atomic_operations.exchange.buffer.*
dEQP-VK.image.atomic_operations.inc.buffer.*
dEQP-VK.image.atomic_operations.max.buffer.*
dEQP-VK.image.atomic_operations.min.buffer.*
dEQP-VK.image.atomic_operations.or.buffer.*
dEQP-VK.image.atomic_operations.sub.buffer.*
dEQP-VK.image.atomic_operations.xor.buffer.*

Fixes: 6dc5e1e109 ("pco: fully support Vulkan 1.2 image atomics")

Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39521>
2026-01-28 08:54:28 +00:00
Olivia Lee
4959f45e99 Revert "panvk: advertise VK_EXT_primitives_generated_query on v10+"
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This reverts commit 6eadcaa851.

VK_EXT_primitives_generated_query has a dependency on
VK_EXT_transform_feedback, which we do not implement yet. This is
breaking the android CTS. It will be reenabled once transform feedback
is in.

Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39547>
2026-01-27 23:34:19 +00:00
Georg Lehmann
1240444e63 spirv: assert fp_math_ctrl was reset after use
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39460>
2026-01-27 23:01:44 +00:00
Georg Lehmann
3deb57b654 spirv: remove vtn_builder::exact
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39460>
2026-01-27 23:01:44 +00:00
Georg Lehmann
51d30d0f96 spirv: consider both source and dest type for fast math
This matters for conversions and and comparisons.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39460>
2026-01-27 23:01:44 +00:00
Georg Lehmann
46a617884e spirv: use base type instead of bit size to determine fp_math_ctrl
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39460>
2026-01-27 23:01:44 +00:00
Georg Lehmann
565f37b98c spirv: handle fast_math for opencl opcodes
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39460>
2026-01-27 23:01:42 +00:00
Georg Lehmann
836efa8c3c spirv: move NoContraction handling into vtn_handle_fp_fast_math
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39460>
2026-01-27 23:01:42 +00:00
Iván Briano
5b48805b42 brw: fix local_invocation_index with quad derivaties on mesh/task shaders
For mesh/task shaders, the thread payload provides a local invocation
index, but it's always linear so it doesn't give the correct value when
quad derivatives are in use.
The lowering pass where all of this is done correctly for compute
shaders assumes load_local_invocation_index will be lowered in the
backend for mesh/task, calculates the values for the quads correctly but
then avoid replacing the original intrinsic and we remain with the wrong
results.

Add an intel specific intrinsic and always lower the generic one to that
(or whatever else was calculated) to avoid ambiguities and fix the value
for quad derivatives.

Fixes future CTS tests using mesh/task shaders under:
dEQP-VK.spirv_assembly.instruction.compute.compute_shader_derivatives.*

Fixes: d89bfb1ff7 ("intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39276>
2026-01-27 22:28:19 +00:00
Emma Anholt
eb990cd81e nir: Bump test timeouts.
nir_opt_algebraic_tests has been pushing our qemu-ed tests over the line.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39563>
2026-01-27 21:31:14 +00:00
Christian Gmeiner
d19460ffc4 etnaviv: Emit alpha_to_coverage dither table only when needed
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Instead of unconditionally emitting the dither table during GPU state
reset, only emit it when alpha_to_coverage is actually enabled in
the blend state. A tracking flag avoids redundant re-emission until the
next GPU state reset.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39557>
2026-01-27 21:14:33 +00:00
Georg Lehmann
2d38da94d4 aco: allow v_cmpx with DPP
The wording in the RDNA3 ISA doc was since clarified, v_cmpx with DPP
behaves exactly like one would expect:
FI controls whether the source value can be read from inactive lanes,
but inactive lanes always write a 0 bit. The same applies to v_cmp with DPP.

Foz-DB Navi48:
Totals from 987 (1.20% of 82405) affected shaders:
Instrs: 517003 -> 516445 (-0.11%); split: -0.11%, +0.00%
CodeSize: 2782688 -> 2780508 (-0.08%); split: -0.08%, +0.00%
Latency: 2059169 -> 2056327 (-0.14%); split: -0.14%, +0.00%
InvThroughput: 365374 -> 365328 (-0.01%); split: -0.03%, +0.01%
Copies: 64669 -> 65616 (+1.46%)
SALU: 70693 -> 70652 (-0.06%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
1c1bd9d090 aco: only apply DPP with 3 or less uses
Creating many new DPP instructions increases code size and decreases throughput.

Foz-DB Navi48:
Totals from 2196 (2.67% of 82179) affected shaders:
MaxWaves: 59930 -> 59960 (+0.05%); split: +0.08%, -0.03%
Instrs: 3718514 -> 3718298 (-0.01%); split: -0.08%, +0.07%
CodeSize: 20593544 -> 20507660 (-0.42%); split: -0.43%, +0.02%
VGPRs: 135924 -> 135744 (-0.13%); split: -0.17%, +0.04%
Latency: 33174704 -> 33163001 (-0.04%); split: -0.07%, +0.04%
InvThroughput: 6500723 -> 6491382 (-0.14%); split: -0.15%, +0.01%
VClause: 72348 -> 72343 (-0.01%); split: -0.06%, +0.05%
SClause: 83160 -> 83165 (+0.01%); split: -0.03%, +0.04%
Copies: 286592 -> 285575 (-0.35%); split: -0.45%, +0.09%
Branches: 99970 -> 99971 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 103280 -> 103279 (-0.00%)
PreVGPRs: 95590 -> 95440 (-0.16%); split: -0.30%, +0.14%
VALU: 1931369 -> 1931725 (+0.02%); split: -0.08%, +0.09%
SALU: 637663 -> 636780 (-0.14%); split: -0.15%, +0.01%
VOPD: 65236 -> 65589 (+0.54%); split: +0.91%, -0.37%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
bb6a3e2891 aco/optimizer: rework how dpp is applied
Using the common helpers means we can use VINTERP instead of DPP,
which has higher throughput and smaller CodeSize.

Foz-DB Navi48:
Totals from 986 (1.20% of 82405) affected shaders:
Instrs: 1985282 -> 1985545 (+0.01%); split: -0.01%, +0.02%
CodeSize: 11179700 -> 11151780 (-0.25%); split: -0.26%, +0.01%
Latency: 19899190 -> 19897694 (-0.01%); split: -0.01%, +0.01%
InvThroughput: 4110650 -> 4104911 (-0.14%)
VClause: 44143 -> 44139 (-0.01%); split: -0.03%, +0.02%
Copies: 164340 -> 164344 (+0.00%); split: -0.02%, +0.02%
VALU: 1061904 -> 1061908 (+0.00%); split: -0.00%, +0.00%
SALU: 305980 -> 305974 (-0.00%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
228cb29dae aco/optimizer: allow DPP with scalar src1 in alu_opt_info_is_valid
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00
Georg Lehmann
d4c0318f48 aco: apply DPP with scalar src1 on gfx11.5+
Foz-DB Navi48:
Totals from 6261 (7.62% of 82179) affected shaders:
MaxWaves: 176284 -> 176236 (-0.03%); split: +0.01%, -0.03%
Instrs: 5850185 -> 5828451 (-0.37%); split: -0.41%, +0.04%
CodeSize: 31363324 -> 31419904 (+0.18%); split: -0.08%, +0.26%
VGPRs: 328284 -> 328200 (-0.03%); split: -0.07%, +0.05%
SpillSGPRs: 2268 -> 2256 (-0.53%)
Latency: 50235516 -> 50218816 (-0.03%); split: -0.06%, +0.03%
InvThroughput: 8256243 -> 8242036 (-0.17%); split: -0.22%, +0.05%
VClause: 81000 -> 80975 (-0.03%); split: -0.11%, +0.08%
SClause: 136376 -> 136387 (+0.01%); split: -0.11%, +0.11%
Copies: 414021 -> 417894 (+0.94%); split: -0.13%, +1.07%
Branches: 105301 -> 105298 (-0.00%); split: -0.00%, +0.00%
PreSGPRs: 291360 -> 291432 (+0.02%)
PreVGPRs: 238593 -> 238729 (+0.06%); split: -0.02%, +0.08%
VALU: 3425446 -> 3403463 (-0.64%); split: -0.65%, +0.01%
SALU: 815505 -> 819372 (+0.47%); split: -0.02%, +0.50%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39516>
2026-01-27 20:42:51 +00:00