Commit graph

218316 commits

Author SHA1 Message Date
Icenowy Zheng
9dfcd141cf util/cpu: add a number of RISC-V extensions
Add a few RISC-V extensions that could be detected by the riscv_hwprobe
interface of Linux v6.5+, and add caps for FD/C extensions.

The real probe code will come in the following commit, only a stub that
still assumes GC is added.

Adding these bits also changed the size of non-cache-related CPU
information from 4 dwords to 5, so the code hashing it for shader cache
in llvmpipe is also updated.

Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39154>
2026-02-07 17:34:02 +00:00
Brian Paul
0880c8d564 util,loader: silence asprintf() unused result warnings
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Silence warning about result of asprintf() calls not being used.
Seen with gcc 11.4 on Ubuntu 22.04

Signed-off-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38775>
2026-02-07 10:07:22 +00:00
Brian Paul
35c7cad18b gallivm: fix undefined CALLOC_STRUCT build error
Seen on Ubuntu 22.04 w/ LLVM 14.0.0

Signed-off-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38775>
2026-02-07 10:07:22 +00:00
Yiwei Zhang
bb22e441b9 ci/android: revive some previously skipped tests
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39750>
2026-02-07 06:58:33 +00:00
Yiwei Zhang
091c4f43ff venus: remove obsolete asserts for ANB image creation
Those have long been supported by vn_image_deferred_info_init because of
AHB support. For non-aliased ANB image, those are directly passed from
the platform swapchain create info as well. So we just need to drop the
obsolete asserts to make newer Android platform and ANGLE happy.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39750>
2026-02-07 06:58:33 +00:00
Kenneth Graunke
c5859b2d40 intel: Rename wm_prog_key to fs_prog_key
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is the shader key for the fragment shader.  Nobody even knows
what the windowizer/masker unit is or does anymore.  Even on Gen4-6,
"fs" is still clearer.  This makes the codebase easier to read.

This is only about 15 years overdue.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>
2026-02-06 20:52:01 -08:00
Kenneth Graunke
56e638be81 intel: Rename wm_prog_data to fs_prog_data
This is the program data for the fragment shader.  Nobody even knows
what the windowizer/masker unit is or does anymore.  Even on Gen4-6,
"fs" is still clearer.  This makes the codebase easier to read.

This is only about 15 years overdue.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>
2026-02-06 20:51:59 -08:00
Kenneth Graunke
beb4b78fe7 intel: Rename intel_msaa_flags to intel_fs_config
This started out as dynamic configuration for MSAA related state, but
has since expanded to cover many dynamic fragment shader options.

We rename it to intel_fs_config, similar to intel_tess_config, to
better indicate its purpose.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748>
2026-02-06 20:51:43 -08:00
Emma Anholt
9aa93039d9 ci/zink: Skip ext-no-config-context for now, due to taking out the X server.
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This, plus the single-threading of piglit, means that
arb_timer_query-timestamp-get seems to usually pass now, rather than
usually fail.  Still listed as a flake because I haven't stress tested.

Oh, and add in a test that flake-timeouted (3 minutes long) twice in a row
for me.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39747>
2026-02-06 23:22:59 +00:00
Dylan Baker
8e65c6a118 docs: update calendar for 25.3.5
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39746>
2026-02-06 13:54:42 -08:00
Dylan Baker
69e4163b9e docs: Add 25.3.5 SHA sums
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39746>
2026-02-06 13:54:35 -08:00
Dylan Baker
98a8d8d88c docs: add release notes for 25.3.5
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39746>
2026-02-06 13:54:34 -08:00
Nanley Chery
efb5ab1e4b intel/blorp: Fix the redescribed fast-clear qpitch
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Assign a new QPitch when fast-clearing the unaligned top rows on a
redescribed surface. Fixes the following piglit test on gfx12.5:

$ test_folder=generated_tests/spec/EXT_shader_framebuffer_fetch/execution/gles3/
$ ./bin/shader_runner_gles3 $test_folder/single-slice-2darray.shader_test -auto -fbo

Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 3e331e4fe9 ("intel/blorp: Optimize non-zero-layer fast-clears")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39722>
2026-02-06 19:09:12 +00:00
Daniel Schürmann
f71a38e9de nir/opt_load_store_vectorize: don't use shared2 vectorization across blocks
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Besides the undesireable combinations this can produce,
it would also require to update the last_entry in every
previous block.

Totals from 99 (0.12% of 84383) affected shaders: (Navi48)
Instrs: 288989 -> 289727 (+0.26%); split: -0.02%, +0.28%
CodeSize: 1542572 -> 1546616 (+0.26%); split: -0.02%, +0.28%
SpillSGPRs: 17 -> 16 (-5.88%)
Latency: 2104020 -> 2103286 (-0.03%); split: -0.17%, +0.13%
InvThroughput: 472380 -> 472265 (-0.02%); split: -0.08%, +0.05%
VClause: 9778 -> 9779 (+0.01%)
Copies: 24937 -> 25173 (+0.95%); split: -0.05%, +0.99%
Branches: 10124 -> 10156 (+0.32%); split: -0.01%, +0.33%
PreSGPRs: 6112 -> 6091 (-0.34%)
PreVGPRs: 4079 -> 4069 (-0.25%); split: -0.39%, +0.15%
VALU: 120208 -> 120421 (+0.18%); split: -0.03%, +0.21%
SALU: 56338 -> 56312 (-0.05%); split: -0.09%, +0.04%
VOPD: 34 -> 37 (+8.82%)

Fixes: 4ca7ee7bd7 ('nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39733>
2026-02-06 16:34:15 +00:00
Job Noorman
210993a13b ir3/parser: make bison fail on warnings
This should hopefully prevent shift/reduce issues in the future. To help
debugging, also make bison always generate counterexamples.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39735>
2026-02-06 16:03:12 +00:00
Job Noorman
ca4e48d743 ir3/isa: fix shift/reduce conflict for mova.r
Trying to parse mova(.u)?.r causes a shift/reduce conflicts with the
rule for regular mova(.u). Work around this by (dis)assembling it as
mova.r(.u)?.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39735>
2026-02-06 16:03:12 +00:00
Job Noorman
e9268efab0 ir3/isa: attach (sat) to dst
This is consistent with the blob disassembler.

It also fixes a shift/reduce conflict on (sat) since mova did already
attach it to dst instead of using iflags like other instructions.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39735>
2026-02-06 16:03:11 +00:00
Aleksi Sapon
ecf6ac2537 lavapipe: update fails
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Aleksi Sapon
9fcbaf5d89 llvmpipe: fix incorrect image 64bit fetch return value type
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Aleksi Sapon
d1421e3a1c llvmpipe: update traces
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Aleksi Sapon
f4f1ba1fb2 llvmpipe: add stride argument to lp_build_swizzle_aos_n
Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Aleksi Sapon
3bd8a5e3f1 llvmpipe: add GALLIVM_PERF=no_lod_ellipse
Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Aleksi Sapon
22436cb0f0 llvmpipe: implement per-fragment anisotropic rho
Return correct results when using explicit derivatives.

Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Aleksi Sapon
4baf977faa llvmpipe: elliptical derivative transform for anisotropic filtering
This improves a lot the quality of anisotropic filtering on surfaces
with a high angle of incidence.

This can also be used for non-anisotropic filtering, but the effect isn't
as pronounced, and might not be worth the cost. In fact in my testing,
it didn't seem to be used on Apple hardware.

Based on this excellent article:
https://pema.dev/2025/05/09/mipmaps-too-much-detail/

Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Aleksi Sapon
5699772386 llvmpipe: pass explicit derivatives to sampling codegen
Don't use the lower sampling call in NIR since it doesn't implement
the right algorithm for LOD computation. It's overly simplified.

Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316>
2026-02-06 15:34:33 +00:00
Utku Iseri
cf48da58a7 zink: ignore msrtss support on panvk
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39736>
2026-02-06 14:31:18 +00:00
Utku Iseri
d5ce03fa21 zink: add arm and panvk to invalid<->linear
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39736>
2026-02-06 14:31:18 +00:00
Samuel Pitoiset
c817ef30ee radv/meta: remove dead DCC clear code about E5B9B9R9_UFLOAT_PACK32
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Only GFX10.3+ supports COLOR_ATTACHMENT/STORAGE with this format, so older
gens can't have DCC either.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689>
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
181bb1fc93 radv/meta: remove dead code for VK_FORMAT_R4G4_UNORM_PACK8
This isn't supported at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689>
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
cd54224a73 radv/meta: remove useless check in radv_CmdClearAttachments()
Rendering must be active.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689>
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
ad7151f4bf radv/meta: fix the key for DCC decompress on compute
This could return the graphics DCC pipeline if it was created before,
and crash or potentially hang the GPU.

Found this while working on in-progress VKCTS coverage.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689>
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
18317460bc radv/meta: stop saving/restoring rendering state for FS/HW resolves
This isn't needed because resolves are at the end of the rendering.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688>
2026-02-06 12:29:40 +00:00
Samuel Pitoiset
30db01ed05 radv/meta: make radv_decompress_resolve_src() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688>
2026-02-06 12:29:40 +00:00
Samuel Pitoiset
7ea6b311d9 radv/meta: decompress resolve src outside of depth/stencil resolves
For consistency with color resolves.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688>
2026-02-06 12:29:39 +00:00
Georg Lehmann
0c46053c05 aco/optimzer: apply extract with any uses
Foz-DB Navi48:
Totals from 362 (0.44% of 82405) affected shaders:
MaxWaves: 5052 -> 5066 (+0.28%)
Instrs: 5297858 -> 5294009 (-0.07%); split: -0.09%, +0.01%
CodeSize: 30187188 -> 30177592 (-0.03%); split: -0.05%, +0.02%
VGPRs: 44280 -> 44172 (-0.24%)
Latency: 35632812 -> 35619796 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 7050206 -> 7041058 (-0.13%); split: -0.14%, +0.01%
VClause: 137780 -> 137794 (+0.01%); split: -0.01%, +0.02%
SClause: 114821 -> 114781 (-0.03%)
Copies: 466018 -> 465150 (-0.19%); split: -0.24%, +0.05%
Branches: 171990 -> 171988 (-0.00%)
PreVGPRs: 39268 -> 39084 (-0.47%)
VALU: 2557456 -> 2554297 (-0.12%); split: -0.15%, +0.02%
SALU: 893170 -> 893192 (+0.00%); split: -0.00%, +0.01%
VOPD: 393760 -> 394427 (+0.17%); split: +0.39%, -0.22%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:40 +00:00
Georg Lehmann
85c62f1515 aco/optimizer: only copy propagate p_split_vector if it can be eliminated
Foz-DB Navi48:
Totals from 402 (0.49% of 82405) affected shaders:
Instrs: 3078116 -> 3070117 (-0.26%); split: -0.28%, +0.02%
CodeSize: 17329444 -> 17240360 (-0.51%); split: -0.53%, +0.01%
VGPRs: 48960 -> 48924 (-0.07%); split: -0.12%, +0.05%
SpillVGPRs: 1683 -> 1687 (+0.24%)
Latency: 27758978 -> 27728451 (-0.11%); split: -0.17%, +0.06%
InvThroughput: 5748513 -> 5741761 (-0.12%); split: -0.18%, +0.06%
VClause: 69557 -> 69575 (+0.03%); split: -0.01%, +0.03%
SClause: 74850 -> 74866 (+0.02%)
Copies: 338241 -> 329400 (-2.61%); split: -2.71%, +0.10%
Branches: 118443 -> 118431 (-0.01%)
PreVGPRs: 44561 -> 44598 (+0.08%)
VALU: 1463081 -> 1455438 (-0.52%); split: -0.56%, +0.04%
SALU: 574113 -> 574013 (-0.02%); split: -0.03%, +0.01%
VMEM: 105789 -> 105797 (+0.01%)
VOPD: 140203 -> 139009 (-0.85%); split: +0.44%, -1.29%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
5ecc800edd aco/optimizer: add second copy prop for pseudo instructions
Foz-DB Navi48:
Totals from 28 (0.03% of 82405) affected shaders:
Instrs: 144993 -> 144645 (-0.24%); split: -0.26%, +0.02%
CodeSize: 784668 -> 783604 (-0.14%); split: -0.19%, +0.05%
SpillVGPRs: 215 -> 209 (-2.79%)
Latency: 2529900 -> 2526895 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 775379 -> 773859 (-0.20%); split: -0.20%, +0.00%
VClause: 2815 -> 2803 (-0.43%)
Copies: 23474 -> 23170 (-1.30%); split: -1.38%, +0.09%
Branches: 4638 -> 4632 (-0.13%)
VALU: 81924 -> 81620 (-0.37%); split: -0.40%, +0.03%
SALU: 23986 -> 23995 (+0.04%); split: -0.03%, +0.07%
VMEM: 3726 -> 3714 (-0.32%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
269007faf3 aco/optimizer: apply byte p_split_vector as extract
Foz-DB Navi48:
Totals from 80 (0.10% of 82405) affected shaders:
Instrs: 3022374 -> 3024178 (+0.06%); split: -0.00%, +0.06%
CodeSize: 17396984 -> 17403108 (+0.04%); split: -0.00%, +0.04%
Latency: 17685547 -> 17687073 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 3622683 -> 3622618 (-0.00%); split: -0.02%, +0.02%
VClause: 83840 -> 83841 (+0.00%)
Copies: 242072 -> 242528 (+0.19%); split: -0.01%, +0.20%
Branches: 81582 -> 81578 (-0.00%)
PreVGPRs: 7536 -> 7527 (-0.12%)
VALU: 1520822 -> 1521762 (+0.06%); split: -0.01%, +0.07%
VOPD: 294392 -> 293908 (-0.16%); split: +0.03%, -0.20%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
b21b36b6ab aco/optimizer: apply further extracts to v_cvt_f32_ubyte
Foz-DB Navi48:
Totals from 21 (0.03% of 82405) affected shaders:
Instrs: 2818255 -> 2817482 (-0.03%)
CodeSize: 16282360 -> 16273080 (-0.06%)
Latency: 14172672 -> 14172405 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2728551 -> 2728493 (-0.00%); split: -0.00%, +0.00%
Copies: 213703 -> 212973 (-0.34%)
VALU: 1407351 -> 1406585 (-0.05%)
VOPD: 291185 -> 291221 (+0.01%); split: +0.04%, -0.03%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
08f9bad0b5 aco/isel: avoid extracts for continuous alu src components
Helps fp8 FSR4, hurts parallel_rdp.

Foz-DB Navi48:
Totals from 23 (0.03% of 82405) affected shaders:
MaxWaves: 380 -> 383 (+0.79%)
Instrs: 71228 -> 71487 (+0.36%); split: -0.26%, +0.62%
CodeSize: 411500 -> 415004 (+0.85%); split: -0.21%, +1.06%
VGPRs: 2856 -> 2784 (-2.52%)
Latency: 1654160 -> 1665555 (+0.69%); split: -0.14%, +0.83%
InvThroughput: 354145 -> 361122 (+1.97%); split: -0.10%, +2.07%
VClause: 1557 -> 1541 (-1.03%); split: -1.41%, +0.39%
Copies: 9857 -> 10059 (+2.05%); split: -1.76%, +3.80%
PreVGPRs: 2285 -> 2182 (-4.51%); split: -4.73%, +0.22%
VALU: 38873 -> 39066 (+0.50%); split: -0.47%, +0.96%
VOPD: 1237 -> 1246 (+0.73%); split: +1.13%, -0.40%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
a0c663378c aco/isel: split vector into dwords/words first
Foz-DB Navi48:
Totals from 361 (0.44% of 82405) affected shaders:
MaxWaves: 5806 -> 5832 (+0.45%)
Instrs: 2343746 -> 2343762 (+0.00%); split: -0.04%, +0.04%
CodeSize: 13270504 -> 13267116 (-0.03%); split: -0.10%, +0.08%
VGPRs: 42008 -> 41708 (-0.71%)
SpillVGPRs: 308 -> 303 (-1.62%)
Scratch: 1574656 -> 1574400 (-0.02%)
Latency: 26571385 -> 22602486 (-14.94%); split: -14.95%, +0.01%
InvThroughput: 5474157 -> 4614777 (-15.70%); split: -15.70%, +0.00%
VClause: 57512 -> 57515 (+0.01%); split: -0.03%, +0.03%
SClause: 56313 -> 56319 (+0.01%)
Copies: 251626 -> 248707 (-1.16%); split: -1.24%, +0.08%
Branches: 89620 -> 89614 (-0.01%)
PreVGPRs: 37361 -> 36910 (-1.21%); split: -1.21%, +0.01%
VALU: 1111534 -> 1108507 (-0.27%); split: -0.29%, +0.02%
SALU: 443684 -> 443687 (+0.00%); split: -0.00%, +0.00%
VMEM: 85287 -> 85277 (-0.01%)
VOPD: 97987 -> 98091 (+0.11%); split: +0.30%, -0.20%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
1a3e627223 aco: improve emit_extract_vector for vector of vecs
No Foz-DB changes, but nessecary for dword first splits.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
1b491cc51a aco/optimizer: don't remove label_extract for splits
No Foz-DB changes, but will become nessecary with dword first splits.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
66f2a35954 aco/optimizer: repeat vector of split opt
Foz-DB Navi48:
Totals from 13 (0.02% of 82405) affected shaders:
Instrs: 12071 -> 12119 (+0.40%); split: -0.07%, +0.46%
CodeSize: 86908 -> 86960 (+0.06%); split: -0.29%, +0.35%
Latency: 104959 -> 105385 (+0.41%); split: -0.60%, +1.00%
InvThroughput: 46518 -> 46598 (+0.17%); split: -0.03%, +0.20%
VClause: 515 -> 506 (-1.75%); split: -3.11%, +1.36%
SClause: 32 -> 30 (-6.25%)
Copies: 973 -> 1038 (+6.68%); split: -0.82%, +7.50%
PreVGPRs: 1185 -> 1191 (+0.51%)
VALU: 7126 -> 7166 (+0.56%); split: -0.08%, +0.65%
SALU: 1127 -> 1129 (+0.18%)
VOPD: 1516 -> 1539 (+1.52%); split: +1.78%, -0.26%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Georg Lehmann
6951ddc43b aco: clean up emit_extract_vector a bit
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532>
2026-02-06 11:29:39 +00:00
Daniel Schürmann
5e86cfac8e nir/opt_load_store_vectorize: Vectorize speculatable instructions across blocks
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This should always be safe.

Totals from 446 (0.53% of 84383) affected shaders: (Navi48)
Instrs: 995942 -> 994416 (-0.15%); split: -0.17%, +0.02%
CodeSize: 5500372 -> 5489900 (-0.19%); split: -0.20%, +0.01%
SpillSGPRs: 197 -> 195 (-1.02%)
Latency: 14872922 -> 14851646 (-0.14%); split: -0.15%, +0.00%
InvThroughput: 2395050 -> 2391537 (-0.15%); split: -0.15%, +0.00%
VClause: 20207 -> 20195 (-0.06%); split: -0.07%, +0.01%
SClause: 27090 -> 26427 (-2.45%); split: -2.51%, +0.07%
Copies: 84182 -> 84228 (+0.05%); split: -0.08%, +0.13%
Branches: 22927 -> 22928 (+0.00%)
PreSGPRs: 27275 -> 27524 (+0.91%); split: -0.02%, +0.93%
PreVGPRs: 29116 -> 29131 (+0.05%)
VALU: 545565 -> 545549 (-0.00%); split: -0.01%, +0.00%
SALU: 124275 -> 124329 (+0.04%); split: -0.05%, +0.09%
VMEM: 39044 -> 39030 (-0.04%)
SMEM: 44052 -> 43205 (-1.92%)
VOPD: 32354 -> 32337 (-0.05%); split: +0.02%, -0.07%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00
Daniel Schürmann
52a5b29497 radeonsi: vectorize UBO, SSBO and shared across blocks
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00
Daniel Schürmann
b859aa7dce radv: vectorize UBO, SSBO and shared across blocks
Totals from 10898 (12.91% of 84383) affected shaders: (Navi48)
MaxWaves: 279340 -> 279332 (-0.00%); split: +0.00%, -0.00%
Instrs: 21764388 -> 21710270 (-0.25%); split: -0.27%, +0.02%
CodeSize: 116069624 -> 115722304 (-0.30%); split: -0.31%, +0.02%
VGPRs: 693736 -> 693796 (+0.01%); split: -0.00%, +0.01%
SpillSGPRs: 7225 -> 7339 (+1.58%); split: -0.55%, +2.13%
Latency: 338393228 -> 325192648 (-3.90%); split: -3.92%, +0.02%
InvThroughput: 51966571 -> 50173171 (-3.45%); split: -3.46%, +0.01%
VClause: 350568 -> 350287 (-0.08%); split: -0.13%, +0.05%
SClause: 632838 -> 614290 (-2.93%); split: -2.96%, +0.03%
Copies: 1479044 -> 1475429 (-0.24%); split: -0.45%, +0.20%
Branches: 514433 -> 512921 (-0.29%); split: -0.31%, +0.01%
PreSGPRs: 618454 -> 624707 (+1.01%); split: -0.07%, +1.08%
PreVGPRs: 564593 -> 564725 (+0.02%); split: -0.00%, +0.02%
VALU: 11557516 -> 11557859 (+0.00%); split: -0.01%, +0.01%
SALU: 3488945 -> 3486685 (-0.06%); split: -0.16%, +0.10%
VMEM: 615523 -> 614751 (-0.13%)
SMEM: 966514 -> 936739 (-3.08%); split: -3.09%, +0.01%
VOPD: 329318 -> 329406 (+0.03%); split: +0.04%, -0.02%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00
Daniel Schürmann
4ca7ee7bd7 nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks
The idea is to initialize the vectorization table with one
entry from the previous blocks if it's the same for all predecessors.
In order to not speculatively load out-of-bounds, backends need to
set a new bounds_checked_modes option indicating variable modes
for which per-component bounds checks are supported.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00
Daniel Schürmann
0a07ea20e6 nir/opt_load_store_vectorize: create add_entry_to_hash_table() helper
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373>
2026-02-06 10:16:50 +00:00