Vinson Lee
85fd63068e
compiler/clc: Fix const correctness in libclc_add_generic_variants
...
Fix compiler error:
../src/compiler/clc/nir_load_libclc.c:266:13: error: initializing
'char *' with an expression of type 'const char *' discards qualifiers
[-Werror,-Wincompatible-pointer-types-discards-qualifiers]
266 | char *U3AS1 = strstr(func->name, "U3AS1");
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
glibc now provides C23-style type-generic string functions. strstr
returns const char * when passed a const char * argument. Update U3AS1
declaration to const since it's only used for offset calculation.
Fixes: 4a08ee7ecf ("spirv/libclc: Add generic versions of arithmetic functions")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39761 >
2026-02-08 22:48:13 +00:00
Christian Gmeiner
4938ad435e
pan/compiler: Fix progress reporting in pan_nir_lower_store_component
...
lower_store_component() always returns false even though it modifies
NIR instructions (rewrites sources, creates new SSA defs, removes
previous stores). This triggers the "NIR changed but no progress
reported" assertion in nir_shader_intrinsics_pass.
Return true when a store_output or store_per_view_output intrinsic is
processed, since the function always modifies the shader in that case.
Closes: https://gitlab.freedesktop.org/panfrost/mesa/-/issues/274
Cc: mesa-stable
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39757 >
2026-02-08 22:09:08 +00:00
Emma Anholt
2c3d3d23d8
freedreno/a5xx: Convert a bunch of LO/HI regs to 64-bit regs.
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Much prettier cffdec output.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39764 >
2026-02-08 20:54:35 +00:00
Icenowy Zheng
51d4803a6f
util/cpu: support detecting RISC-V FD/C/V/Zb[abs] with riscv_hwprobe
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Linux v6.4+ provides a syscall called riscv_hwprobe that could detect
multiple characteristics of the running CPU on RISC-V platform.
Implement real check_os_riscv_support() with it and support extensions
detectable by it on Linux v6.5 .
When the toolchain has no riscv_hwprobe definition or the kernel at
runtime does not support it, the fallback code still assumes GC.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39154 >
2026-02-07 17:34:03 +00:00
Icenowy Zheng
9dfcd141cf
util/cpu: add a number of RISC-V extensions
...
Add a few RISC-V extensions that could be detected by the riscv_hwprobe
interface of Linux v6.5+, and add caps for FD/C extensions.
The real probe code will come in the following commit, only a stub that
still assumes GC is added.
Adding these bits also changed the size of non-cache-related CPU
information from 4 dwords to 5, so the code hashing it for shader cache
in llvmpipe is also updated.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39154 >
2026-02-07 17:34:02 +00:00
Brian Paul
0880c8d564
util,loader: silence asprintf() unused result warnings
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Silence warning about result of asprintf() calls not being used.
Seen with gcc 11.4 on Ubuntu 22.04
Signed-off-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38775 >
2026-02-07 10:07:22 +00:00
Brian Paul
35c7cad18b
gallivm: fix undefined CALLOC_STRUCT build error
...
Seen on Ubuntu 22.04 w/ LLVM 14.0.0
Signed-off-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38775 >
2026-02-07 10:07:22 +00:00
Yiwei Zhang
bb22e441b9
ci/android: revive some previously skipped tests
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39750 >
2026-02-07 06:58:33 +00:00
Yiwei Zhang
091c4f43ff
venus: remove obsolete asserts for ANB image creation
...
Those have long been supported by vn_image_deferred_info_init because of
AHB support. For non-aliased ANB image, those are directly passed from
the platform swapchain create info as well. So we just need to drop the
obsolete asserts to make newer Android platform and ANGLE happy.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39750 >
2026-02-07 06:58:33 +00:00
Kenneth Graunke
c5859b2d40
intel: Rename wm_prog_key to fs_prog_key
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This is the shader key for the fragment shader. Nobody even knows
what the windowizer/masker unit is or does anymore. Even on Gen4-6,
"fs" is still clearer. This makes the codebase easier to read.
This is only about 15 years overdue.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748 >
2026-02-06 20:52:01 -08:00
Kenneth Graunke
56e638be81
intel: Rename wm_prog_data to fs_prog_data
...
This is the program data for the fragment shader. Nobody even knows
what the windowizer/masker unit is or does anymore. Even on Gen4-6,
"fs" is still clearer. This makes the codebase easier to read.
This is only about 15 years overdue.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748 >
2026-02-06 20:51:59 -08:00
Kenneth Graunke
beb4b78fe7
intel: Rename intel_msaa_flags to intel_fs_config
...
This started out as dynamic configuration for MSAA related state, but
has since expanded to cover many dynamic fragment shader options.
We rename it to intel_fs_config, similar to intel_tess_config, to
better indicate its purpose.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39748 >
2026-02-06 20:51:43 -08:00
Emma Anholt
9aa93039d9
ci/zink: Skip ext-no-config-context for now, due to taking out the X server.
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This, plus the single-threading of piglit, means that
arb_timer_query-timestamp-get seems to usually pass now, rather than
usually fail. Still listed as a flake because I haven't stress tested.
Oh, and add in a test that flake-timeouted (3 minutes long) twice in a row
for me.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39747 >
2026-02-06 23:22:59 +00:00
Dylan Baker
8e65c6a118
docs: update calendar for 25.3.5
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39746 >
2026-02-06 13:54:42 -08:00
Dylan Baker
69e4163b9e
docs: Add 25.3.5 SHA sums
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39746 >
2026-02-06 13:54:35 -08:00
Dylan Baker
98a8d8d88c
docs: add release notes for 25.3.5
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39746 >
2026-02-06 13:54:34 -08:00
Nanley Chery
efb5ab1e4b
intel/blorp: Fix the redescribed fast-clear qpitch
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Assign a new QPitch when fast-clearing the unaligned top rows on a
redescribed surface. Fixes the following piglit test on gfx12.5:
$ test_folder=generated_tests/spec/EXT_shader_framebuffer_fetch/execution/gles3/
$ ./bin/shader_runner_gles3 $test_folder/single-slice-2darray.shader_test -auto -fbo
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 3e331e4fe9 ("intel/blorp: Optimize non-zero-layer fast-clears")
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39722 >
2026-02-06 19:09:12 +00:00
Daniel Schürmann
f71a38e9de
nir/opt_load_store_vectorize: don't use shared2 vectorization across blocks
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Besides the undesireable combinations this can produce,
it would also require to update the last_entry in every
previous block.
Totals from 99 (0.12% of 84383) affected shaders: (Navi48)
Instrs: 288989 -> 289727 (+0.26%); split: -0.02%, +0.28%
CodeSize: 1542572 -> 1546616 (+0.26%); split: -0.02%, +0.28%
SpillSGPRs: 17 -> 16 (-5.88%)
Latency: 2104020 -> 2103286 (-0.03%); split: -0.17%, +0.13%
InvThroughput: 472380 -> 472265 (-0.02%); split: -0.08%, +0.05%
VClause: 9778 -> 9779 (+0.01%)
Copies: 24937 -> 25173 (+0.95%); split: -0.05%, +0.99%
Branches: 10124 -> 10156 (+0.32%); split: -0.01%, +0.33%
PreSGPRs: 6112 -> 6091 (-0.34%)
PreVGPRs: 4079 -> 4069 (-0.25%); split: -0.39%, +0.15%
VALU: 120208 -> 120421 (+0.18%); split: -0.03%, +0.21%
SALU: 56338 -> 56312 (-0.05%); split: -0.09%, +0.04%
VOPD: 34 -> 37 (+8.82%)
Fixes: 4ca7ee7bd7 ('nir/opt_load_store_vectorize: Allow to vectorize at most one entry of each type across blocks')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39733 >
2026-02-06 16:34:15 +00:00
Job Noorman
210993a13b
ir3/parser: make bison fail on warnings
...
This should hopefully prevent shift/reduce issues in the future. To help
debugging, also make bison always generate counterexamples.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39735 >
2026-02-06 16:03:12 +00:00
Job Noorman
ca4e48d743
ir3/isa: fix shift/reduce conflict for mova.r
...
Trying to parse mova(.u)?.r causes a shift/reduce conflicts with the
rule for regular mova(.u). Work around this by (dis)assembling it as
mova.r(.u)?.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39735 >
2026-02-06 16:03:12 +00:00
Job Noorman
e9268efab0
ir3/isa: attach (sat) to dst
...
This is consistent with the blob disassembler.
It also fixes a shift/reduce conflict on (sat) since mova did already
attach it to dst instead of using iflags like other instructions.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39735 >
2026-02-06 16:03:11 +00:00
Aleksi Sapon
ecf6ac2537
lavapipe: update fails
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Aleksi Sapon
9fcbaf5d89
llvmpipe: fix incorrect image 64bit fetch return value type
...
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Aleksi Sapon
d1421e3a1c
llvmpipe: update traces
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Aleksi Sapon
f4f1ba1fb2
llvmpipe: add stride argument to lp_build_swizzle_aos_n
...
Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Aleksi Sapon
3bd8a5e3f1
llvmpipe: add GALLIVM_PERF=no_lod_ellipse
...
Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Aleksi Sapon
22436cb0f0
llvmpipe: implement per-fragment anisotropic rho
...
Return correct results when using explicit derivatives.
Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Aleksi Sapon
4baf977faa
llvmpipe: elliptical derivative transform for anisotropic filtering
...
This improves a lot the quality of anisotropic filtering on surfaces
with a high angle of incidence.
This can also be used for non-anisotropic filtering, but the effect isn't
as pronounced, and might not be worth the cost. In fact in my testing,
it didn't seem to be used on Apple hardware.
Based on this excellent article:
https://pema.dev/2025/05/09/mipmaps-too-much-detail/
Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Aleksi Sapon
5699772386
llvmpipe: pass explicit derivatives to sampling codegen
...
Don't use the lower sampling call in NIR since it doesn't implement
the right algorithm for LOD computation. It's overly simplified.
Acked-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38316 >
2026-02-06 15:34:33 +00:00
Utku Iseri
cf48da58a7
zink: ignore msrtss support on panvk
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39736 >
2026-02-06 14:31:18 +00:00
Utku Iseri
d5ce03fa21
zink: add arm and panvk to invalid<->linear
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39736 >
2026-02-06 14:31:18 +00:00
Samuel Pitoiset
c817ef30ee
radv/meta: remove dead DCC clear code about E5B9B9R9_UFLOAT_PACK32
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Only GFX10.3+ supports COLOR_ATTACHMENT/STORAGE with this format, so older
gens can't have DCC either.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
181bb1fc93
radv/meta: remove dead code for VK_FORMAT_R4G4_UNORM_PACK8
...
This isn't supported at all.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
cd54224a73
radv/meta: remove useless check in radv_CmdClearAttachments()
...
Rendering must be active.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
ad7151f4bf
radv/meta: fix the key for DCC decompress on compute
...
This could return the graphics DCC pipeline if it was created before,
and crash or potentially hang the GPU.
Found this while working on in-progress VKCTS coverage.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39689 >
2026-02-06 12:49:36 +00:00
Samuel Pitoiset
18317460bc
radv/meta: stop saving/restoring rendering state for FS/HW resolves
...
This isn't needed because resolves are at the end of the rendering.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688 >
2026-02-06 12:29:40 +00:00
Samuel Pitoiset
30db01ed05
radv/meta: make radv_decompress_resolve_src() static
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688 >
2026-02-06 12:29:40 +00:00
Samuel Pitoiset
7ea6b311d9
radv/meta: decompress resolve src outside of depth/stencil resolves
...
For consistency with color resolves.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39688 >
2026-02-06 12:29:39 +00:00
Georg Lehmann
0c46053c05
aco/optimzer: apply extract with any uses
...
Foz-DB Navi48:
Totals from 362 (0.44% of 82405) affected shaders:
MaxWaves: 5052 -> 5066 (+0.28%)
Instrs: 5297858 -> 5294009 (-0.07%); split: -0.09%, +0.01%
CodeSize: 30187188 -> 30177592 (-0.03%); split: -0.05%, +0.02%
VGPRs: 44280 -> 44172 (-0.24%)
Latency: 35632812 -> 35619796 (-0.04%); split: -0.05%, +0.01%
InvThroughput: 7050206 -> 7041058 (-0.13%); split: -0.14%, +0.01%
VClause: 137780 -> 137794 (+0.01%); split: -0.01%, +0.02%
SClause: 114821 -> 114781 (-0.03%)
Copies: 466018 -> 465150 (-0.19%); split: -0.24%, +0.05%
Branches: 171990 -> 171988 (-0.00%)
PreVGPRs: 39268 -> 39084 (-0.47%)
VALU: 2557456 -> 2554297 (-0.12%); split: -0.15%, +0.02%
SALU: 893170 -> 893192 (+0.00%); split: -0.00%, +0.01%
VOPD: 393760 -> 394427 (+0.17%); split: +0.39%, -0.22%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:40 +00:00
Georg Lehmann
85c62f1515
aco/optimizer: only copy propagate p_split_vector if it can be eliminated
...
Foz-DB Navi48:
Totals from 402 (0.49% of 82405) affected shaders:
Instrs: 3078116 -> 3070117 (-0.26%); split: -0.28%, +0.02%
CodeSize: 17329444 -> 17240360 (-0.51%); split: -0.53%, +0.01%
VGPRs: 48960 -> 48924 (-0.07%); split: -0.12%, +0.05%
SpillVGPRs: 1683 -> 1687 (+0.24%)
Latency: 27758978 -> 27728451 (-0.11%); split: -0.17%, +0.06%
InvThroughput: 5748513 -> 5741761 (-0.12%); split: -0.18%, +0.06%
VClause: 69557 -> 69575 (+0.03%); split: -0.01%, +0.03%
SClause: 74850 -> 74866 (+0.02%)
Copies: 338241 -> 329400 (-2.61%); split: -2.71%, +0.10%
Branches: 118443 -> 118431 (-0.01%)
PreVGPRs: 44561 -> 44598 (+0.08%)
VALU: 1463081 -> 1455438 (-0.52%); split: -0.56%, +0.04%
SALU: 574113 -> 574013 (-0.02%); split: -0.03%, +0.01%
VMEM: 105789 -> 105797 (+0.01%)
VOPD: 140203 -> 139009 (-0.85%); split: +0.44%, -1.29%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
5ecc800edd
aco/optimizer: add second copy prop for pseudo instructions
...
Foz-DB Navi48:
Totals from 28 (0.03% of 82405) affected shaders:
Instrs: 144993 -> 144645 (-0.24%); split: -0.26%, +0.02%
CodeSize: 784668 -> 783604 (-0.14%); split: -0.19%, +0.05%
SpillVGPRs: 215 -> 209 (-2.79%)
Latency: 2529900 -> 2526895 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 775379 -> 773859 (-0.20%); split: -0.20%, +0.00%
VClause: 2815 -> 2803 (-0.43%)
Copies: 23474 -> 23170 (-1.30%); split: -1.38%, +0.09%
Branches: 4638 -> 4632 (-0.13%)
VALU: 81924 -> 81620 (-0.37%); split: -0.40%, +0.03%
SALU: 23986 -> 23995 (+0.04%); split: -0.03%, +0.07%
VMEM: 3726 -> 3714 (-0.32%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
269007faf3
aco/optimizer: apply byte p_split_vector as extract
...
Foz-DB Navi48:
Totals from 80 (0.10% of 82405) affected shaders:
Instrs: 3022374 -> 3024178 (+0.06%); split: -0.00%, +0.06%
CodeSize: 17396984 -> 17403108 (+0.04%); split: -0.00%, +0.04%
Latency: 17685547 -> 17687073 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 3622683 -> 3622618 (-0.00%); split: -0.02%, +0.02%
VClause: 83840 -> 83841 (+0.00%)
Copies: 242072 -> 242528 (+0.19%); split: -0.01%, +0.20%
Branches: 81582 -> 81578 (-0.00%)
PreVGPRs: 7536 -> 7527 (-0.12%)
VALU: 1520822 -> 1521762 (+0.06%); split: -0.01%, +0.07%
VOPD: 294392 -> 293908 (-0.16%); split: +0.03%, -0.20%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
b21b36b6ab
aco/optimizer: apply further extracts to v_cvt_f32_ubyte
...
Foz-DB Navi48:
Totals from 21 (0.03% of 82405) affected shaders:
Instrs: 2818255 -> 2817482 (-0.03%)
CodeSize: 16282360 -> 16273080 (-0.06%)
Latency: 14172672 -> 14172405 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 2728551 -> 2728493 (-0.00%); split: -0.00%, +0.00%
Copies: 213703 -> 212973 (-0.34%)
VALU: 1407351 -> 1406585 (-0.05%)
VOPD: 291185 -> 291221 (+0.01%); split: +0.04%, -0.03%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
08f9bad0b5
aco/isel: avoid extracts for continuous alu src components
...
Helps fp8 FSR4, hurts parallel_rdp.
Foz-DB Navi48:
Totals from 23 (0.03% of 82405) affected shaders:
MaxWaves: 380 -> 383 (+0.79%)
Instrs: 71228 -> 71487 (+0.36%); split: -0.26%, +0.62%
CodeSize: 411500 -> 415004 (+0.85%); split: -0.21%, +1.06%
VGPRs: 2856 -> 2784 (-2.52%)
Latency: 1654160 -> 1665555 (+0.69%); split: -0.14%, +0.83%
InvThroughput: 354145 -> 361122 (+1.97%); split: -0.10%, +2.07%
VClause: 1557 -> 1541 (-1.03%); split: -1.41%, +0.39%
Copies: 9857 -> 10059 (+2.05%); split: -1.76%, +3.80%
PreVGPRs: 2285 -> 2182 (-4.51%); split: -4.73%, +0.22%
VALU: 38873 -> 39066 (+0.50%); split: -0.47%, +0.96%
VOPD: 1237 -> 1246 (+0.73%); split: +1.13%, -0.40%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
a0c663378c
aco/isel: split vector into dwords/words first
...
Foz-DB Navi48:
Totals from 361 (0.44% of 82405) affected shaders:
MaxWaves: 5806 -> 5832 (+0.45%)
Instrs: 2343746 -> 2343762 (+0.00%); split: -0.04%, +0.04%
CodeSize: 13270504 -> 13267116 (-0.03%); split: -0.10%, +0.08%
VGPRs: 42008 -> 41708 (-0.71%)
SpillVGPRs: 308 -> 303 (-1.62%)
Scratch: 1574656 -> 1574400 (-0.02%)
Latency: 26571385 -> 22602486 (-14.94%); split: -14.95%, +0.01%
InvThroughput: 5474157 -> 4614777 (-15.70%); split: -15.70%, +0.00%
VClause: 57512 -> 57515 (+0.01%); split: -0.03%, +0.03%
SClause: 56313 -> 56319 (+0.01%)
Copies: 251626 -> 248707 (-1.16%); split: -1.24%, +0.08%
Branches: 89620 -> 89614 (-0.01%)
PreVGPRs: 37361 -> 36910 (-1.21%); split: -1.21%, +0.01%
VALU: 1111534 -> 1108507 (-0.27%); split: -0.29%, +0.02%
SALU: 443684 -> 443687 (+0.00%); split: -0.00%, +0.00%
VMEM: 85287 -> 85277 (-0.01%)
VOPD: 97987 -> 98091 (+0.11%); split: +0.30%, -0.20%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
1a3e627223
aco: improve emit_extract_vector for vector of vecs
...
No Foz-DB changes, but nessecary for dword first splits.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
1b491cc51a
aco/optimizer: don't remove label_extract for splits
...
No Foz-DB changes, but will become nessecary with dword first splits.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
66f2a35954
aco/optimizer: repeat vector of split opt
...
Foz-DB Navi48:
Totals from 13 (0.02% of 82405) affected shaders:
Instrs: 12071 -> 12119 (+0.40%); split: -0.07%, +0.46%
CodeSize: 86908 -> 86960 (+0.06%); split: -0.29%, +0.35%
Latency: 104959 -> 105385 (+0.41%); split: -0.60%, +1.00%
InvThroughput: 46518 -> 46598 (+0.17%); split: -0.03%, +0.20%
VClause: 515 -> 506 (-1.75%); split: -3.11%, +1.36%
SClause: 32 -> 30 (-6.25%)
Copies: 973 -> 1038 (+6.68%); split: -0.82%, +7.50%
PreVGPRs: 1185 -> 1191 (+0.51%)
VALU: 7126 -> 7166 (+0.56%); split: -0.08%, +0.65%
SALU: 1127 -> 1129 (+0.18%)
VOPD: 1516 -> 1539 (+1.52%); split: +1.78%, -0.26%
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Georg Lehmann
6951ddc43b
aco: clean up emit_extract_vector a bit
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39532 >
2026-02-06 11:29:39 +00:00
Daniel Schürmann
5e86cfac8e
nir/opt_load_store_vectorize: Vectorize speculatable instructions across blocks
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This should always be safe.
Totals from 446 (0.53% of 84383) affected shaders: (Navi48)
Instrs: 995942 -> 994416 (-0.15%); split: -0.17%, +0.02%
CodeSize: 5500372 -> 5489900 (-0.19%); split: -0.20%, +0.01%
SpillSGPRs: 197 -> 195 (-1.02%)
Latency: 14872922 -> 14851646 (-0.14%); split: -0.15%, +0.00%
InvThroughput: 2395050 -> 2391537 (-0.15%); split: -0.15%, +0.00%
VClause: 20207 -> 20195 (-0.06%); split: -0.07%, +0.01%
SClause: 27090 -> 26427 (-2.45%); split: -2.51%, +0.07%
Copies: 84182 -> 84228 (+0.05%); split: -0.08%, +0.13%
Branches: 22927 -> 22928 (+0.00%)
PreSGPRs: 27275 -> 27524 (+0.91%); split: -0.02%, +0.93%
PreVGPRs: 29116 -> 29131 (+0.05%)
VALU: 545565 -> 545549 (-0.00%); split: -0.01%, +0.00%
SALU: 124275 -> 124329 (+0.04%); split: -0.05%, +0.09%
VMEM: 39044 -> 39030 (-0.04%)
SMEM: 44052 -> 43205 (-1.92%)
VOPD: 32354 -> 32337 (-0.05%); split: +0.02%, -0.07%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39373 >
2026-02-06 10:16:50 +00:00