Commit graph

12095 commits

Author SHA1 Message Date
Faith Ekstrand
84bbfaa7e5 pan/bi: Delete the old texel buffer intrinsics
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
7d5cb2884c pan/bi: Allow setting the table on lea_attr_pan
Also allow us to set AUTO32 while we're at it.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
2369808cd1 pan,nir: Add Bifrost texturing intrinsics
These are funky enough that they make more sense as intrinsics than
texture opcodes.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
0d549f5bde nir: Add a new nir_op_f2u32_rtne
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
58cba7887a nir: Add a new nir_texop_gradient_pan
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
e0fffabda7 nir/builder: Allow backend1/2 in nir_build_tex()
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:16 +00:00
Faith Ekstrand
337aaa0ab9 pan,nir: Add cube face intrinsics
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41036>
2026-05-05 01:27:15 +00:00
Rhys Perry
081feabf9c nir/search: fix nir_algebraic_automaton after constant folding op(bcsel)
Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/jobs/98917704

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: f4812dc11d ("nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2)")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41343>
2026-05-04 17:27:38 +00:00
Daniel Schürmann
f4812dc11d nir/opt_constant_folding: constant-fold op(bcsel(), #c) -> bcsel(.., #c1, #c2)
for all ALU instructions except fneg instead of using nir_opt_algebraic
for a small subset.

Totals from 17711 (8.49% of 208640) affected shaders: (Navi48)
MaxWaves: 364391 -> 364397 (+0.00%); split: +0.01%, -0.01%
Instrs: 33873994 -> 33780398 (-0.28%); split: -0.31%, +0.03%
CodeSize: 198627596 -> 198259724 (-0.19%); split: -0.23%, +0.05%
VGPRs: 1435516 -> 1435144 (-0.03%); split: -0.04%, +0.02%
SpillSGPRs: 652827 -> 654577 (+0.27%); split: -0.00%, +0.27%
SpillVGPRs: 594840 -> 593598 (-0.21%); split: -0.28%, +0.07%
Scratch: 31791360 -> 31543552 (-0.78%)
Latency: 417824569 -> 415881858 (-0.46%); split: -0.48%, +0.02%
InvThroughput: 80376232 -> 80307996 (-0.08%); split: -0.10%, +0.01%
VClause: 557238 -> 554770 (-0.44%); split: -0.50%, +0.06%
SClause: 688297 -> 688125 (-0.02%); split: -0.04%, +0.02%
Copies: 3571756 -> 3566704 (-0.14%); split: -0.44%, +0.29%
Branches: 628710 -> 628576 (-0.02%); split: -0.07%, +0.05%
PreSGPRs: 1100316 -> 1103478 (+0.29%); split: -0.02%, +0.30%
PreVGPRs: 1132139 -> 1128765 (-0.30%); split: -0.30%, +0.00%
VALU: 18944830 -> 18912030 (-0.17%); split: -0.20%, +0.03%
SALU: 4363054 -> 4342748 (-0.47%); split: -0.57%, +0.10%
VMEM: 1894420 -> 1891754 (-0.14%); split: -0.19%, +0.05%
SMEM: 1073860 -> 1073741 (-0.01%); split: -0.01%, +0.00%
VOPD: 1734659 -> 1735718 (+0.06%); split: +0.20%, -0.14%

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>
2026-05-04 09:42:59 +00:00
Daniel Schürmann
8b1c60add4 nir/opt_constant_folding: create const_value_for_alu() helper
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>
2026-05-04 09:42:59 +00:00
Georg Lehmann
52b195b4e8 nir/opt_algebraic: add more fmulz pattern
Totals from 3 (0.00% of 202440) affected shaders: (Navi48)
Instrs: 5684 -> 5641 (-0.76%); split: -0.77%, +0.02%
CodeSize: 30952 -> 30708 (-0.79%); split: -0.80%, +0.01%
Latency: 9236 -> 9199 (-0.40%); split: -0.42%, +0.02%
InvThroughput: 2287 -> 2273 (-0.61%)
VALU: 3900 -> 3884 (-0.41%)
SALU: 305 -> 289 (-5.25%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40848>
2026-05-04 09:42:59 +00:00
Georg Lehmann
38e691fc0a nir/opt_varyings: do no_signed_zero linking even for non removable stores
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
E.g. position in VS.

Foz-DB Navi48:
Totals from 948 (0.79% of 120695) affected shaders:
MaxWaves: 26816 -> 26828 (+0.04%)
Instrs: 799692 -> 796993 (-0.34%); split: -0.34%, +0.01%
CodeSize: 3855744 -> 3846816 (-0.23%); split: -0.24%, +0.01%
VGPRs: 50256 -> 50220 (-0.07%)
Latency: 2209359 -> 2207667 (-0.08%); split: -0.09%, +0.01%
InvThroughput: 305260 -> 303519 (-0.57%); split: -0.57%, +0.00%
VClause: 11640 -> 11643 (+0.03%); split: -0.01%, +0.03%
SClause: 21152 -> 21149 (-0.01%)
Copies: 51658 -> 51675 (+0.03%); split: -0.11%, +0.14%
Branches: 18656 -> 18655 (-0.01%)
PreVGPRs: 37999 -> 37984 (-0.04%)
VALU: 469752 -> 467406 (-0.50%); split: -0.50%, +0.00%
SALU: 105433 -> 105323 (-0.10%); split: -0.11%, +0.00%

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00
Georg Lehmann
fac4edbcba nir/opt_varyings: back propagate signed zero information to outputs
Foz-DB Navi48:
Totals from 809 (0.67% of 120695) affected shaders:
MaxWaves: 21804 -> 21808 (+0.02%)
Instrs: 863131 -> 861310 (-0.21%); split: -0.22%, +0.01%
CodeSize: 4535500 -> 4523232 (-0.27%); split: -0.30%, +0.03%
VGPRs: 47304 -> 47280 (-0.05%)
SpillSGPRs: 170 -> 82 (-51.76%)
Latency: 6791484 -> 6786880 (-0.07%); split: -0.07%, +0.00%
InvThroughput: 906281 -> 905301 (-0.11%); split: -0.11%, +0.00%
VClause: 16910 -> 16917 (+0.04%); split: -0.01%, +0.05%
SClause: 21856 -> 21827 (-0.13%); split: -0.14%, +0.01%
Copies: 61890 -> 61436 (-0.73%); split: -0.80%, +0.06%
Branches: 19725 -> 19640 (-0.43%)
PreSGPRs: 38011 -> 37851 (-0.42%)
PreVGPRs: 36482 -> 36454 (-0.08%)
VALU: 465316 -> 464323 (-0.21%); split: -0.22%, +0.00%
SALU: 143757 -> 143395 (-0.25%); split: -0.33%, +0.08%
VMEM: 36827 -> 36806 (-0.06%)
SMEM: 37769 -> 37768 (-0.00%)

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00
Georg Lehmann
b2bc57551a nir/instr_set: allow cse with fp_math_ctrl mismatches for intrinsics
Just like for ALU.

No Foz-DB changes.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41292>
2026-05-03 19:55:10 +00:00
Marek Olšák
f583f6e717 nir: use nir_build_frag_coord everywhere
nir_build_frag_coord generates the correct sysval loads based on NIR
options. nir_load_frag_coord shouldn't be used directly because drivers
don't have to support it.

v2: RADV can't use it because nir->options isn't set, so use load_pixel_coord.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:01 +00:00
Marek Olšák
b63a9a8b39 nir: add direct lowered frag_coord building to replace lowering passes
Instead of lowering frag_coord 4 times during compilation,
just use this.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
9c5ad16819 nir/opt_frag_coord_to_pixel_coord: handle frag_coord_xy
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
076b0aaf1d nir/lower_wpos_ytransform: handle frag_coord_xy
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Marek Olšák
e49f29f25e nir: add frag_coord_xy
to strengthen and simplify pixel_coord lowering

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41227>
2026-05-03 13:03:00 +00:00
Daniel Schürmann
012d72f2b0 nir/opt_algebraic: add some imul24_relaxed pattern
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41178>
2026-05-01 10:07:26 +00:00
Daniel Schürmann
708093d830 nir/opt_algebraic: use imul24_relaxed for lowered dot4x8_add
Totals from 28 (0.04% of 72819) affected shaders: (Navi10)

MaxWaves: 181 -> 186 (+2.76%)
Instrs: 406735 -> 338360 (-16.81%)
CodeSize: 2913588 -> 2469712 (-15.23%)
VGPRs: 5520 -> 5468 (-0.94%)
SpillVGPRs: 32 -> 0 (-inf%)
LDS: 64512 -> 62464 (-3.17%)
Scratch: 10240 -> 0 (-inf%)
Latency: 11028252 -> 4357120 (-60.49%)
InvThroughput: 11004126 -> 4079018 (-62.93%)
VClause: 1686 -> 2055 (+21.89%); split: -0.89%, +22.78%
SClause: 890 -> 852 (-4.27%)
Copies: 4516 -> 2644 (-41.45%); split: -41.59%, +0.13%
PreSGPRs: 982 -> 974 (-0.81%)
PreVGPRs: 5356 -> 4284 (-20.01%)
VALU: 370529 -> 330201 (-10.88%)
SALU: 28850 -> 1170 (-95.94%)
VMEM: 2616 -> 2560 (-2.14%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41178>
2026-05-01 10:07:25 +00:00
Marek Olšák
5db0493a04 glsl,gallium: add pipe_caps::glsl_bindless_handles_are_32bit
to lower bindless handles to 32 bits before nir_opt_varyings, so that
the high 32 bits of (input) loads of bindless handles are eliminated early.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41170>
2026-05-01 03:00:17 +00:00
Lorenzo Rossi
63aceb07ff nir/opt_sink: Add pan-specific load_input
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
30d8f9c554 nir/lower_point_size: Handle 16-bit point sizes
panfrost has float16 point size, handling that precision too allows the
compiler to call lower_point_size later in the compilation pipeline

Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40924>
2026-04-30 18:26:10 +00:00
Lorenzo Rossi
2a7d817591 nir/opt_algebraic: optimize fadd/fmul with 16-bit source and constant
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>
2026-04-30 17:33:09 +00:00
Lorenzo Rossi
89436db611 nir: Extract float_is_half tests in common code
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41096>
2026-04-30 17:33:09 +00:00
Karol Herbst
4e67582ddf nir: add fmul_rtz optimizations
NVK is only going to use it for `fmul_rtz(frcp(ipa), ipa)` patterns, so
try not too hard to optimize this.

Totals from 10 (0.00% of 1212873) affected shaders:
CodeSize: 34480 -> 34288 (-0.56%); split: -0.60%, +0.05%
Static cycle count: 6225 -> 6132 (-1.49%); split: -1.57%, +0.08%

Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
2e09b4ac68 nir: handle fmul_rtz in a couple of places
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Karol Herbst
4e520f671c nir: add fmul_rtz
It's needed in NVK for correctness with interpolation.

Backport-to: 26.1
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41179>
2026-04-30 15:42:40 +00:00
Marek Olšák
a3e3bf0ac2 nir/opt_dce: add shader_info::assert_inputs_not_dead
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41166>
2026-04-30 07:07:32 +00:00
Marek Olšák
7bd5856cc6 nir/opt_dce: factor out dead instruction removal into a helper
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41166>
2026-04-30 07:07:32 +00:00
Icenowy Zheng
2bffc653ec isaspec: decode: manually print the sign when printing NaN float values
The IEEE754-2019 standard declaring the preceding sign "optional" when
converting NaN values to strings because the standard tries to not
regulate how sign bits in NaNs are interpreted.

In the real world, when using printf-series function to print a number
with type `float` on RISC-V, the sign of NaNs is wiped during the
conversion from `float` to `double` (defined as part of the default
argument promotions rule for variable arguments in the C spec).

Change the code to stop relying on isa_print() to print the negative
sign, instead parse it from the highest bit of value and manually print
it before "nan" string.

This fixes the `etnaviv_isa_disasm` unit test on RISC-V.

Suggested-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40887>
2026-04-29 11:39:12 +00:00
Xinju Li
0319be8b02 nir: resolve functions: only resolve functions that are reachable from main
Use DFS traversal from main to resolve reachable functions. Avoid spurious
"unresolved reference" linker errors for dead helper functions.
It avoid reporting linking error for following shader test. The shader test used
to pass before merge_requests/31137:

[require]
GLSL >= 1.50

[vertex shader]
/* declared but not defined */
vec4 transform_color(vec3 color, float alpha);

/* calls transform_color — but this function is never called from main */
vec4 apply_transform(vec3 color, float alpha)
{
    return transform_color(color, alpha);
}

[vertex shader]
in vec4 piglit_vertex;

void main()
{
    /* apply_transform is never called here */
    gl_Position = piglit_vertex;
}

Signed-off-by: Xinju Li <xinju.li@broadcom.com>

use pass_flags to mark function as reachable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41065>
2026-04-28 23:35:17 +00:00
Alyssa Rosenzweig
0c49738211 nir/opt_reassociate: fix exactness bug
For an inexact-associative operation (fadd or fmul), can_reassociate ensures the
root of the chain is inexact to allow reassociating. However, build_chain just
checks for opcodes to match up after, although we do sum up exactness across the
chain. Although an Effort Was Made, it still seems incorrect to reassociate

   %3 = fadd! %0, %1
   %4 = fadd %3, %2

to instead be (ex.)

   %3 = fadd! %0, %2
   %4 = fadd! %3, %1

Closes: #14418
Fixes: e0b0f7e73c ("nir: add ALU reassocation pass")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41162>
2026-04-28 21:14:56 +00:00
Georg Lehmann
599a52174b nir: disable fp class analysis for 64bit transcendentals
Some backends have terrible precision for these fp64 opcodes, so don't try to
do anything clever.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/work_items/15334
Fixes: 5a298f3560 ("nir: rewrite fp range analysis as a fp class analysis")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41206>
2026-04-28 13:26:42 +00:00
Simon Perretta
57791c4a99 pco: track how many tg4/raw sample comps are needed
Rather than always emitting and swizzling 16 components for raw samples,
scale it by the number actually needed as defined by the selected tg4
channel/components.

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
2026-04-28 12:04:03 +01:00
Marek Olšák
3dcba87ca3 nir/opt_licm: hoist instructions across multiple levels of nested loops
radv gfx12:

Totals:
Instrs: 42861311 -> 42861476 (+0.00%); split: -0.00%, +0.00%
CodeSize: 227917476 -> 227918160 (+0.00%); split: -0.00%, +0.00%
Latency: 265381068 -> 265373506 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 42954018 -> 42952350 (-0.00%)
VClause: 819026 -> 819024 (-0.00%)
SClause: 1210348 -> 1210293 (-0.00%)
Copies: 2919525 -> 2919597 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 2889432 -> 2889406 (-0.00%)
VALU: 23757371 -> 23757377 (+0.00%); split: -0.00%, +0.00%
SALU: 5981417 -> 5981485 (+0.00%); split: -0.00%, +0.00%
VOPD: 8966 -> 8964 (-0.02%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41220>
2026-04-27 23:58:21 +00:00
Marek Olšák
8e036fcaec nir/opt_licm: use nir_metadata_control_flow
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41220>
2026-04-27 23:58:21 +00:00
Marek Olšák
e0112be522 nir/opt_licm: add a private state structure for the pass
The structure will grow in later commits.

The major change is that the preheader and exit blocks are replaced
by tracking just the innermost optimized nir_loop * and getting the
predecessor and successor blocks out of it.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41220>
2026-04-27 23:58:20 +00:00
Timothy Arceri
a42c55da46 amd/radeonsi: dont clamp packed user varyings
ac_nir_optimize_outputs() might pack user varyings into the color
built-ins. If this happens we skip adding clamping to the
components that contain the user varying.

This change also fixes a second bug where a color built-in can be
packed into a non-color slot and was no longer being clamped.

Fixes: 3777a5d7 ("radeonsi: assign param export indices before compilation")
Closes: #14443

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40594>
2026-04-27 22:59:58 +00:00
Simon Perretta
af1669d9e2 pco: reserve additional outputs for trilinear sampled coeffs
Sampling coeffs with trilinear filtering will output 2x sets of data.
Whether bilinear or trilinear filtering is in use can't be determined
without checking state words, so unconditionally reserve 2x to avoid
clobbering output regs.

Fixes: 7df32ba09d ("pco: initial texture/sampler compiler support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41051>
2026-04-27 11:32:29 +00:00
squidbus
a41f0e62bb asahi,nir: Move asahi dynamic clipz pass to common.
Acked-by: Alyssa Rosenzweig <alyssa@rosenz.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41088>
2026-04-27 11:00:59 +00:00
Rhys Perry
91d555c2cb radv: lower indirect derefs after linking
Scratch access isn't very optimizable, so more stores are optimized away
if we lower indirect derefs after both linking and radv_optimize_nir.

fossil-db (navi21):
Totals from 1264 (0.62% of 202427) affected shaders:
Instrs: 1504703 -> 1504708 (+0.00%); split: -0.02%, +0.02%
CodeSize: 8031388 -> 8031020 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 1865 -> 1869 (+0.21%)
Latency: 12106362 -> 12106464 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 4056269 -> 4056044 (-0.01%); split: -0.01%, +0.00%
VClause: 13927 -> 13940 (+0.09%)
SClause: 32382 -> 32396 (+0.04%); split: -0.03%, +0.08%
Copies: 188004 -> 187897 (-0.06%); split: -0.17%, +0.11%
Branches: 39045 -> 39052 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 79885 -> 79814 (-0.09%); split: -0.11%, +0.02%
VALU: 1072639 -> 1072532 (-0.01%); split: -0.01%, +0.00%
SALU: 187317 -> 187375 (+0.03%); split: -0.11%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31265>
2026-04-24 11:01:03 +00:00
Eric Guo
ba92143ef2 compiler: Add missing MESA_SHADER_KERNEL case for SPIR-V dump
Fixes assertion failure when MESA_SPIRV_DUMP_PATH is set for OpenCL
programs.

Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41097>
2026-04-23 20:36:55 +00:00
Alyssa Rosenzweig
6a43e6c9e0 nir/opt_algebraic: add redundant u2u32/unpack_64_2x32_split_x patterns
reduces hello world kernel 57 -> 44 inst on jay. why do we have two opcodes that
do literally the same thing? :/

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41085>
2026-04-23 19:54:21 +00:00
Samuel Pitoiset
34b8ce948a spirv: add support for SPV_KHR_constant_data
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40722>
2026-04-23 11:12:06 +00:00
Emma Anholt
f7d1f59948 spirv: Demote the SPIRV 1.6 OpTypeSampledImage on Buffer failure to a warning.
The hangover DXVK builds we want to use for arm64 CI hit this path, and we
have a perfectly reasonable fallback for handling this case (ignore the
sampler, as glslang should have done).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40959>
2026-04-22 17:39:30 +00:00
Daniel Schürmann
806fcc6193 nir/opt_loop: always try to peel initial break from loops with unrolling hint
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This allows to unroll these loops, even if loop analyze is unable
to calculate the iteration count.
As always with loops, the throughput stats are meaningless.

Totals from 6 (0.00% of 202440) affected shaders: (Navi48)
Instrs: 7825 -> 6201 (-20.75%)
CodeSize: 37056 -> 30412 (-17.93%)
Latency: 21563 -> 16934 (-21.47%)
InvThroughput: 144649 -> 77962 (-46.10%)
SClause: 139 -> 133 (-4.32%)
Copies: 536 -> 388 (-27.61%)
Branches: 156 -> 84 (-46.15%)
PreVGPRs: 298 -> 296 (-0.67%); split: -1.01%, +0.34%
VALU: 2493 -> 2378 (-4.61%); split: -4.65%, +0.04%
SALU: 3263 -> 2199 (-32.61%)
SMEM: 188 -> 183 (-2.66%)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Daniel Schürmann
738cc6a7db nir/opt_loop: stop recursion at loop header phi in can_constant_fold()
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00
Daniel Schürmann
1f9a0490c6 nir/opt_loop: Don't peel initial break from do-while loops
As the main purpose of this optimization is to transform
while- into do-while loops, don't apply on loops which are
already in do-while form. Also set nir_loop::do_while after
this transformation, so that it is only applied once.

Totals from 576 (0.28% of 202440) affected shaders: (Navi48)
Instrs: 1337529 -> 1253438 (-6.29%); split: -6.36%, +0.07%
CodeSize: 8390852 -> 7837328 (-6.60%); split: -6.61%, +0.01%
VGPRs: 50856 -> 50844 (-0.02%)
SpillSGPRs: 42198 -> 35395 (-16.12%); split: -16.13%, +0.01%
SpillVGPRs: 47608 -> 44620 (-6.28%)
Latency: 31043828 -> 44143753 (+42.20%); split: -0.06%, +42.26%
InvThroughput: 6973433 -> 10079000 (+44.53%); split: -0.08%, +44.61%
VClause: 26839 -> 24718 (-7.90%); split: -7.91%, +0.00%
SClause: 21831 -> 21583 (-1.14%); split: -1.52%, +0.38%
Copies: 183503 -> 150040 (-18.24%); split: -18.84%, +0.61%
Branches: 27738 -> 26848 (-3.21%); split: -5.12%, +1.91%
PreSGPRs: 40233 -> 39083 (-2.86%); split: -2.88%, +0.02%
PreVGPRs: 38745 -> 38903 (+0.41%); split: -0.02%, +0.43%
VALU: 688396 -> 645948 (-6.17%); split: -6.17%, +0.01%
SALU: 189792 -> 177642 (-6.40%); split: -6.97%, +0.57%
VMEM: 121500 -> 112748 (-7.20%)
SMEM: 38765 -> 37767 (-2.57%); split: -2.58%, +0.00%
VOPD: 102488 -> 89071 (-13.09%); split: +0.24%, -13.33%

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40349>
2026-04-22 10:34:58 +00:00