Commit graph

6373 commits

Author SHA1 Message Date
Rhys Perry
8fd5266b69 nir/divergence: ignore boolean phis for ignore_undef_if_phi_srcs
The only user of this option (ACO) doesn't support this for boolean phis.

fossil-db (navi21):
Totals from 1208 (1.51% of 79825) affected shaders:
Instrs: 826592 -> 823201 (-0.41%); split: -0.41%, +0.00%
CodeSize: 4228296 -> 4224280 (-0.09%); split: -0.11%, +0.01%
Latency: 3030803 -> 3028410 (-0.08%); split: -0.08%, +0.01%
InvThroughput: 578588 -> 578693 (+0.02%); split: -0.00%, +0.02%
VClause: 19500 -> 19494 (-0.03%)
Copies: 60914 -> 57589 (-5.46%); split: -5.47%, +0.01%
PreVGPRs: 50759 -> 50774 (+0.03%)
VALU: 528582 -> 528671 (+0.02%); split: -0.00%, +0.02%
SALU: 121134 -> 117646 (-2.88%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Backport-to: 25.1
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13455
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13509
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36005>
2025-07-21 08:27:01 +00:00
Konstantin Seurer
df44b353ad radv: Optimize ray tracing position fetch
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Gets rid of a lot of indirection when fetching triangle positions.
Storing the primitive address increases register pressure by a bit but
the traversal shader which should have the highest register demand
should not be affected when position fetch is not used.

Totals:
Instrs: 4021686 -> 4022435 (+0.02%); split: -0.01%, +0.03%
CodeSize: 21235812 -> 21235832 (+0.00%); split: -0.02%, +0.02%
Latency: 23402275 -> 23412110 (+0.04%); split: -0.04%, +0.09%
InvThroughput: 4352818 -> 4352206 (-0.01%); split: -0.04%, +0.02%
VClause: 101906 -> 102058 (+0.15%); split: -0.03%, +0.18%
Copies: 342210 -> 342368 (+0.05%); split: -0.09%, +0.14%
Branches: 114988 -> 114993 (+0.00%)
PreVGPRs: 26551 -> 27111 (+2.11%)
VALU: 2249366 -> 2249524 (+0.01%); split: -0.01%, +0.02%
SALU: 529828 -> 529808 (-0.00%); split: -0.01%, +0.00%

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35533>
2025-07-19 16:07:59 +00:00
Faith Ekstrand
9fbb57e0a4 nir,nak: Add a nir_texop_sample_pos_nv and plumb it through
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36207>
2025-07-18 22:21:46 +00:00
Faith Ekstrand
557ac588e4 nir/instr_set: Rework tex instr hash/compare
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We were missing a couple bits from hash and a bunch of stuff from the
comparison.  This puts most of nir_tex_instr into a single pack_tex
helper that's used by both and grabs everything we were missing.

Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36234>
2025-07-18 17:10:20 -04:00
Alyssa Rosenzweig
2308960bed treewide: use nir_mov_scalar
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Via Coccinelle patch:

    @@
    expression builder, scalar;
    @@

    -nir_channel(builder, scalar.def, scalar.comp)
    +nir_mov_scalar(builder, scalar)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36142>
2025-07-16 18:59:16 +00:00
Alyssa Rosenzweig
186db0ebfe nir: add nir_mov_scalar helper
I keep reaching for this helper but it doesn't exist. So I fixed that.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36142>
2025-07-16 18:59:16 +00:00
Alyssa Rosenzweig
98aad84d73 hk: push descriptor set addresses
saves an indirection and sets us up for more goodness.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:18 +00:00
Alyssa Rosenzweig
24c708564f nir: add bindless_sampler_agx intrinsic
to facilitate pushing on AGX.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:17 +00:00
Alyssa Rosenzweig
58cc66238a nir/opt_preamble: add sampler class
AGX will use.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:17 +00:00
Georg Lehmann
d672737372 nir,aco: add byte_perm_amd
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Mary Guillemard
90438bae51 nir: Add NVIDIA-specific muladd intrinsics
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32777>
2025-07-15 23:34:31 +00:00
Natalie Vock
9707b30965 nir,aco: Add ds_bvh_stack_rtn
This is a ds instruction that also overwrites its first input, so
introduce a new ds format with two outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35269>
2025-07-15 21:34:39 +00:00
Dave Airlie
2273b6c46a nak: add divergent attribute and wrapper for nir_load_sysval_nv
This wraps the sysval load in a builder where we can add
proper divergence for ctaid later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36105>
2025-07-15 19:07:11 +00:00
Marek Olšák
6286c1c66f nir/opt_vectorize_io: optionally vectorize loads with holes
e.g. load X; load W; ==> load XYZW. Verified with a shader test.

This will be used by AMD drivers. See the code comments.

Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36098>
2025-07-15 16:29:30 +00:00
Romaric Jodin
b4977a1605 nir/lower_bit_size: Avoid round-trip conversion when possible
When we detect that the source is a conversion generated by the pass,
try to get the real source instead of doing a round-trip conversion.

Make sure that the nir_alu_type and the bit_size is the same between what
we need and what's before the detected conversion.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35744>
2025-07-15 15:32:58 +00:00
Marek Olšák
0fdd6de65f nir/lower_io: validate locations more accurately
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
2025-07-15 13:38:29 +00:00
Marek Olšák
b0494f9485 nir/opt_varyings: optimize the consumer after constant propagation and dedupli.
A TF2 shader propagates 0 to the consumer, which eliminates 1 input
if we run algebraic opts and DCE before compaction.

This is a prerequisite for removing all IO var optimizations from the GLSL
linker that are redundant with nir_opt_varyings.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
2025-07-15 13:38:29 +00:00
Marek Olšák
9607852c30 nir/opt_varyings: use nir_scalar
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
2025-07-15 13:38:29 +00:00
Christian Gmeiner
ec9a2aa2e4 nir: Unvendor sampler_lod_parameters(_pan)
Will be used by etnaviv too.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35753>
2025-07-12 10:48:03 +00:00
Qiang Yu
25897f0692 nir/recompute_io_bases: fix for per primitive IO
It does not handle per primitive output and count
per primitive input.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>
2025-07-11 02:25:51 +00:00
Qiang Yu
35e3f4ee92 nir: fix PRIMITIVE_INDICES mistreated as varying
It's a sysval in mesh shader, but it share the same
slot number with VARYING_SLOT_TESS_LEVEL_INNER.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35931>
2025-07-11 02:25:51 +00:00
Alyssa Rosenzweig
329413992e nir/lower_tex: revert "optimize LOD bias lower for txl"
This reverts commit f853d285ef.

Failing a GL CTS test
https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/5866 .. apparently I ran
VK CTS but not GL CTS on that MR. Oops.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>
2025-07-10 15:00:28 -04:00
Alyssa Rosenzweig
ee26938faf nir,agx: switch to bindless_image_agx intrinsic
this is more explicit than vec2's and hence has fewer footguns. in particular
it's easier to handle with preambles in a sane way.

modelled on what ir3 does.

there's probably room for more clean up but for now this unblocks what I want to
do.

stats don't seem concerning.

Totals from 692 (1.29% of 53701) affected shaders:
MaxWaves: 441920 -> 442112 (+0.04%)
Instrs: 1588748 -> 1589304 (+0.03%); split: -0.05%, +0.08%
CodeSize: 11487976 -> 11491620 (+0.03%); split: -0.04%, +0.07%
ALU: 1234867 -> 1235407 (+0.04%); split: -0.06%, +0.10%
FSCIB: 1234707 -> 1235249 (+0.04%); split: -0.06%, +0.10%
IC: 380514 -> 380518 (+0.00%)
GPRs: 117292 -> 117332 (+0.03%); split: -0.08%, +0.11%
Preamble instrs: 314064 -> 313948 (-0.04%); split: -0.05%, +0.01%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>
2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig
78f4c7c6a4 nir: fix AGX intrinsic flag
by inspection.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>
2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig
f10e96586f nir/rewrite_image_intrinsic: handle non-derefs
it is sometimes useful to turn lowered bindless intrinsics into bound or vice
versa, and it is annoying to do so without this helper, so generalize the
helper.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>
2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig
569046d95e nir/rewrite_image_intrinsic: handle explicit coord
for agx.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>
2025-07-10 14:55:17 -04:00
Alyssa Rosenzweig
d55bdb4ec5 nir/opt_preamble: add "register class" concept
Class represents an indexed "ideal" register class, where non-general classes
only allow defs that choose that class in the def_size callback.
nir_opt_preamble will try to assign specialized classes where possible, falling
back to the general class once the special-purpose classes are exhausted.

AGX will use this mechanism to promote bindless texture handles to bound texture
registers where possible, falling back to pushing the handle as a uniform where
not possible. Supporting multiple classes in nir_opt_preamble allows this
multi-level hoisting to work in a single nir_opt_preamble call with proper
global behaviour.

Add this concept to nir_opt_preamble so we can use it in AGX later in this MR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35949>
2025-07-10 14:55:17 -04:00
Marek Olšák
2ba2a61101 nir: switch indirect IO load lowering to nir_lower_io_indirect_loads for GLSL
This reduces GLSL compile times with the gallium noop driver by 0.6%.

This might decrease register usage and do less code reordering because
nir_lower_io_vars_to_temporaries is no longer called for inputs, which
moved most input loads to the top.

radeonsi+ACO shader-db results are noise.
More uniforms are identified as inlinable.

TOTALS FROM ALL SHADERS (58138):
  VGPRs: 2152680 -> 2158032 (0.25 %)
  Code Size: 71008908 -> 71064812 (0.08 %) bytes
  Max Waves: 916943 -> 916924 (-0.00 %)
  Inline Uniforms: 6395 -> 6414 (0.30 %)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>
2025-07-10 16:37:45 +00:00
Marek Olšák
a4e522f8b0 nir: add new pass nir_opt_move_to_top
This can be used to move input loads to top after we stop using
nir_lower_io_vars_to_temporaries that does it unconditionally.

It's more flexible than what nir_lower_io_vars_to_temporaries was doing,
and can be extended to handle any instructions.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>
2025-07-10 16:37:44 +00:00
Marek Olšák
3dd9a9782b nir: add new pass nir_lower_io_indirect_loads
This is a partial replacement for nir_lower_io_vars_to_temporaries.
It supports all input and output loads. It doesn't handle stores.
The motivation is to improve compile times.

The main differences compared to nir_lower_io_vars_to_temporaries are:
- it only lowers indirect loads to temps and doesn't touch direct loads
  which improves compile times and removes the need for nir_lower_vars_to_ssa
  afterward because indirect temp access can't be lowered to SSA
- it doesn't move all input loads to the top; it only moves those input
  loads to the top whose indirect loads are lowered (which improves
  register usage because direct loads are not moved)
- it doesn't have to deal with complexities of variables

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36018>
2025-07-10 16:37:44 +00:00
Mel Henning
94f4fc12ea nir/divergence_analysis: Add NV_shader_sm_builtins
Fixes crucible func.nv.shader-sm-builtins.q0

Fixes: a3839dbb90 ("nak: Change divergence analysis pass order")
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36011>
2025-07-09 16:47:28 +00:00
Simon Perretta
f89fb76671 nir/lower_io_to_scalar: add case for lowering push constants
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>
2025-07-09 12:58:29 +00:00
Simon Perretta
d3e3e0e3d2 nir/builder: add nir_ibitfield_extract_imm helper
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>
2025-07-09 12:58:29 +00:00
Simon Perretta
1a4e22b01a nir/builder: add nir_bitfield_insert_imm helper
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>
2025-07-09 12:58:29 +00:00
Simon Perretta
e2ece5ef25 nir/serialize: increase the op limit to 10 bits/1024
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36000>
2025-07-09 12:58:29 +00:00
Simon Perretta
1f1b3cc200 nir/precompiled: add shader stage option to nir_precompiled_build_variant
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36001>
2025-07-09 13:14:41 +01:00
Simon Perretta
5b29daf7bc nir/precompiled: add helper to emit an enum map for multiple targets
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36001>
2025-07-09 13:14:41 +01:00
Alyssa Rosenzweig
fc95397957 nir/lower_alu: optimize min/max signed zeros
we don't usually need a multi-instruction lowering.

with the agx change in the next commit, honeykrisp results:

   Totals from 3589 (6.64% of 54019) affected shaders:
   MaxWaves: 3598144 -> 3598400 (+0.01%); split: +0.02%, -0.01%
   Instrs: 1445830 -> 1332394 (-7.85%)
   CodeSize: 10696356 -> 9742130 (-8.92%)
   Fills: 721 -> 723 (+0.28%); split: -0.14%, +0.42%
   Scratch: 3980 -> 3968 (-0.30%)
   ALU: 1156426 -> 1043198 (-9.79%)
   FSCIB: 1156426 -> 1043196 (-9.79%)
   IC: 267202 -> 267166 (-0.01%)
   GPRs: 208765 -> 208712 (-0.03%); split: -0.16%, +0.14%
   Uniforms: 683643 -> 683677 (+0.00%); split: -0.01%, +0.01%
   Preamble instrs: 1163325 -> 1159314 (-0.34%)

control results alone:

   Totals:
   Instrs: 110168 -> 107171 (-2.72%)

   Totals from 71 (22.26% of 319) affected shaders:
   Instrs: 48895 -> 45898 (-6.13%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>
2025-07-08 17:09:16 +00:00
Alyssa Rosenzweig
042adf3cc5 nir/opt_algebraic: optimize signed pow in Control
used in a post-processing shader which goes 896 instrs -> 749 instrs.

In my Control fossil:

   Totals from 2 (0.63% of 319) affected shaders:
   Instrs: 2078 -> 1841 (-11.41%)
   CodeSize: 14540 -> 12800 (-11.97%)
   ALU: 1779 -> 1626 (-8.60%)
   FSCIB: 1779 -> 1626 (-8.60%)
   Uniforms: 370 -> 372 (+0.54%)

In radv_fossils, there are affected shaders in Dredge.

   Totals from 4 (0.01% of 54019) affected shaders:
   Instrs: 2306 -> 2294 (-0.52%)
   CodeSize: 16594 -> 16534 (-0.36%)
   ALU: 2010 -> 2004 (-0.30%)
   FSCIB: 2010 -> 2004 (-0.30%)
   Uniforms: 1138 -> 1146 (+0.70%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>
2025-07-08 17:09:16 +00:00
Alyssa Rosenzweig
2765017553 nir: fuse ffma even with float controls
The fmul+fadd -> fma rules in nir_opt_algebraic are marked imprecise,
because they are a contraction. However, they respect signed zero/Inf/NaN rules.
As such, it is legal to do this fusion with shader float controls as long as the
exact bit is not set (mapping to SPIR-V NoContract).

Unfortunately, NIR's imprecise rules do not distinguish between contraction
issues versus float special case issues, forcing nir_search to skip all
imprecise rules when any shader float control modes are used. This notably
affects DXVK, which sets shader float controls to get D3D11 float behaviour and
hence loses FMA fusing.

Therefore, we plumb in the exact bit to express NoContract independent of the
float controls, and weaken the requirement for fma fusion to allowable
contraction. For fma splitting, it's a similar issue, as inexact GLSL fma in
SPIR-V is just a multiply add that we're allowed to contract rather than the
real deal.

Drivers that use their own FMA fusing passes (notably, Intel and AMD) are
unaffected, but DXVK-capable drivers using fuse_ffma should like this. Results
on hk shown:

Totals from 2194 (4.06% of 54019) affected shaders:
MaxWaves: 2174272 -> 2175936 (+0.08%); split: +0.08%, -0.01%
Instrs: 1173283 -> 1131494 (-3.56%); split: -3.57%, +0.01%
CodeSize: 8568168 -> 8381724 (-2.18%); split: -2.18%, +0.01%
Spills: 1094 -> 747 (-31.72%)
Fills: 988 -> 681 (-31.07%)
Scratch: 4444 -> 3820 (-14.04%)
ALU: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01%
FSCIB: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01%
IC: 215398 -> 215274 (-0.06%)
GPRs: 139865 -> 139032 (-0.60%); split: -1.56%, +0.96%
Uniforms: 414886 -> 414466 (-0.10%); split: -0.14%, +0.04%
Preamble instrs: 646398 -> 644017 (-0.37%); split: -0.43%, +0.07%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>
2025-07-08 17:09:16 +00:00
Daniel Schürmann
2c51a8870d nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.

The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Daniel Schürmann
23b7b3b919 nir/lower_phis_to_scalar: remove exec_list dead_instrs
No need to free the instructions at this point.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Daniel Schürmann
f6e0f4813c nir: remove recursive check in nir_lower_phis_to_scalar()
This check causes unnecessary overhead and can be replaced by simply
checking whether a phi_src is from a loop continue block.
Except for rare edge cases, the result will be the same.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Marek Olšák
656675a490 nir: change nir_lower_mem_access_bit_sizes to an intrinsics pass
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
1cc5f7f868 nir: add nir_shift_channels helper
for later use

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
5760f92e08 nir: print lowp/mediump/highp next to deref types
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
070aaa1c9f nir/lower_io: validate that location and num_slots fit in the bitfields
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
5aa3748b26 nir: remove deprecated nir_io_dont_optimize
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
80ed5653a7 nir: invert the meaning of has_indirect_* flags in nir_lower_io_passes
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:44 +00:00
Marek Olšák
a065a09d22 glsl: don't lower outputs to temps unconditionally
It's done later in nir_lower_io_passes only for shader stages not
supporting indirect access.

Unfortunately we have add a hack into nir_lower_io_passes to get rid of
output loads. A later commit will remove it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:44 +00:00