Commit graph

10704 commits

Author SHA1 Message Date
Alyssa Rosenzweig
042adf3cc5 nir/opt_algebraic: optimize signed pow in Control
used in a post-processing shader which goes 896 instrs -> 749 instrs.

In my Control fossil:

   Totals from 2 (0.63% of 319) affected shaders:
   Instrs: 2078 -> 1841 (-11.41%)
   CodeSize: 14540 -> 12800 (-11.97%)
   ALU: 1779 -> 1626 (-8.60%)
   FSCIB: 1779 -> 1626 (-8.60%)
   Uniforms: 370 -> 372 (+0.54%)

In radv_fossils, there are affected shaders in Dredge.

   Totals from 4 (0.01% of 54019) affected shaders:
   Instrs: 2306 -> 2294 (-0.52%)
   CodeSize: 16594 -> 16534 (-0.36%)
   ALU: 2010 -> 2004 (-0.30%)
   FSCIB: 2010 -> 2004 (-0.30%)
   Uniforms: 1138 -> 1146 (+0.70%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>
2025-07-08 17:09:16 +00:00
Alyssa Rosenzweig
2765017553 nir: fuse ffma even with float controls
The fmul+fadd -> fma rules in nir_opt_algebraic are marked imprecise,
because they are a contraction. However, they respect signed zero/Inf/NaN rules.
As such, it is legal to do this fusion with shader float controls as long as the
exact bit is not set (mapping to SPIR-V NoContract).

Unfortunately, NIR's imprecise rules do not distinguish between contraction
issues versus float special case issues, forcing nir_search to skip all
imprecise rules when any shader float control modes are used. This notably
affects DXVK, which sets shader float controls to get D3D11 float behaviour and
hence loses FMA fusing.

Therefore, we plumb in the exact bit to express NoContract independent of the
float controls, and weaken the requirement for fma fusion to allowable
contraction. For fma splitting, it's a similar issue, as inexact GLSL fma in
SPIR-V is just a multiply add that we're allowed to contract rather than the
real deal.

Drivers that use their own FMA fusing passes (notably, Intel and AMD) are
unaffected, but DXVK-capable drivers using fuse_ffma should like this. Results
on hk shown:

Totals from 2194 (4.06% of 54019) affected shaders:
MaxWaves: 2174272 -> 2175936 (+0.08%); split: +0.08%, -0.01%
Instrs: 1173283 -> 1131494 (-3.56%); split: -3.57%, +0.01%
CodeSize: 8568168 -> 8381724 (-2.18%); split: -2.18%, +0.01%
Spills: 1094 -> 747 (-31.72%)
Fills: 988 -> 681 (-31.07%)
Scratch: 4444 -> 3820 (-14.04%)
ALU: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01%
FSCIB: 953032 -> 913149 (-4.18%); split: -4.19%, +0.01%
IC: 215398 -> 215274 (-0.06%)
GPRs: 139865 -> 139032 (-0.60%); split: -1.56%, +0.96%
Uniforms: 414886 -> 414466 (-0.10%); split: -0.14%, +0.04%
Preamble instrs: 646398 -> 644017 (-0.37%); split: -0.43%, +0.07%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35989>
2025-07-08 17:09:16 +00:00
Daniel Schürmann
2c51a8870d nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar()
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Similar to nir_lower_alu_width(), the callback can return the
desired number of components for a phi, or 0 for no lowering.

The previous behavior of nir_lower_phis_to_scalar() with lower_all=true
can be elicited via nir_lower_all_phis_to_scalar() while the previous
behavior with lower_all=false now corresponds to nir_lower_phis_to_scalar()
with NULL callback.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Daniel Schürmann
23b7b3b919 nir/lower_phis_to_scalar: remove exec_list dead_instrs
No need to free the instructions at this point.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Daniel Schürmann
f6e0f4813c nir: remove recursive check in nir_lower_phis_to_scalar()
This check causes unnecessary overhead and can be replaced by simply
checking whether a phi_src is from a loop continue block.
Except for rare edge cases, the result will be the same.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35783>
2025-07-08 15:33:59 +00:00
Marek Olšák
656675a490 nir: change nir_lower_mem_access_bit_sizes to an intrinsics pass
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
1cc5f7f868 nir: add nir_shift_channels helper
for later use

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
5760f92e08 nir: print lowp/mediump/highp next to deref types
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
070aaa1c9f nir/lower_io: validate that location and num_slots fit in the bitfields
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
5aa3748b26 nir: remove deprecated nir_io_dont_optimize
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35999>
2025-07-08 14:01:56 +00:00
Marek Olšák
80ed5653a7 nir: invert the meaning of has_indirect_* flags in nir_lower_io_passes
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:44 +00:00
Marek Olšák
1c4929645b glsl: don't call nir_lower_global_vars_to_local twice in preprocess_shader
it's called again below

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:44 +00:00
Marek Olšák
425a89cb75 glsl: don't call nir_split_var_copies in preprocess_shader
it seems to have no effect

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:44 +00:00
Marek Olšák
a065a09d22 glsl: don't lower outputs to temps unconditionally
It's done later in nir_lower_io_passes only for shader stages not
supporting indirect access.

Unfortunately we have add a hack into nir_lower_io_passes to get rid of
output loads. A later commit will remove it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:44 +00:00
Marek Olšák
1124587495 glsl: don't lower inputs to temps unconditionally
It's done later in nir_lower_io_passes only for shader stages not
supporting indirect access.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:43 +00:00
Marek Olšák
9083e8b984 glsl: fix a possible crash in gl_nir_lower_xfb_varying
If the last block is empty, nir_block_last_instr returns NULL, which
sets the cursor to NULL, which crashes.

I think this can't crash currently because if xfb is present, there is
always at least 1 output store in the last block due to
lower_io_vars_to_temporaries, but that won't be true after we stop
calling it in a later commit.

Fixes: fa9cee4247 - glsl: implement lower_xfb_varying() as a NIR pass

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:43 +00:00
Marek Olšák
89285e25b6 nir: remove nir_shader_compiler_options::lower_all_io_to_temps
All drivers should report support_indirect_* correctly, so this
is redundant.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35945>
2025-07-08 06:11:43 +00:00
Alyssa Rosenzweig
d31cb824df treewide: use VARYING_BIT_*
Some checks failed
macOS-CI / macOS-CI (dri) (push) Has been cancelled
macOS-CI / macOS-CI (xlib) (push) Has been cancelled
Via Coccinelle patch generated by the following Python:

  varys = [ "POS", "COL0", "COL1", "FOGC", "TEX0", "TEX1", "TEX2", "TEX3", "TEX4",
           "TEX5", "TEX6", "TEX7", "PSIZ", "BFC0", "BFC1", "EDGE", "CLIP_VERTEX",
           "CLIP_DIST0", "CLIP_DIST1", "CULL_DIST0", "CULL_DIST1", "PRIMITIVE_ID",
           "PRIMITIVE_COUNT", "LAYER", "VIEWPORT", "FACE",
           "PRIMITIVE_SHADING_RATE", "PNTC", "TESS_LEVEL_OUTER",
           "TESS_LEVEL_INNER", "PRIMITIVE_INDICES", "BOUNDING_BOX0",
           "BOUNDING_BOX1", "VIEWPORT_MASK", "CULL_PRIMITIVE" ]
  t = """
  @@
  @@

  -(1 << VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -BITFIELD_BIT(VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -(1ull << VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  @@
  @@

  -BITFIELD64_BIT(VARYING_SLOT_${V})
  +VARYING_BIT_${V}

  """
  for v in varys:
      from mako.template import Template
      print(Template(t).render(V = v))

Closes: #13453
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [panfrost, common]
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [broadcom]
Reviewed-by: Corentin Noël <corentin.noel@collabora.com> [virgl]
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> [zink]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35917>
2025-07-04 19:01:04 +00:00
Georg Lehmann
045ddb992a nir/opt_algebraic: optimize 16bit vec2 comparison followed by b2i16 using usub_sat
Helps vectorized emulated fp16 -> fp8 conversions

No Foz-DB changes.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35876>
2025-07-03 20:08:39 +00:00
Alyssa Rosenzweig
f853d285ef nir/lower_tex: optimize LOD bias lower for txl
make sure we can fold the f2f away. alternatively f2fmp would work
here but details.

elden ring:

Totals from 137 (4.27% of 3206) affected shaders:
Instrs: 485455 -> 484904 (-0.11%)
CodeSize: 3218638 -> 3215338 (-0.10%)
ALU: 308071 -> 307520 (-0.18%)
FSCIB: 308071 -> 307520 (-0.18%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35909>
2025-07-03 16:41:51 +00:00
Alyssa Rosenzweig
b992703477 nir/lower_system_values: optimize global ID
for drivers where we need to lower a base_workgroup_id but not global IDs.
rather than lowering the whole global ID to stick the base workgroup ID in
there, just add the workgroup offset to the final thread position.

Elden ring fossils:

Totals from 52 (1.62% of 3206) affected shaders:
Instrs: 48355 -> 48233 (-0.25%); split: -0.31%, +0.06%
CodeSize: 331912 -> 331148 (-0.23%); split: -0.28%, +0.05%
ALU: 30853 -> 30674 (-0.58%); split: -0.70%, +0.12%
FSCIB: 30853 -> 30674 (-0.58%); split: -0.70%, +0.12%
IC: 9054 -> 8958 (-1.06%)
GPRs: 4184 -> 4216 (+0.76%)
Uniforms: 6703 -> 6677 (-0.39%); split: -1.61%, +1.22%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35909>
2025-07-03 16:41:51 +00:00
Alejandro Piñeiro
0003e16fc6 nir/lower_clip: update comment
As the lowering mentioned there got renamed twice:

  commit b085016f94
  Author: Rob Clark <robclark@freedesktop.org>
  Date:   Fri Mar 25 13:52:26 2016 -0400

    nir: rename lower_outputs_to_temporaries -> lower_io_to_temporaries

    Since it will gain support to lower inputs, give it a more generic name.

  commit 1754507d49
  Author: Marek Ol¨ák <maraeo@gmail.com>
  Date:   Wed Jun 25 19:05:19 2025 -0400

    nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries

    Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
    Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>

Reviewed-by: Marek Ol¨ák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35855>
2025-07-02 20:56:53 +00:00
Marek Olšák
4263b49778 ac/nir: remove ngg_scratch LDS ABI, allocate it in the lowering pass
This is a cleanup.

Old gs LDS layout: [es outputs][gs outputs][scratch]
Old nogs LDS layout: [xfb/cull][scratch]

New gs LDS layout: [es outputs][scratch|gs outputs]
New nogs LDS layout: [scratch|xfb/cull]

The LDS scratch is moved to the beginning of the preceding buffer in LDS,
while the addresses in that LDS buffer are offset by the scratch size.
It effectively merges the LDS scratch with the preceding buffer in LDS.
Thanks to that, we no longer need the ngg_scratch ABI and the offset
in a user SGPR.

The lowering passes now return the LDS scratch size, which is used
by the drivers to determine the final LDS size.

The ngg_lds_layout SGPR is now unused without GS in RADV.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35352>
2025-07-02 20:27:41 +00:00
David Neto
673f684ddd nir: Support printing cmat constants
A cooperative matrix can only be constructed from a single
scalar value. Print that value, wrapped by a function call that
looks like a type-constructor.

This adds a test case that will otherwise assert out in spirv2nir.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35757>
2025-07-02 16:48:51 +00:00
Alyssa Rosenzweig
3c2f46fcac treewide: use nir_break_if with named if
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Via Coccinelle patch:

    @@
    expression builder, condition;
    identifier nif;
    @@

    -nir_if *nif = nir_push_if(builder, condition);
    -{
    -nir_jump(builder, nir_jump_break);
    -}
    -nir_pop_if(builder, nif);
    +nir_break_if(builder, condition);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>
2025-06-30 14:51:54 -04:00
Alyssa Rosenzweig
67237b6f1b treewide: use nir_break_if
Via Coccinelle patch:

    @@
    expression builder, condition;
    @@

    -nir_push_if(builder, condition);
    -{
    -nir_jump(builder, nir_jump_break);
    -}
    -nir_pop_if(builder, NULL);
    +nir_break_if(builder, condition);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>
2025-06-30 14:51:24 -04:00
Karol Herbst
b3c245ecf2 clc: add support for cl_ext_image_unorm_int_2_101010
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35469>
2025-06-30 18:04:59 +00:00
Alyssa Rosenzweig
7fd7b18b38 nir: rename AGX geom/tess intrinsics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
to the new common code name.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig
d13b321201 nir/lower_gs_intrinsics: drop stuff added for AGX
AGX now vendors a significantly different version of this pass, so the common
one doesn't need the stuff added for AGX.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:10 +00:00
Alyssa Rosenzweig
16b53d356a nir: add rasterization_stream sysval
for plumbing transformFeedbackRasterizationStreamSelect (in turn for exercising
more CTS and proving out my design).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:06 +00:00
Alyssa Rosenzweig
805ef6cc17 nir: add intrinsics for geometry shader lowering
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:05 +00:00
Alyssa Rosenzweig
4f7cae5e61 nir/opt_algebraic: add trichotomy identity
In https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802 we will
significantly rework geometry shaders & transform feedback. In the new approach,
transform feedback is executed as part of the hardware vertex shader, meaning
the vertex shader needs to write out all the "copies" of the same value into
different parts of the XFB buffer. In the general case of a GS writing triangle
strips, we get 0-3 copies. This is good and lets us parallelize XFB better with
GS.

In the case of a VS alone with XFB, we insert a passthrough GS. In that case
special case, we can only get at most 1 copy, so if we can prove the length of
the output strip is 3 we can delete 2/3 of the shader.

Anyway, the only thing preventing NIR from doing that optimization is failing to
see through some conditionals, fixed by optimizing with the law of trichotomy.

We could add other variants of this pattern (signed vs unsigned, iand vs
ior/ixor) if we expect anything else to hit this other than my boutique use
case.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802>
2025-06-30 16:24:04 +00:00
Robert Mader
a166d7609f gles: Add support for 10/12/16 bit SW decoder YCbCr formats
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Co-Authored-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34303>
2025-06-30 11:56:23 +00:00
Rhys Perry
7b291a33d4 nir/search: fix dumping of conversions
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>
2025-06-30 10:41:39 +00:00
Rhys Perry
08859cbe50 nir/lower_bit_size: fix bitz/bitnz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: 6585209cdd ("nir/lower_bit_size: mask bitz/bitnz src1 like shifts")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35770>
2025-06-30 10:41:39 +00:00
Mel Henning
8795006994 nir/opt_uniform_subgroup: Handle vote_feq
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Brings the vertex shader in
dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_dvec4_vertex
from 234 to 169 instructions on NAK.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
70fccc59fc nir/opt_uniform_subgroup: Handle vote_ieq
No shader-db changes here, but it does improve some cts shaders, eg. the
vertex shader in
dEQP-VK.subgroups.vote.framebuffer.subgroupallequal_i64vec4_vertex
goes from 80 to 56 instructions with NAK

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Mel Henning
10acb44c64 nir: Split lower_vote_eq into int/float versions
Recent nvidia hardware has a native instruction for
nir_intrinsic_vote_ieq but not for nir_intrinsic_vote_feq. So, split
this boolean into two so we can contol the lowering separately for each
instruction.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35778>
2025-06-28 16:10:50 +00:00
Lionel Landwerlin
fcf4401824 brw: handle wa_18019110168 with independent shader compilation
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35103>
2025-06-28 05:55:35 +00:00
Matt Turner
102d7409ef nir: Add convert_cmat_intel intrinsic
This intrinsic will be used to implement matrix type and layout
conversions in the backend compiler.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35616>
2025-06-27 01:26:22 +00:00
James Price
10ae673368 spirv: Fix cooperative matrix in OpVariable initializer
Check for cooperative matrix types first in the
nir_lower_variable_initializers pass, since they are also considered
to be scalar types.

Fixes: 7e6cd395c7 ("nir: Handle cmat types in lower_variable_initializers")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13388
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35668>
2025-06-26 22:24:31 +00:00
Konstantin Seurer
aacfc663cb nir: Add nir_lower_halt_to_return
This is a lowering pass that was implemented by multiple drivers.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33003>
2025-06-26 20:12:12 +00:00
Marek Olšák
1754507d49 nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
1e03827c77 nir: rename nir_lower_io_arrays_to_elements -> nir_lower_io_array_vars_to_elements
same for *_no_indirects

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:54 +00:00
Marek Olšák
3713e2d580 nir: rename nir_lower_clip_cull_distance_arrays -> nir_lower_clip_cull_distance_array_vars
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:53 +00:00
Marek Olšák
adb17a8609 nir: move nir_recompute_io_bases into its own file
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:53 +00:00
Marek Olšák
97743980ce nir: remove unused nir_force_mediump_io & nir_unpack_16bit_varying_slots
I think I added these.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:52 +00:00
Marek Olšák
aefea49dad nir: move lots of code from nir_lower_io.c into new nir_lower_explicit_io.c
nir_lower_io is just for regular inputs/outputs.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:52 +00:00
Marek Olšák
5bd3e0c08c nir: move nir_assign_var_locations to freedreno (its only use)
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:52 +00:00
Marek Olšák
c8cda0dc1a nir: move nir_io_add_const_offset_to_base into its own file
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35760>
2025-06-26 18:20:51 +00:00