Commit graph

11511 commits

Author SHA1 Message Date
Aitor Camacho
fcf53988c4 nir/opt_varyings: Support implementations that cannot compact 16-bits
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Add nir_io_compact_to_higher_16 flag so that the pass knows if it can
compact 16-bit varyings into the higher 16 bits of a 32-bit varying.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Aitor Camacho <aitor@lunarg.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38994>
2026-01-14 20:44:41 +00:00
Georg Lehmann
fdfe3acdf0 nir/constant_expression: remove fquantize2f16 denorm special case
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Unnessecary, as any fp32 denorm would be 0 here already.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39266>
2026-01-14 17:05:24 +00:00
Georg Lehmann
631a7ef92a nir: make fquantize2f16 32bit only
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39266>
2026-01-14 17:05:24 +00:00
Natalie Vock
cc81c7de23 nir,aco: Clean up useless lowering of sbt_base_amd
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>
2026-01-14 14:19:07 +00:00
Natalie Vock
0a1911b220 radv,aco: Use function call structure for RT programs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>
2026-01-14 14:19:07 +00:00
Natalie Vock
c5d796c902 radv/rt: Use function call structure in NIR lowering
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>
2026-01-14 14:19:06 +00:00
Natalie Vock
9d2c3c3db2 nir/intrinsics: Add incoming/outgoing payload load/store instructions
With RT function calls, these are going to get lowered to:
- load/store_param (incoming payload)
- load/store_var (outgoing payload)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29580>
2026-01-14 14:19:05 +00:00
Alyssa Rosenzweig
41cdc548ee nir/builder: infer txf_ms/txl/txb opcodes
I'm not convinced these really should be separate opcodes at all in NIR, but
that's not what this patch is about. Here we just infer the opcodes in the
texture builder to allow simplified usage.

This lets us drop nir_txl() & nir_txb() helpers in favour of nir_tex(.lod/bias)
which is more normalized. We could also drop nir_txf_ms in favour of nir_txf but
that affects more callsites and is not obviously a win (unlike nir_txl which is
used once and nir_txb which is unused).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39271>
2026-01-14 08:18:15 +00:00
Jesse Natalie
7b82b52fd7 nir: Suppress 'potentially uninitialized local pointer variable used' warning
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39181>
2026-01-13 23:31:28 +00:00
John Anthony
50682ec22c pan: Use correct architecture name for v12+
The official name for the architecture after Valhall is 'Arm 5th
Gen'. In code we can use 'FIFTHGEN' or 'fifthgen', while in
documentation and printed output we should use 'Arm 5th Gen' or '5th
Gen'.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39267>
2026-01-13 13:28:34 +01:00
Lars-Ivar Hesselberg Simonsen
ce3e13774a nir: Add channels to pan texel_buf intrinsics
Rather than loading a single 64bit channel with
load_texel_buf_index_address_pan, load three channels of 32bit each. The
last channel is required by the next commit.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
2026-01-13 10:00:58 +01:00
Lars-Ivar Hesselberg Simonsen
46b44cf941 glsl/nir: Add texture_buffers to shader info
While analyzing glsl shaders, keep a bitmask of texture buffers. This
information is needed by panfrost.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38490>
2026-01-13 10:00:58 +01:00
Faith Ekstrand
6fc1030e4f nir: Add some new panfrost fragment shader intrinsics
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39244>
2026-01-12 18:14:43 +00:00
Lionel Landwerlin
6d19b898e7 anv/brw: prep work for SIMD32 ray queries
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36181>
2026-01-12 12:19:21 +00:00
Konstantin Seurer
f156743b0f spirv: Add internal f2f16 opcodes
The OpFConvert+FPRoundingModeRTP/FPRoundingModeRTN cannot be used
because GL_EXT_spirv_intrinsics does not allow decorations. Instead,
we need opcodes that encode the rounding mode so that they can be used
in glsl code.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
2026-01-10 11:34:07 +01:00
Konstantin Seurer
6d9cd36db6 nir: Add f2f16_ru/rd opcodes
Those are variants of f2f16 that always round up/down. Constant folding
requires nextafter that supports half floats (util_nextafter).

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
2026-01-10 11:33:23 +01:00
Karol Herbst
2e2b86c64f clc: handle all optional subgroup extensions
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38015>
2026-01-09 21:53:28 +00:00
Alyssa Rosenzweig
4e59199cbb nir: add nir_is_shared_access helper
This is helpful to identify shared mem access for writing more generic code
operating on nir intrinsics.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39219>
2026-01-09 20:51:12 +00:00
Lionel Landwerlin
26e4632f64 nir: add a new push_data_intel intrinsic
We're finally moving on from misusing various intrinsics :
  - load_uniform
  - load_push_constant
  - load_ubo*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
2026-01-09 14:19:46 +00:00
Lionel Landwerlin
799258fdde nir: use load() helper for inline_data_intel
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
2026-01-09 14:19:45 +00:00
Lionel Landwerlin
c84760a185 nir: add missing divergence handling for ray_query_global_intel
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38975>
2026-01-09 14:19:45 +00:00
Georg Lehmann
93d05cdfd8 nir/opt_algebraic: move fsat last for fsqrt(fsat(a))
This should be exact, even for all special values:

fsqrt(NaN) -> NaN
fsqrt(-0.0) -> 0.0
fsqrt(-Inf) -> NaN
fsqrt(negative finite) -> NaN

So all of these get saturated to +0.0

All numbers >= 1.0 will have a square root >= 1.0,
which will be saturate to 1.0

Moving the fsat guarantees that it can use an output modifier
for hardware that has those, and shouldn't harm other hardware either.

Foz-DB Navi21:
Totals from 255 (0.31% of 82151) affected shaders:
Instrs: 664906 -> 664194 (-0.11%)
CodeSize: 3623500 -> 3619188 (-0.12%)
Latency: 11336397 -> 11335688 (-0.01%); split: -0.01%, +0.00%
InvThroughput: 2716430 -> 2715726 (-0.03%); split: -0.03%, +0.00%
VALU: 442603 -> 441891 (-0.16%)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39202>
2026-01-09 07:34:46 +00:00
Ian Romanick
aba079b3af nir/algebraic: Detect missing f on F-strings
Missing f in other cases seems to be caught either elsewhere in the
script or by the C compiler.

Fixes: c49d6e0480 ("nir/algebraic: Elide range clamping of f2u sources")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39031>
2026-01-08 13:20:48 -08:00
Ian Romanick
d4a87e85b3 nir/algebraic: Add missing f on F-strings
Without this, nir_algebraic.py was treating "f2i{int_sz}_sat" as the
literal opcode name when it should have been "f2i8_sat" or similar.

Fixes: c49d6e0480 ("nir/algebraic: Elide range clamping of f2u sources")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39031>
2026-01-08 13:19:35 -08:00
Juan A. Suarez Romero
a6330ed4d0 nir: add ACCESS to load_uniforms
v3d/v3dv drivers require this information.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
2026-01-08 12:59:44 +00:00
Georg Lehmann
a706769a0b nir: move exact bit to nir_fp_math_control
Unifies nir per instruction float control.

In the future this can be split into contract/reassoc/transform
like SPIR-V.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (except SPIR-V)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
2026-01-07 09:40:57 +00:00
Georg Lehmann
ce27703768 spirv: don't set float control for integer dot
As the name says, integer dot products do not operate on floats.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
2026-01-07 09:40:57 +00:00
Georg Lehmann
eb4737a1dd nir: add nir_alu_instr_is_exact helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
2026-01-07 09:40:57 +00:00
Georg Lehmann
b70294b91f nir: document signed zero, inf, nan preserve flags
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
2026-01-07 09:40:56 +00:00
Georg Lehmann
9d027fc870 nir/opt_varyings: actually clone alu math control to different shader
Cc: mesa-stable

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39103>
2026-01-07 09:40:56 +00:00
Marek Olšák
1912a00a91 ALL: use SHA1_DIGEST_LENGTH etc. instead of hardcoding the numbers
only build_id is switched to use literal 20 instead of SHA1_DIGEST_LENGTH
because we will increase SHA1_DIGEST_LENGTH to BLAKE3_KEY_LEN

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39110>
2026-01-07 08:32:33 +00:00
Emma Anholt
1e8a1e9285 nir/algebraic: Apply autopep8.
I needed to reformat the nir_algebraic unit test generation, but we
weren't in pep8 to begin with.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Konstantin Seurer
e2ac22a068 nir: Allow using nir_eval_const_opcode in C++ code
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Konstantin Seurer
295b67f7bf nir: Allow shaders in tests to be annotated
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Konstantin Seurer
2ed16ed1a6 nir/print: Print annotations as comments
Also prints them in the same line as the instruction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Georg Lehmann
17615b412b nir: prevent undefined behavior in idiv/imod/irem constant folding
Prevents SIGFPE when doing constant evaluation in the upcoming
nir_opt_algebraic_pattern_tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Emma Anholt
feffd0e445 nir: Avoid UB of (int)0xff << 24 evaluating usadd_4x8_vc4.
Caught by UBSan on introduction of nir_opt_algebraic_pattern_test.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Konstantin Seurer
a8224e3e00 nir/opt_algebraic: Do not emit patterns for 64bit booleans
Avoids assertion failures trying to constant-evaluate the pattern with the
new nir_opt_algebraic_pattern_tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:48 +00:00
Konstantin Seurer
211c7db8e3 nir/opt_algebraic: Remove a pattern for 8bit floats
Avoids assertion failures trying to constant-evaluate the pattern with the
new nir_opt_algebraic_pattern_tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:48 +00:00
Emma Anholt
afece95101 nir/opt_algebraic: Fix return type of fdot(vec(a, 0.0, ...), b).
The replace pattern was generating a vector when it should have been
scalar.  Fixes validation failures with the new algebraic unit tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:47 +00:00
Georg Lehmann
9c6d294111 nir/opcodes: use util_max_num/util_min_num for fmin/fmax constant folding.
Hopefully, this is easier to read.

The SPIR-V behavior has also since been clarified to require associativity.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39137>
2026-01-06 10:55:03 +00:00
Georg Lehmann
026d4cd200 nir/opcodes: fix fsat signed zero correctness
fsat(-0.0) must return +0.0.

Cc: mesa-stable

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39137>
2026-01-06 10:55:03 +00:00
Marek Olšák
86b74563a0 nir/clip_cull_distance_utils: add more assertions validating the type & sizes
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39146>
2026-01-05 21:24:10 +00:00
Marek Olšák
bba2536bb0 nir/clip_cull_distance_utils: fix assertion failures with GL_EXT_mesh_shader
Those outputs are never compact in GLSL mesh shaders. The assertions might
not be needed.

Cc: mesa-stable

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39146>
2026-01-05 21:24:10 +00:00
Alyssa Rosenzweig
347a0ac212 panfrost,nir: drop my lonely Authors tags
We all know who wrote a bunch of Panfrost code. No need to repeat this a million
places, the copyright line is plenty.

in cases where there's a joint me & Italo/Eric/.. tag, i've left it alone to
respect others' potential wishes.

$ find . -type f -exec perl -i -p0e 's/ \*\s+\* Author[^\n]+\s+\*\s+Alyssa[^\n]+\n \*\// \*\//' \{} \;

v2: delete more tags (Boris).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39136>
2026-01-05 17:47:52 +00:00
Georg Lehmann
c8ce0df2d2 nir/opt_algebraic: replace is_negative_zero with constant -0.0
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Now that nir_search respects the sign of zero, we don't need
a manual helper for this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123>
2026-01-03 12:42:23 +00:00
Georg Lehmann
0d255011ae nir/search: respect sign of zero when comparing floats
Floating point comparison treats -0.0 and 0.0 as equal,
but do this in nir_search makes patterns signed zero incorrect.

Foz-DB Navi21:
Totals from 1460 (1.16% of 125360) affected shaders:
MaxWaves: 33704 -> 33710 (+0.02%)
Instrs: 2559362 -> 2558823 (-0.02%); split: -0.02%, +0.00%
CodeSize: 14502684 -> 14496352 (-0.04%); split: -0.05%, +0.00%
VGPRs: 71800 -> 71776 (-0.03%)
Latency: 19274782 -> 19274267 (-0.00%); split: -0.01%, +0.00%
InvThroughput: 3307870 -> 3299091 (-0.27%); split: -0.27%, +0.00%
SClause: 158698 -> 158703 (+0.00%); split: -0.00%, +0.00%
Copies: 240291 -> 241003 (+0.30%); split: -0.03%, +0.32%
PreSGPRs: 73203 -> 73206 (+0.00%); split: -0.00%, +0.01%
PreVGPRs: 62515 -> 62508 (-0.01%)
VALU: 1564970 -> 1564331 (-0.04%); split: -0.04%, +0.00%
SALU: 378546 -> 378654 (+0.03%); split: -0.00%, +0.03%

This difference is suprisingly positive, the only patterns affected
did previously signed zero incorrect bcsel -> b2f.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123>
2026-01-03 12:42:23 +00:00
Georg Lehmann
7d2a946730 nir/opt_algebraic: canonicalize scmp with -0.0
We already do this for non fused comparisons.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123>
2026-01-03 12:42:23 +00:00
Georg Lehmann
2824c12252 nir/opt_algebraic: explicitly add some -0.0 variants of patterns
Foz-DB Navi21:
Totals from 5 (0.00% of 125360) affected shaders:
CodeSize: 28812 -> 28744 (-0.24%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39123>
2026-01-03 12:42:23 +00:00
Timur Kristóf
2ecb7a9e18 nir: Add pass to lower workgroup size
Lowers a shader to use a smaller workgroup to do the same work,
while it will still appear as a bigger workgroup to applications.

To achieve this, the pass augments the CF of the shader
so that each real subgroup will execute two or more logical
subgroups. A logical subgroup represents what the application
can observe as a subgroup.

The size of a logical subgroup is the same as a real subgroup.
Only one logical subgroup may be executed per real subgroup
at the same time. This ensures that all subgroup operations
keep working and the subgroup invocation ID stays the same.

- When the CF contains barriers, we can't just repeat
  the code and we need to augment each CF node individually
  so that they are aware of logical subgroups.

- In case parts of the CF don't contain any barriers, we can simply
  repeat and predicate that CF for each logical subgroup.
  It is technically not necessary to implement this strategy, but
  in practice it helps reduce the amount of branches in the shader
  and therefore improves compile times.

The pass is mainly intended for working around HW limitations,
for example when the HW has an upper limit on the workgroup size
or doesn't support workgroups at all, but the API requires a
certain minimum.

Notes:
- Only applicable to shader stages that use workgroups
- Hits an assertion when called on smaller workgroups
- Always flattens workgroup size to 1D
- Creates local variables
- Does not change subgroup size
- Variable workgroup size not supported yet, maybe later

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Anna Maniscalco <anna.maniscalco2000@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37985>
2026-01-02 13:33:54 -06:00