Commit graph

242 commits

Author SHA1 Message Date
Emma Anholt
ce7ad2639a nir: Fix C UB in imad24_ir3 evaluation.
Same fix as imul24, technically you can't shift into the top bit of the
int32, but the util helper does it right.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39369>
2026-01-26 05:39:39 +00:00
Rhys Perry
625afb0d29 nir: add fcanonicalize
v2(Georg Lehmann):
Always remove fcanonicalize if denorms must be neither flushed nor preserved.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39180>
2026-01-19 16:11:29 +00:00
Emma Anholt
7dbd170a7f nir/opcodes: Cast isub/iadd3's args to uint to avoid UB integer underflow.
Same treatment as iadd itself got.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:37 +00:00
Emma Anholt
8529aaa399 nir/opcodes: Avoid technical UB left shifting ints.
We all know that (int)0xff << 24 is fine, but UBSan doesn't like it.
These were triggered by nir_opt_algebraic_pattern_tests.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:37 +00:00
Konstantin Seurer
079d416e99 nir: Fix the types of udot_.*_uadd_sat
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:37 +00:00
Emma Anholt
0dc3276a26 nir: Define udot_2x16_uadd_sat to have UB according to the SPIRV spec.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:32 +00:00
Emma Anholt
f638eb1b85 nir: Define extract/insert_i8 and friends to be UB if the shift is too large.
These opcodes are generated inside NIR algebraic when the shift is
constant, but this will help us do automated algebraic pattern testing
with arbitrary inputs that are unaware of the opcode's restrictions.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:32 +00:00
Emma Anholt
045ae759a5 nir: Specify f2i/f2u as undefined if the float is out of range of the int.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:32 +00:00
Emma Anholt
94f0e2dbaf nir/constant_expressions: Set the poison flag during i/ubitfield_extract.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:32 +00:00
Emma Anholt
b375da7f2a nir: Let nir_eval_const_opcode() return a poison mask in case of UB.
This is unused by any callers currently, but will be useful for nir
algebraic pattern testing, and as a way to turn our comments in
nir_opcodes.py into actual C code.  For now, always returns false.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:32 +00:00
Emma Anholt
f6008645f6 nir: Fix constant evaluation of non-32-bit bitfield_extract.
Caught by nir_opt_algebraic_pattern_tests.

Fixes: 226b0e28db ("nir: generalize bitfield insert/extract sizes")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39076>
2026-01-15 19:09:29 +00:00
Georg Lehmann
631a7ef92a nir: make fquantize2f16 32bit only
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39266>
2026-01-14 17:05:24 +00:00
Konstantin Seurer
6d9cd36db6 nir: Add f2f16_ru/rd opcodes
Those are variants of f2f16 that always round up/down. Constant folding
requires nextafter that supports half floats (util_nextafter).

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37883>
2026-01-10 11:33:23 +01:00
Georg Lehmann
17615b412b nir: prevent undefined behavior in idiv/imod/irem constant folding
Prevents SIGFPE when doing constant evaluation in the upcoming
nir_opt_algebraic_pattern_tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Emma Anholt
feffd0e445 nir: Avoid UB of (int)0xff << 24 evaluating usadd_4x8_vc4.
Caught by UBSan on introduction of nir_opt_algebraic_pattern_test.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39184>
2026-01-06 21:27:49 +00:00
Georg Lehmann
9c6d294111 nir/opcodes: use util_max_num/util_min_num for fmin/fmax constant folding.
Hopefully, this is easier to read.

The SPIR-V behavior has also since been clarified to require associativity.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39137>
2026-01-06 10:55:03 +00:00
Georg Lehmann
026d4cd200 nir/opcodes: fix fsat signed zero correctness
fsat(-0.0) must return +0.0.

Cc: mesa-stable

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39137>
2026-01-06 10:55:03 +00:00
Job Noorman
0b82b803d9 nir,ir3: rename umul_low to umul_16x16
This is more in line with similar opcodes like umul_32x16.

Also change its const expr: the masking based on bit size was
unnecessary as it is only defined for 32 bits. Use simple casts instead.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37863>
2025-10-14 12:54:54 +00:00
Ian Romanick
986086c846 nir: Add saturating float to integer conversion opcodes
v2: Add a comment around has_f2[ui]_sat explaining which opcodes it
enables. Suggested by Georg. Cast u_uintN_max and friends to double in
nir_opcodes.py. This ensures that an exact conversion is made.
Eliminate duplicate conversions from half float to double. Both noticed
by Georg.

v3: Apply "NaN should be zero" fix suggested by Georg.

Co-authored-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37186>
2025-10-10 17:25:05 +00:00
Ella Stanforth
082e6369f9 nir: add v3d specific intrinsic normalised to float conversion
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35820>
2025-09-30 12:48:42 +00:00
Pierre-Eric Pelloux-Prayer
cc4b50b023 nir/opcodes: use u_overflow to fix incorrect checks
Operands of an addition will be promoted to int making the a+b<a
kind of checks ineffective.

Use u_overflow.h helpers to perform the check correctly.
The commit would be simpler if it used __typeof__ like so:

   util_add_check_overflow(__typeof__(src0), src0, src1)

But typeof only became a standard in C23 so this commit instead extends
nir_opcodes a bit to allow opcodes that need the dest_type to get it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37331>
2025-09-23 09:09:55 +02:00
Simon Perretta
6dd0a5ee2d pvr, pco: switch to clc query shaders
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37439>
2025-09-22 14:52:04 +01:00
Simon Perretta
6edb72d28b pco: replace {un,}packing alu ops with intrinsics
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:19 +00:00
Simon Perretta
8104ef4e01 pco: support 1010102 snorm, [us]scaled formats
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:19 +00:00
Simon Perretta
672541d036 nir, asahi: commonize interleave_agx
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:12 +00:00
Simon Perretta
78062fbb75 pvr, pco: improved image write (with format) support, handle 111110
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:11 +00:00
Simon Perretta
ed652e10fc pco: force image/texture array coordinate f2i32 conversions to be rtne
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:11 +00:00
Simon Perretta
b50f0b47d2 pco: add support for sscaled8* formats
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:09 +00:00
Simon Perretta
db686e190a pvr, pco: per frag/vertex input/output rework
Adds support for packing and unpacking r10g10b10a2 unorm and
r11g11b10 float formats, as well as partial 2x16 and 4x8 formats.

Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:09 +00:00
Simon Perretta
b7c0863b97 pco: add uadd64_32 op
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:08 +00:00
Simon Perretta
8ec174b3f9 pco: add support for various selection, complex, trig ops
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36412>
2025-09-16 18:26:08 +00:00
Alyssa Rosenzweig
b9c2579ae0 nir: unmark 24b multiply as associative
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:19 +00:00
Alyssa Rosenzweig
076f245df8 nir: restrict associativity to binary operations
mathemtically, associativity is only defined for binary operations. I have no
idea what "associativity" would even mean for imad. I can kinda see the idea for
iadd3 but iadd3 should not be formed until after reassociating adds so the point
is moot. Unmark the
"associative" ternary operations, and assert that associativity implies binary.

nothing uses associativity yet, so this doesn't cause any functional change.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:19 +00:00
Alyssa Rosenzweig
e466b8735b nir: introduce "inexact associative" property
nothing currently uses the associative flag, but they will change soon. we need
to stop incorrectly marking fmul/fadd/etc as associative, because they're not,
but they almost are. distinguish these properties so we can correctly
handle floating point rules without any opcode-based special casing.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:19 +00:00
Georg Lehmann
d672737372 nir,aco: add byte_perm_amd
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36115>
2025-07-16 11:46:52 +00:00
Georg Lehmann
f047a67fba nir,aco: optimize FP16_OFVL pattern created by vkd3d-proton
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:27 +00:00
Georg Lehmann
5addbf63f9 nir: add float8 conversion opcodes
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:24 +00:00
Samuel Pitoiset
226b0e28db nir: generalize bitfield insert/extract sizes
Original patch from Alyssa Rosenzweig

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35209>
2025-06-04 09:37:53 +00:00
Rhys Perry
397920c16e nir: fix left shift of negative value in ibfe constant folding
Fixes "left shift of negative value -128" with parallel_rdp/00f93a9497dfbb3b
and UBSan.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>
2025-06-03 09:45:01 +00:00
Rhys Perry
78aae4b1ba nir: fix signed overflow in pack_half_2x16 constant folding
Without this cast, the left shift is promoted to 'int'.

Fixes "left shift of 50432 by 16 places cannot be represented in type 'int'"
with horizon_zero_dawn/001064f580f8e3be and UBSan.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>
2025-06-03 09:45:01 +00:00
Rhys Perry
6852538ba0 nir: fix unpack_unorm_2x16/unpack_snorm_2x16 constant folding
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255>
2025-06-03 09:45:01 +00:00
Alyssa Rosenzweig
759dc70bde nir: generalize bitfield_reverse bit size
No reason we can't reverse other bit sizes, we just need to generalize the
constant folding & bit size lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35198>
2025-05-28 16:29:30 +00:00
Georg Lehmann
ba63263f32 nir: add bfdot2_bfadd and use it for lowering bfdot if supported
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34768>
2025-05-09 11:20:26 +00:00
Caio Oliveira
cf4021f93c nir: Add opcodes for BFloat16
SPV_KHR_bfloat16 requires a small set of operations,
since it doesn't support all the arithmetic ops.

This patch adds conversions to/from Float32 and also
the necessary ops (bfdot, bffma, bfmul) to implement
SpvOpDot using the same lowering approach than the
Float32 counterpart.

Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34105>
2025-04-29 16:29:36 +00:00
Benjamin Lee
252c59602e panfrost: implement 16-bit ldexp
Bifrost LDEXP.v2f16 takes a 16-bit exponent, which requires messy
lowering. The codegen for this is quite bad currently, but would be
improved by implementing unpack_32_2x16_split_*, and by fusing
comparisons with CSEL.

The main alternative is converting to F32, then LDEXP.f32, then
converting back to F16. This has better codegen for dynamic exponents
currently, but worse in the common case with a constant exponent where
all the saturating cast logic can be folded.

Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.ldexp.compute.vec2
when shaderFloat16 is enabled in panvk.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33637>
2025-02-27 16:49:11 +00:00
Mel Henning
11b8c8b8e6 nak,nir: Add 64-bit lea_nv
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32517>
2025-02-13 17:36:41 +00:00
Mel Henning
0470643047 nak,nir: Add 32-bit nir_op_lea_nv and use it
Changes code size by -0.80% on shaderdb.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32517>
2025-02-13 17:36:41 +00:00
Alyssa Rosenzweig
bd89279dd4 nir: add lower_scratch_to_var pass
to ease opencl pain.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32529>
2024-12-12 21:16:13 +00:00
Karmjit Mahil
b79994e92d nir,ir3: Add icsel_eqz
In IR3 `sel.b32` works based on the 0 so add `icsel_eqz` to fuse the
cmp and sel that we'd otherwise need.

total Instruction Count in shared programs: 1112814 -> 1110473 (-0.21%)
Instruction Count in affected programs: 162701 -> 160360 (-1.44%)
helped: 81
HURT: 29
Instruction count are helped.

total MOV Count in shared programs: 86777 -> 88671 (2.18%)
MOV Count in affected programs: 28119 -> 30013 (6.74%)
helped: 1
HURT: 292
Mov count are HURT.

total COV Count in shared programs: 15070 -> 14962 (-0.72%)
COV Count in affected programs: 5770 -> 5662 (-1.87%)
helped: 76
HURT: 2
Cov count are helped.

total Last helper instruction in shared programs: 592729 -> 590638 (-0.35%)
Last helper instruction in affected programs: 91331 -> 89240 (-2.29%)
helped: 30
HURT: 1
Last helper instruction are helped.

total Instructions with SS sync bit in shared programs: 29336 -> 29546 (0.72%)
Instructions with SS sync bit in affected programs: 4702 -> 4912 (4.47%)
helped: 8
HURT: 43
Instructions with ss sync bit are HURT.

total Estimated cycles stalled on SS in shared programs: 111590 -> 112401 (0.73%)
Estimated cycles stalled on SS in affected programs: 27708 -> 28519 (2.93%)
helped: 21
HURT: 61
Estimated cycles stalled on ss are HURT.

total cat1 instructions in shared programs: 101933 -> 103695 (1.73%)
cat1 instructions in affected programs: 35804 -> 37566 (4.92%)
helped: 18
HURT: 290
Cat1 instructions are HURT.

total cat2 instructions in shared programs: 380299 -> 377499 (-0.74%)
cat2 instructions in affected programs: 128609 -> 125809 (-2.18%)
helped: 322
HURT: 0
Cat2 instructions are helped.

Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32189>
2024-12-06 08:42:36 +00:00
Job Noorman
22fc90a116 nir: add ir3-specific bitwise triop opcodes
ir3 has a number of bitwise triops (e.g., shrm == (src0 >> src1) & src2)
that don't have NIR-equivalents. Doing instruction selection for them is
a lot more convenient using algebraic patterns than to have to manually
match for them. This patch add NIR opcodes for these instructions.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32181>
2024-11-28 06:19:59 +00:00