Rhys Perry
ae6ad8977b
nir/uub: improve iand with constant sources
...
fossil-db (navi21):
Totals from 9 (0.01% of 79653) affected shaders:
Instrs: 11878 -> 11868 (-0.08%)
CodeSize: 61572 -> 61508 (-0.10%)
Latency: 44585 -> 44581 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 9697 -> 9660 (-0.38%)
VALU: 8889 -> 8876 (-0.15%)
SALU: 1339 -> 1342 (+0.22%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35514 >
2025-06-17 13:27:59 +00:00
Rhys Perry
8ee5440073
nir/uub: improve ishl/imul with constant sources
...
fossil-db (navi21):
Totals from 1 (0.00% of 79653) affected shaders:
Instrs: 1339 -> 1338 (-0.07%)
CodeSize: 7244 -> 7240 (-0.06%)
Latency: 19827 -> 19822 (-0.03%)
InvThroughput: 9913 -> 9911 (-0.02%)
SALU: 419 -> 418 (-0.24%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35514 >
2025-06-17 13:27:59 +00:00
Olivia Lee
88ac602cc2
panvk: implement shaderInputAttachmentArrayNonUniformIndexing
...
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35408 >
2025-06-13 19:02:19 +00:00
Christian Gmeiner
b30b87c096
nir/inline_uniforms: Convert to use nir_shader_intrinsics_pass(..)
...
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35463 >
2025-06-12 22:35:48 +02:00
Marek Olšák
fa2e7c3dfd
nir: return progress from nir_group_loads, nir_inline_uniforms
...
so that NIR_PASS is usable
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:37 +00:00
Marek Olšák
0cbcb72869
nir/opt_vectorize_io: work around a 16-bit IO bug for RADV
...
If nir_opt_vectorize_io isn't called, 16-bit IO is broken.
This is a workaround to keep RADV working and consume incorrect NIR
while other drivers consume correct NIR.
Hopefully this will be removed ASAP.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:37 +00:00
Marek Olšák
b636e5ca66
nir: add nir_clear_mediump_io_flag
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:36 +00:00
Marek Olšák
13005d5e4e
nir/xfb_info: don't merge incompatible XFB outputs to fix mediump
...
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:36 +00:00
Marek Olšák
118c0e6991
nir/opt_vectorize_io: fix vectorizing 16-bit XFB
...
Tested with mediump.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:36 +00:00
Marek Olšák
caddd67b8c
nir/opt_vectorize_io: don't vectorize 16-bit IO to vec8 - it's illegal
...
NIR represents low bits of 16-bit IO as a separate vec4, and high bits as
another separate vec4. There is no representation that allows vec8.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:35 +00:00
Marek Olšák
1f80ff5550
nir/opt_vectorize_io: convert bool merge_low_high_16_to_32 to an enum
...
refactoring for the next commit
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:35 +00:00
Marek Olšák
6270136b7d
nir/opt_varyings: set prev_stage/next_stage if they are NONE and validate them
...
Doing it here ensures that any linked shader will have the correct values
there.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:34 +00:00
Marek Olšák
e3d122ed7b
nir/opt_varyings: completely exclude mediump from type changes
...
It broke mediump XFB, which needs the correct type for the up-conversion.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:34 +00:00
Marek Olšák
aba7b0831c
nir: add shader_info::prev_stage
...
When lowering mediump to 16 bits, this will allow drivers to enable
the lowering only for certain pairs of stages, e.g. a driver can lower
mediump for VS->FS, but not GS->FS.
This could also be useful for other things.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315 >
2025-06-12 19:35:33 +00:00
Georg Lehmann
ad80b554f4
spirv: use feq for OpIsInf
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
This effectively reverts fcca6a83cd because feq was clarified to be ordered
when used with exact and without fast math flags.
It's common for HW to only have free abs for floating point instructions.
Foz-DB Navi21:
Totals from 63 (0.08% of 80065) affected shaders:
Instrs: 337027 -> 336667 (-0.11%); split: -0.12%, +0.02%
CodeSize: 1846752 -> 1845000 (-0.09%); split: -0.13%, +0.03%
Latency: 3401087 -> 3400633 (-0.01%); split: -0.04%, +0.03%
InvThroughput: 847299 -> 845939 (-0.16%); split: -0.19%, +0.03%
VClause: 7693 -> 7694 (+0.01%)
Copies: 45175 -> 45240 (+0.14%); split: -0.12%, +0.27%
PreSGPRs: 3555 -> 3553 (-0.06%)
PreVGPRs: 4565 -> 4564 (-0.02%)
VALU: 225473 -> 225245 (-0.10%); split: -0.13%, +0.03%
SALU: 44735 -> 44625 (-0.25%)
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35437 >
2025-06-11 18:34:21 +00:00
Marek Olšák
f2c48652da
nir: add shader_info::tess::tcs_*outputs_read_by_tes*
...
Gather no_varying for AMD.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780 >
2025-06-07 16:29:39 +00:00
Marek Olšák
a59464b6e3
radv,radeonsi: precompute and pass TCS per-vertex output stride via a user SGPR
...
It's a stride of 1 output, which isn't 16. It's 16 * num_threads,
aligned to 256.
tcs_offchip_layout has 5 unused bits, so let's use them.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780 >
2025-06-07 16:29:39 +00:00
Marek Olšák
534b282573
ac/nir/tess: adjust memory layout of TCS outputs to have aligned store offsets
...
There is a comment that explains it.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34780 >
2025-06-07 16:29:38 +00:00
Mel Henning
d15b5fadbb
nir/divergence_analysis: Update LCSSA comment
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35271 >
2025-06-06 18:15:05 +00:00
Karol Herbst
33fb1eca3e
nir/scale_fdiv: handle fp16 fdiv
...
Not strictly scaling, but we upcast fo fp32, do the fdiv there and cast
back again.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34053 >
2025-06-05 13:17:27 +00:00
Mike Blumenkrantz
208450fc57
nir/lower_to_scalar: fix opt_varying with output reads
...
no_varying cannot be used to eliminate stores on locations which may
be subsequently read
Fixes: 0058989357 ("nir/lower_io_to_scalar: don't create output stores that have no effect")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35325 >
2025-06-04 18:21:16 +00:00
Marek Olšák
c3034fa82c
amd: replace most u_bit_consecutive* with BITFIELD_MASK/RANGE
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35346 >
2025-06-04 17:46:38 +00:00
Lionel Landwerlin
978933c015
nir/opt_algebraic: extend lowering for (i|u)bitfield_extract
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35334 >
2025-06-04 16:28:39 +00:00
Georg Lehmann
1c4070f3e9
nir/opt_if: limit rewrite_uniform_uses iand recursion
...
https://github.com/doitsujin/dxvk/issues/4970 has a shader
where unrolled loops caused large iand chains and if we don't
limit this we won't finish compiling in reasonable time.
Cc: mesa-stable
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35312 >
2025-06-04 10:49:05 +00:00
Georg Lehmann
eaeaf9554d
nir/opt_if: don't replace constant uses with other uniform values
...
If constant folding wasn't run, this could replace constant uses with different
constants.
Additional, it could also create worse code for "if (subgroupXor(1) == 1)".
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13281
Cc: mesa-stable
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35312 >
2025-06-04 10:49:05 +00:00
Samuel Pitoiset
226b0e28db
nir: generalize bitfield insert/extract sizes
...
Original patch from Alyssa Rosenzweig
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35209 >
2025-06-04 09:37:53 +00:00
Caio Oliveira
542836afe5
intel: Don't require dpas_intel src2 to match destination
...
With upcoming configurations, the number of elements in the src2
slice might not match the destination.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35301 >
2025-06-03 21:31:23 +00:00
Rhys Perry
dd45bf5bce
nir/load_store_vectorize: stabilize entry sort
...
I think this was unlikely to cause issues, even if the qsort()
implementation is unstable.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255 >
2025-06-03 09:45:01 +00:00
Rhys Perry
397920c16e
nir: fix left shift of negative value in ibfe constant folding
...
Fixes "left shift of negative value -128" with parallel_rdp/00f93a9497dfbb3b
and UBSan.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255 >
2025-06-03 09:45:01 +00:00
Rhys Perry
78aae4b1ba
nir: fix signed overflow in pack_half_2x16 constant folding
...
Without this cast, the left shift is promoted to 'int'.
Fixes "left shift of 50432 by 16 places cannot be represented in type 'int'"
with horizon_zero_dawn/001064f580f8e3be and UBSan.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255 >
2025-06-03 09:45:01 +00:00
Rhys Perry
6852538ba0
nir: fix unpack_unorm_2x16/unpack_snorm_2x16 constant folding
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.0
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35255 >
2025-06-03 09:45:01 +00:00
Marek Olšák
bf2ed20eb9
nir: remove unused nir_io_semantics::invariant
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Acked-by: Alyssa on IRC
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35256 >
2025-06-02 23:08:58 +00:00
Marek Olšák
44fcda9631
nir/opt_clip_cull_const: support GS
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35256 >
2025-06-02 23:08:58 +00:00
Marek Olšák
6677d087c0
nir/xfb_info: add new fields to describe 16-bit XFB better
...
for drivers that need this information
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35256 >
2025-06-02 23:08:58 +00:00
Marek Olšák
7b70b419b5
nir: always index SSA defs before printing
...
This makes the output more readable.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35256 >
2025-06-02 23:08:58 +00:00
Marek Olšák
cf94ae8544
nir: change the type of shader_info::patch_* fields to 32 bits
...
Patch outputs only use 32 bits.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35256 >
2025-06-02 23:08:58 +00:00
Jesse Natalie
f0dde6ca7f
nir_gather_output_deps: Fix incorrect enum in switch
...
Cc: mesa-stable
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35247 >
2025-05-30 17:04:18 +00:00
Lionel Landwerlin
f0e18c475b
intel: remove GRL/intel-clc
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35227 >
2025-05-29 20:17:13 +00:00
Samuel Pitoiset
cecf6675be
nir/lower_int64: add bitfield_extract lowering
...
This will be used by RADV for ACO/LLVM.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35187 >
2025-05-29 08:45:40 +02:00
Alyssa Rosenzweig
d696b19dd0
nir/lower_int64: add bitfield_reverse lowering
...
now that we can represent 64-bit bitfield_reverse in NIR, we need a lowering for
it as well.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35198 >
2025-05-28 16:29:30 +00:00
Alyssa Rosenzweig
c3fb0645d8
nir/lower_alu: compact bitcount lowering
...
while in the area.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35198 >
2025-05-28 16:29:30 +00:00
Alyssa Rosenzweig
759dc70bde
nir: generalize bitfield_reverse bit size
...
No reason we can't reverse other bit sizes, we just need to generalize the
constant folding & bit size lowering.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35198 >
2025-05-28 16:29:30 +00:00
Marek Olšák
35c76bc7f7
nir/tcs_info: use range analysis to determine the range of tess levels
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35195 >
2025-05-28 06:46:56 +00:00
Marek Olšák
24c3f30e4a
nir/tcs_info: gather which patch outputs are only read/written by invoc 0
...
Tested thoroughly by a shader test.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35195 >
2025-05-28 06:46:56 +00:00
Marek Olšák
a3632d7d88
nir/tcs_info: gather for all patch outputs whether they're written by all invocs
...
This substantially rewrites the pass. It also makes it easier to read.
Tested thoroughly by a shader test.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35195 >
2025-05-28 06:46:56 +00:00
Lorenzo Rossi
2c0d0bad01
nak: Remove unused intrinsic image_load_raw_nv
...
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34975 >
2025-05-28 01:47:19 +00:00
Lorenzo Rossi
5fbcdd6e32
nir,nak: Add NV-specific image intrinsics
...
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34975 >
2025-05-28 01:47:19 +00:00
Lorenzo Rossi
47f6c74b71
nir,nak: Add KeplerB shared atomics intrinsics and lowering
...
Kepler cards do not support shared atomic operations directly, but they
have special ldslk and stsul that can implement mutex locks on
addresses. Shared atomics can be lowered into operations in mutexes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35028 >
2025-05-26 16:29:05 +00:00
Qiang Yu
6f2a1e19da
nir/opt_varyings: fix mesh shader miss promote varying to flat
...
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
We still allow mesh shader promote constant output to flat, but
mesh shader like geometry shader may store multi vertices'
varying in a single thread. So mesh shader may store different
constant values to different vertices in a single thread, we
should not promote this case to flat.
I'm not using shader_info.mesh.ms_cross_invocation_output_access
because OpenGL does not require IO to have explicit location, so
when nir_shader_gather_info is called in OpenGL GLSL compiler to
compute ms_cross_invocation_output_access, some implicit output
has -1 location which causes ms_cross_invocation_output_access
unset for it.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13134
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35081 >
2025-05-26 02:07:50 +00:00
Eric Engestrom
162f1f5566
delete xa leftovers
...
Fixes: 3be2c47db2 ("delete the XA frontend")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35136 >
2025-05-23 18:54:04 +00:00