Commit graph

6280 commits

Author SHA1 Message Date
Rob Clark
6f5ff6be44 nir: Fix lower_readonly_images_to_tex bitsize
The txf instruction could be returning something smaller than 32b.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35758>
2025-06-26 02:48:16 +00:00
Georg Lehmann
e6d208b1f9 nir/opt_shrink_vectors: also split vecs into distinct smaller vecs if possible
Foz-DB Navi48:
Totals from 17 (0.02% of 80265) affected shaders:
Instrs: 75085 -> 74912 (-0.23%); split: -0.23%, +0.00%
CodeSize: 428968 -> 427028 (-0.45%); split: -0.45%, +0.00%
Latency: 1306841 -> 1306080 (-0.06%); split: -0.06%, +0.00%
InvThroughput: 598998 -> 598719 (-0.05%)
Copies: 15733 -> 15561 (-1.09%)
Branches: 2435 -> 2422 (-0.53%)
PreVGPRs: 1723 -> 1721 (-0.12%)
VALU: 43019 -> 42847 (-0.40%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35676>
2025-06-25 05:34:48 +00:00
Georg Lehmann
22d7dd69b2 nir/shrink_vectors: shrink larger vectors too
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35676>
2025-06-25 05:34:48 +00:00
Matt Turner
6100dbc3d0 compiler: Generate files with newline at end
These generator scripts use the `write` function that, unlike `print`,
doesn't print a trailing newline. So let's add one to the template.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35697>
2025-06-24 14:01:04 +00:00
Georg Lehmann
b729ad1742 nir/loop_analyze: consider movs/vecs free
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
They are free more likely than not.

Foz-DB Navi31:
Totals from 462 (0.58% of 80251) affected shaders:
Instrs: 1464013 -> 1868466 (+27.63%)
CodeSize: 8476352 -> 10745544 (+26.77%)
VGPRs: 27412 -> 27560 (+0.54%)
SpillSGPRs: 0 -> 16 (+inf%)
SpillVGPRs: 83 -> 76 (-8.43%)
Scratch: 6072832 -> 6071808 (-0.02%)
Latency: 19282476 -> 19552323 (+1.40%); split: -0.11%, +1.51%
InvThroughput: 2198357 -> 2258490 (+2.74%); split: -0.47%, +3.21%
VClause: 32986 -> 43491 (+31.85%)
SClause: 72760 -> 126112 (+73.33%)
Copies: 165286 -> 223368 (+35.14%)
Branches: 60530 -> 79743 (+31.74%); split: -0.03%, +31.77%
PreSGPRs: 24885 -> 25077 (+0.77%)
PreVGPRs: 23004 -> 22494 (-2.22%); split: -2.26%, +0.04%
VALU: 760978 -> 898136 (+18.02%)
SALU: 187786 -> 252995 (+34.73%); split: -0.03%, +34.75%
VMEM: 58469 -> 69164 (+18.29%); split: -0.07%, +18.36%
SMEM: 87926 -> 158175 (+79.90%); split: -0.00%, +79.90%
VOPD: 580 -> 732 (+26.21%); split: +31.38%, -5.17%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>
2025-06-24 12:18:47 +00:00
Georg Lehmann
b1290fdf20 nir/loop_analyze: handle vector selections properly
Consider all conditions, not just the first.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>
2025-06-24 12:18:47 +00:00
Georg Lehmann
47aba15489 nir/loop_analyze: always consider comparisions between induction var and constant free
There is no reason why this should be restricted to single uses.

Foz-DB Navi31:
Totals from 21 (0.03% of 80251) affected shaders:
Instrs: 54424 -> 65851 (+21.00%)
CodeSize: 286688 -> 346896 (+21.00%)
Latency: 2980310 -> 2959904 (-0.68%)
InvThroughput: 403744 -> 400782 (-0.73%)
VClause: 923 -> 1316 (+42.58%)
SClause: 1217 -> 1705 (+40.10%)
Copies: 3226 -> 3393 (+5.18%); split: -0.87%, +6.04%
Branches: 1014 -> 1130 (+11.44%); split: -0.39%, +11.83%
PreSGPRs: 1327 -> 1306 (-1.58%)
PreVGPRs: 1896 -> 1868 (-1.48%)
VALU: 36083 -> 43560 (+20.72%)
SALU: 4471 -> 4708 (+5.30%); split: -2.75%, +8.05%
VMEM: 2225 -> 2743 (+23.28%)
SMEM: 1662 -> 2273 (+36.76%); split: -0.06%, +36.82%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35686>
2025-06-24 12:18:47 +00:00
Georg Lehmann
8c4225b99b nir: add cmat_transpose
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34793>
2025-06-24 07:14:34 +00:00
Alyssa Rosenzweig
d37bf148d2 nir/lower_blend: fix snorm factor clamping
The spec says (emphasis mine):

  If the color attachment is fixed-point, the components of the source and
  destination values **AND BLEND FACTORS** are each clamped to [0,1] or [-1,1]
  respectively for an unsigned normalized or signed normalized color attachment
  prior to evaluating the blend operations. If the color attachment is
  floating-point, no clamping occurs.

However, neither the CTS nor any hardware implement this semantic.

For unsigned normalized formats, the definitions are roughly equivalent (except
perhaps around constant colours). 0 <= x <= 1 implies that 0 <= 1 - x <= 1.
Therefore if the source/destination colours are clamped to [0, 1], then their
complements are also in [0, 1], so clamping any blend factor (except constant
colour) has no effect if the source/dest were already clamped.

For signed normalized formats, however, this difference matters. -1 <= x <= 1
implies that 0 <= 1 - x <= 2... so to implement the spec text faithfully, we
would need to clamp again the complemented colour blend factors to return back
to signed normalized range. Software blending implementations can of course do
that... but doing so causes CTS fails, as the CTS reference renderer does not do
this.

This commit adjusts nir_lower_blend to match what actual hardware does, what CTS
requires, and what the spec should have said.

See https://gitlab.khronos.org/vulkan/vulkan/-/issues/4293 for the spec
resolution.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35519>
2025-06-23 19:38:27 +00:00
Emma Anholt
bc8994cb48 nir: Add a pass to reassociate multiplication of mat*mat*vec.
The typical case of mat4*mat4*vec4 is 80 scalar multiplications, but
mat4*(mat4*vec4) is only 32.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35622>
2025-06-23 17:49:51 +00:00
Timothy Arceri
21ea8c205f nir: raise NIR_SEARCH_MAX_VARIABLES limit to 24
This is required to process the pattern in the following patch.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35622>
2025-06-23 17:49:51 +00:00
Faith Ekstrand
bb4c5edda1 nir: Add more tex_src helpers
This adds a variant of nir_steal_tex_src() which is for derefs as well
as versions that just return the source without removing it.

Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35623>
2025-06-23 14:25:30 +00:00
Faith Ekstrand
2b40fa09f2 nir: Move nir_steal_tex_src() to nir.h
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35623>
2025-06-23 14:25:30 +00:00
David Neto
bff2b1b947 mesa: flush stderr when dumping nir validation errors
When dumping nir validation errors, flush stderr before
calling abort. Otherwise the errors might not be emitted.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35665>
2025-06-23 13:36:09 +00:00
Georg Lehmann
f047a67fba nir,aco: optimize FP16_OFVL pattern created by vkd3d-proton
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:27 +00:00
Georg Lehmann
30ec9ed1cf spirv,nir: emit saturating float8 cmat convert
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:24 +00:00
Georg Lehmann
5addbf63f9 nir: add float8 conversion opcodes
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:24 +00:00
Georg Lehmann
9da23499ff compiler: add float8 glsl types
e4m3fn: 8bit floating point format with 4bit exponent, 3bit mantissa
        and no infinities (finite only)
e5m2:   8bit floating point format with 5bit exponent, 2bit mantissa
        and with infinities.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35434>
2025-06-23 07:59:24 +00:00
Lionel Landwerlin
50dab62f57 nir/opt_offsets: add support for intel intrinsics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
16fca611d7 nir: add new intel ssbo intrinsics
Similar to ir3 ones, to optimize offsets in the backend.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Lionel Landwerlin
ba119c73c6 intel: replace RANGE_BASE by BASE for uniform block loads
We're not currently using RANGE_BASE and we'll use BASE for offset
optimizations on Xe2+.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:23 +00:00
Rohan Garg
909ec6ff1f nir/lower_io: add io_offset support for more intrinsics
This will be used by upcoming changes in the intel compiler.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:22 +00:00
Lionel Landwerlin
8ea124f877 nir/divergence: add missing intel intrinsics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35252>
2025-06-22 10:55:21 +00:00
Alyssa Rosenzweig
5795c8595f nir: model dynamic uniform layout on hk
add some new intrinsics so we can defer lowering until we have the information.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35658>
2025-06-20 18:46:13 +00:00
Alyssa Rosenzweig
b0f8c22682 nir/opt_sink: sink agx backfacing
helps an elden ring shader:

Totals from 1 (0.03% of 3206) affected shaders:
Instrs: 4146 -> 4141 (-0.12%)
CodeSize: 27374 -> 27334 (-0.15%)
ALU: 2554 -> 2549 (-0.20%)
FSCIB: 2554 -> 2549 (-0.20%)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35559>
2025-06-20 16:09:28 +00:00
Alyssa Rosenzweig
3eeba6efdd nir/opt_preamble: hoist reorderable SSBO loads on AGX
basically asahi version of 2f93137308 ("nir/opt_preamble: Handle
load_global_ir3"). elden ring:

Totals from 2409 (75.14% of 3206) affected shaders:
MaxWaves: 2068416 -> 2081984 (+0.66%); split: +0.74%, -0.08%
Instrs: 2439078 -> 1849792 (-24.16%)
CodeSize: 15570886 -> 12196612 (-21.67%)
Spills: 11623 -> 11678 (+0.47%); split: -0.63%, +1.10%
Fills: 9815 -> 9762 (-0.54%); split: -1.37%, +0.83%
Scratch: 31200 -> 31328 (+0.41%); split: -0.23%, +0.64%
ALU: 1154256 -> 1038680 (-10.01%); split: -10.22%, +0.21%
FSCIB: 1154256 -> 1038479 (-10.03%); split: -10.24%, +0.21%
IC: 248318 -> 237344 (-4.42%); split: -4.47%, +0.05%
GPRs: 266323 -> 254621 (-4.39%); split: -4.72%, +0.33%
Uniforms: 639557 -> 794085 (+24.16%); split: -0.07%, +24.23%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reacted-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35559>
2025-06-20 16:09:28 +00:00
Faith Ekstrand
9f9cde04ec nir: Add a new load_input_attachment_coord intrinsic
This hoists all the annoyance of figuring out the current pixel's input
attachment coordinates to the driver.  The pass still deals with all the
annoyance of turning an image instruciton into a texture instruction but
it gives the driver more control over the position.  For most drivers,
this will be something like ivec3(int(gl_FragCoord.xy), gl_Layer) or
similar, some drivers need something more nuanced.  Turnip, for
instance, needs unscaled coordinates for some attachments and NVK
doesn't really want gl_Layer or gl_ViewIndex for the layer.  It's better
to just have a new system value that drivers can make what they want.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
2025-06-19 02:14:04 +00:00
Faith Ekstrand
2c13e1e655 nir/lower_input_attachments: Don't ignore tex coordinates
The SPIR-V spec is pretty clear that coordinates on subpass attachments
are relative to the current pixel.  They're required to be zero but we
should stay consistent with ourselves (we already do this for image
intrinsics) and with the spec.

Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
2025-06-19 02:14:04 +00:00
Faith Ekstrand
9a52b9372c nir/lower_input_attachments: Stop assuming tex src indices
There's nothing in NIR which guarantees that the deref is the first
source or that the coordinate is the second.  Use
nir_tex_instr_src_index() to get the actual indices.

Fixes: 84b08971fb ("nir/lower_input_attachments: lower nir_texop_fragment_{mask}_fetch")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35551>
2025-06-19 02:14:03 +00:00
Emma Anholt
908bfb2ac9 nir: Add support for load_frag_coord_zw to nir_opt_fragdepth.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:36 +00:00
Emma Anholt
3b28604be2 nir: Make pixel_coord/frag_coord_zw be peephole-able sysvals.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:36 +00:00
Emma Anholt
8fa6d473e7 nir: Add SYSTEM_VALUE_FRAG_COORD_Z/W.
Intel's going to use these.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:36 +00:00
Emma Anholt
7db62e6dad nir: Split nir_load_frag_coord_zw to separate z/w intrinsics.
This will be a win for Intel for tracking which payload values need to be
set up.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25190>
2025-06-18 23:11:36 +00:00
Job Noorman
78f62d6d6d nir: remove unused global_atomic(_swap)_ir3 intrinsics
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
ir3 switched to using the generic ones.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33797>
2025-06-18 19:06:33 +00:00
Job Noorman
2490ecf5fc ir3: ingest global addresses as 64b values from NIR
There are currently two places where we have to handle values that are
logically 64b: 64b atomics and 64b global addresses. For the former, we
ingest the values as 64b from NIR, while the latter uses 2x32b values.
This commit makes things more consistent by using 64b NIR values for
global addresses as well.

Of course, we could also go the other way around and use 2x32b values
everywhere, which would make things consistent as well. Given that ir3
doesn't actually have 64b registers, and 64b values are represented by
collected 2x32b registers, this could actually make more sense.

In the end, both methods are mostly equivalent and it probably doesn't
matter too much one way or the other. However, the reason I have a
slight preference for ingesting things as 64b is that it allows us to
use more of the generic NIR intrinsics, which use 1-component values for
64b addresses or atomic values. This commit already makes
global_atomic(_swap)_ir3 obsolete and I'm planning to create generic
intrinsics to support ldg.a/stg.a as well.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33797>
2025-06-18 19:06:32 +00:00
Georg Lehmann
e9c886c331 nir/opt_intrinsic: fix inclusive scan rewrite with multiple uses
Modifying the iterated list is a footgun, so just create a new instruction.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13364
Fixes: 5c70a55bf3 ("nir/opt_intrinsics: optimize (exclusive_scan(op, a) op a) to inclusive scan")

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35577>
2025-06-18 18:18:15 +00:00
Rhys Perry
ea0670dfb5 nir: simplify nir_addition_might_overflow
nir_unsigned_upper_bound is good enough that this isn't needed anymore.

No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35514>
2025-06-17 13:28:00 +00:00
Rhys Perry
f3b7ac730c nir/uub: improve ior/ixor with constant sources
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35514>
2025-06-17 13:28:00 +00:00
Rhys Perry
ae6ad8977b nir/uub: improve iand with constant sources
fossil-db (navi21):
Totals from 9 (0.01% of 79653) affected shaders:
Instrs: 11878 -> 11868 (-0.08%)
CodeSize: 61572 -> 61508 (-0.10%)
Latency: 44585 -> 44581 (-0.01%); split: -0.02%, +0.01%
InvThroughput: 9697 -> 9660 (-0.38%)
VALU: 8889 -> 8876 (-0.15%)
SALU: 1339 -> 1342 (+0.22%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35514>
2025-06-17 13:27:59 +00:00
Rhys Perry
8ee5440073 nir/uub: improve ishl/imul with constant sources
fossil-db (navi21):
Totals from 1 (0.00% of 79653) affected shaders:
Instrs: 1339 -> 1338 (-0.07%)
CodeSize: 7244 -> 7240 (-0.06%)
Latency: 19827 -> 19822 (-0.03%)
InvThroughput: 9913 -> 9911 (-0.02%)
SALU: 419 -> 418 (-0.24%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35514>
2025-06-17 13:27:59 +00:00
Olivia Lee
88ac602cc2 panvk: implement shaderInputAttachmentArrayNonUniformIndexing
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35408>
2025-06-13 19:02:19 +00:00
Christian Gmeiner
b30b87c096 nir/inline_uniforms: Convert to use nir_shader_intrinsics_pass(..)
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35463>
2025-06-12 22:35:48 +02:00
Marek Olšák
fa2e7c3dfd nir: return progress from nir_group_loads, nir_inline_uniforms
so that NIR_PASS is usable

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:37 +00:00
Marek Olšák
0cbcb72869 nir/opt_vectorize_io: work around a 16-bit IO bug for RADV
If nir_opt_vectorize_io isn't called, 16-bit IO is broken.
This is a workaround to keep RADV working and consume incorrect NIR
while other drivers consume correct NIR.

Hopefully this will be removed ASAP.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:37 +00:00
Marek Olšák
b636e5ca66 nir: add nir_clear_mediump_io_flag
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:36 +00:00
Marek Olšák
13005d5e4e nir/xfb_info: don't merge incompatible XFB outputs to fix mediump
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:36 +00:00
Marek Olšák
118c0e6991 nir/opt_vectorize_io: fix vectorizing 16-bit XFB
Tested with mediump.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:36 +00:00
Marek Olšák
caddd67b8c nir/opt_vectorize_io: don't vectorize 16-bit IO to vec8 - it's illegal
NIR represents low bits of 16-bit IO as a separate vec4, and high bits as
another separate vec4. There is no representation that allows vec8.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:35 +00:00
Marek Olšák
1f80ff5550 nir/opt_vectorize_io: convert bool merge_low_high_16_to_32 to an enum
refactoring for the next commit

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:35 +00:00
Marek Olšák
6270136b7d nir/opt_varyings: set prev_stage/next_stage if they are NONE and validate them
Doing it here ensures that any linked shader will have the correct values
there.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35315>
2025-06-12 19:35:34 +00:00