Commit graph

25 commits

Author SHA1 Message Date
Daniel Schürmann
87cb42f953 treewide: don't lower to LCSSA before calling nir_divergence_analysis()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Daniel Schürmann
8d1abd4996 treewide: use nir_src_is_divergent() rather than checking the divergence of the SSA
Without LCSSA, divergence between src and def might differ.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30787>
2024-10-24 10:06:17 +00:00
Georg Lehmann
dbf63a0788 nir: remove nir_op_is_derivative
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Marek Olšák
948f94b8c5 nir/opt_varyings: pack TCS inputs with cross-invocation access together
Unigine Heaven has a TCS that reads pos.xyz and tescoord.w from all
invocations in every invocation. By putting those two in the same vec4,
AMD hw can reduce the amount of shared memory that is allocated for those
inputs from 2 vec4s to 1 vec4.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
2024-10-17 03:30:07 +00:00
Marek Olšák
8e93907b7c nir/opt_varyings: assign locations of no_varying IO for TCS outputs only
Skip the code for other shader stages because it doesn't do anything there.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31670>
2024-10-17 03:30:07 +00:00
Marek Olšák
9bfea3183a nir/opt_varyings: improve convergent input handling to fix data corruption
Backward inter-shader code motion can move any code into the previous
shader if it only uses convergent inputs. The problem is the final input
type can end up being integer or FP64, which is incompatible with
the assumption that convergent inputs can always be interpolated.

If such a case occurs and the type is integer or FP64, either don't
do any code motion, or if the driver exposes the new flag, rewrite
convergent  loads to use load_input.

If the new flag is supported, all convergent loads are rewritten to use
load_input, and flat varyings are allowed to be classified as convergent,
which means they are packed into interpolated vec4 slots if there are
unused components.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895>
2024-07-23 16:13:16 +00:00
Marek Olšák
b2d32ae246 nir: add nir_intrinsic_load_per_primitive_input, split from io_semantics flag
Instead of having 1 bit in nir_io_semantics indicating a per-primitive
FS input, add a dedicated intrinsic for it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895>
2024-07-23 16:13:16 +00:00
Alyssa Rosenzweig
da752ed7c1 treewide: use nir_def_replace sometimes
Two Coccinelle patches here. Didn't catch nearly as much as I would've liked but
it's a start.

Coccinelle patch:

    @@
    expression intr, repl;
    @@

    -nir_def_rewrite_uses(&intr->def, repl);
    -nir_instr_remove(&intr->instr);
    +nir_def_replace(&intr->def, repl);

Coccinelle patch:

    @@
    identifier intr;
    expression instr, repl;
    @@

    nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
    ...
    -nir_def_rewrite_uses(&intr->def, repl);
    -nir_instr_remove(instr);
    +nir_def_replace(&intr->def, repl);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom]
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima]
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> [etna]
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> [r300]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29817>
2024-06-21 15:36:56 +00:00
Alyssa Rosenzweig
15257b65c6 treewide: use nir_metadata_control_flow
Via Coccinelle patch:

    @@
    @@

    -nir_metadata_block_index | nir_metadata_dominance
    +nir_metadata_control_flow

...plus some manual fixups for call sites missed by coccinelle.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom]
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>
2024-06-17 16:28:14 -04:00
Natanael Copa
0274518615 nir/opt_varyings: reduce stack usage
Avoid put a huge struct on stack to fix a stack overflow on musl libc.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10988
Fixes: c66967b5cb (nir: add nir_opt_varyings, new pass optimizing and compacting varyings)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29375>
2024-05-24 13:15:33 +00:00
Timur Kristóf
c23c5c0a07 nir/opt_varyings: Don't promote flat inputs when moving post-dominator.
Promoting flat inputs should only happen while assigning FS input
slot groups. Otherwise we risk adding extra input slots, which
is undesireable.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29208>
2024-05-23 13:14:46 +00:00
Timur Kristóf
9dad0ced52 nir/opt_varyings: Print FS VEC4 type when debugging relocate_slot.
Useful when debugging this pass.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29208>
2024-05-23 13:14:46 +00:00
Timur Kristóf
2b1031ec10 nir/opt_varyings: Add workaround for RADV mesh shader multiview.
The layer output is added in ac_nir_lower_ngg which is called
later than this pass; prevent deleting layer input from FS here.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:12 +00:00
Timur Kristóf
91dd9c35be nir/opt_varyings: Fix relocate_slot so it doesn't mix up 32-bit and 16-bit I/O.
Previously, nir_opt_varyings was unable to distinguish between
a fully occupied 32-bit flat input and the low part of a 16-bit
flat input, and would assign them the same slot, thereby messing
up both I/O slots in the process.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:12 +00:00
Timur Kristóf
7e43c2d08f nir/opt_varyings: Debug print during relocate_slot.
VERY useful when debugging issues with this pass.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:11 +00:00
Timur Kristóf
bf2227d0d0 nir/opt_varyings: Only propagate constant MS outputs, not other uniforms.
Due to how mesh shaders work, we'll need a workgroup divergence
pass in order to really prove that an output is uniform.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:11 +00:00
Timur Kristóf
5dd1461ca4 nir/opt_varyings: Add early return when producer stage is task.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:11 +00:00
Timur Kristóf
a083a25a80 nir/opt_varyings: Fix explicit and per-vertex FS inputs.
Fixes: 772149b15a
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:11 +00:00
Timur Kristóf
586acb47c8 nir/opt_varyings: Support per-primitive I/O.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:11 +00:00
Timur Kristóf
21ff2907c7 nir/opt_varyings: Allow optimizing primitive ID for MS -> FS.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28685>
2024-04-14 19:51:11 +00:00
Marek Olšák
772149b15a nir/opt_varyings: handle load_input_vertex
Explicit interpolation just loads raw vertex data as-is and lets the FS do
the interpolation manually.

This adds handling of nir_intrinsic_load_input_vertex, which has 2 different
behaviors: undefined vertex ordering and strict vertex ordering.

- dead IO removed correctly
- constants and uniform expressions are propagated normally
- outputs are deduplicated within their own category (strict and non-strict)
- outputs used by explicit interpolation are never treated as "convergent"
- backward inter-shader code motion is skipped
- compaction has 2 new types of vec4 slots:
    - mixed 32-bit and 16-bit explicit strict (sharing the same vec4)
    - mixed 32-bit and 16-bit explicit non-strict (sharing the same vec4)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28247>
2024-04-04 01:25:06 +00:00
Marek Olšák
b6a93058b9 nir/opt_varyings: simplify nir_io_semantics::num_slots of directly-indexed slots
Compaction only moves directly-indexed slots. This prevents unnecessary
num_slots > 1 from appearing in random slots.

Fixes: c66967b5cb - nir: add nir_opt_varyings, new pass optimizing and compacting varyings

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28431>
2024-03-31 03:02:51 +00:00
Marek Olšák
71becd1b44 nir/opt_varyings: don't generate IO with unsupported bit sizes
Backward inter-shader code motion turns ALU results into outputs,
which led to getting IO with unsupported bit sizes. This prevents
that.

There is a new NIR option flag that indicates 16-bit support.

Fixes: c66967b5cb - nir: add nir_opt_varyings, new pass optimizing and compacting varyings

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28431>
2024-03-31 03:02:51 +00:00
Mike Blumenkrantz
b5877e0501 nir/opt_varyings: update alu type when rewriting src/dest for moved ops
this otherwise retains the old bit size

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28304>
2024-03-27 23:29:36 +00:00
Marek Olšák
c66967b5cb nir: add nir_opt_varyings, new pass optimizing and compacting varyings
Highlights:
- all shader stages and all input/output types are handled, including
  inputs and outputs with multiple vertices
- the optimizations performed are: unused input/output removal, constant
  and uniform propagation, output deduplication, inter-shader code motion,
  and compaction
- constant and uniform propagation and output deduplication work even
  if a shader contains multiple stores of the same output, e.g. in GS
- the same optimizations are also performed between output stores and
  output loads (for TCS)
- FS inputs are packed agressively. Only flat, interp FP32, and interp
  FP16 can't be in the same vec4. Also, if an output value is
  non-divergent within a primitive, the corresponding FS input is
  opportunistically promoted to flat.

The big comment at the beginning of nir_opt_varyings.c has a detailed
explanation, which is the same as:
    https://gitlab.freedesktop.org/mesa/mesa/-/issues/8841

dEQP and GLCTS have incorrect tests that fail with this, see:
    https://gitlab.freedesktop.org/mesa/mesa/-/issues/10361

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26819>
2024-03-15 19:55:46 +00:00