Explicit interpolation just loads raw vertex data as-is and lets the FS do
the interpolation manually.
This adds handling of nir_intrinsic_load_input_vertex, which has 2 different
behaviors: undefined vertex ordering and strict vertex ordering.
- dead IO removed correctly
- constants and uniform expressions are propagated normally
- outputs are deduplicated within their own category (strict and non-strict)
- outputs used by explicit interpolation are never treated as "convergent"
- backward inter-shader code motion is skipped
- compaction has 2 new types of vec4 slots:
- mixed 32-bit and 16-bit explicit strict (sharing the same vec4)
- mixed 32-bit and 16-bit explicit non-strict (sharing the same vec4)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28247>
Some hardware requires that per-primitive FS inputs are
sorted last, and nir_assign_io_var_locations can already
take care of this.
Add the same consideration to nir_recompute_io_bases.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28489>
Previously, this information would have been lost when the
shader has no I/O variables.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28489>
Previously, this information would have been lost when the
shader has no I/O variables.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28489>
Compaction only moves directly-indexed slots. This prevents unnecessary
num_slots > 1 from appearing in random slots.
Fixes: c66967b5cb - nir: add nir_opt_varyings, new pass optimizing and compacting varyings
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28431>
Backward inter-shader code motion turns ALU results into outputs,
which led to getting IO with unsupported bit sizes. This prevents
that.
There is a new NIR option flag that indicates 16-bit support.
Fixes: c66967b5cb - nir: add nir_opt_varyings, new pass optimizing and compacting varyings
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28431>
I'm not measuring a significant perf difference in
-bshading:shading=phong:model=bunny -bideas -brefract so this seems Good Enough
For Me.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
unfortunately, shader stage merging is bogus when coherent images are used, so
we need an unmerged path. i'd rather not maintain two paths, so let's just
stop merging. as a bonus this makes ESO a lot easier, and lets us reuse the same
VS for both VS->GS and VS->TCS.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28483>
In nir_lower_terminate_to_demote(), we were deleting the rest of the
block contents when we added a halt instruction but left any subsequent
CF nodes in the list. While this may be technically okay, that much
dead code makes the rest of NIR pretty grumpy. It's better to delete
everything to the end of the CF list, not just everything to the end of
the block.
Fixes: 75861c64b8 ("nir: Add a lower_terminate_to_demote pass")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28456>
It was by pure luck that all sources (and the result) of nir_dpas_intel
had the same number of components. It is possible to support matrix
sizes where the accumlator matrix and the result matrix are larger
(e.g., 16x8 * 8x16 = 16x16).
This breaks all of the assumptions of NIR's infrastructure for code
generating intrinsics. Fix the by making the accumulator matrix be the
first source. The accumulator and the result will always have the same
dimensions (due to rules of matrix multiplication) and the same type
(due to restructions of the cooperative matrix extension). This forces
them to have the same number of components.
This doesn't fix all the potential problems. NIR expects that all
0-sized sources will have the same number of components. This just
ensures that the result has the correct number of components.
Fixes: 6b14da33ad ("intel/fs: nir: Add nir_intrinsic_dpas_intel")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28404>
These will be needed to implement some tessellation dynamic
states within the TCS as opposed to using an epilog.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28408>
These were intended to be shared with (e.g.) rusticl, but they're
unused and I expect they will continue to be. The spirv options
are also hardcoded to be what CLOn12 expects.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26803>
Mali GPUs have a 32-bit alignment constraint on push constants. Extend
nir_lower_mem_access_bit_sizes() so it can lower bit sizes on push
constant accesses.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
Will be needed to support push constants in
nir_lower_mem_access_bit_sizes().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
This is just a little handle to tell the back-end where to do the copy.
Ideally, we'd have a NIR intrinsic that does the copy but we need to be
able to copy any number of registers up to 34 and NIR intrinsics just
aren't that flexible.
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>
This way we avoid destroying divergence information which may be used by
passes which also need to lower phis.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>
This fixes nir_print for unstructured control-flow. It's safe to
backport just this patch because the worst case is that we don't set as
many types and not as much gets printed.
Fixes: 260a9167db ("nir/print: Improve NIR_PRINT=print_consts by using nir_gather_ssa_types()")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>
For any reg we can lower, we remove it whenever we remove the last read
or write. For regs that aren't used at all, however, there are no reads
or writes so there's nothing to trigger the removal. Instead, we need
to do it in setup_reg.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28300>