In Valhall multiview, position/varying shaders are invoked once per
draw. Each invocation write separate outputs for all views. Fragment
processing is handled by the existing multilayer support. Note that
because the hardware only supports up to 8 views, we don't have to care
about the case where there are too many layers to fit in one tiler when
multiview is enabled.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>
This is needed for implementing multiview in panvk, where the address
calculation for multiview outputs is not well-represented by lowering to
nir_intrinsic_store_output with a single offset.
The case where a variable is both per-view and per-{vertex,primitive} is
now unsupported. This would come up with drivers implementing
NV_mesh_shader or using nir_lower_multiview on geometry, tessellation,
or mesh shaders. No drivers currently do either of these. There was some
code that attempted to handle the nested per-view case by unwrapping
per-view/arrayed types twice, but it's unclear to what extent this
actually worked.
ANV and Turnip both rely on per-view outputs being assigned a unique
driver location for each view, so I've added on option to configure that
behavior rather than removing it.
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31704>
Move away from NIR_PASS_V() like other drivers have done long ago.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32480>
This is a prerequisite for enabling nir_opt_varyings for all gallium
drivers.
nir_lower_io_passes (called by the GLSL linker) only uses NIR options
to lower indirect IO access before lowering IO and calling
nir_opt_varyings.
Most drivers report full support for indirect IO and lower it themselves,
which prevents compaction of lowered indirectly accessed varyings because
nir_opt_varyings doesn't touch indirect varyings.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> (Rb for asahi)
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> (for r300)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32423>
This will need to be set to true when the GLSL linker lowers IO, which
can later be unlowered by st/mesa, and then drivers can lower it again
without load_interpolated_input. Therefore, it can't be a global
immutable option.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32229>
nir_lower_pack can generate split operations, execute algebraic again
to handle them.
This fix an assert on
"dEQP-VK.spirv_assembly.instruction.compute.opphi.vartype_float16" and
probably others tests.
Fixes: 3904cfabd6 ("bi: Use nir_opt_load_store_vectorize")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32131>
On v7-, use TEXC(op=GRDESC_DER) to convert user-provided gradient into a
gradient descriptor consumed by the hardware, and then supply that
descriptor to the TEXC instruction.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29521>
On v9+, use TEX_GRADIENT to convert user-provided gradient into a
gradient descriptor consumed by the hardware, and then supply that
descriptor to TEX_SINGLE.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29521>
Rather than adding memcpy()s to a local u32 variable, add a union
to bifrost_texture_operation so we can directly access the packed
value.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29521>
The TEXC(GRDESC) instruction returns the LOD for a given texture
coordinates. Use it to implement nir_texop_lod.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31742>
The TEX_GRADIENT instruction returns the LOD for a given texture
coordinates. Use it to implement nir_texop_lod.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31742>
Define the TEX_GRADIENT instruction in valhall/ISA.xml, and add the
necessary bits to the compiler to expose it.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31742>
These bits are reserved in the spec. Even if setting them is harmless,
we'd rather keep them zero just in case.
Suggested-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31742>
It will be used to allow merging loads with a hole between them.
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
V3D can use these too.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
We don't have a generic v4i8 on Valhall, we have to lower it to two
v2i8. Fortunately, bi_make_vec_to() hides the Bifrost/Valhall
differences, so use that for nir_op_pack_uvec4_to_uint.
Fixes: 934b0f1add ("pan/bi: Respect swizzles for more vector ops")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31280>
Instead of special-casing 3D image handling in the gallium driver, use
the actual image type and extend the compiler to deal with cube/3D
image coordinates.
This fixes panvk without resorting to the image type casting that was
in place in the gallium driver.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Tested-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31227>
For reads, we use the LD_PKA (AKA LD_BUFFER) so we can directly
pass the buffer index. For writes, we still convert the SSBO index
into a global address before doing a global load/store/atomic
operation, but we do that with an LEA_PKA instruction that takes
care of bounds checking.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>
On Mali(Valhall), we have a way to load SSBO data without going through
an SSBO index -> global address translation, so let's provide a way
to tell nir_lower_ssbo() when it shouldn't lower loads.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>
We don't merge subpasses, so we can't turn subpass attachment
loads into tile buffer reads yet. Let's just treat those as
regular 2D textures for now (as we do on Bifrost).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11875
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Tested-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31166>
On valhall, the sample index should go in the R component
of the image load/store/lea instruction. This provides a
straightforward way to implement image2DMS and
image2DMSArray image load and store for valhall.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30521>
The nir_lower_image_atomics_to_global pass can create some image
load/stores, so we need to do the multisample image load/store
lowering after this.
Also, the pass only actually works on bifrost and below, so skip it
for valhall.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30521>
The pan_arch function is useful elsewhere, and doesn't rely on
anything else within genxml/gen_macros.h.
It's useful, for example, to find the architecture from the
GPU id in bifrost_compile.c, where before we were using ad-hoc
shifting.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30521>
Previous changes of the helper invocation pass fixed missing conditional
control flow tracking but this is not enough.
Propagation of the dependency chain also need to handle value outside of
direct predecessors.
This fix "dEQP-VK.graphicsfuzz.cov-nested-loops-sample-opposite-corners"
for real this time.
Fixes: 33fef27356 ("bi: Do not mark tex ops as skip when dest is used by control flow")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30754>
On Valhall, we can store the layer index in PositionFIFO attributes and
have the primitives dispatched to the appropriate list in the tiler
context, which means we no longer have to issue N IDVS jobs when doing
layered rendering.
On the fragment shader side, we can pass the layer index through the
frame_argument field, which can be preloaded in r62-r63, so do that to
save a push uniform slot.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30695>
Make pseudo instructions for the IR separate from real Bifrost and
Valhall instructions, which are kept in their own ISA.xml files.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30179>
Make the valhall ISA parser valhall.py have a functional interface
returning a tuple, rather than making users directly access variables
within it.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30179>
Updates the ISA.xml parser to be able to handle some of the constructs
from the Valhall ISA.xml (which differs in significant ways from the
Bifrost ISA.xml). The eventual intent is to avoid duplicating instructions
in the two files, although that isn't enabled in this patch.
The new features aren't used yet, that will be in a future commit.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30179>
We were using the first character of names to indicate the execution unit
('+' for add, '*' for fma). Change the ISA.xml file to have an explicit
`unit` attribute for instructions; this makes the XML more flexible
for future architectures and matches what the valhall ISA.xml does.
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30179>
Apply the same optimisation as ACO and AGX.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30484>
This trim vector srcs to the appropriate component count
based on the write mask.
This also should help with image store as the vector srcs
will be trimed according to the format if its known.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30484>
Ensure we vectorize load/store when possible.
Also move lower pack after loop optimization.
This drastically reduce the shader size of
"dEQP-VK.graphicsfuzz.spv-stable-maze-flatten-copy-composite" and allow
it to pass instead of timing out but it might greatly help others.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30484>
Previously, it was possible to have a texture operation marked as SKIP
while one of the dests was in use in conditional control flow.
If an helper thread was to execute that instruction, it would result
in an undefined value being used.
This fix
"dEQP-VK.graphicsfuzz.cov-nested-loops-sample-opposite-corners" where
helper threads would get stuck inside a loop depending on the result of
a TEXS_2D invocation.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30484>
Will be used for DCE and helper invocations pass changes.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30484>
rewrote most of the impl but shrug.
regresses code gen for mediump but I'm not too bothered given the lackluster
perf of fp16 on bifrost :(
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30567>
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.
nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
Not the most efficient approach but functional.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Rebecca Mckeever <rebecca.mckeever@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>