this is a special version of indirect deref lowering which is used by
mesa/st to remove dynamic indexing from builtin uniforms for the lowering
pass in non-packed uniform case
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9741>
Some backends, like r600 support a fused version of int and float compare
against zero and and csel. Adding these opcodes here makes it possible to
optimize this in nir.
v2: Add rules for float compare + csel
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>
Some hardware supports a version of find_msb where the bits are counted
starting at the high bit, and this needs some lowering to obtain the
value that is expected by *find_msb
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>
This moves the dxil pass to common code and makes dxil
use the new code.
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9643>
This is a global address format where you have a 64-bit base pointer and
a 32-bit offset. It's intentionally identical to 64bit_bounded_global
except nir_lower_explicit_io does no bounds checking with it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>
This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after()
with an SSA def, and rewrites all the users as needed.
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This is currently an alias for nir_ssa_def_rewrite_uses but we move all
the instances which used it to write a non-SSA source to the newly named
helper.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
Some games declare the wrong format, so we might want to disable this
optimization in that case.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: e4d75c22 ("nir/opt_shrink_vectors: shrink image stores using the format")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>
E.g. r600 a cube texture lookup uses a specific cube instruction
to evaluate the sample coordinates and the face ID, so that the cube
texture lookup can be lowered to a array texture lookup, thereby sharing
the code with the 2D array texture lopkup.
However, for TXD the given gradients still need to be three-component
vectors, so add a flag that the NIR validation knows that we deal with
cube texture that was lowered to an array and can validate accordingly.
v2: Handle new flag in serialization (Marek)
v3: Rebase so that the change does not require the patch to deduct the
number of offset and grad components from sampler type
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>
Compile-time (nir_opt_dce):
Difference at 95.0% confidence
-319.51 +/- 5.67632
-12.0627% +/- 0.208076%
(Student's t, pooled s = 6.70399)
Compile-time (overall):
Difference at 95.0% confidence
-385.025 +/- 42.1124
-0.929489% +/- 0.10139%
(Student's t, pooled s = 49.7367)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7691>
Some nir lowerings might need to know if txs is supported by
the backend.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8898>
Basically every pass in NIR uses nir_ssa_def_rewrite_uses which calls
nir_instr_rewrite_src which is fairly complex because it handles all
sorts of non-SSA cases. Since we already know a priori that every
source written by nir_ssa_def_rewrite_uses is SSA, we can check new_src
once at the top of the function and cut out all that complexity.
While we're at it, we expose a new SSA-only nir_ssa_def_rewrite_uses_ssa
helper which takes an SSA def which avoids the one SSA check. It's also
more convenient 90% of the time.
Compile time as tested by Rhys Perry <pendingchaos02@gmail.com>
Difference at 95.0% confidence
-797.166 +/- 418.649
-0.566174% +/- 0.296441%
(Student's t, pooled s = 325.459)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8790>
The function is very convenient for lowering any type of instruction
that can be easily filtered, but so far instructions that didn't return
a value were siletly ignored.
Fix this by
- not requiring a return value in the instruction
- add a new special return value from the lowering implementation
function to indicated that an instruction that doesn't have a
return value must be removed, and
- don't try to collect and replace uses in this case.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8177>
These are all pretty trivial because we can just split the op into one
subgroup op per half of the value. There's some question as to whether
these belong in lower_int64 or lower_subgroups but, on Intel, they key
decider of whether or not we need the lowering is based on whether or
not we have hardware int64 support.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
This patch also replaces lower_negate with lower_ineg / lower_fneg.
The fneg semantics have been clarified as of Version 1.5, Revision 1
of the SPIR-V specification, which means that the previous lowering
to fsub is not a viable solution anymore, and is replaced with
lowering to fmul(x, -1.0).
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
If the instruction being coalesced would be vectorized but the target
doesn't support vectorizing that op, skip coalescing.
Reuse the callbacks from alu_to_scalar to describe which ops should not
be vectorized.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6506>
Like SPIR-V and GL_ARB_sparse_texture2, these return a residency code. It
is placed in the destination after the rest of the result. If it's zero,
then the texel is resident. Otherwise, it's not resident.
Besides the larger destination and the residency code, sparse fetches
work the same as normal fetches.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
These will be useful for sparse texture instructions and image load
intrinsics.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
The resulting point-coord origin not only depends on whether
the draw buffer is flipped but also on GL_POINT_SPRITE_COORD_ORIGIN
state. Which makes its transform differ from a transform of wpos.
On freedreno fixes:
gl-3.2-pointsprite-origin
gl-3.2-pointsprite-origin -fbo
Fixes: d934d320 "nir: Add flipping of gl_PointCoord.y in nir_lower_wpos_ytransform."
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8200>
This pass creates a SSBO var for the printf buffer. It does an atomic increment
at the beginning of the buffer to determine where to write, then dumps
the args after that.
v2: [airlied]
Enhanced to use an index into a set of format info that is passed
back to the caller. The format info contains the number of args,
argument sizes and the format string.
v3: move format string lowering to vtn
v4: Jason reworked it.
v5: assume buffer has initial offset prebaked in and work from there.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8254>
This just adds the basic nir support for printf,
intrinsic, and support for storing the printf info.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8254>
With the block's end_ip accidentally being the ip of the next instruction,
contrary to the comment, you would end up doing end-of-block freeing early
and have the value missing when it came time to emit the next instruction.
Just expand the ips to have separate ones for start and end of block --
while it means that nir_instr->index is no longer incremented by 1 per
instruction, it makes sense for use in liveness because a backend is
likely to need to do other things at block boundaries (like emit the if
statement's code), and having an ip to identify that stuff is useful.
Fixes: a206b58157 ("nir: Add a block start/end ip to live instr index metadata.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7658>
This involves determining the variables referenced by intrinsics, setting and
using the access qualifier correctly and considering that images and buffers
can alias.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6483>
v2: Fixup comment about bits in nir_intrinsics.py
v3: Use varying for primitive shading rate builtin (samuel)
v4: Reoder switch alphabetically
Make divergence of frag_shading_rate an option
v5: Remove stage check for frag_shading_rate in divergence (Samuel)
v6: s/frag_shading_rate_per_subgroup/single_frag_shading_rate_per_subgroup/ (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7795>