If a deref_cast parent is also a deref, and the parent mode is not
generic, we can safely propagate the parent mode down to the
deref_cast.
This will allow the fixup to work when the deref_cast is used just
to change the glsl_type when accessing a variable/deref-chain.
Reviewed-by: Faith Ekstrand <faith@gfxstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23734>
Done by hand at each call site but going very quickly with funny Vim motions and
common regexes. This is a very common idiom in NIR.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23807>
The var splitting pass can rearrange the variables as long as their
position in memory doesn't matter. For block-arranged variables,
or things like memcpys or casts, the layout matters, but atomics
don't imply anything about the layout of the overall variable, so
don't treat them as "complex" for this use case.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23173>
Via Coccinelle patches
@@
expression a, b, c;
@@
-nir_channels(b, a, (1 << c) - 1)
+nir_trim_vector(b, a, c)
@@
expression a, b, c;
@@
-nir_channels(b, a, BITFIELD_MASK(c))
+nir_trim_vector(b, a, c)
@@
expression a, b;
@@
-nir_channels(b, a, 3)
+nir_trim_vector(b, a, 2)
@@
expression a, b;
@@
-nir_channels(b, a, 7)
+nir_trim_vector(b, a, 3)
Plus a fixup for pointless trimming an immediate in RADV and radeonsi.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23352>
Since 624e799cc3 ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA
defs don't have names, making the name argument unused. Drop it from the
signature and fix the call sites. This was done with the help of the following
Coccinelle semantic patch:
@@
expression A, B, C, D, E;
@@
-nir_ssa_dest_init(A, B, C, D, E);
+nir_ssa_dest_init(A, B, C, D);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>
Every nir_ssa_def is part of a chain of uses, implemented with doubly linked
lists. That means each requires 2 * 64-bit = 16 bytes per def, which is
memory intensive. Together they require 32 bytes per def. Not cool.
To cut that memory use in half, we can combine the two linked lists into a
single use list that contains both regular instruction uses and if-uses. To do
this, we augment the nir_src with a boolean "is_if", and reimplement the
abstract if-uses operations on top of that list. That boolean should fit into
the padding already in nir_src so should not actually affect memory use, and in
the future we sneak it into the bottom bit of a pointer.
However, this creates a new inefficiency: now iterating over regular uses
separate from if-uses is (nominally) more expensive. It turns out virtually
every caller of nir_foreach_if_use(_safe) also calls nir_foreach_use(_safe)
immediately before, so we rewrite most of the callers to instead call a new
single `nir_foreach_use_including_if(_safe)` which predicates the logic based on
`src->is_if`. This should mitigate the performance difference.
There's a bit of churn, but this is largely a mechanical set of changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22343>
The result might be used in a deref_ptr_as_array, which requires a proper
stride within lower_explicit_io. If we'd lose that information or end up
with a different stride don't execute this optimization.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8289
Fixes: b779baa9bf ("nir/deref: fix struct wrapper casts. (v3)")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21458>
This also removes the loop so opt_remove_cast_cast() will only optimize
cast(cast(x)) and not cast(cast(cast(x))). However, since nir_opt_deref
walks instructions top-down, there will almost never be a tripple cast
because the parent cast will have opt_remove_cast_cast() run on it.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21252>
This is almost always a nir_instr and updating the src of a nir_if will
have to work slightly differently in the future.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12910>
Preserving information about inbounds access and
the required bit size for the bounds will help
with avoiding 64-bit operations when lowering io.
Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16729>
Instead of just checking for the variables to match, check that the
entire deref up to the interface type matches.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
Instead of having a bunch of mode checks as special cases, assert that
the modes equal and then switch on the mode. This should make the
special cases a bit easier to understand. Handling of `a_var == b_var`
looks redundant now but it won't be in the next patch.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
This will let us use it to compare only the first part of a pair of
deref paths and continue the comparison later.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
Instead of incrementing pointers, use an integer index. This makes it
clear that we always increment them together. It'll also make the next
change a bit easier. We use a pointer to an integer because the next
patch is going to let us abort the walk and we want to be able to
continue where we left off.
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
Whether it's coherent should be irrelevant and the ACCESS_RESTRICT check
above should consider all cases aliasing unless NIR makes it clear they're
not.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Tested-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16894>
Casts shouldn't change the bit pattern of the deref and you have to cast
again after you're done with the ALU anyway so we can ignore casts on
ALU sources. This means we can actually start constant folding NULL
checks even if there are annoying casts in the way.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15673>
Found while running valgrind :
==3583454== Invalid read of size 4
==3583454== at 0xF48336: glsl_get_struct_field_offset (nir_types.cpp:84)
==3583454== by 0xC7CD0D: opt_replace_struct_wrapper_cast (nir_deref.c:1068)
==3583454== by 0xC7CDD9: opt_deref_cast (nir_deref.c:1087)
==3583454== by 0xC7DD8E: nir_opt_deref_impl (nir_deref.c:1369)
==3583454== by 0xC7DF4E: nir_opt_deref (nir_deref.c:1428)
==3583454== by 0xA63F3C: brw_kernel_from_spirv (brw_kernel.c:325)
==3583454== by 0xA3BC2C: main (intel_clc.c:481)
==3583454== Address 0xe4f7e88 is 24 bytes after a block of size 48 in arena "client"
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13952>
We say that they're for debug only but we don't really have a good
policy around when to set them and when not to. In particular,
nir_lower_system_values and nir_lower_vars_to_ssa which are the chief
producers of SSA values which might reasonably have a name do not bother
to set one. We have some names set from things like BLORP and RADV's
meta shaders but AFAICT, they're setting a name more because it's there
than because they actually care.
Also, most things other than nir_clone and nir_serialize don't bother to
try and preserve them. You can see in the diffstat of this commit
exactly what passes attempt to preserve names. Notably missing from the
list is opt_algebraic which is the single largest source of SSA def
churn and it happily throws names away.
These observations lead me to question whether or not names are actually
useful at all or if they're just taking up space (8B per instruction)
and wasting CPU cycles (to ralloc_strdup on the off chance we do have
one). I don't think I can think of a single time in recent history
where I've been debugging a shader issue and a SSA value name has been
there and been useful. If anything, the few times they are there, they
just throw me off because they mess up the indentation in nir_print.
iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5)
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>
This breaks CLOn12's handling of CL CTS test_basic vector_creation for char3 (at least).
Removing this cast causes us to try to load from a deref with no alignment info.
Fixes: 99bb2a4d ("nir/opt_deref: Don't remove casts with alignment information")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after()
with an SSA def, and rewrites all the users as needed.
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
nir_build_deref_offset() can be extended to support calculating an
offset relative to a base pointer.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7565>
If opt_restrict_deref_modes makes progress, we may be able to figure out
the mode well enough to turn a deref_mode_is intrinsic into a constant.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
We rename it to "modes" to make it clear that it may contain more than
one mode and adjust all the uses of nir_deref_instr::modes to attempt to
handle multiple modes.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6332>
LLVM loves take advantage of the fact that vec3s in OpenCL are 16B
aligned so it can just read/write them as vec4s. This is questionably
legal except that it uses a xyz write-mask when it does it. The result
is a LOT of vec4->vec3 casts on loads and stores. This optimization
detects this case as well as other bit-cast cases and rewrites them to
get rid of the cast.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6871>
If we have a cast deref with alignment information and we can get equal
or better alignment information from something further up the deref
chain, we can strip the alignment information from the cast and allow
other optimizations to potentially eliminate the cast.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Generally, if a cast has alignment information, that information is
useful and we don't want to loose it.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
This renames it to drop the ptr_as and makes it handle all of the stride
cases. There's a bit of a tricky bit in here around Booleans but we
currently use 32-bit for those always.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
These are safe to call on either structured or unstructured NIR but
don't provide the nice ordering guarantees of nir_foreach_block and
friends. While we're here, we use them for a very small selection of
passes which are known to be safe for unstructured control-flow. The
most important such pass is nir_dominance which is required for
structurizing.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401>
One day, we may want copy_prop_vars or other passes to be able to see
through certain types of casts such as when someone casts a uint64_t to
a uvec2. However, for now we should just avoid casts all together.
Fixes: d8e3edb784 "nir/deref: Support casts and ptr_as_array in..."
Tested-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6072>
Allow casts in a deref chain so we can calculate an offset from
a base pointer dereference or have pointer type casts in the
middle of the chain (both are pretty common in CL).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5682>