This reverts commit 939ddf3f67.
Intel has a separate pass for fusing FFMAs selectively. We split
these flags in commit 1b72c31e1f and
the reasoning still stands. The patch being reverted was just a
cleanup, so there should be no issue with reverting it.
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849>
In CL 1.2, images are required to be either read-only or write-only. We
can always translate the read-only image ops to texture ops. In CL 2.0
(and an extension), the ability is added to have read-write images but
sampling (with a sampler) is only allowed on read-only images. As long
as we only lower read-only images to texture ops, everything should stay
consistent.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6578>
We don't do full dominance validation of SSA values in nir_validate
because it requires generating valid dominance information and, while
that's not extremely expensive, it's probably more than we want to do on
every pass. Also, dominance information is generated through the
metadata system so if we ran it by default in nir_validate, we would get
different beavior of the metadata system based on whether or not you
have a debug build and metadata bugs would be very hard to find.
However, having a pass for it that can be run occasionally, should help
detect and expose bugs. For ease of use, we add a NIR_VALIDATE_SSA_DOMINANCE
environment variable which can be set to manually enable dominance
validation as a standard part of nir_validate.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
For UBO accesses to be the same performance as classic GL default uniform
block uniforms, we need to be able to push them through the same path. On
freedreno, we haven't been uploading UBOs as push constants when they're
used for indirect array access, because we don't know what range of the
UBO is needed for an access.
I believe we won't be able to calculate the range in general in spirv
given casts that can happen, so we define a [0, ~0] range to be "We don't
know anything". We use that at the moment for all UBO loads except for
nir_lower_uniforms_to_ubo, where we now avoid losing the range information
that default uniform block loads come with.
In a departure from other NIR intrinsics with a "base", I didn't make the
base an be something you have to add to the src[1] offset. This keeps us
from needing to modify all drivers (particularly since the base+offset
thing can mean needing to do addition in the backend), makes backend
tracking of ranges easy, and makes the range calculations in
load_store_vectorizer reasonable. However, this could definitely cause
some confusion for people used to the normal NIR base.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
This also fixes the inverted last parameter of nir_lower_flrp in most drivers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
This renames it to drop the ptr_as and makes it handle all of the stride
cases. There's a bit of a tricky bit in here around Booleans but we
currently use 32-bit for those always.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Instead of always lowering everything, we add a threshold such that if
the total indirected array size (AoA size) is above that threshold, it
won't lower. It's assumed that the driver will sort things out somehow
by, for instance, lowering to scratch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
For NIR-to-TGSI, we don't want to revectorize 64-bit ops that we split to
scalar beyond vec2 width. We even have some ops that we would rather
retain as scalar due to TGSI opcodes being scalar, or having more unusual
requirements.
This could be used to do the vectorize_vec2_16bit filtering, but that
shader compiler option is also used in algebraic so leave it in place for
now.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>
It would be nice if we could do swizzling of an expression on the
replacement side so that we could have a single ieq/ine of the vector
after CSE. However, if you do want vector operations, nir_opt_vectorize()
does just fine.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>
This commit adds support for nir_var_mem_constant various places. It
also adds a pass similar to nir_lower_vars_to_explicit_types except it
also scrapes out the constants and stuffs them into constant_data.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6379>
This doesn't fix the problem that no one knows what any of these mean
half the time but it at least makes them better documented to hopefully
make people's expectations more accurate.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6524>
In nir-to-tgsi, I want to free temps storing SSA values when they go dead,
and NIR liveness has most of the information I need. Hoever, when I reach
the end of a block, I need to free whatever temps were in liveout which
are dead at that point. If liveout is indexed by live_index, then I don't
know the maximum live_index for iterating the live_out bitset, and I also
don't have a way to map that index back to the def->index that my temps
are stored under.
We can use the more typical def->index for these bitsets, which resolves
both of those problems. The only cost is that ssa_undefs don't get merged
into a single bit in the bitfield, but there are generally 1-4 of them in
a shader and we don't track liveness for those anyway so splitting them
apart is fine.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6408>
This enables drivers and utils to get all IO information from intrinsics,
so that they don't have to walk the complex types of NIR variables to find
out other information about IO intrinsics.
NIR in/out variables can be removed after nir_lower_io. We could remove
the variables in the pass, but for now I just decided to remove
the variables in radeonsi before shaders are returned to st/mesa.
(st/mesa just needs adjustments to work without NIR in/out variables)
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6442>
This is very common for backends -- r600, freedreno, and nir_to_tgsi all
needed versions of it. Make a common intrinsic to use for it with a
shared, slightly-tuned-from-ir3 lowering pass.
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6378>
Microsoft's DXIL is based on LLVM, which doesn't have an integer ABS
opcode, but instead needs it lowered to NEG + MAX. We need to do this
with an option, to prevent an already existing optimization rule from
undoing this.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5211>
If no options are provided, existing intrinsics are used.
If the lowering pass indicates there should be offsets used for global
invocation ID or work group ID, then those instructions are lowered to
include the offset.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
The actual variable -> intrinsic lowering stays where it is, but
ops which convert one intrinsic to be implemented in terms of
another have moved.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
For D3D12, we don't want to lower this, as there's a dedicated global-id
system-value that might be faster to use, depending on the hardware.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5891>
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
MSVC treats enums as signed, so storing values that use the topmost
bit of the explicitly sized field loads as a negative value instead.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6393>
This compacts the code and makes it easier to extend.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6220>
v2 (Karol):
renamed pathes to paths
use more bool
use _mesa_set_intersects
deduplicated some code
fixed some typos
v3 (Karol):
don't enable structurizer as we do this in vtn now
v4 (Jason):
A few clean-ups due to unstructured NIR changes
v5 (Jason):
Misc whitespace and style cleanups
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401>
These are safe to call on either structured or unstructured NIR but
don't provide the nice ordering guarantees of nir_foreach_block and
friends. While we're here, we use them for a very small selection of
passes which are known to be safe for unstructured control-flow. The
most important such pass is nir_dominance which is required for
structurizing.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401>
v2 (Jason Ekstrand):
- Make "structured" a property of nir_function_impl not nir_shader
- More validation and asserts
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2401>