SPIR-V's OpKill is a control-flow instruction but NIR's discard is not.
Therefore, it can be valid SPIR-V to have
if (...) {
foo = /* something */
} else {
discard;
}
use(foo);
without any phi between the definition of foo and its use. This is not
true in NIR, however, because NIR's discard isn't considered
control-flow. Arguably, this is a NIR bug but making discard control-
flow is a very deep change that can have serious ans subtle
side-effects. The easier thing to do is just fix up the SSA in case we
have an OpKill which might have gotten us into the above case.
Fixes dEQP-VK.graphicsfuzz.vectors-and-discard-in-function with the new
NIR dominance validation pass enabled.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
We don't do full dominance validation of SSA values in nir_validate
because it requires generating valid dominance information and, while
that's not extremely expensive, it's probably more than we want to do on
every pass. Also, dominance information is generated through the
metadata system so if we ran it by default in nir_validate, we would get
different beavior of the metadata system based on whether or not you
have a debug build and metadata bugs would be very hard to find.
However, having a pass for it that can be run occasionally, should help
detect and expose bugs. For ease of use, we add a NIR_VALIDATE_SSA_DOMINANCE
environment variable which can be set to manually enable dominance
validation as a standard part of nir_validate.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
Fixes a case where opt_if_merge created code like:
if (...) {
break;
loop {
...
}
}
which caused opt_peel_loop_initial_if to complain that the loop pre-header
wasn't a predecessor of the loop header. This patch prevents this
(invalid, I think) unreachable code from being created.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3496
Fixes: 4d3f6cb973 ('nir: merge some basic consecutive ifs')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6633>
freedreno results (note that cat6 is loads from memory as opposed to
pushed constants from the constant file):
total instructions in shared programs: 8044344 -> 8022085 (-0.28%)
total constlen in shared programs: 1411384 -> 1461964 (3.58%)
total cat6 in shared programs: 89983 -> 87065 (-3.24%)
Over the last 3 commits, we increased Manhattan31 performance by ~10%
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
For UBO accesses to be the same performance as classic GL default uniform
block uniforms, we need to be able to push them through the same path. On
freedreno, we haven't been uploading UBOs as push constants when they're
used for indirect array access, because we don't know what range of the
UBO is needed for an access.
I believe we won't be able to calculate the range in general in spirv
given casts that can happen, so we define a [0, ~0] range to be "We don't
know anything". We use that at the moment for all UBO loads except for
nir_lower_uniforms_to_ubo, where we now avoid losing the range information
that default uniform block loads come with.
In a departure from other NIR intrinsics with a "base", I didn't make the
base an be something you have to add to the src[1] offset. This keeps us
from needing to modify all drivers (particularly since the base+offset
thing can mean needing to do addition in the backend), makes backend
tracking of ranges easy, and makes the range calculations in
load_store_vectorizer reasonable. However, this could definitely cause
some confusion for people used to the normal NIR base.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
I remembered doing this analysis and was arguing in another MR that this
pass didn't have any driver dependency, but it actually does based on
PIPE_CAP_PACKED_UNIFORMS.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6359>
The current align_mul calculation in the unknown-array-index
calculation is
align_mul = MIN3(parent_mul,
min_pow2_divisor(parent_offset),
min_pow2_divisor(stride))
which is certainly correct if parent_offset > 0. However, when
parent_offset = 0, min_pow2_divisor(parent_offset) isn't well-defined
and our calculation for it is 1 << -1 which isn't well-defined. That
said.... it's not actually needed.
The offset to the base of the array is
array_base = parent_mul * k + parent_offset
for some integer k. When we throw in an unknown array index i, we get
elem = parent_mul * k + parent_offset + stride * i.
If we set new_align = MIN2(parent_mul, min_pow2_divisor(stride)), then
both parent_mul and stride are divisible by new_align and
elem = (parent_mul / new_alig) * new_align * k +
(stride / new_align) * new_align * i + parent_offset
= new_align * ((parent_mul / new_alig) * k +
(stride / new_align) * i) + parent_offset
so elem = new_align * j + parent_offset where
j = (parent_mul / new_alig) * k + (stride / new_align) * i.
That's a very long-winded way of saying that we can delete one parameter
from the align_mul calculation and it's still fine. :-)
Fixes: 480329cf8b "nir: Add a helper for getting the alignment of a deref"
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Tested-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6628>
The argument handling of this little tool was pretty rubbish. It had no
help and it required the filename to come first which is just strange.
This reworks it and makes things much nicer. It's still rubbish but at
least there's a chance people can figure out how to use it now.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6607>
As shown in the valid SPIR-V below, if one switch case statement
directly jumps to the merge block, it has no branches at all and
we have to reset the fall variable. Otherwise, it creates an
unintentional fallthrough.
OpSelectionMerge %97 None
OpSwitch %96 %97 1 %99 2 %100
%100 = OpLabel
%102 = OpAccessChain %_ptr_StorageBuffer_v4float %86 %uint_0 %uint_37
%103 = OpLoad %v4float %102
%104 = OpBitcast %v4uint %103
%105 = OpCompositeExtract %uint %104 0
%106 = OpShiftLeftLogical %uint %105 %uint_1
OpBranch %97
%99 = OpLabel
OpBranch %97
%97 = OpLabel
%107 = OpPhi %uint %uint_4 %75 %uint_5 %99 %106 %100
This fixes serious corruption in Horizon Zero Dawn.
v2: Changed the code to skip the entire if-block instead of resetting
the fallthrough variable.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3460
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6590>
This also fixes the inverted last parameter of nir_lower_flrp in most drivers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
We're casting pointers to const memory to const pointers. MSVC complains
about this with the following warning:
warning C4090: 'initializing': different 'const' qualifiers
In this case, we can easily use both constnesses, because all we do is
read here. So let's avoid the warning by adding another const-keyword.
Fixes: 193765e26b ("nir/lower_goto_if: Sort blocks in select_fork")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6582>
nir_shader::num_inputs isn't supposed to be a count of how many input
variables we have. It's a size of the lowered input space.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
This commit propagates the alignment information provided either through
the Alignment decoration on pointers or via the alignment mem operands
to OpLoad, OpStore, and OpCopyMemory to the NIR deref chain. It does so
by wrapping the deref in a cast. NIR should be able to clean up most
unnecessary casts only leaving us with the useful alignment information.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
If we have a cast deref with alignment information and we can get equal
or better alignment information from something further up the deref
chain, we can strip the alignment information from the cast and allow
other optimizations to potentially eliminate the cast.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Generally, if a cast has alignment information, that information is
useful and we don't want to loose it.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
If the deref has no explicit alignment in the chain, we assume component
alignment which is what we currently assume for all derefs today. This
should be correct for all APIs in the sense that we can usually assume
at least component alignment. However, for some APIs such as OpenCL, we
could potentially make larger alignment assumptions. The intention is
that those will be handled via alignment-increasing casts.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
This renames it to drop the ptr_as and makes it handle all of the stride
cases. There's a bit of a tricky bit in here around Booleans but we
currently use 32-bit for those always.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
When creating explicit type, the alignment information is lost, thus
forcing explicit type users to recalculate the alignment using the same
size_align() function. Let's add a new field to cache this information.
Only structs, matrices, and vectors have and explicit alignment. Arrays
alignment is implicitly set to its element alignment and matrices are
required to have an alignment that matches that of its vector columns.
the concept of alignment simply doesn't apply to other types.
We make the strategic choice to not allow explicit alignments on
scalars. This is for a couple of reasons:
1. There are no cases today where we use explicit types where we want
any other alignment for scalars than natural alignment.
2. Vectors don't have a component alignment that's separate from the
explicit_alignment so it's impossible to get an explicitly aligned
scalar type which is the component of the explicitly aligned vector
type.
This choice may cause problems if we ever want to use explicitly laid
out types for things like varyings where we sometimes want vec4
alignment of scalars. We can deal with that when the time comes.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
OpenCL doesn't mandate a size and this is consistent with the rest of
the glsl_type system. While we're here, we also clean ::cl_size() up a
bit and use a new explicit_type_scalar_byte_size() helper.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
This should help code calculating field offsets to get it right when
the structure is marked packed.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Right now, when calling get_explicit_type_for_size_align() on a packed
struct, the packed attribute is lost and field offsets are wrong.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
We need to parse the CPacked decoration early enough to apply it when
calculating field offsets and creating the struct type.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
CPacked decoration is only allowed on struct definitions, not struct
members.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
If there were no constant variables, we would bail out entirely.
However, we may still have constant input pointers coming in from the
client.
Fixes: 4360a8a2b3 "nir/lower_io: Add support for nir_var_mem_constant"
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6472>
Instead of always lowering everything, we add a threshold such that if
the total indirected array size (AoA size) is above that threshold, it
won't lower. It's assumed that the driver will sort things out somehow
by, for instance, lowering to scratch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909>
Especially a problem for OpImage/OpSampledImage. Note that the problem
doesn't seem to be propagation through glslang, but only in emitting
the SPIR-V. So it is fine if we are somewhat lossy in handling this, as
long as direct Op(Sampled)Image -> texture op chains work.
Fixes: af81486a8c "spirv: Simplify our handling of NonUniform"
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3406
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6451>
For NIR-to-TGSI, we don't want to revectorize 64-bit ops that we split to
scalar beyond vec2 width. We even have some ops that we would rather
retain as scalar due to TGSI opcodes being scalar, or having more unusual
requirements.
This could be used to do the vectorize_vec2_16bit filtering, but that
shader compiler option is also used in algebraic so leave it in place for
now.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>
It would be nice if we could do swizzling of an expression on the
replacement side so that we could have a single ieq/ine of the vector
after CSE. However, if you do want vector operations, nir_opt_vectorize()
does just fine.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>