The current approach has an issue detected by static analyzer: use of
memory after it is freed.
Using a proper iterator makes things safer.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34050>
It is possible that shader comes with output stores executed before
loading inputs. As the memory to read the inputs and store the outputs
is the same, this mean it could be overwriting the inputs before reading
them.
This move avoids this situation.
This partially improves
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33053.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33310>
Annotating ssa defs without affecting compilation is impossible with
debug info instructions since referencing a nir_def from the debug info
instr will add uses.
The old approach also stops worrking if passes reorder instructions.
This patch proposes a solution which should not regress performance just
like the old approach. The difference is that this one allocates a bit
more space for debug info instead of adding a new instruction for it.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33141>
This pass used to unconditionally use divergence information
which forced the caller to either call divergence_analysis or
ensure that the divergence is properly reset.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33009>
rather than try to enumerate everything a driver might want with an unmanageable
collection of booleans, just do a filter callback + data. this ends up simpler
overall, and will allow Intel to use this pass for just 64-bit images without
needing to add even more booleans.
while we're churning the pass signature, also do a quick port to
nir_shader_intrinsics_pass
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [NIR and V3D]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32907>
This will need to be set to true when the GLSL linker lowers IO, which
can later be unlowered by st/mesa, and then drivers can lower it again
without load_interpolated_input. Therefore, it can't be a global
immutable option.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32229>
Sprinkle around a few traces that were useful in locating submit and
fence waits.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31575>
Adds a new instruction type that stores metadata that might be useful
for debugging purposes. Passes must ignore these instructions when
making decisions.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18903>
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.
nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
Let's rely on nir_lower_mem_access_bit_sizes doing all the heavy work, so
v3d_nir_lower_scratch can be cleaned up quite a lot.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29711>
This is achieved by the following steps:
#ifndef DEBUG => #if !MESA_DEBUG
defined(DEBUG) => MESA_DEBUG
#ifdef DEBUG => #if MESA_DEBUG
This is done by replace in vscode
excludes
docs,*.rs,addrlib,src/imgui,*.sh,src/intel/vulkan/grl/gpu
These are safe because those files should keep DEBUG macro is already excluded;
and not directly replace DEBUG, as we have some symbols around it.
Use debug or NDEBUG instead of DEBUG in comments when proper
This for reduce the usage of DEBUG,
so it's easier migrating to MESA_DEBUG
These are found when migrating DEBUG to MESA_DEBUG,
these are all comment update, so it's safe
Replace comment /* DEBUG */ and /* !DEBUG */ with proper /* MESA_DEBUG */ or /* !MESA_DEBUG */ manually
DEBUG || !NDEBUG -> MESA_DEBUG || !NDEBUG
!DEBUG && NDEBUG -> !(MESA_DEBUG || !NDEBUG)
Replace the DEBUG present in comment with proper new MESA_DEBUG manually
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28092>
We don't rely in any lowerings for these (other than
scalarization). The only noteworthy aspect is that these
instructions, like ballot, use the condition mask to
filter out valid invocations that are inactive because of
control flow.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
This maps to our native shuffle instruction. For shuffle relative
and shuffle xor, we rely on the nir lowering to lower this to
ALU and regular shuffle.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
This adds support in our compiler for the subgroup ballot
feature. To this end we start using the NIR lowering for
subgroups which can lowers some of these intrinsics into
things more amenable to our hardware and takes care of
scalarization.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27211>
Demoting means that we don't execute any writes to memory but
otherwise the invocation continues to execute. Particularly,
subgroup operations and derivatives must work.
Our implementation of discard does exactly this by using
setmsf to prevent writes for the affected invocations, the
only difference for us is that with discard/terminate we
want to be more careful with emitting quad loads for tmu
operations, since the invocations are not supposed to be
running any more and load offsets may not be valid, but with
demote the invocations are not terminated and thus we should
emit memory reads for them to ensure quad operations and
derivatives from invocations that have not been demoted still
work.
Since we use the sample mask to implement demotes we can't tell
whether a particular helper invocation was originally such
(gl_HelperInvocation in GLSL) or was later demoted
(OpIsHelperInvocationEXT added with SPV_EXT_demote_to_helper_invocation),
so we use nir_lower_is_helper_invocation to take care of this.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26949>
We only support the variablePointersStorageBuffer feature of variable pointers,
which basically ensures that pointers may only target one buffer. This means
that a particular pointer may change where it points within a given buffer but
it cannot change its value to point to some other buffer. This is a requirement
from us since we expect buffer indices on buffer loads and stores to be
constant, so we can't have a buffer load come through a pointer that may
be assigned to different buffers, since in that case the buffer index
would need to come from bcsel.
There is, however, a small complication: the spec still allows pointers to
be null, and NIR defines null pointers to use 0xffffffff for both the buffer
index and the offset, which will cause a problem in a scenario like this:
int *b = ...
if (cond) {
b = null;
discard;
}
ubo_load(b);
Here the buffer index for the ubo load may come from a bcsel choosing between
the null pointer (0xffffffff) and the valid address (let's say 0), so we don't
have a constant and we assert fail.
This change detects this scenario and upon finding it will rewrite the buffer
index on the null pointer branch of the bcsel to match that of the valid
branch so that later optimizations passes can remove the bcsel and we end up
with a constant index. This is fine because a null pointer dereference is
undefined behavior and it is not something we should see in valid applications.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26683>
Vulkan shaderdb stats with pattern dEQP-VK.image.*.with_format.*.*:
total instructions in shared programs: 35993 -> 33245 (-7.63%)
instructions in affected programs: 21153 -> 18405 (-12.99%)
helped: 394
HURT: 1
Instructions are helped.
total uniforms in shared programs: 8550 -> 7418 (-13.24%)
uniforms in affected programs: 5136 -> 4004 (-22.04%)
helped: 399
HURT: 0
Uniforms are helped.
total max-temps in shared programs: 6014 -> 5905 (-1.81%)
max-temps in affected programs: 473 -> 364 (-23.04%)
helped: 58
HURT: 0
Max-temps are helped.
total nops in shared programs: 1515 -> 1504 (-0.73%)
nops in affected programs: 46 -> 35 (-23.91%)
helped: 14
HURT: 2
Inconclusive result (%-change mean confidence interval includes 0).
FWIW, that one HURT on the instructions count is for just one
instruction.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25726>
On V3D 7.1 there is not a flag on the Shader State Record to specify
if we are using shared or separate segments. This is done by setting
the vpm input size to 0 (so we need to ensure that the output would be
the max needed for input/output).
We were already doing the latter on the prog_data_vs, so we just need
to use those values, instead of assigning default values.
As we are here, we also add some comments on the compiler part.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
At this point it just tidy up a little the alu_instr structure.
But also serves to prepare the structure for new changes, as 7.x uses
raddr instead of mux, and it is just easier to add the raddr to the
new input structure.
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
We already renamed the type, we just need to rename the enum and the
casting helper functions.
Generated with sed:
sed -i -e 's/nir_instr_type_ssa_undef/nir_instr_type_undef/g' src/**/*.h src/**/*.c src/**/*.cpp
sed -i -e 's/nir_instr_as_ssa_undef/nir_instr_as_undef/g' src/**/*.h src/**/*.c src/**/*.cpp
and two tiny whitespace fixups in lima.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24703>
Instead, we replace every use of it with nir_def. Most of this commit
was generated by sed:
sed -i -e 's/dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
A few manual fixups were required in lima and the nir_legacy code.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
For tex instructions that don't have sampler state use backend_flags
instead of sampler index to bind default sampler state.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24537>
It doesn't do anything yet. We leave that to the subsequent patches so we can
keep the tree-wide refactor as simple as possible.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
It seems NIR is tracking this for us now so we can stop doing this
in the backend.
Also, new CTS tests seem to add the requirement where in the presence of
some builtin's like gl_SampleID in a shader, even if unused, sample shading
is expected to be enabled, which is something we can't track in the backend
since the variable may have been dropped by then.
Fixes 2 failures in:
dEQP-VK.draw.renderpass.implicit_sample_shading.sample*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23984>
`cl_aligned_packet_length()` expand literals, so use ALIGN_POT to compute it
at compile time.
`v3dv_AllocateMemory()` uses a 64-bit `allocationSize`, so use `align64()`.
`v3d_lower_nir()` uses a 32-bit `shared_size`, so use `align()`.
Extracted out of https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23932
for easier review.
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23938>