Currently, we have an atomic intrinsic for each combination of memory type
(global, shared, image, etc) and atomic operation (add, sub, etc). So for m
types of memory supported by the driver and n atomic opcodes, the driver has to
handle O(mn) intrinsics. This makes a total mess in every single backend I've
looked at, without fail.
It would be a lot nicer to unify the intrinsics. There are two obvious ways:
1. Make the memory type a constant index, keep different intrinsics for
different operations. The problem with this is that different memory types
imply different intrinsic signatures (number of sources, etc). As an
example, it doesn't make sense to unify global_atomic_amd with
global_atomic_2x32, as an example. The first takes 3 scalar sources, the
second takes 1 vector and 1 scalar. Also, in any single backend, there are a
lot more operations than there are memory types.
2. Make the opcode a constant index, keep different intrinsics for different
operations. This works well, with one exception: compswap and fcompswap
take an extra argument that other atomics don't, so there's an extra axis of
variation for the intrinsic signatures.
So, the solution is to have 2 intrinsics for each memory type -- for atomics
taking 1 argument and atomics taking 2 respectively. Both of these intrinsics
take an nir_atomic_op enum to describe its operation. We don't use a nir_op for
this purpose, as there are some atomics (cmpxchg, inc_wrap, etc) that don't
cleanly map to any ALU op and it would be weird to force it.
The plan is to transition to these new opcodes gradually. This series adds a
lowering pass producing these opcodes from the existing opcodes, so that
backends can opt-in to the new forms one-by-one. Then we can convert backends
separately without any cross-tree flag day. Once everything is converted, we can
convert the producers and core NIR as a flag day, but we have far fewer
producers than backends so this should be fine. Finally we can drop the old
stuff.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22914>
We have a helper, don't open code it.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22967>
The pattern shows up all the time open-coded. Use the macro instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22967>
Serious preprocessor voodoo here. There are two tricks here.
1. Iterating only phis. We know that phis come only at the beginning of a block,
so all over the tree, we open-code iteration like:
nir_foreach_instr(instr, block) {
if (instr->type != phi)
break;
/* do stuff */
}
We can express this equivalently as
nir_foreach_instr(instr, block)
if (instr->type != phi)
break;
else {
/* do stuff */
}
So, we can define a macro
#define nir_foreach_phi(instr, block)
if (instr->type != phi)
break;
else
and then
nir_foreach_phi(..)
statement;
and
nir_foreach_phi(..) {
...
}
will expand to the right thing.
2. Automatically getting the phi as a phi. We want the instruction to go to some
hidden variable, and then automatically insert nir_phi_instr *phi =
nir_instr_as_phi(instr_internal); We can't do that directly, since we need to
express the assignment implicitly in the control flow for the above trick to
work. But we can do it indirectly with a loop initializer.
for (nir_phi_instr *phi = nir_instr_as_phi(instr_internal); ...)
That loop needs to break after exactly one iteration. We know that phi
will always be non-null on its first iteration, since the original
instruction is non-null, so we can use phi==NULL as a sentinel and express a
one-iteration loop as for (phi = nonnull; phi != NULL; phi = NULL).
Putting these together gives the macros implemented used.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22967>
We already document a lot of ALU opcodes, let's make this machine-readable so we
can put the descriptions in our generated HTML documentation.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22929>
Models `(a * b) + (c << d)` in general, as implemented in various forms on AGX.
This will be fused with backend NIR opt algebraic rules, both for the literal
pattern as well as to strength reduce certain multiplications, e.g. replacing
a * 5 with `a + (a << 2)` expressed as imadshl_agx(a, 1, a, 2).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22695>
We have a few ALU instructions that take a constant source. Technically, they
have a swizzle so you can't just nir_src_as_uint them, even though a bunch of
backends do. To help backends do the right thing, add a helper that's just as
easy to use that will chase the swizzle properly.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22695>
Similar to SpvDecorationRestrict, looks like it's also incorrectly
generated by glslang.
This will allow RADV/CI to leave MESA_SPIRV_LOG_LEVEL as default
(ie. only warnings).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22917>
if a sampler is never used (no derefs) then its binding will never be
applied here, leaving it with binding=0. this will clobber the real binding=0
sampler in driver backends, leading to errors, so try to iterate using
the same criteria as above and apply bindings in the same way
fixes#8974
cc: mesa-stable
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22902>
When rendering a scaled tile, we need to use the original, hardware
FragCoord when accessing input attachments that are on-tile (i.e. were
rendered to in a previous subpass) because they are also scaled in the
same way that FragCoord is scaled. For input attachments that aren't
already on-tile, however, we need to use the fixed gl_FragCoord. Add a
new intrinsic and a bitfield of input attachments which should use it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20304>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
Contains a global wave ID of legacy GS waves.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
This intrinsic is going to be used for simplifying GS code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22690>
It is undefined behavior when an ARB assembly or shadow2d GLSL func
uses SHADOW2D target with a texture in not depth format.
In this case AMD and NVIDIA automatically replaces SHADOW sampler
with a normal sampler and some games like Penumbra Overture which abuses
this UB works fine but breaks with mesa.
Replace the shadow sampler with a normal one here by recompiling
the ARB program variant
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8425
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22147>
Prevent GCC warning:
```
[230/1401] Compiling C object src/compiler/nir/libnir.a.p/nir_lower_io_to_vector.c.o
In function 'get_flat_type',
inlined from 'create_new_io_vars' at ../src/compiler/nir/nir_lower_io_to_vector.c:300:10:
../src/compiler/nir/nir_lower_io_to_vector.c:208:14: warning: 'base' may be used uninitialized [-Wmaybe-uninitialized]
208 | return glsl_array_type(glsl_vector_type(base, 4), slots, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/compiler/nir/nir_lower_io_to_vector.c: In function 'create_new_io_vars':
../src/compiler/nir/nir_lower_io_to_vector.c:163:24: note: 'base' was declared here
163 | enum glsl_base_type base;
| ^~~~
```
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8957
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22840>
We need to do full pow if 64-bit, and we can do fpow() otherwise. Not
the other way around.
Fixes: 9076c4e289 ("nir: update opcode definitions for different bit sizes")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22774>
Without this, we get some trailing spaces in some cases, and some
double space in some other cases. Let's clean that up a bit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22550>
For GFX11 export dual source blend outputs when ACO.
ACO need a pseudo instruction to emit a block of
code which can't be done in nir currently.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22199>
Previously the passthrough gs shader loaded some values with uniform
loads using sevaral hardcoded values.
This was not flexible for other drivers and started becoming too
unflexible for zink itself.
Use system values instead and use a lowering pass in zink.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22667>
Extracts some per-impl code to nir_lower_vec3_to_vec4 and then
converts to use the nir_shader_instructions_pass helper.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11683>
The new code splits the work into a few passes instead of trying to do
everything with a single pass. This helps to apply the new clarified
rules for structured control flow in the SPIR-V specification, in
particular the "exit construct" rules.
First find an appropriate ordering for the blocks, based on the
approach taken by Tint (WebGPU compiler). Then, with those blocks
in order, identify the SPIR-V constructs start and end positions.
Finally, walk the blocks again to emit NIR for each of them, "opening"
and "closing" the necessary NIR constructs as we reach the start and
end positions of the SPIR-V constructs.
There are a couple of interesting choices when mapping the constructs
to NIR:
- NIR doesn't have something like a switch, so like the previous code,
we lower the switch construct to a series of conditionals for each
case.
- And, unlike the previous code, when there's a need to perform a
break from a construct that NIR doesn't directly support (e.g. inside
a case construct, conditionally breaking early from the switch), we
now use a combination of a NIR loop and an NIR if. Extra code is
added to ensure that loop_break and loop_continues are propagated
to the right loop.
This should fix various issues with valid SPIR-V that previously
resulted in "Invalid back or cross-edge in the CFG" errors.
Thanks to Alan Baker and David Neto for their explanations of
ordering the blocks, in the Tint code and in presentations to
the SPIR-V WG.
Thanks to Jack Clark for providing a lot of valuable tests used to
validate this MR.
Closes: #5973, #6369
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17922>
Apply some basic NIR optimizations to clean up the result. Useful in some
situations when comparing the parsing code from different mesa branches.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22180>