Fixes some tests when bindless is disabled, where the image format is
R32, we do atomics on it, but we didn't set the "typed UAV load with
additional formats" feature bit because when we loaded from it, we
only loaded one component. Since the image format on the DXIL side
was declared as U32x4, the DXIL validator said that we should have.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23266>
For loads/stores without formats, let's guess one based on how it's used.
The actual format doesn't matter, we just want to use it for the number
of components it has.
Also copy image formats from variables to intrinsics, to ensure that
deref-based intrinsics have formats assigned and lowered intrinsics
are up to date.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23266>
When doing nir_fsub(b, x, imm), we can negate the immediate value, and
replace the fsub with nir_fadd_imm() and get the same result. This makes
the code a bit shorter and easier to read.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23179>
This simplifies things a bit. Note that in some cases, the arguments are
swapped, because multiplications are commutative, and nir_fmul_imm only
allows the second operand to be an immediate.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23179>
Using nir_gather_ssa_types works much better. There's 2 differences
compared to what I was doing before:
1. Multiple passes to allow data to propagate forward and backward
through the whole shader.
2. Allowing a value to have indeterminate types due to having both
int and float usages.
So this deletes some code and gets better results. Wish I'd known
this existed last week.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23062>
Since 624e799cc3 ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA
defs don't have names, making the name argument unused. Drop it from the
signature and fix the call sites. This was done with the help of the following
Coccinelle semantic patch:
@@
expression A, B, C, D, E;
@@
-nir_ssa_dest_init(A, B, C, D, E);
+nir_ssa_dest_init(A, B, C, D);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>
There are no more producers of legacy atomics so these calls are inert.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23036>
For ALU ops where input types are known, we can store that info on
the input sources. This can be used to produce the correct overloads
of load instructions that don't immediately need to be followed by
bitcasts, or similarly to produce a constant value which can be directly
consumed by the relevant instruction without needing a bitcast.
Similarly for values that will be stored in an output, we know type
information. And using that info, we can use more-correct information
for phis instead of forcing all phi sources to be bitcast to int just
to be bitcast back to float on the other side for an alu or an output
store.
One missing piece is SSBO stores, where we can support int or float.
If the input is coming from a phi, we don't influence the phi's type,
so it'll be int, even though the incoming sources might've been float.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22972>
For each phi src, ensure that it's only used as a phi src.
This lets us give each phi their own unique types without worrying
about them stomping on each other. Also scalarize phis.
For each constant, ensure that it's only used once. The DXIL backend
will already dedupe these consts within the module, but this lets a
single load_const have multiple types depending on how it's used.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22972>
This should be faster, since we're not iterating pointlessly over all the
non-phis after the phi.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22967>
The base nir options were assuming all bit sizes were supported at
shader model 6.2. Multiple callers were then changing properties
based on actual support.
Standardize behavior by providing the majority of things that can
impact nir options when getting them. Some callers (e.g. meta blit
shaders or libclc) don't bother, because they are known to have
contents that are unaffected by these options. Other callers might
munge more properties afterwards, but this minimizes that.
Note that lower_helper_invocation was incorrectly being turned off
for SM6.6+ by some callers, despite load_helper_invocation being
unimplemented by the backend.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22952>
We can now determine whether a nir_src is for an if without a sideband, so
simplify the function signature.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Suggested-by: Faith Ekstrand <faith@gfxstrand.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22343>
Every nir_ssa_def is part of a chain of uses, implemented with doubly linked
lists. That means each requires 2 * 64-bit = 16 bytes per def, which is
memory intensive. Together they require 32 bytes per def. Not cool.
To cut that memory use in half, we can combine the two linked lists into a
single use list that contains both regular instruction uses and if-uses. To do
this, we augment the nir_src with a boolean "is_if", and reimplement the
abstract if-uses operations on top of that list. That boolean should fit into
the padding already in nir_src so should not actually affect memory use, and in
the future we sneak it into the bottom bit of a pointer.
However, this creates a new inefficiency: now iterating over regular uses
separate from if-uses is (nominally) more expensive. It turns out virtually
every caller of nir_foreach_if_use(_safe) also calls nir_foreach_use(_safe)
immediately before, so we rewrite most of the callers to instead call a new
single `nir_foreach_use_including_if(_safe)` which predicates the logic based on
`src->is_if`. This should mitigate the performance difference.
There's a bit of churn, but this is largely a mechanical set of changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22343>
Unlike DXBC, DXIL's shift instructions don't have the implicit behavior
that they only take the 5 bits. This is observable if you try to have
DXC do a shift of a dynamic value, e.g. a constant buffer value, where
the compiler inserts the appropriate 'and' op. We need to do the same.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Shifts should always use 32bit shift values, and when lowering to
masked, we need to use 32-bit atomics. That means that we should also
treat 24bit stores as a single masked op rather than one 16bit unmasked
and one 8bit masked.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22225>
Currently we can get one by looking up already-emitted resource
metadata, but in the future we'll want to be able to get this
info from a call site alone. Depending on the type of call site,
we'll have different sets of info, so add helpers for the
various different kinds of call sites we can support.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21913>
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
The shader key structure is quite large and memsetting it to zero to be
able to create or often simply find an existing shader is responsible
for a large portion of CPU usage during benchmarks.
This change is more surgical about what, when, and how things get
cleared.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21247>