With no registers seen, it is now a no-op.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>
We use nir_builder a variety of places where we may not yet have
back-end compiler options such as for building meta shaders. Don't
assume we always have an options struct.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24459>
sed + ninja clang-format + fix up spacing for common code.
If you are unhappy that I did not manually change the whitespace of your driver,
you need to enable clang-format for it so the formatting would happen
automatically.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428>
True for all users. I intentionally didn't add is_ssa asserts because they're
pointless and will be deleted, like, next week and will just make that churn
even more annoying.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24319>
a variable with a component offset may span multiple slots, and this cannot
be inferred from its type alone (e.g., compacted clip+cull distances)
cc: mesa-stable
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163>
If a deref_cast parent is also a deref, and the parent mode is not
generic, we can safely propagate the parent mode down to the
deref_cast.
This will allow the fixup to work when the deref_cast is used just
to change the glsl_type when accessing a variable/deref-chain.
Reviewed-by: Faith Ekstrand <faith@gfxstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23734>
In 9ffd00bcf1 ("nir_to_tgsi: Pack our tex coords into vec4
nir_tex_src_backend[12]"), Emma added a pair of back-end sources to
nir_tex_instr to allow complex lowering to be done in NIR. This adds a
tiny bit more hw-specific back-end information that a NIR lowering pass
can communicate to the back-end compiler.
While the opcode contains most of the information needed, some thing
such as the presence of offsets is currently only communicated via the
presence of specific source types in the source list. This information
is gone when the texture instruction is lowered to back-end sources.
Adding a backend_flags field fixes this by allowing the lowering pass to
communicate a small amount of side-band information if needed.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22303>
Just drop the store. Written while debugging
dEQP-VK.pipeline.monolithic.logic_op.r8_uint.no_op.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24252>
nir_const_value_for_int asserts signed bounds on the input, but we pass in an
unsigned value that would be out-of-bounds for 32-bit channels, causing the
assert to fail for 32-bit channel formats.
Fixes dEQP-VK.pipeline.monolithic.logic_op.r32_uint.* on AGXV (and probably
PanVK).
Fixes: dbd0615e7a ("nir/lower_blend: Avoid useless iand with logic ops")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24252>
These are tracked the same way as register reads and writes, allowing
them to be re-arranged as long as they respect dependencies within the
same reg.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
This makes it so we never find a reg_decl in between a reg_store and the def
for its value, which helps avid inserting copy movs.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
Consider the snippet of NIR:
div 32 %447 = @load_reg (%442) (base=0, legacy_fabs=0, legacy_fneg=0)
div 32 %463 = @load_reg (%442) (base=0, legacy_fabs=0, legacy_fneg=0)
con 32 %409 = iadd %17 (0x3), %447
@store_output (%182 (0x601), %463) (base=0, wrmask=x, component=0, src_type=invalid...
@store_reg (%409, %442) (base=0, wrmask=x, legacy_fsat=0)
The load_reg's are trivial, so the %442 read will get folded into store_output.
But under the old definition, the store_reg is also trivial so it gets folded
into the iadd... causing a read-after-write hazard and invalid code generation.
The fix is to amend our definition of store_reg triviality to account for loads
getting folded in. It's not good enough that there's no intervening load_reg,
there can also be no intervening source that gets chased to a load_reg. Handle
that case as well.
Identified in dEQP-VK.geometry.input.basic_primitive.triangles_adjacency on
V3DV.
Fixes: d313eba94e ("nir: Add pass for trivializing register access")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reported-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
In order for a register load to be trivial, it cannot be used in any
block other than the one in which it is loaded. We're not currently
explicitly doing anything to ensure this invariant holds. It may be
that it holds regardless but I couldn't find any documented reason why
it should so let's explicitly handle that case. Worst case, the newly
added code does nothing.
Fixes: d313eba94e ("nir: Add pass for trivializing register access")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
Because this pass is intended to be run after out-of-SSA and directly
before injesting the NIR into the back-end, it may come after divergence
analysis and needs to preserve the divergence information. Fortunately,
since all we ever do is insert nir_op_mov, this is easy.
Fixes: d313eba94e ("nir: Add pass for trivializing register access")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
This commit makes three changes:
1. Default all newly created registers divergent because this is the
safer default.
2. Make divergence analysis do something sane with register divergence.
It's not perfect because divergence analysis isn't able to prove
registers divergent based on stores but at least if someone uses
registers a bit they'll end up with safe defaults. This matches
what they'd get with nir_ssa_def_init().
3. Make the load_reg() helper automatically propagate divergence from
the register. Because the defaults for both nir_ssa_def_init() and
nir_decl_reg() are to mark everything divergent, this only means
that nir_load_reg() of a uniform reg is now uniform.
Putting all these together, nir_from_ssa should now be producing
load_reg intrinsics with the proper uniform information.
Fixes: 7229bffcb1 ("nir: Add intrinsics for register access")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
In the case of:
halt
// succs: b9
if %618 {
block b3:// preds:
break
// succs: b6
} else {
block b4: // preds: , succs: b5
}
block b5: // preds: b4
32 %556 = iadd %617, %2 (0x1)
opt_constant_if() doesn't work because stitch_blocks() can't join blocks if the
before ends in a jump and the after isn't empty.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24235>
After the previous commit there are so few to reuse that this is no
longer worth doing and actually causes compilation to slow down.
The Blender shader compile time in issue #9326 improves as folows:
21.11 seconds -> 9.90 seconds
The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:
0.92 seconds -> 0.68 seconds
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9326
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Most of the variables in the hash table will never actually be looked up
for any given block so cloning every possible value just creates a bunch
of unrequired memcpy calls.
Here we change the code to only clone the copies array once it is
actually looked up for the first time.
The Blender shader compile time in issue #9326 improves as folows:
151.09 seconds -> 21.11 seconds
The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:
1.67 seconds -> 0.92 seconds
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
If kill alias results in the hash table entry holding an empty
copies array then remove the hash entry and return the dynamic array
to the unused pool.
This helps avoid hash table size getting out of control in very large
shaders.
151.09 seconds -> 118.60 seconds
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Here we change things to simply clone the entire hash table. This
is much faster than trying to rebuild it and is needed to avoid
slow compilation of very large shaders.
The Blender shader compile time in issue #9326 improves as folows:
251.29 seconds -> 151.09 seconds
The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:
2.38 seconds -> 1.67 seconds
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
There is no point doing an expensive clone of the copies if the
if-branch is empty.
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
Nothing produces them any more, so remove them from NIR. This massively reduces
the size of nir_src, which should improve performance all over.
nir_src size reduced from 56 bytes -> 40 bytes (pahole results on arm64, x86_64
should be similar.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>
Non of the know etnaviv GPUs support this feature in hardware
and the binary blob provides sizes via uniforms too.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24217>