in the sense of operating on register intrinsics instead of nir_registers.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
These are registerful versions of core nir_src/nir_dest which will become
SSA-only soon enough, and modifierful versions of nir_alu_src/nir_alu_dest.
The latter will let us remove modifiers from nir_alu_instr finally.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
After running the pass, all register access intrinsics are guaranteed to be
"trivial" in the sense that the program is free of hazards preventing
propagating them away without inserting any copies.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
Note the writemask handling is chosen for consistency with the rest of NIR. In
every other instance, writemask=w requires a vec4 source. This is hardcoded into
nir_validate and nir_print as what it means to have a writemask.
More importantly, consistency with how register writemasks currently work.
nir_print hides it, but r0.w = fneg ssa_1.x is actually a vec4 instruction with
source ssa_1.xxxx. As a silly example nir_dest_num_components(that) = 4 in the
old model. I realize this is quite strange coming from a scalar ISA, but it's
perfectly natural for the class of vec4 hardware for which this was designed. In
that hardware, conceptually all instructions are vec4`, so the sequence "fneg
ssa_1 and write to channel w" is implemented as "fneg a vec4 with ssa_1.x in the
last component and write that vec4 out but mask to write only the w channel".
Isn't this inefficient? It can be. To save power, Midgard has scalar ALUs in
addition to vec4 ALUs. Those details are confined to the backend VLIW scheduler;
the instruction selection is still done as vec4. This mechanism has little in
common with AMD's SALUs. Midgard has a wave size of 1, with special hacks for
derivatives.
As a result, all backends consuming register writemasks are expecting this
pattern of code. Changing the store to take a vec1 instead of a vec4 would
require changing every backend to reswizzle the sources to resurrect the vec4. I
started typing a branch to do this yesterday, but it made a mess of both Midgard
and nir-to-tgsi. Without any good reason to think it'd actually help
performance, I abandoned the idea. Getting all 15 backends converted to the
helpers is enough of a challenge without forcing 10 backends to reswizzle their
sources too.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
Previously, nir_opt_dead_cf could skip dead CF nodes because overwriting
cur after dead_cf_block is not enough to cover the whole CF list.
foreach_list_typed would select the next node, skipping the node that
previously made progress:
block 1
if (true) {}
block 2
if (true) {}
block 3
if (true) {}
Would turn into:
block 1, then, block 2
if (true) { }
block 3, then
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22064>
This avoids spilling deref instructions by wrapping shader calls inside
dummy blocks, rematerializing derefs in their use blocks and removing
the dummy blocks.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22064>
`y_vu` will be used to convert NV21 to RGB.
`yv_yu` will be used to convert YVYU and VYUY to RGB when the
subsampling formats PIPE_FORMAT_R8B8_R8G8 and PIPE_FORMAT_B8R8_G8R8
are supported.
`yx_xvxu` and `xy_vxux` will be used to convert YVYU and VYUY to RGB
when those subsampling formats are not supported.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21219>
Image sparse loads can be stripped from their sparse component if
unused and turned into non sparse variants.
Texture sparse accesses can also be turned off if unused.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23995>
When I reading through some of my older commits I noticed that `break` in
`nir_foreach_phi` is broken because I used the two-loop trick wrong. Rewrite the
macros to fix this, and also to generally be a lot cleaner.
Fixes: 7dc297cc14 ("nir: Add nir_foreach_phi(_safe) macro")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23957>
Makes it easier to copy snippets of shaders into code or
test comments without worrying about conflict with `/* */`.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23564>
Follow the same syntax as the intrinsic indices, since they
are conceptually similar.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23564>
- For SSA destination, padding is applied before `%`.
- For Reg destination, pad to the SSA size (to align div/con),
then remaining padding is applied before `r`.
- For instructions without destination, padding is applied so
they start right after the ` = ` of the cases above.
If the block doesn't have any destinations, there's no padding
is applied to the instructions without destinations in that
block.
For now registers with array access will be unaligned.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23564>
Gets rid of all the
struct nir_*_indices {
int _; /* exists to avoid empty initializers */
};
declarations. 14293 loc -> 12900 loc
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23906>
If a then/else block ends in a jump, the phi nodes do not necessarily
have to reference the always taken branch because they are dead code.
Avoid crashing in this case by only rewriting phis, if the block does
not end in a jump.
cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23150>
Creates and returns a nir_builder from a cursor. The nir_function_impl
is retrieved using said cursor. This should be fine as long as it is not
used on extracted control flow.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23883>
The function iterator should be able to modified in this foreach loop
And the latter patches needs this
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23960>
This frees up the shorter names for the intrinsic-based versions that will
replace them.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23956>
This frees up the shorter names for the new register-based intrinsics.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23956>
Midgard has both int and float version of b32csel. The backend needs some way to
pick between the two, and it's a lot more convenient to choose in NIR before
going out-of-SSA than in the backend.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23769>
Add a pass for bounds checking UBOs, SSBOs, and images to implement robustness.
This pass is based on v3d_nir_lower_robust_access.c, with significant
modifications to be appropriate for common code. Notably:
* v3d-isms are removed.
* Stop generating invalid imageSize() instructions for cube maps, this
blows up nir_validate with asahi's lowerings.
* Logic to wrap an intrinsic in an if-statement is extracted in anticipation of
future robustness2 support that will reuse that code path for buffers.
* Misc cleanups to follow modern NIR best practice. This pass is noticeably
shorter than the original v3d version.
For future support of robustness2, I envision the booleans turning into tristate
enums.
There's a few more knobs added for Asahi's benefit. Apple hardware can do
imageLoad and imageStore to non-buffer images (only). There is no support for
image atomics. To handle, Asahi implements software lowering for buffer images
and for image atomics. While the hardware is robust, the software paths are not.
So we would like to use this pass to lower robustness for the software paths but
not the hardware paths.
Or maybe we want a filter callback?
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23895>
We can't txf_ms on non-MS images and we can't txf on MS images. This would have
caught a regression on Asahi.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23892>