With no registers seen, it is now a no-op.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>
In 9ffd00bcf1 ("nir_to_tgsi: Pack our tex coords into vec4
nir_tex_src_backend[12]"), Emma added a pair of back-end sources to
nir_tex_instr to allow complex lowering to be done in NIR. This adds a
tiny bit more hw-specific back-end information that a NIR lowering pass
can communicate to the back-end compiler.
While the opcode contains most of the information needed, some thing
such as the presence of offsets is currently only communicated via the
presence of specific source types in the source list. This information
is gone when the texture instruction is lowered to back-end sources.
Adding a backend_flags field fixes this by allowing the lowering pass to
communicate a small amount of side-band information if needed.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22303>
Nothing produces them any more, so remove them from NIR. This massively reduces
the size of nir_src, which should improve performance all over.
nir_src size reduced from 56 bytes -> 40 bytes (pahole results on arm64, x86_64
should be similar.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>
Lowers tess_coord to tess_coord_xy and math. Based on ir3's version.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24159>
Convert it to an opt-in for backends to prefer and use nir_load_texture_scale
instead of txs for nir lowerings.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>
It is now unused, as all internal producers of registers have been switched over
to intrinsics and no drivers call it.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
This is a variant of nir_lower_vec_to_movs that produces register intrinsics
(store_reg with write masks) instead of masked moves.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
This isn't so bad. I still duplicated the pass because it makes a lot easier to
have them coexist, switch users over one by one, and then garbage collect the
old when we're done.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
A number of passes lower SSA partially to registers, do work that would be
invalid in SSA, and then go back into SSA with nir_lower_regs_to_ssa. As a step
towards replacing nir_register with intrinsics,
the nir_lower_{phis,ssa_defs}_to_regs passes are changed to produce intrinsics
instead of nir_registers, and their callers are updated to call
nir_lower_reg_intrinsics_to_ssa instead of nir_lower_regs_to_ssa to compensate.
Jointly authored with Faith.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
It doesn't do anything yet. We leave that to the subsequent patches so we can
keep the tree-wide refactor as simple as possible.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
in the sense of operating on register intrinsics instead of nir_registers.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
After running the pass, all register access intrinsics are guaranteed to be
"trivial" in the sense that the program is free of hazards preventing
propagating them away without inserting any copies.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
Note the writemask handling is chosen for consistency with the rest of NIR. In
every other instance, writemask=w requires a vec4 source. This is hardcoded into
nir_validate and nir_print as what it means to have a writemask.
More importantly, consistency with how register writemasks currently work.
nir_print hides it, but r0.w = fneg ssa_1.x is actually a vec4 instruction with
source ssa_1.xxxx. As a silly example nir_dest_num_components(that) = 4 in the
old model. I realize this is quite strange coming from a scalar ISA, but it's
perfectly natural for the class of vec4 hardware for which this was designed. In
that hardware, conceptually all instructions are vec4`, so the sequence "fneg
ssa_1 and write to channel w" is implemented as "fneg a vec4 with ssa_1.x in the
last component and write that vec4 out but mask to write only the w channel".
Isn't this inefficient? It can be. To save power, Midgard has scalar ALUs in
addition to vec4 ALUs. Those details are confined to the backend VLIW scheduler;
the instruction selection is still done as vec4. This mechanism has little in
common with AMD's SALUs. Midgard has a wave size of 1, with special hacks for
derivatives.
As a result, all backends consuming register writemasks are expecting this
pattern of code. Changing the store to take a vec1 instead of a vec4 would
require changing every backend to reswizzle the sources to resurrect the vec4. I
started typing a branch to do this yesterday, but it made a mess of both Midgard
and nir-to-tgsi. Without any good reason to think it'd actually help
performance, I abandoned the idea. Getting all 15 backends converted to the
helpers is enough of a challenge without forcing 10 backends to reswizzle their
sources too.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
`y_vu` will be used to convert NV21 to RGB.
`yv_yu` will be used to convert YVYU and VYUY to RGB when the
subsampling formats PIPE_FORMAT_R8B8_R8G8 and PIPE_FORMAT_B8R8_G8R8
are supported.
`yx_xvxu` and `xy_vxux` will be used to convert YVYU and VYUY to RGB
when those subsampling formats are not supported.
Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21219>
When I reading through some of my older commits I noticed that `break` in
`nir_foreach_phi` is broken because I used the two-loop trick wrong. Rewrite the
macros to fix this, and also to generally be a lot cleaner.
Fixes: 7dc297cc14 ("nir: Add nir_foreach_phi(_safe) macro")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23957>
The function iterator should be able to modified in this foreach loop
And the latter patches needs this
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23960>
This frees up the shorter names for the intrinsic-based versions that will
replace them.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23956>
Add a pass for bounds checking UBOs, SSBOs, and images to implement robustness.
This pass is based on v3d_nir_lower_robust_access.c, with significant
modifications to be appropriate for common code. Notably:
* v3d-isms are removed.
* Stop generating invalid imageSize() instructions for cube maps, this
blows up nir_validate with asahi's lowerings.
* Logic to wrap an intrinsic in an if-statement is extracted in anticipation of
future robustness2 support that will reuse that code path for buffers.
* Misc cleanups to follow modern NIR best practice. This pass is noticeably
shorter than the original v3d version.
For future support of robustness2, I envision the booleans turning into tristate
enums.
There's a few more knobs added for Asahi's benefit. Apple hardware can do
imageLoad and imageStore to non-buffer images (only). There is no support for
image atomics. To handle, Asahi implements software lowering for buffer images
and for image atomics. While the hardware is robust, the software paths are not.
So we would like to use this pass to lower robustness for the software paths but
not the hardware paths.
Or maybe we want a filter callback?
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23895>
This macro nir_foreach_function_with_impl can be used when func and func->impl are both accessed in
foreach loop
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23903>