Commit graph

4627 commits

Author SHA1 Message Date
Faith Ekstrand
4fd257d20f nir: Properly handle divergence for load_reg
This commit makes three changes:

 1. Default all newly created registers divergent because this is the
    safer default.

 2. Make divergence analysis do something sane with register divergence.
    It's not perfect because divergence analysis isn't able to prove
    registers divergent based on stores but at least if someone uses
    registers a bit they'll end up with safe defaults.  This matches
    what they'd get with nir_ssa_def_init().

 3. Make the load_reg() helper automatically propagate divergence from
    the register.  Because the defaults for both nir_ssa_def_init() and
    nir_decl_reg() are to mark everything divergent, this only means
    that nir_load_reg() of a uniform reg is now uniform.

Putting all these together, nir_from_ssa should now be producing
load_reg intrinsics with the proper uniform information.

Fixes: 7229bffcb1 ("nir: Add intrinsics for register access")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24153>
2023-07-25 15:36:52 +00:00
Rhys Perry
a53d3ff0b3 nir/tests: add nir_opt_dead_cf_test.jump_before_constant_if
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24235>
2023-07-24 14:06:16 +01:00
Rhys Perry
21f0aca948 nir/opt_dead_cf: remove nodes after a jump earlier
In the case of:
   halt
   // succs: b9
   if %618 {
       block b3:// preds:
       break
       // succs: b6
   } else {
       block b4:  // preds: , succs: b5
   }
   block b5:    // preds: b4
   32    %556 = iadd %617, %2 (0x1)
opt_constant_if() doesn't work because stitch_blocks() can't join blocks if the
before ends in a jump and the after isn't empty.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24235>
2023-07-24 14:06:16 +01:00
Konstantin Seurer
1c8577b493 nir/tests: Use a single binary
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24249>
2023-07-24 11:44:46 +00:00
Konstantin Seurer
6eb0a3a5b7 nir/tests: Refactor boilerplate into a common header
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24249>
2023-07-24 11:44:46 +00:00
Timothy Arceri
2cf8c8cba4 nir/opt_copy_prop_vars: drop reuse of dynamic arrays
After the previous commit there are so few to reuse that this is no
longer worth doing and actually causes compilation to slow down.

The Blender shader compile time in issue #9326 improves as folows:
21.11 seconds -> 9.90 seconds

The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:

0.92 seconds -> 0.68 seconds

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9326

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
2023-07-24 02:29:54 +00:00
Timothy Arceri
d56e739417 nir/opt_copy_prop_vars: skip cloning of copies arrays until needed
Most of the variables in the hash table will never actually be looked up
for any given block so cloning every possible value just creates a bunch
of unrequired memcpy calls.

Here we change the code to only clone the copies array once it is
actually looked up for the first time.

The Blender shader compile time in issue #9326 improves as folows:
151.09 seconds -> 21.11 seconds

The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:

1.67 seconds -> 0.92 seconds

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
2023-07-24 02:29:54 +00:00
Timothy Arceri
869b5a562e nir/opt_copy_prop_vars: remove var hash entry on kill alias
If kill alias results in the hash table entry holding an empty
copies array then remove the hash entry and return the dynamic array
to the unused pool.

This helps avoid hash table size getting out of control in very large
shaders.

151.09 seconds -> 118.60 seconds

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
2023-07-24 02:29:54 +00:00
Timothy Arceri
9b4c7cc611 nir/opt_copy_prop_vars: speedup cloning of copy tables
Here we change things to simply clone the entire hash table. This
is much faster than trying to rebuild it and is needed to avoid
slow compilation of very large shaders.

The Blender shader compile time in issue #9326 improves as folows:
251.29 seconds -> 151.09 seconds

The CTS test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20
improves as follows:

2.38 seconds -> 1.67 seconds

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
2023-07-24 02:29:54 +00:00
Timothy Arceri
e9804bdc4c nir/opt_copy_prop_vars: don't clone copies if branch empty
There is no point doing an expensive clone of the copies if the
if-branch is empty.

Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24227>
2023-07-24 02:29:54 +00:00
Bas Nieuwenhuizen
c2e3986326 nir: Fix 16-component nir_replicate.
Fixes: f534c2c539 ("nir/builder: Add nir_replicate helper")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24286>
2023-07-22 22:11:15 +00:00
Alyssa Rosenzweig
03b2c34793 nir: Remove register arrays
Nothing produces them any more, so remove them from NIR. This massively reduces
the size of nir_src, which should improve performance all over.

nir_src size reduced from 56 bytes -> 40 bytes (pahole results on arm64, x86_64
should be similar.)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>
2023-07-21 11:25:49 +00:00
Alyssa Rosenzweig
1466014184 nir: Rename lower_locals_to_reg_intrinsics back
The short name is freed up.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>
2023-07-21 11:25:49 +00:00
Alyssa Rosenzweig
d2c94f9e71 nir: Remove nir_lower_locals_to_regs
No more users, all switched to the intrinsic version.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24253>
2023-07-21 11:25:49 +00:00
Christian Gmeiner
fb48d3d1da nir: add enta specific intrinsic used for txs lowering
Non of the know etnaviv GPUs support this feature in hardware
and the binary blob provides sizes via uniforms too.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24217>
2023-07-21 08:52:03 +00:00
Christian Gmeiner
019e5cbd39 nir/print: print instr pass_flags
From time to time it can be helpful to "see" the pass_flags.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24234>
2023-07-20 18:03:47 +00:00
Caio Oliveira
97c79cdf19 nir: Use instructions_pass() for nir_fixup_deref_modes()
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24220>
2023-07-19 20:15:12 +00:00
Alyssa Rosenzweig
d0f0afc6a4 nir: Initialize workgroup_size in builder_init_simple_shader
It can't be 0 in Vulkan.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24158>
2023-07-17 19:53:49 +00:00
Alyssa Rosenzweig
4f0f76346e nir: Add nir_lower_tess_coord_z pass
Lowers tess_coord to tess_coord_xy and math. Based on ir3's version.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24159>
2023-07-17 17:31:52 +00:00
Alyssa Rosenzweig
9109830bb0 nir: Promote tess_coord_r600 to tess_coord_xy
This intrinsic (vec2 tess_coord) is generally useful for non-r600 backends.
Promote it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24159>
2023-07-17 17:31:52 +00:00
Rhys Perry
5e4029bfe5 nir/tests: add test for unsigned_upper_bound with loop header phis
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23990>
2023-07-17 14:45:21 +00:00
Rhys Perry
1139d870f3 nir/unsigned_upper_bound: fix phi(bcsel)
This was looking at the wrong sources. src0 is the condition.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Fixes: 72ac3f6026 ("nir: add nir_unsigned_upper_bound and nir_addition_might_overflow")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23990>
2023-07-17 14:45:21 +00:00
Alyssa Rosenzweig
9bcdc45ee7 nir: Devendor load_sample_mask
AGX will use this too for its MSAA lowerings.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24148>
2023-07-15 19:48:30 +00:00
Alyssa Rosenzweig
56d61d9a64 nir: Add fence_{pbe,mem}_to_tex(_pixel)_agx intrinsics
Read-after-write hazards require special handling on AGX, since image loads are
implemented with texturing. Add intrinsics to handle these hazards.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24148>
2023-07-15 19:48:30 +00:00
Alyssa Rosenzweig
d54aa28b97 nir/legacy: Fix handling of fsat(fabs)
Consider code like:

    32x4  %2 = @load_interpolated_input (%1, %0 (0x0)) (base=0, component=0, dest_type=float32, io location=VARYING_SLOT_VAR0 slots=1 mediump)  // Color
    32x4  %3 = fabs %2
    32x4  %4 = fsat %3
    32x4  %5 = fsin %4

The existing logic would incorrectly tell the backend that both fabs and fsat
could be folded, and then half the shader disappears. Whoops. Fix by stopping
the folding in this case. I choose to do this check in the fsat rather than the
fabs because it's more straightforward (1 source vs N uses) but it's somewhat
arbitrary.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
2023-07-13 22:43:36 +00:00
Alyssa Rosenzweig
34fcf6d479 nir/legacy: Fix fneg(load_reg) case
Consider the IR:

   %0 = load_reg
   %1 = fneg %0
   %2 = ffloor %1
   %3 = bcsel .., .., %1

Because the fneg has both foldable and non-foldable users, nir/legacy does not
fold the fneg into the load_reg. This ensures that the backend correctly emits a
dedicated fneg instruction (with the load_reg folded in) for the bcsel to use.
However, because the chasing helpers did not previously take other uses of a
modifier into account, the helpers would fuse in the fneg to the ffloor. Except
that doesn't work, because the load_reg instruction is supposed to be
eliminated. So we end up with broken chased IR:

   1 = fneg r0
   2 = ffloor -NULL
   3 = bcsel, ..., 1

The fix is easy: only fold modifiers into ALU instructions if the modifiers can
be folded away. If we can't eliminate the modifier instruction altogether, it's
not necessarily beneficial to fold it anyway from a register pressure
perspective. So this is probably ok. With that check in place we get correct IR

   1 = fneg r0
   2 = ffloor 1
   3 = bcsel, ..., 1

Fixes carchase/230.shader_test under softpipe.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24116>
2023-07-13 22:43:36 +00:00
Alyssa Rosenzweig
a2d56c4c73 nir/lower_blend: Use util enums
This avoids the silly compiler versions. Some bits are slightly more
complicated, because they have to account for inverted enum values (rather than
a separate invert bit), but this is a LOT friendlier to drivers using the pass
and it makes the pass itself more readable.

The conversion functions in panfrost/panvk will go away momentarily.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Italo Nicola <italonicola@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24076>
2023-07-13 21:03:32 +00:00
Christian Gmeiner
f831883af6 nir/lower_tex: optimize offset lowering for has_texture_scaling
Generates much better code and even helps to beat a blob driver.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>
2023-07-12 10:03:06 +00:00
Christian Gmeiner
9383009809 nir: rename has_txs to has_texture_scaling
Convert it to an opt-in for backends to prefer and use nir_load_texture_scale
instead of txs for nir lowerings.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>
2023-07-12 10:03:06 +00:00
Christian Gmeiner
9ddedf4554 nir: rename intrinsic to have a more generic nameing
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24054>
2023-07-12 10:03:06 +00:00
Alyssa Rosenzweig
fded7e7b66 nir: Remove nir_register-based unit tests
Non-SSA functionality will become obsolete after nir_register is removed, so
there's no need to keep the tests around, and they will interfere with the
nir_register de-clawing.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Alyssa Rosenzweig
e96a9a1b71 nir: Remove nir_lower_regs_to_ssa
It is now unused, as all internal producers of registers have been switched over
to intrinsics and no drivers call it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Alyssa Rosenzweig
9eab1e7521 nir/lower_shader_calls: Convert to register intrinsics
Yet another internal use of nir_register that gets lowered back to SSA after the
pass. Easy enough to replace with intrinsic-based registers instead.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Alyssa Rosenzweig
61010e5255 nir: Add lower_vec_to_regs pass
This is a variant of nir_lower_vec_to_movs that produces register intrinsics
(store_reg with write masks) instead of masked moves.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Alyssa Rosenzweig
aea8a70200 nir: Add intrinsics version of locals_to_regs
This isn't so bad. I still duplicated the pass because it makes a lot easier to
have them coexist, switch users over one by one, and then garbage collect the
old when we're done.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Faith Ekstrand
ae0408be1b nir/from_ssa: Support register intrinsics
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Faith Ekstrand
29b2ace184 nir/from_ssa: Make additional assumptions in coalescing
At this point, everything is SSA.  Also, NIR no longer allows different
numbers of components on the two sides of a phi so we can just assert
rather than trying to gracefully handle mismatches.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Alyssa Rosenzweig
36b29201fa nir: Produce intrinsics in lower_{phis,ssa_defs}_to_regs
A number of passes lower SSA partially to registers, do work that would be
invalid in SSA, and then go back into SSA with nir_lower_regs_to_ssa. As a step
towards replacing nir_register with intrinsics,
the nir_lower_{phis,ssa_defs}_to_regs passes are changed to produce intrinsics
instead of nir_registers, and their callers are updated to call
nir_lower_reg_intrinsics_to_ssa instead of nir_lower_regs_to_ssa to compensate.

Jointly authored with Faith.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Faith Ekstrand
73e191924c nir: Add a reg_intrinsics flag to nir_convert_from_ssa
It doesn't do anything yet. We leave that to the subsequent patches so we can
keep the tree-wide refactor as simple as possible.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Alyssa Rosenzweig
bcf3a622d1 nir: Add new version of lower_regs_to_ssa
in the sense of operating on register intrinsics instead of nir_registers.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:27 +00:00
Alyssa Rosenzweig
2eb554af48 nir: Add legacy data structures & helpers
These are registerful versions of core nir_src/nir_dest which will become
SSA-only soon enough, and modifierful versions of nir_alu_src/nir_alu_dest.
The latter will let us remove modifiers from nir_alu_instr finally.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:26 +00:00
Alyssa Rosenzweig
d313eba94e nir: Add pass for trivializing register access
After running the pass, all register access intrinsics are guaranteed to be
"trivial" in the sense that the program is free of hazards preventing
propagating them away without inserting any copies.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:26 +00:00
Alyssa Rosenzweig
1d6c06e4b9 nir: Add helpers for walking register uses
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:26 +00:00
Alyssa Rosenzweig
7229bffcb1 nir: Add intrinsics for register access
Note the writemask handling is chosen for consistency with the rest of NIR. In
every other instance, writemask=w requires a vec4 source. This is hardcoded into
nir_validate and nir_print as what it means to have a writemask.

More importantly, consistency with how register writemasks currently work.
nir_print hides it, but r0.w = fneg ssa_1.x is actually a vec4 instruction with
source ssa_1.xxxx. As a silly example nir_dest_num_components(that) = 4 in the
old model. I realize this is quite strange coming from a scalar ISA, but it's
perfectly natural for the class of vec4 hardware for which this was designed. In
that hardware, conceptually all instructions are vec4`, so the sequence "fneg
ssa_1 and write to channel w" is implemented as "fneg a vec4 with ssa_1.x in the
last component and write that vec4 out but mask to write only the w channel".

Isn't this inefficient? It can be. To save power, Midgard has scalar ALUs in
addition to vec4 ALUs. Those details are confined to the backend VLIW scheduler;
the instruction selection is still done as vec4. This mechanism has little in
common with AMD's SALUs. Midgard has a wave size of 1, with special hacks for
derivatives.

As a result, all backends consuming register writemasks are expecting this
pattern of code. Changing the store to take a vec1 instead of a vec4 would
require changing every backend to reswizzle the sources to resurrect the vec4. I
started typing a branch to do this yesterday, but it made a mess of both Midgard
and nir-to-tgsi. Without any good reason to think it'd actually help
performance, I abandoned the idea. Getting all 15 backends converted to the
helpers is enough of a challenge without forcing 10 backends to reswizzle their
sources too.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23089>
2023-07-12 01:34:26 +00:00
Konstantin Seurer
f160f7c525 nir/opt_dead_cf: Clarify comment
Make it obvious that the comment is about the block stitching behavior
of nir_cf_node_remove.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22064>
2023-07-11 17:32:55 +00:00
Konstantin Seurer
4b5737b785 nir/opt_dead_cf: Run dead_cf_block while it makes progress
Previously, nir_opt_dead_cf could skip dead CF nodes because overwriting
cur after dead_cf_block is not enough to cover the whole CF list.
foreach_list_typed would select the next node, skipping the node that
previously made progress:

block 1
if (true) {}
block 2
if (true) {}
block 3
if (true) {}

Would turn into:

block 1, then, block 2
if (true) { }
block 3, then

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22064>
2023-07-11 17:32:55 +00:00
Konstantin Seurer
6532751e4d nir/lower_shader_calls: Remat derefs after shader calls
This avoids spilling deref instructions by wrapping shader calls inside
dummy blocks, rematerializing derefs in their use blocks and removing
the dummy blocks.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22064>
2023-07-11 17:32:55 +00:00
Yonggang Luo
97014036a9 nir: Remove nir_builder_init, it's not used anymore
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24038>
2023-07-10 19:20:18 +00:00
Yonggang Luo
48a25ef700 treewide: Remove all usage of nir_builder_init with nir_builder_create and nir_builder_at
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24038>
2023-07-10 19:20:17 +00:00
Italo Nicola
65d6f5aed2 nir: add options to lower y_vu, yv_yu, yx_xvxu and xy_vxux
`y_vu` will be used to convert NV21 to RGB.
`yv_yu` will be used to convert YVYU and VYUY to RGB when the
subsampling formats PIPE_FORMAT_R8B8_R8G8 and PIPE_FORMAT_B8R8_G8R8
are supported.
`yx_xvxu` and `xy_vxux` will be used to convert YVYU and VYUY to RGB
when those subsampling formats are not supported.

Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21219>
2023-07-10 16:29:13 +00:00