In V3D 4.x only a couple of simultaneous accesses where allowed, but
V3D 7.x is a bit more flexible, so rather than trying to check for all
the allowed combinations it is easier to check if we are one of the
disallows.
Shader-db (pi5):
total instructions in shared programs: 11338883 -> 11307386 (-0.28%)
instructions in affected programs: 2727201 -> 2695704 (-1.15%)
helped: 12555
HURT: 289
Instructions are helped.
total max-temps in shared programs: 2230199 -> 2229260 (-0.04%)
max-temps in affected programs: 20508 -> 19569 (-4.58%)
helped: 608
HURT: 4
Max-temps are helped.
total sfu-stalls in shared programs: 15236 -> 15293 (0.37%)
sfu-stalls in affected programs: 148 -> 205 (38.51%)
helped: 38
HURT: 64
Inconclusive result (%-change mean confidence interval includes 0).
total inst-and-stalls in shared programs: 11354119 -> 11322679 (-0.28%)
inst-and-stalls in affected programs: 2732262 -> 2700822 (-1.15%)
helped: 12550
HURT: 304
Inst-and-stalls are helped.
total nops in shared programs: 273711 -> 274095 (0.14%)
nops in affected programs: 9626 -> 10010 (3.99%)
helped: 186
HURT: 397
Nops are HURT.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
As for v71 the payload registers are not the same. Specifically now
rf3 is used as payload register, so this is needed to avoid rf3 being
selected as a instruction dst by the register allocator, overwriting
the payload value that could be still used.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
For v42 and below ldunif/ldvary write both on r5, but with a different
delay, so we need to take that into account when scheduling both.
For v71 the register used is rf0, but the behaviour is the same. So
the scheduling code can be the same, but the comment needs update.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
V3D 4.x has pixel center W in rf0 and V3D 7.x has it in rf3. We already
account for this when we setup the c->payload_w, so use that.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
In v3d 4.x there were restrictions based on the number of raddrs used
by the combined instructions, but we don't have these restrictions in
v3d 7.x.
It should be noted that while there are no restrictions on the number
of raddrs addressed, a QPU instruction can only address a single small
immediate, so we should be careful about that when we add support for
small immediates.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
ldvary writes rf0 implicitly on the next cycle so they would clash.
This case is not handled correctly by our normal dependency tracking,
which doesn't know anything about delayed writes from instructions
and thinks the rf0 write happens on the same cycle ldvary is emitted.
Fixes (v71):
dEQP-VK.glsl.conversions.matrix_to_matrix.mat2x3_to_mat4x2_fragment
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
In platforms that don't have accumulators and have implicit writes to
the register file we need to be careful and avoid assigning a physical
register to a temp that lives across an implicit write to that same
physical register.
For now, we have the case of implicit writes to rf0 from various
signals, but it should be easy to extend this to include additional
registers if needed.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
We just need to add a write dep if rf0 is written implicitly.
Note that we don't need to check if we have accumulators when checking
for r3/r4/r5, as v3d_qpu_writes_rX would return false for hw version
that doesn't have accumulators.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Note that what we do is to just return false on the favor/select accum
methods. We could just avoid to call them, but as the select is called
more than once, it is just easier this way.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
At this point it just tidy up a little the alu_instr structure.
But also serves to prepare the structure for new changes, as 7.x uses
raddr instead of mux, and it is just easier to add the raddr to the
new input structure.
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
small_imm_a, small_imm_c and small_imm_d added on top of the already
existing small_imm_b, as V3D 7.1 defines 4 small immediates, tied to
the 4 raddr. Note that this is only the definition, and just a inst
validation rule to check that are not used before v71. Any real use is
still pending.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Current small_imm is associated with the "B" read address.
We do this change in advance for v71 support, where we will have 4
different small_imm (a/b/c/d), so we start with a renaming.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
Having this pointer in the key is undesirable since it makes
copying keys difficult and error prone (as seen in previous
patches), also, it is only there for convenience and we don't
strictly need it (in fact the vulkan driver doesn't use it at
all), so let's just get rid of it so our v3d_key is fully
static.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>
Our shader key includes a void pointer that we can't just memcmp,
so add helpers that allow us toget the 'static' portion and size
of a key. We will use this to fix up the shader cache in v3d in
a later patch.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25418>
We already renamed the type, we just need to rename the enum and the
casting helper functions.
Generated with sed:
sed -i -e 's/nir_instr_type_ssa_undef/nir_instr_type_undef/g' src/**/*.h src/**/*.c src/**/*.cpp
sed -i -e 's/nir_instr_as_ssa_undef/nir_instr_as_undef/g' src/**/*.h src/**/*.c src/**/*.cpp
and two tiny whitespace fixups in lima.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24703>
...when trying to ignore components. Trim functionality with nir_ssa_for_src is
deprecated.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24654>
Instead, we replace every use of it with nir_def. Most of this commit
was generated by sed:
sed -i -e 's/dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
A few manual fixups were required in lima and the nir_legacy code.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
Instead, we replace it directly with nir_def. We could replace it with
nir_dest but the next commit gets rid of that so this avoids unnecessary
churn. Most of this commit was generated by sed:
sed -i -e 's/dest.dest.ssa/def/g' src/**/*.h src/**/*.c src/**/*.cpp
There were a few manual fixups required in the nir_legacy.c and
nir_from_ssa.c as nir_legacy_reg and nir_parallel_copy_entry both have a
similar pattern.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
We could add a nir_def_bit_size() helper but we use ->bit_size about 3x
as often as nir_dest_bit_size() today so that's a major Coccinelle
refactor anyway and this doesn't make it much worse. Most of this
commit was generated byt the following semantic patch:
@@
expression D;
@@
<...
-nir_dest_bit_size(D)
+D.ssa.bit_size
...
Some manual fixup was needed, especially in cpp files where Coccinelle
tends to give up the moment it sees any interesting C++.
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>
For tex instructions that don't have sampler state use backend_flags
instead of sampler index to bind default sampler state.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24537>