Commit graph

178993 commits

Author SHA1 Message Date
Alejandro Piñeiro
146ceadcf4 v3dv: add support for TFU jobs in v71
This includes update the simulator.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
8f2704a28d v3dv: handle Z clipping in v71
Fixes the following tests:

dEQP-VK.clipping.clip_volume.*
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_* (except deltazero)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
acd99e08b4 v3dv: don't convert floating point border colors in v71
The TMU does this for us now.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Alejandro Piñeiro
452421dfe5 v3dv: no specific separate_segments flag for V3D 7.1
On V3D 7.1 there is not a flag on the Shader State Record to specify
if we are using shared or separate segments. This is done by setting
the vpm input size to 0 (so we need to ensure that the output would be
the max needed for input/output).

We were already doing the latter on the prog_data_vs, so we just need
to use those values, instead of assigning default values.

As we are here, we also add some comments on the compiler part.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
ef60f6db0d v3dv: handle RTs with no color targets in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
f4fa1c8586 v3dv: handle early Z/S clears for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
8c191d1103 broadcom/compiler: update thread end restrictions validation for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
f7b16f91e1 v3dv: GFX-1461 does not affect V3D 7.x
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
84ca72ace2 v3dv: handle render pass global clear for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
76a019f8cc v3dv: implement noop job for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
5ad415c0e6 v3dv: handle new texture state transfer functions in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
fe6594c4c1 v3dv: fix up texture shader state for v71
There are some new fields for YCbCr with pointers for the various
planes in multi-planar formats. These need to match the base address
pointer in the texture state, or the hardware will assume this is a
multi-planar texture.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
f514ad5dd2 v3dv: setup TLB clear color for meta operations in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Iago Toral Quiroga
b5e3322d93 v3dv: setup render pass color clears for any format bpp in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Alejandro Piñeiro
9c92a758cc v3dv/pipeline: handle GL_SHADER_STATE_RECORD changed size on v71
It is likely that we would need more changes, as this packet changed,
but this is enough to get basic tests running. Any additional support
will be handled with new commits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:43 +00:00
Alejandro Piñeiro
5750926d0e v3dv/pipeline: default vertex attributes values are not needed for v71
There are not part of the shader state record.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
53773f3ea7 v3dv: default vertex attribute values are gen dependant
Content, structure and size would depend on the generation. Even if it
is needed at all.

So let's move it to the v3dvx files.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
0abf7c1407 v3dv/cmd_buffer: just don't fill up early-z fields for CFG_BITS for v71
For v71 early_z_enable/early_z_updates_enable is configured with
packet 121.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
1a822ba3e6 v3dv/uniforms: update VIEWPORT_X/Y_SCALE uniforms for v71
As the packet CLIPPER_XY scaling, this needs to be computed on 1/64ths
of pixel, instead of 1/256ths of pixels.

As this is the usual values that we get from macros, we add manually a
v42 and v71 macro, and define a new helper (V3DV_X) to get the value
for the current hw version.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
0fdd9ea9bc v3dv/cmd_buffer: emit CLIPPER_XY_SCALING for v71
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
e2eed3fff6 v3dvx/cmd_buffer: emit CLEAR_RENDER_TARGETS for v71
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
33886d5f26 v3dv/cmd_buffer: emit TILE_RENDERING_MODE_CFG_RENDER_TARGET_PART1 for v71
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
5cc035a750 v3dv: emit TILE_BINNING_MODE_CFG and TILE_RENDERING_MODE_CFG_COMMON for v71
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
6f8d331188 v3dv/device: handle new rpi5 device (bcm2712)
This includes both master and primary devices.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
15a0ad216a v3dv: expose V3D revision number in device name
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
4606904215 v3dv/meson: add v71 hw generation
Starting point for v71 version inclusion.

This just adds it as one of the versions to be compiled (on meson),
updates the v3dX/v3dv_X macros, and update the code enough to get it
compiling when building using the two versions. For any packet not
available on v71 we just provide a generic asserted placeholder of
generation not supported.

Any real v71 support will be implemented on following commits.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
1f5a3391bb broadcom/compiler: only assign rf0 as last resort in V3D 7.x
So we can use it for ldunif(a) and avoid generating ldunif(a)rf which
can't be paired with conditional instructions.

shader-db (pi5):

total instructions in shared programs: 11357802 -> 11338883 (-0.17%)
instructions in affected programs: 7117889 -> 7098970 (-0.27%)
helped: 24264
HURT: 17574
Instructions are helped.

total uniforms in shared programs: 3857808 -> 3857815 (<.01%)
uniforms in affected programs: 92 -> 99 (7.61%)
helped: 0
HURT: 1

total max-temps in shared programs: 2230904 -> 2230199 (-0.03%)
max-temps in affected programs: 52309 -> 51604 (-1.35%)
helped: 1219
HURT: 725
Max-temps are helped.

total sfu-stalls in shared programs: 15021 -> 15236 (1.43%)
sfu-stalls in affected programs: 6848 -> 7063 (3.14%)
helped: 1866
HURT: 1704
Inconclusive result

total inst-and-stalls in shared programs: 11372823 -> 11354119 (-0.16%)
inst-and-stalls in affected programs: 7149177 -> 7130473 (-0.26%)
helped: 24315
HURT: 17561
Inst-and-stalls are helped.

total nops in shared programs: 273624 -> 273711 (0.03%)
nops in affected programs: 31562 -> 31649 (0.28%)
helped: 1619
HURT: 1854
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
c8e4ee8ecb broadcom/compiler: don't assign registers to unused nodes/temps
In programs with a lot of unused temps, if we don't do this, we may
end up recycling previously used rfs more often, which can be
detrimental to instruction pairing.

total instructions in shared programs: 11464335 -> 11444136 (-0.18%)
instructions in affected programs: 8976743 -> 8956544 (-0.23%)
helped: 33196
HURT: 33778
Inconclusive result

total max-temps in shared programs: 2230150 -> 2229445 (-0.03%)
max-temps in affected programs: 86413 -> 85708 (-0.82%)
helped: 2217
HURT: 1523
Max-temps are helped.

total sfu-stalls in shared programs: 18077 -> 17104 (-5.38%)
sfu-stalls in affected programs: 8669 -> 7696 (-11.22%)
helped: 2657
HURT: 2182
Sfu-stalls are helped.

total inst-and-stalls in shared programs: 11482412 -> 11461240 (-0.18%)
inst-and-stalls in affected programs: 8995697 -> 8974525 (-0.24%)
helped: 33319
HURT: 33708
Inconclusive result

total nops in shared programs: 298140 -> 296185 (-0.66%)
nops in affected programs: 52805 -> 50850 (-3.70%)
helped: 3797
HURT: 2662
Inconclusive result

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
ce13aa4ee7 broadcom/compiler: improve allocation for final program instructions
The last 3 instructions can't use specific registers so flag all the
nodes for temps used in the last program instructions and try to
avoid assigning any of these. This may help us avoid injecting nops
for the last thread switch instruction.

Because regisster allocation needs to happen before QPU scheduling
and instruction merging we can't tell exactly what the last 3
instructions will be, so we do this for a few more instructions than
just 3.

We only do this for fragment shaders because other shader stages
always end with VPM store instructions that take an small immediate
and therefore will never allow us to merge the final thread switch
earlier, so limiting allocation for these shaders will never improve
anything and might instead be detrimental.

total instructions in shared programs: 11471389 -> 11464335 (-0.06%)
instructions in affected programs: 582908 -> 575854 (-1.21%)
helped: 4669
HURT: 578
Instructions are helped.

total max-temps in shared programs: 2230497 -> 2230150 (-0.02%)
max-temps in affected programs: 5662 -> 5315 (-6.13%)
helped: 344
HURT: 44
Max-temps are helped.

total sfu-stalls in shared programs: 18068 -> 18077 (0.05%)
sfu-stalls in affected programs: 264 -> 273 (3.41%)
helped: 37
HURT: 48
Inconclusive result (value mean confidence interval includes 0).

total inst-and-stalls in shared programs: 11489457 -> 11482412 (-0.06%)
inst-and-stalls in affected programs: 585180 -> 578135 (-1.20%)
helped: 4659
HURT: 588
Inst-and-stalls are helped.

total nops in shared programs: 301738 -> 298140 (-1.19%)
nops in affected programs: 14680 -> 11082 (-24.51%)
helped: 3252
HURT: 108
Nops are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
818fc41e7e broadcom/compiler: don't allocate spill base to rf0 in V3D 7.x
Otherwise it can be stomped by instructions doing implicit rf0 writes.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
dc6ed98aae broadcom/qpu: new packing/conversion v71 instructions
This commits adds the qpu definitions for several new v71
instructions.

Packing:
  * vpack does a 2x32 to 2x16 bit integer pack
  * v8pack: Pack 2 x 2x16 bit integers into 4x8 bits
  * v10pack packs parts of 2 2x16 bit integer into r10g10b10a2.
  * v11fpack packs parts of 2 2x16 bit float into r11g11b10 rounding
    to nearest

Conversion to unorm/snorm:
  * vftounorm8/vftosnorm8: converts from 2x16-bit floating point
    to 2x8 bit unorm/snorm.
  * ftounorm16/ftosnorm16: converts floating point to 16-bit
    unorm/snorm
  * vftounorm10lo: Convert 2x16-bit floating point to 2x10-bit unorm
  * vftounorm10hi: Convert 2x16-bit floating point to one 2-bit and one 10-bit unorm

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
84c912c1d4 broadcom/compiler: fix up copy propagation for v71
Update rules for unsafe copy propagations to match v7.x.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
1e85be415a broadcom/compiler: lift restriction on vpmwt in last instruction for V3D 7.x
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
2774601780 broadcom/compiler: validate restrictions after TLB Z write
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
d4285d7f2a broadcom/compiler: start allocating from RF 4 in V7.x
In V3D 4.x we start at RF3 so that we allocate RF0-2 only if there
aren't any other RFs available. This is useful with small shaders to
ensure that our TLB writes don't use these registers because these are
the last instructions we emit in fragment shaders and the last
instructions in a program can't write to these registers, so if we do,
we need to emit NOPs.

In V3D 7.x the registers affected by this restriction are RF2-3, so we
choose to start at RF4.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
2b39bb35c5 broadcom/compiler: lift restriction for branch + msfign after setmsf for v7.x
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
5e9b405aa7 broadcom/compiler: update ldvary thread switch delay slot restriction for v7.x
In V3D 7.x we don't have accumulators which would not survive a thread
switch, so the only restriction is that ldvary can't be placed in the
second delay slot of a thread switch.

shader-db results for UnrealEngine4 shaders:

total instructions in shared programs: 446458 -> 446401 (-0.01%)
instructions in affected programs: 13492 -> 13435 (-0.42%)
helped: 58
HURT: 3
Instructions are helped.

total nops in shared programs: 19571 -> 19541 (-0.15%)
nops in affected programs: 161 -> 131 (-18.63%)
helped: 30
HURT: 0
Nops are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
526c1889e5 broadcom/compiler: update thread end restrictions for v7.x
In 4.x it is not allowed to write to the register file in the last 3
instructions, but in 7.x we only have this restriction in the thread
end instruction itself, and only if the write comes from the ALU
ports.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
ced83e7803 broadcom/compiler: implement small immediates for v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
e4d30600a4 broadcom/compiler: convert mul to add when needed to allow merge
V3D 7.x added 'mov' opcodes to the ADD alu, so now it is possible to
move these to the ADD alu to facilitate merging them with other MUL
instructions.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
cbedf14687 broadcom/compiler: don't assign rf0 to temps that conflict with ldvary
ldvary writes to rf0 implicitly, so we don't want to allocate rf0 to
any temps that are live across ldvary's rf0 live ranges.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
3a36a618d7 broadcom/compiler: try to use ldunif(a) instead of ldunif(a)rf in v71
The rf variants need to encode the destination in the cond bits, which
prevents these to be merged with any other instruction that need them.

In 4.x, ldunif(a) write to r5 which is a special register that only
ldunif(a) and ldvary can write so we have a special register class for
it and only allow it for them. Then when we need to choose a register
for a node, if this register is available we always use it.

In 7.x these instructions write to rf0, which can be used by any
instruction, so instead of restricting rf0, we track the temps that
are used as ldunif(a) destinations and use that information to favor
rf0 for them.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
d8a25bdb07 broadcom/compiler: enable ldvary pipelining on v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
a8014be2b0 broadcom/compiler: handle rf0 flops storage restriction in v71
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
852274d00a broadcom/qpu: add packing for fmov on ADD alu
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
d1281d857f broadcom/compiler: update peripheral access restrictions for v71
In V3D 4.x only a couple of simultaneous accesses where allowed, but
V3D 7.x is a bit more flexible, so rather than trying to check for all
the allowed combinations it is easier to check if we are one of the
disallows.

Shader-db (pi5):

total instructions in shared programs: 11338883 -> 11307386 (-0.28%)
instructions in affected programs: 2727201 -> 2695704 (-1.15%)
helped: 12555
HURT: 289
Instructions are helped.

total max-temps in shared programs: 2230199 -> 2229260 (-0.04%)
max-temps in affected programs: 20508 -> 19569 (-4.58%)
helped: 608
HURT: 4
Max-temps are helped.

total sfu-stalls in shared programs: 15236 -> 15293 (0.37%)
sfu-stalls in affected programs: 148 -> 205 (38.51%)
helped: 38
HURT: 64
Inconclusive result (%-change mean confidence interval includes 0).

total inst-and-stalls in shared programs: 11354119 -> 11322679 (-0.28%)
inst-and-stalls in affected programs: 2732262 -> 2700822 (-1.15%)
helped: 12550
HURT: 304
Inst-and-stalls are helped.

total nops in shared programs: 273711 -> 274095 (0.14%)
nops in affected programs: 9626 -> 10010 (3.99%)
helped: 186
HURT: 397
Nops are HURT.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
ce66c9aead broadcom/compiler: update payload registers handling when computing live intervals
As for v71 the payload registers are not the same. Specifically now
rf3 is used as payload register, so this is needed to avoid rf3 being
selected as a instruction dst by the register allocator, overwriting
the payload value that could be still used.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
d72e57fe30 broadcom/compiler: update ldunif/ldvary comment for v71
For v42 and below ldunif/ldvary write both on r5, but with a different
delay, so we need to take that into account when scheduling both.

For v71 the register used is rf0, but the behaviour is the same. So
the scheduling code can be the same, but the comment needs update.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Alejandro Piñeiro
a3aba3f352 broadcom/compiler: update one TMUWT restriction for v71
TMUWT not allowed in the final instruction restriction doesn't apply
for v71.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00
Iago Toral Quiroga
c9fcd5d786 broadcom/compiler: v71 isn't affected by double-rounding of viewport X,Y coords
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
2023-10-13 22:37:42 +00:00