Commit graph

15 commits

Author SHA1 Message Date
Iago Toral Quiroga
6b9bd3f038 broadcom/compiler: make opt passes set current block
Typically, optimization passes go through all the blocks in a shader
and make adjustments on the fly, so we always want them to update
the current block or the current block pointer will become outdated.

Also, we don't need to keep track of the previous current block
pointer to restore it, since optimization passes run after we have
completed conversion to VIR, and therefore, anything that comes after
that should always set the current block before emitting code.

Fixes debug assert crashes when running shader-db:
vir.c:1888: try_opt_ldunif: Assertion `found || &c->cur_block->instructions == c->cursor.link' failed

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13625>
2021-11-02 11:17:01 +00:00
Juan A. Suarez Romero
2a86d51960 broadcom/compiler: set current block on incrementing unifa
When incrementing unifa address in DCE optimization, ensure that we
setup correctly the current block, so the ldfunif optimization is also
executed correctly.

This fixes
dEQP-VK.graphicsfuzz.cov-struct-float-array-mix-uniform-vectors
heap-buffer overflow with address sanitizer enabled.

v2 (Iago):
 - Save and restore current block

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12339>
2021-08-12 12:33:46 +00:00
Iago Toral Quiroga
2897a83ff8 broadcom/compiler: drop the destination for unused ldunifa
We can't remove unused ldunifa that are not the first or last
in a sequence, but we can still ignore their destination
to reduce register pressure.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9384>
2021-03-04 09:00:15 +01:00
Iago Toral Quiroga
e1cf2406da broadcom/compiler: add a constant alu optimization pass
Currently this is useful to clean up after DCEing leading ldunifa
instructions, but it can be expanded to handle more cases which
may allow to simplify the compiler code in places where we have
been trying to optimize manually for similar cases.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9128>
2021-02-23 08:08:01 +00:00
Iago Toral Quiroga
89de085055 broadcom/compiler: remove unused leading ldunifa
This requires that we go back to the unifa write and update the address
to jump over the unused leading component.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9128>
2021-02-23 08:08:01 +00:00
Iago Toral Quiroga
9d16d2d0be broadcom/compiler: allow dead code elimination of unused trailing ldunifa
If a ldunifa is the last in a sequence and is not used, we can safely
eliminate it.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9128>
2021-02-23 08:08:01 +00:00
Eric Anholt
5c655c47db v3d: Drop the V3D 3.x vpm read dead code elimination.
We now have NIR dead code eliminating our VPM reads, so this shouldn't be
necessary.
2019-03-05 12:57:39 -08:00
Eric Anholt
5a84d46896 v3d: Stop tracking num_inputs for VPM loads.
It's unused in the VS (since we need vattr_sizes[] anyway), so move it to
FS prog data.
2019-02-18 18:09:07 -08:00
Eric Anholt
b36757448d v3d: Dead-code eliminate unused flags updates.
The greedy comparison folding in bcsel means that we may have left the
original bool-generating NIR ALU instruction dead, but DCE wasn't
eliminating the VIR code for it because of the flags updates.

total instructions in shared programs: 5186024 -> 5100894 (-1.64%)
instructions in affected programs: 1448695 -> 1363565 (-5.88%)
2018-12-30 08:05:11 -08:00
Eric Anholt
ebde5afb93 v3d: Move "does this instruction have flags" from sched to generic helpers.
I wanted to reuse it for DCE of flags updates.
2018-12-30 08:03:51 -08:00
Eric Anholt
a7e15a5086 v3d: Avoid assertion failures when removing end-of-shader instructions.
After generating VIR, we leave c->cursor pointing at the end of the
shader.  If the shader had dead code at the end (for example from preamble
instructions in a shader with no side effects), we would assertion fail
that we were leaving the cursor pointing at freed memory.  Since anything
following DCE should be setting up a new cursor anyway, just clear the
cursor at the start.
2018-12-14 17:48:01 -08:00
Eric Anholt
e7ae900341 v3d: Switch to using the new SFU instructions on V3D 4.x.
These instructions let us write directly to the phys regfile, instead of
just R4.  That lets us avoid moving out of R4 to avoid conflicting with
other SFU results, and to avoid conflicting with thread switches.

There is still an extra instruction of latency, which is not represented
in the scheduler at the moment.  If you use the result before it's ready,
the QPU will just stall, unlike the magic R4 mode where you'd read the
previous value.  That means that the following shader-db results aren't
quite representative (since we now cause some stalls instead of emitting
nops), but they're impressive enough that I'm happy with the change.

total instructions in shared programs: 95669 -> 91275 (-4.59%)
instructions in affected programs:     82590 -> 78196 (-5.32%)
2018-07-23 10:21:43 -07:00
Eric Anholt
8e4cba9d92 broadcom/vc5: Also check the update flags for avoiding DCE.
I was trying to do a NULL-destination UF, and it got removed.
2018-01-12 21:58:11 -08:00
Eric Anholt
368bab43fd broadcom/vc5: Add support for loading varyings in V3D 4.1.
The LDVARY signal now writes an arbitrary register, so I took out the
magic src register file and replaced it with an instruction with LDVARY
set so we have somewhere to hang a QFILE_TEMP destination for register
allocation.
2018-01-12 21:57:21 -08:00
Eric Anholt
ade416d023 broadcom: Add VC5 NIR compiler.
This is a pretty straightforward fork of VC4's NIR compiler to VC5.  The
condition codes, registers, and I/O have all changed, making the backend
hard to share, though their heritage is still recognizable.

v2: Move to src/broadcom/compiler to match intel's layout, rename more
    "vc5" to "v3d", rename QIR to VIR ("V3D IR") to avoid symbol conflicts
    with vc4, use new v3d_debug header, add compiler init/free functions,
    do texture swizzling in NIR to allow optimization.
2017-10-10 11:42:04 -07:00