Commit graph

413 commits

Author SHA1 Message Date
Alyssa Rosenzweig
44971b84b7 panfrost: Remove unused definitions in mali-job.h
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Alyssa Rosenzweig
fa14cdf6e4 panfrost: Cleanup _shader_upper -> shader
I don't believe this is actually a tagged pointer; warn if it is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Alyssa Rosenzweig
f98e9a2771 pan/midgard: Express allocated registers as offsets
Rather than supplying a mask/swizzle to compose with the original, just
supply the offset of the allocated register so we can directly offset
the mask/swizzle, without resorting to composition.

This is simpler, cleaner, and will generalize to non-32-bit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig
c1d36eb115 pan/midgard: Expose more typesize manipulation routines
These internal mir.c routines will help the RA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig
9bba182840 pan/midgard: Add mir_set_bytemask helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Rhys Perry
8b98d0954e nir/lower_idiv: add new llvm-based path
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00
Alyssa Rosenzweig
b8c4fb235e pan/midgard: Implement SIMD-aware dead code elimination
We would like to eliminate not just entire dead instructions, but also
dead components, which increases scheduler flexibility (since some
vector instructions can become scalar after eliminating dead
components). This also will allow better RA in the future.

Results are meh.

total instructions in shared programs: 3453 -> 3451 (-0.06%)
instructions in affected programs: 60 -> 58 (-3.33%)
helped: 2
HURT: 0

total bundles in shared programs: 1826 -> 1824 (-0.11%)
bundles in affected programs: 33 -> 31 (-6.06%)
helped: 2
HURT: 0

total quadwords in shared programs: 3144 -> 3144 (0.00%)
quadwords in affected programs: 0 -> 0
helped: 0
HURT: 0

total registers in shared programs: 321 -> 321 (0.00%)
registers in affected programs: 45 -> 45 (0.00%)
helped: 11
HURT: 11
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 16.67% max: 50.00% x̄: 39.70% x̃: 50.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for registers value: -0.45 0.45
95% mean confidence interval for registers %-change: -1.87% 62.18%
Inconclusive result (value mean confidence interval includes 0).

total threads in shared programs: 445 -> 447 (0.45%)
threads in affected programs: 2 -> 4 (100.00%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
6c4b97011b pan/midgard: Create dependency graph bytewise
This allows for vec16 dependencies in the scheduler, not that we have
any yet (thankfully).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
825f11e739 pan/midgard: Handle nontrivial masks in texture RA
The texture instruction has a mask we need to take into account.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
d1d3411ba5 pan/midgard: Implement per-byte liveness tracking
Now that we have notion of byte masks, liveness tracking can be updated
to reflect this extra granularity without loss of correctness.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
43fd730fc4 pan/midgard: Simplify mir_bytemask_of_read_components
There are easy ways to iterate sources!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
e9202ff3cb pan/midgard: Report byte masks for read components
Read component masks don't have a particular type associated, since the
type of the ALU operation may not match the type of the operands in
question. So let's generate byte masks instead, and update the rest of
the compiler to use byte masks when analyzing reads.

Preparation for mixed types.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
d079631248 pan/midgard: Add helpers for manipulating byte masks
There are essentially two formats of masks in play beginning with this
commit: masks per-channel and masks per-byte. The former make sense
within a given fixed-size instruction; the latter are
typesize-independent. It turns out you need the latter to meaningfully
manipulate instructions containing multiple sizes (which is quite
possible with ALU operations).

Similarly, we have mir_srcsize. We calculate the size of the source by
analyzing the size of the instruction itself and stepping down if there
is a half-modifier.

Finally, we have mir_round_bytemask_down, for when we want to take a
byte mask and "round it down" to a given component size, so that we can
use it as a component mask.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
e981b69484 pan/midgard: Implement OP_IS_STORE with table
..rather than open-coding.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
8e31b14858 pan/midgard: Tableize load/store ops
This will allow us to encode properties about the load/store ops like we
do for ALU ops. We include now properties about whether we have a store,
and if there are special cases on the load/store op. We also tag each
instruction by its natural size... this is probably not totally right,
but it's a start.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
5952add9a9 pan/midgard: Factor out mir_get_alu_src
This helper is used in a bunch of places ... might as well make that
common.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
f77ea9798d pan/midgard/disasm: Fix printing 8-bit/16-bit masks
The trick is realizing even with a destination override, the masks are encoded in the same mode as the
instruction itself, rather than stepping down. The override means that
the smaller type is used, but the mask is parsed as if it were the
higher type. Overriding down is down by printed by blinding doing this. Overriding up can be thought of as printing in the upper size, but shifting the alphabet to use the upper half, i.e. shifting xyzw to become abcd.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
d49fdca229 pan/midgard: Identify 64-bit atomic opcodes
They are symmetric to their 32-bit counterparts, just shifted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-20 12:02:31 +00:00
Alyssa Rosenzweig
6601570ead pan/midgard: Debug mir_insert_instruction_after_scheduled
Add some comments explaining what's going on in a more natural flow in
order to solve the actual bug.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 2d914ebe81 ("pan/midgard: Fix memory corruption in register spilling")
2019-10-20 12:02:31 +00:00
Erik Faye-Lund
2da792d398 panfrost: do not report alpha-test as supported
This triggers lowering in the state-tracker, which makes things a bit
simpler.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-17 10:41:36 +02:00
Alyssa Rosenzweig
c94ccbf201 pan/midgard: Do not repeatedly spill same value
It doesn't make sense. You already spilled it once, and it didn't help.
Don't try again, or you'll end up in a loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig
2d914ebe81 pan/midgard: Fix memory corruption in register spilling
Essentially an off-by-one error ... bit of an edge case, but seems to
occur in some glamor shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig
fd2216e1fd pan/midgard: Use 16-bit liveness masks
We'll want liveness per-byte, so we need to accomodate up to 16 bytes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-16 08:17:56 -04:00
Alyssa Rosenzweig
923aa3918c pan/midgard: Fix mir_mask_of_read_components with dot products
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig
47b58199f0 pan/midgard: Add perspective ops to mir_get_swizzle
I really need to just make this a table..

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig
7db36d94af pan/midgard: Don't try to propagate swizzles to branches
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:12 -04:00
Alyssa Rosenzweig
9c0915ba4a pan/midgard: Allow non-contiguous masks in UBO lowering
We don't really need to impose this condition, but we do need to cope
with the slightly more general case.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:11 -04:00
Alyssa Rosenzweig
a6867fb3fd pan/midgard: Report read mask for branch arguments
Conditionals in particular read values.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-15 21:41:11 -04:00
Alyssa Rosenzweig
dcd2f26b98 pan/midgard: Replace mir_is_live_after with new pass
Now that we have live_out calculated per block as metadata, calculating
liveness of an instruction at a given point in the program becomes O(n)
to the size of the block worst-case, rather than O(n) the program.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
39a4b3ebe9 pan/midgard: Calculate temp_count for liveness
This needs to be correct or the analysis fails.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
ad5fcac005 pan/midgard: Invalidate liveness for mir_is_live_after
Callers should have liveness info ready. Ideally we'd have a nice
metadata tracking framework like NIR to handle this automatically, but
for now this will allow us to make forward progress... when we're about
to do something with liveness, invalidate everything ahead to force a
clean calculation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
3450c013c5 pan/midgard: Begin tracking liveness metadata
This will allow us to explicitly invalidate liveness analysis results so
we can cache liveness results.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:51 -04:00
Alyssa Rosenzweig
846e5d5ba8 pan/midgard: Don't try to OR live_in of successors
By definition, once liveness analysis has occurred:

   live_out = OR {succ} succ->live_in

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig
013cd6bed2 pan/midgard: Move RA's liveness analysis into midgard_liveness.c
There are unfortunately two distinct liveness analysis passes in the
compiler right now -- one good (but complex) pass used by RA based on
solving data flow equations, and one awful (but simple) pass used for
dead code elimination and bundling based on an abstract walk of the AST.

Let's move RA's pass into shared code so we can work on unifying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig
76a76de7af pan/midgard: Add mir_calculate_temp_count helper
This allows us to fill in ctx->temp_count explicitly, even if we haven't
squished down the MIR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig
c59fae0fef pan/midgard: Remove mir_has_multiple_writes
We already enforce this with the SSA/register distinction in the
backend. There is no need to duplicate this logic merely for an assert.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-03 22:29:50 -04:00
Alyssa Rosenzweig
7be00b2a06 pan/midgard: Allow scheduling conditions with constants
Now that we have constant adjustment logic abstracted, we can do this
safely. Along with the csel inversion patch, this allows many more
common csel ops to inline their condition in the bundle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
c20063aa4a pan/midgard: Add csel invert optimization
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
f0f4b39548 pan/midgard: Add mir_flip helper
Useful for various operations on both commutative and anticommutative
ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
10037ce523 pan/midgard: Tightly pack 32-bit constants
If we can reuse constant slots from other instructions, we would like to
do so to include more instructions per bundle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
a3ca283bc1 pan/midgard: Allow writeout to see into the future
If an instruction could be scheduled to vmul to satisfy the writeout
conditions, let's do that and save an instruction+cycle per fragment
shader.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
12a70ccd9e pan/midgard: Allow 6 instructions per bundle
We never had a scheduler good enough to hit this case before! :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
34ff50cadd pan/midgard: Only one conditional per bundle allowed
There's no r32 to save ya after you use up r31 :)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
2715bd02ee pan/midgard: Schedule to smul/sadd
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
57bac68fff pan/midgard: Extend choose_instruction for scalar units
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
e9edae3ecb pan/midgard: Don't double check SCALAR units
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
d3b3daa9d3 pan/midgard: Use new scheduler
We still emit in-order but we switch to using the bundles created from
the new scheduler, which will allow greater flexibility and room for
out-of-order optimization.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
1409af9fc7 pan/midgard: Add distance metric to choose_instruction
We require chosen instructions to be "close", to avoid ballooning
register pressure. This is a kludge that will go away once we have
proper liveness tracking in the scheduler, but for now it prevents a lot
of needless spilling.

v2: Lower threshold to 6 (from 8). Schedule is hurt, but a few shaders
that spilled excessively are fixed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Derp
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
e9571b53e1 pan/midgard: Add mir_choose_alu helper
Based on a given unit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00
Alyssa Rosenzweig
8462e82467 pan/midgard: Implement load/store pairing
We can bundle two load/store together. This eliminates the need for
explicit load/store pairing in a prepass, as well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-30 08:40:13 -04:00