Commit graph

457 commits

Author SHA1 Message Date
Alyssa Rosenzweig
ea232c7cfd pan/midgard: Use generic constant packing for 8/64-bit
Eventually, we will want to combine constants across types, but for now
let's not break the world.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
4c182a6d11 pan/midgard: Pack 64-bit swizzles
64-bit ops have their own funky swizzles. Let's pack them, both for
native 64-bit sources as well as extended 32-bit sources.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
ba2fb98d36 pan/midgard: Fix mir_round_bytemask_down for !32b
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
2655a300a3 pan/midgard: Implement i2i64 and u2u64
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
855eec93b1 pan/midgard: Expand 64-bit writemasks
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-15 20:08:46 +00:00
Alyssa Rosenzweig
095654e3c2 pan/midgard: Prioritize texture registers
On newer GPUs, this is a no-op. On older GPUs, this prevents needless
spilling since texture registers are shared with a subset of work
registers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:34 +00:00
Alyssa Rosenzweig
339401b53c pan/midgard: Disassemble with old pipeline always on T720
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig
8344d7425b pan/midgard: Use texture, not textureLod, on early Midgard
We have to disable the fixup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig
29f5b00e6e pan/midgard: Fix vertex texturing on early Midgard
We use a different set of texture registers, probably to save hardware.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig
3866d0776f pan/midgard: Generalize texture registers across GPUs
Early Midgard uses a different set of texture registers; let's not
hardcode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
2019-11-15 18:37:33 +00:00
Alyssa Rosenzweig
ad6b2ac374 pan/midgard: Fix copypropagation for textures
total instructions in shared programs: 3562 -> 3457 (-2.95%)
instructions in affected programs: 575 -> 470 (-18.26%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 6.56 x̃: 10
helped stats (rel) min: 5.71% max: 24.56% x̄: 16.83% x̃: 18.87%
95% mean confidence interval for instructions value: -9.07 -4.06
95% mean confidence interval for instructions %-change: -19.00% -14.66%
Instructions are helped.

total bundles in shared programs: 1846 -> 1830 (-0.87%)
bundles in affected programs: 338 -> 322 (-4.73%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.50% max: 20.00% x̄: 8.85% x̃: 3.33%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -13.02% -4.67%
Bundles are helped.

total quadwords in shared programs: 3191 -> 3144 (-1.47%)
quadwords in affected programs: 606 -> 559 (-7.76%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 2.94 x̃: 3
helped stats (rel) min: 5.17% max: 22.22% x̄: 11.20% x̃: 5.62%
95% mean confidence interval for quadwords value: -4.58 -1.29
95% mean confidence interval for quadwords %-change: -15.16% -7.24%
Quadwords are helped.

total registers in shared programs: 312 -> 303 (-2.88%)
registers in affected programs: 27 -> 18 (-33.33%)
helped: 9
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%
95% mean confidence interval for registers value: -1.00 -1.00
95% mean confidence interval for registers %-change: -33.33% -33.33%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig
f72873e6aa pan/midgard: Copypropagate vector creation
total instructions in shared programs: 3457 -> 3431 (-0.75%)
instructions in affected programs: 787 -> 761 (-3.30%)
helped: 14
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 1.86 x̃: 1
helped stats (rel) min: 1.01% max: 11.11% x̄: 9.22% x̃: 11.11%
95% mean confidence interval for instructions value: -3.55 -0.16
95% mean confidence interval for instructions %-change: -11.41% -7.03%
Instructions are helped.

total bundles in shared programs: 1830 -> 1826 (-0.22%)
bundles in affected programs: 279 -> 275 (-1.43%)
helped: 2
HURT: 0

total quadwords in shared programs: 3144 -> 3121 (-0.73%)
quadwords in affected programs: 645 -> 622 (-3.57%)
helped: 13
HURT: 0
helped stats (abs) min: 1 max: 11 x̄: 1.77 x̃: 1
helped stats (rel) min: 2.09% max: 16.67% x̄: 12.61% x̃: 14.29%
95% mean confidence interval for quadwords value: -3.45 -0.09
95% mean confidence interval for quadwords %-change: -15.43% -9.79%
Quadwords are helped.

total registers in shared programs: 303 -> 301 (-0.66%)
registers in affected programs: 14 -> 12 (-14.29%)
helped: 2
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig
39b5f2fa0b pan/lcra: Use Chaitin's spilling heuristic
Not much of a difference but slightly better and slightly less
arbitrary.

total instructions in shared programs: 3560 -> 3559 (-0.03%)
instructions in affected programs: 44 -> 43 (-2.27%)
helped: 1
HURT: 0

total bundles in shared programs: 1844 -> 1843 (-0.05%)
bundles in affected programs: 23 -> 22 (-4.35%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig
23c83f3f05 pan/midgard: Compute spill costs
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 02:36:21 +00:00
Alyssa Rosenzweig
771d23584a pan/midgard: Remove util/ra support
It's now unused, in favour of LCRA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
e343f2ceb9 pan/midgard: Integrate LCRA
Pretty routine, we do have a hack to force swizzle alignment for !32-bit
for until we implement !32-bit the right way.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
66ad64d73d pan/midgard: Implement linearly-constrained register allocation
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
fd81916ee5 pan/midgard: Add blend shader selection bits for MRT
This is less complicated than previously thought. Note we have no way of
specifying the work register count for blend shaders; it must be
strictly less than the work register count of the corresponding fragment
shader (which is fine since we force the fragment shader to report a
count of 16 with a blend shader as a major hack until we get register
pressure down for blend shaders).

TODO: pandecode the flags.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
3295edaadf pan/midgard: Pack load/store masks
While most load/store operations on 32-bit/vec4 intriniscally, some are
not and have special type-size-dependent semantics for the mask. We need
to convert into this native format.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig
843874c7c3 pan/midgard: Implement nir_intrinsic_load_output_u8_as_fp16_pan
We can use the native Midgard ops for this, depending what chip we're
on.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig
5885b64e42 pan/midgard: Identify ld_color_buffer_u8_as_fp16*
There are two versions of this opcode, depending what version of the ISA
you're using. I'm not sure if there's a semantic difference; I think
there might be some slight subtleties but it's too early to know at this
stage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-11 15:23:44 +00:00
Alyssa Rosenzweig
5f768eda43 pan/midgard: Switch base for vertex texturing on T720
There aren't texture pipeline registers anymore; instead, space is
shared with work and ldst registers for output and input respectively.
We need to shift the base registers to represent this correctly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig
ac14facf7a pan/midgard: Pass shader stage to disassembler
Vertex texturing behaves differently from fragment texturing on some
GPUs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig
515941202d pan/midgard: Disassemble half-steps correctly
The meaning of some bits shifts; we need to account for this to print
swizzles sanely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Alyssa Rosenzweig
ec2af6bc97 pan/midgard: Fix printing of half-registers in texture ops
We were using old style half-registers; let's update that to be
consistent, preparing us for more disassmbler changes in this area.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-08 06:45:03 +00:00
Tomeu Vizoso
072207bc18 panfrost: Pipe the GPU ID into compiler and disassembler
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-11-07 08:48:45 +00:00
Tomeu Vizoso
94e6d17043 panfrost: Print the right zero field
Copy paste error.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-11-06 18:13:16 +01:00
Tomeu Vizoso
8e1ae5fa14 panfrost: Decode blend shaders for SFBD
Also set MALI_HAS_BLEND_SHADER as needed.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
9447a84f69 panfrost: Rework format encoding on SFBD
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:18:46 +01:00
Tomeu Vizoso
23fe7cd2d6 panfrost: Add checksum fields to SFBD descriptor
During tests on T720, these fields were discovered.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-06 16:17:13 +01:00
Alyssa Rosenzweig
12d071024b pan/midgard: Extend default_phys_reg to !32-bit
We can pass through a size.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
762623381d pan/midgard: Extend swizzle packing for vec4/16-bit
We would like to pack not just xyzw swizzles but also efgh swizzles.
This should work for vec4/16-bit. More work will be needed to pack
swizzles for vec8/16-bit and even more work for 8-bit, of course.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
bf5508f7b9 pan/midgard: Extend offset_swizzle to non-32-bit
We take a size parameter; use it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
f538981384 pan/midgard: offset_swizzle doesn't need dstsize
This argument should be omitted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
9eac9389fb pan/midgard: Add bizarre corner case
Someone really needs to look into this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
4ae4d82e21 pan/midgard: Compute bundle interference
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
45ac8ea8bd pan/midgard: Fix quadword_count handling
Spilling can mess with this considerably.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Alyssa Rosenzweig
0a77dd3203 pan/midgard: Validate tags when branching
Midgard prefetches instructions based on tag (ALU, LD/ST, texture *
size). To do so, the shader descriptor specifies the tag of the first
instruction, all instructions specify the tag of the next linear
instruction is, and all branches explicitly specify the tag of the
branch target.

If you mess this up, you get an INSTR_TYPE_MISMATCH, which unambiguously
refers to this problem, but it's still annoying to try to work out all
the branch targets in your head to debug.

Instead, let's track the tags of various blocks over time, so we can
automatically validate tags of branch targets, to make
INSTR_TYPE_MISMATCH issues immediately obvious in a disassembly.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 15:36:08 -05:00
Boris Brezillon
28440820ef panfrost: MALI_DEPTH_TEST is actually MALI_DEPTH_WRITEMASK
MALI_DEPTH_TEST should only be set when depth->writemask is true,
not when the depth test is enabled. Let's rename the flag and patch
panfrost_bind_depth_stencil_state() to do the right thing.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-04 16:14:09 +01:00
Alyssa Rosenzweig
c3a46e7644 pan/midgard: Eliminate blank_alu_src
We don't need it in practice, so this is some more cleanup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig
70072a20e0 pan/midgard: Refactor swizzles
Rather than having hw-specific swizzles encoded directly in the
instructions, have a unified swizzle arary so we can manipulate swizzles
generically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig
e7fd14ca8a pan/midgard: Add a dummy source for loads
We want symmetry between loads and stores, so we add a dummy source. So
we get, e.g.

   st_int4 _,    val, arg_1, arg_2
   ld_int4 dest,   _, arg_1, arg_2

Semantically, this dummy source represents the data itself, as if the
load is simply a move. That means it has a swizzle that acts as a
source.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:47 +00:00
Alyssa Rosenzweig
b5938be51d pan/midgard: Remove OP_IS_STORE_VARY
Unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-01 01:01:46 +00:00
Robert Foss
f140467b5b
android: Add panfrost support to build scripts
Currently the Android build system doesn't expose the panfrost
driver.

This patch enables the panfrost driver to be build on for the
Android platform.

Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-31 10:03:54 +01:00
Alyssa Rosenzweig
44971b84b7 panfrost: Remove unused definitions in mali-job.h
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Alyssa Rosenzweig
fa14cdf6e4 panfrost: Cleanup _shader_upper -> shader
I don't believe this is actually a tagged pointer; warn if it is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Alyssa Rosenzweig
f98e9a2771 pan/midgard: Express allocated registers as offsets
Rather than supplying a mask/swizzle to compose with the original, just
supply the offset of the allocated register so we can directly offset
the mask/swizzle, without resorting to composition.

This is simpler, cleaner, and will generalize to non-32-bit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig
c1d36eb115 pan/midgard: Expose more typesize manipulation routines
These internal mir.c routines will help the RA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Alyssa Rosenzweig
9bba182840 pan/midgard: Add mir_set_bytemask helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-25 08:45:39 -04:00
Rhys Perry
8b98d0954e nir/lower_idiv: add new llvm-based path
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-21 18:49:46 +00:00