Commit graph

117072 commits

Author SHA1 Message Date
Rafael Antognolli
ceeaf93c8e anv: Properly initialize device->slice_hash.
When subslices_delta == 0 and we take the early return,
device->slice_hash is not initialized on GEN11. It then causes a
segfault when going through anv_DestroyDevice, if compiled with
valgrind.

Fixes: 7bc022b4bb ("anv/gen11: Emit SLICE_HASH_TABLE when pipes are
                    unbalanced.)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-15 09:42:48 -07:00
Danylo Piliaiev
72354d43d4 intel/compiler: Fix resource leak in error path
CID: 1452261

Fixes: 04a99515 "intel/compiler: add ability to override shader's assembly"

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-15 08:17:36 +00:00
Alyssa Rosenzweig
44a6c38bd6 panfrost: Implement native RECT textures
We started honouring the normalized_coords flag in the texture
descriptor, but a bisection revealed that broke RECT textures -- since
we were *also* lowering them in the shader. So just remove the
shader-based lowering, use native RECT textures, and enjoy the nominal
reduction in complexity and performance boost.

Fixes: 3e47a1181b ("panfrost: Add MALI_SAMP_NORM_COORDS flag")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:42 -07:00
Alyssa Rosenzweig
6fe4822cca panfrost: Add R10G10B10A2_SSCALED vertex format
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
e823a47f02 pan/midgard: Disassemble UBO index explicitly
It's a bit of a special case but that's fine.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
3d54ed2488 pan/midgard: Account for unaligned UBOs when promoting uniforms
We only know how to promote aligned accesses, although theoretically we
should be able to promote unaligned to swizzles in the future. Check
this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
03350eb8b8 pan/midgard: Add mir_ubo_shift helper
Different UBO reads have different shift requirements.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
cf3bb10f51 pan/midgard: Address emit_ubo_read offset in bytes
We'll want to be smarter about unaligned reads, so let's get this code
all in one place.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
65e6cb4eb0 pan/midgard: Wire writemask into UBO reads
Helps the disassembly be clearer and maybe regalloc be smarter.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
ec2f0b580f pan/midgard: Identify UBO/SSBO op symmetry
It's the same thing, just shifted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig
375d4c2c74 panfrost: Extend blending to MRT
Our hardware supports independent (per-RT) blending, but we need to
route those settings through from Gallium.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
dff4986b1a pan/midgard: Emit store_output branch just-in-time
We'll need multiple branches for MRT, so we can't defer. Also, we need
to track dependencies to ensure r0 is set to the correct value for each
store_output.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
2fc44c4dc8 pan/midgard: Add dont_eliminate flag
We need to treat fragment writes specially.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
6ed3843224 pan/mfbd: Stuff in RT count
Fixes DATA_INVALID_FAULTs with multiple render targets.

We do always allocate space for 4 cbufs just to keep things sane. This
may not be strictly necessary.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
716be7862e pan/decode: Dump FBD tagged pointer
Turns out the rt count is stuffed in here..

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig
358372b256 pan/decode: Decode invalid access type upon fault
We don't have a good way to confirm this, but it parallels the kernel
definitons for MMU faults nicely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:39 -07:00
Alyssa Rosenzweig
f5cc5ef404 pan/decode: Fix duplicate heap_end property
This was supposed to read heap_start. It's the same value but still,
better get this right.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:39 -07:00
Alyssa Rosenzweig
b78e04c17b panfrost: Note "MFBD preload disable" bit
It's a chicken bit, as far as I can tell. Buck buck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:39:57 -07:00
Alyssa Rosenzweig
64720d1e9e pan/bifrost: Link in compiler
We enable the standalone compiler, build the new files, and let it
blast.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
b93fa7d232 pan/bifrost: Check in remainder of the Bifrost compiler
What it says on the tin.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
0e126aa0f0 pan/bifrost: Add bifrost_print.c/h
IR printers.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
d8d8b08fe5 pan/bifrost: Style format the disassembler
$ astyle *.c *.h --style=linux -s8

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
fca491c0e1 pan/bifrost: Stub out standalone compiler
We don't actually have a standalone compiler in-tree yet, but let's get
prepared for when we do.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
62bbc23da5 pan/bifrost: Sync disassembler with Ryan's tree
The disassembler was updated to move common code with the compiler into
a shared header. Additional, some new ops and control registers relating
to rounding were added.

Signed-off-by: Ryan Houdek <Sonicadvance1@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 22:54:07 +00:00
Alyssa Rosenzweig
b73cbd6880 panfrost: Remove standalone pandecode tool
Now that panwrap has gained the ability to trace directly without
dumping to the filesystem, there's no need to lug around this tool.

I can assure you nobody will miss it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
6f4d796911 pan/midgard: Fix disassembly termination condition
Fixes: 863bdd1f8d ("pan/midgard: Break, not return, in disassembler")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
de2efd5ea7 panfrost: Ensure we upload at least 1 blend RT
Otherwise we'll get memory junk.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
54438267c3 panfrost: Zero tripipe on initialize
I don't think the hardware cares, but this adds a lot of noise to traces
that we would rather not need to look at.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
1ab6290746 pan/midgard: Improve disassembler robustness
Some memory corruption / etc issues let to an accidental "fuzzing" of
the disassembler ;) This uncovered some issues leading to a disassembler
hang, so let's fix that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
9c4c7211a3 pan/decode: Split public.h out
We want a defined ABI for tracing; this set of functions should be as
small as strictly necessary to minimize panwrap shenanigans.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
4f03728fb7 pan/decode: Prefer uint64_t to mali_ptr
This removes an unwanted dependency on panfrost-job.h

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig
6c84a2665c pan/midgard: Allocate spill_slot once
Multiple spill moves share a single spill slot. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig
2a9031ea44 pan/midgard: Use hint on midgard_instruction for spill_move
This allows us to have multiple spill moves, whereas otherwise for N
spill moves, the first N-1 would be clobbered. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig
3e6f2e7aba panfrost: Remove panfrost_add_dependency asserts
It doesn't... make a ton of sense to need to assert and this routine is
hotter than you might expect. Doesn't matter for release builds, of
course.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Marek Olšák
aafc95ceb6 radeonsi: add support for Renoir
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-14 17:31:04 -04:00
Eric Engestrom
a3d6024199 meson: add nir tests to the compiler/nir test suite
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-14 22:17:06 +01:00
Eric Engestrom
d0916edfcb EGL: sync headers with Khronos
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-08-14 21:48:23 +01:00
Christian Gmeiner
2c4fe6af78 relnotes: Add new ext on etnaviv for 19.2.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 21:47:35 +02:00
Christian Gmeiner
17200bb67a etnaviv: fix weird indentation
Fixes: 797a2e4fd0 ("etnaviv: update logic to determine uniform limits")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 21:29:48 +02:00
Ian Romanick
0e6581b87d nir/algebraic: Reassociate shift-by-constant of shift-by-constant
v2: After some review discussion with Alyssa, the replacements now
correct account for cases where (b+c) >= bitsize.

v3: Use a temporary to simplify the Python code quite a bit.  Suggested
by Jason.

Haswell and all Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16251155 -> 16249576 (<.01%)
instructions in affected programs: 232627 -> 231048 (-0.68%)
helped: 547
HURT: 1
helped stats (abs) min: 1 max: 15 x̄: 2.89 x̃: 3
helped stats (rel) min: 0.04% max: 7.84% x̄: 1.14% x̃: 1.06%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
95% mean confidence interval for instructions value: -3.12 -2.65
95% mean confidence interval for instructions %-change: -1.20% -1.06%
Instructions are helped.

total cycles in shared programs: 365924392 -> 365372103 (-0.15%)
cycles in affected programs: 59207053 -> 58654764 (-0.93%)
helped: 497
HURT: 34
helped stats (abs) min: 1 max: 29300 x̄: 1118.16 x̃: 16
helped stats (rel) min: <.01% max: 10.59% x̄: 1.82% x̃: 1.82%
HURT stats (abs)   min: 2 max: 424 x̄: 101.03 x̃: 63
HURT stats (rel)   min: 0.07% max: 46.17% x̄: 4.72% x̃: 2.06%
95% mean confidence interval for cycles value: -1426.41 -653.77
95% mean confidence interval for cycles %-change: -1.66% -1.15%
Cycles are helped.

total spills in shared programs: 8870 -> 8871 (0.01%)
spills in affected programs: 104 -> 105 (0.96%)
helped: 0
HURT: 1

Ivy Bridge and all pre-Gen7 platforms had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11956236 -> 11955635 (<.01%)
instructions in affected programs: 94110 -> 93509 (-0.64%)
helped: 106
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 5.67 x̃: 4
helped stats (rel) min: 0.12% max: 4.71% x̄: 1.96% x̃: 0.76%
95% mean confidence interval for instructions value: -6.62 -4.72
95% mean confidence interval for instructions %-change: -2.27% -1.64%
Instructions are helped.

total cycles in shared programs: 179296340 -> 178788044 (-0.28%)
cycles in affected programs: 51009603 -> 50501307 (-1.00%)
helped: 82
HURT: 7
helped stats (abs) min: 5 max: 27820 x̄: 6199.00 x̃: 16
helped stats (rel) min: 0.30% max: 8.16% x̄: 2.58% x̃: 3.11%
HURT stats (abs)   min: 2 max: 8 x̄: 3.14 x̃: 2
HURT stats (rel)   min: 0.02% max: 1.40% x̄: 0.34% x̃: 0.10%
95% mean confidence interval for cycles value: -7649.38 -3773.00
95% mean confidence interval for cycles %-change: -2.71% -1.99%
Cycles are helped.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v2]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-14 11:15:37 -07:00
Ian Romanick
73aaeac0a3 nir/algebraic: Reassociate add-and-shift to be shift-and-add
A common thing in many shaders:

    uniform vs { vec4 bones[...]; };

    ...

    x = some_calculation(bones[i + 0]);
    y = some_calculation(bones[i + 1]);
    z = some_calculation(bones[i + 2]);

This turns into stuff like

    vec1 32 ssa_12 = iadd ssa_11, ssa_0
    vec1 32 ssa_13 = ishl ssa_12, ssa_3
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_15 = iadd ssa_11, ssa_1
    vec1 32 ssa_16 = ishl ssa_15, ssa_3
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_18 = iadd ssa_11, ssa_2
    vec1 32 ssa_19 = ishl ssa_18, ssa_3
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

By reassociating the shift and the add, we can reduce this to

    vec1 32 ssa_12 = ishl ssa_11, ssa_3
    vec1 32 ssa_13 = iadd ssa_12, ssa_0
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_16 = iadd ssa_12, ssa_1
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_19 = iadd ssa_12, ssa_2
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

v2: Add some commentary from Rhys Perry's nearly identical patch.

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16277758 -> 16250704 (-0.17%)
instructions in affected programs: 1440284 -> 1413230 (-1.88%)
helped: 4920
HURT: 6
helped stats (abs) min: 1 max: 69 x̄: 5.50 x̃: 4
helped stats (rel) min: 0.10% max: 18.33% x̄: 2.21% x̃: 1.79%
HURT stats (abs)   min: 1 max: 12 x̄: 4.50 x̃: 3
HURT stats (rel)   min: 0.18% max: 3.23% x̄: 1.91% x̃: 2.55%
95% mean confidence interval for instructions value: -5.67 -5.31
95% mean confidence interval for instructions %-change: -2.26% -2.16%
Instructions are helped.

total cycles in shared programs: 367118526 -> 365895358 (-0.33%)
cycles in affected programs: 93504145 -> 92280977 (-1.31%)
helped: 2754
HURT: 1269
helped stats (abs) min: 1 max: 47039 x̄: 460.66 x̃: 16
helped stats (rel) min: <.01% max: 34.93% x̄: 3.77% x̃: 1.12%
HURT stats (abs)   min: 1 max: 1500 x̄: 35.85 x̃: 9
HURT stats (rel)   min: 0.01% max: 17.35% x̄: 2.18% x̃: 0.75%
95% mean confidence interval for cycles value: -387.31 -220.78
95% mean confidence interval for cycles %-change: -2.11% -1.68%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-14 11:15:32 -07:00
Andrii Simiklit
ff2225cf88 nir/find_array_copies: Reject copies with mismatched lengths
copy_deref for wildcard dereferences requires the same
arrays lengths otherwise it leads to a crash in optimizations
like 'nir_opt_copy_prop_vars' because these optimizations expect
'copy_deref' just for arrays with the same lengths.

v2: check was moved to 'try_match_deref' to fix aoa cases
                 (Jason Ekstrand <jason@jlekstrand.net>)
v3: -fixed comment
    -the condition merged with other one
                 (Jason Ekstrand <jason@jlekstrand.net>)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111286
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-08-14 18:11:31 +00:00
Alyssa Rosenzweig
c4a4f3db5a pan/midgard: Prefix blobber-db output for grepping
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig
5f0f9e1333 pan/midgard: Implement blobber-db
We wire through some shader-db-style stats on the current shader in the
disassemble so we can get a quick estimate of shader complexity from a
trace.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Rob Clark <robdclark@chromium.org>
2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig
863bdd1f8d pan/midgard: Break, not return, in disassembler
We'll want to dump some stats after the shader, and I refuse to use one
teensy little goto.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 10:31:09 -07:00
Ian Romanick
f2965fde9b nir/range-analysis: Fail gracefully on non-SSA sources
Tested-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-14 09:02:38 -07:00
Christian Gmeiner
1290cc3e27 etnaviv: split destroy_shader
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner
f90b23b8c4 etnaviv: split link_shader
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner
0765a1dd0e etnaviv: split dump_shader
Also this adds the missing impl for etna_dump_shader_nir(..).

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00
Christian Gmeiner
a36d04daa1 etnaviv: mv etnaviv_compiler.c etnaviv_compiler_tgsi.c
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-08-14 15:10:07 +02:00