Commit graph

1099 commits

Author SHA1 Message Date
Eric Anholt
112c65825f freedreno/a6xx: Use LDC for UBO loads.
It saves addressing math, but may cause multiple loads to be done and
bcseled due to NIR not giving us good address alignment information
currently.  I don't have any workloads I know of using non-const-uploaded
UBOs, so I don't have perf numbers for it

This makes us match the GLES blob's behavior, and turnip (other than being
bindful).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
ab93a631b4 freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf.
With the upcoming LDC usage in the GL driver, we don't want to be
uploading descriptors for every UBO when they aren't actually in use.
Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling
elsewhere right now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
d5176c453e freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges.
I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges
would move things back in a broken way.  We can just run this pass later
and drop the _ir3 path.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
5387c27140 freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift.
Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3()
after the non-offset src's definition, leading to nir_validate() failures.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
e0a4d1c4e5 freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa).
Just copy the src through.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
6670475a44 freedreno/a6xx: Fix UBWC mipmapping height alignment.
After fixing the power of two sizing, pitches worked, but 1-pixel high and
unaligned height miplevels were off.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
81f21ff4ef freedreno/a6xx: Fix UBWC mipmap sizing.
The HW requires a log2 width/height of the level 0 meta_* size in the
descriptors, making it pretty clear that UBWC mipmapping is all
power-of-two sized.  Fixes a bunch of failures in the upcoming unit UBWC
layout unit tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
b5db2a2574 freedreno/a6xx: Fix UBWC blockheight for RG8.
Using texturator on a P3A at 1024x1024, RG8 has log2w/h of 6x7 instead of
R16I/UI's 6x8.  The other blockw/h I verified other than cpp=1
(R8/R8I/R8UI didn't use UBWC) and 32 (would need a bigger type).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
9da4ce9953 freedreno: Pull the tile_alignment lookup for a layout to a helper.
The r8g8 case UBWC alignment will be changing in the next commit, so
fdl6_get_ubwc_blockwidth needs to start paying attention to r8g8 too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
dc7ccdb3f5 freedreno/a6xx: Add a testcase for UBWC buffer sharing.
These offsets are hand-computed referencing msm_media_info.h, and match
our driver's current behavior.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
e32783c644 freedreno/a6xx: Improve layout testcase logging for UBWC fails.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
2e4ddb6353 freedreno/a4xx+: Increase max texture size to 16384.
Noticed when poking around with texture layouts and found that my big
texture layout from the blob buffer overflowed.  Values come from
http://vulkan.gpuinfo.org for Adreno 418, 512, 630.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Connor Abbott
b408734e5e tu: Implement fallback linear staging blit for CopyImage
Also, rewrite the format decision code so that we correctly decide when
the linear fallback is needed, even if UBWC is disabled. As part of
that, I also moved around some of the code to handle compressed formats
to make sure that copying compressed formats with a linear staging blit
works (this is now possible since we started allowing tiled compressed
textures).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
2020-05-13 13:39:04 +00:00
Connor Abbott
40e842c009 tu: Add noubwc debug flag to disable UBWC
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
2020-05-13 13:39:04 +00:00
Connor Abbott
ed79f805fa tu: Add a "scratch bo" allocation mechanism
This is simpler than a full-blown memory reuse mechanism, but is good
enough to make sure that repeatedly doing a copy that requires the
linear staging buffer workaround won't use excessive memory or be slowed
down due to repeated allocations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
2020-05-13 13:39:04 +00:00
Samuel Pitoiset
91c757b796 turnip: use the common code for generating extensions and dispatch tables
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
2020-05-13 08:45:29 +02:00
Rob Clark
d6706fdc46 freedreno/ir3/sched: try to avoid syncs
Similar to what we do in postsched.  It is useful for pre-RA sched to be
a bit aware of things that would cause syncs.  In particular for the tex
fetches, since the vecN src/dst tends to limit postsched's ability to
re-order them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
d95a6e3a0c freedreno/ir3/sched: avoid scheduling outputs
If an instruction's only use is as an output, and it increases register
pressure, then try to avoid scheduling it until there are no other
options.

A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these
immed loads to the end of the shader.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
488cf208d5 freedreno/ir3/postsched: try to avoid (sy) syncs
Similar to avoidance of `(ss)` syncs, it turns out to be helpful to
avoid `(sy)` syncs as well.  This helps us turn an tex, (sy)alu, tex,
(sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in
gfxbench gl_fill2.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
25f4fb346e freedreno/ir3/postsched: reset sfu_delay on sync
Once we schedule an instruction that will require an `(ss)` sync flag,
there is no need to delay any further instructions that consume an
SFU result (until the next SFU instruction is scheduled).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
f351e1d137 freedreno/ir3: limit # of tex prefetch by shader size
It seems for short frag shaders, too much prefetch can be detrimental.
I think what we *really* want to do is decide after pre-RA sched, when
we also know about nop's and what the actual ir3 instruction count is.
But that will require re-working how prefetch lowering works.  For now
this is a super crude heuristic to attempt to approximate a good
solution.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
d69f6fd852 freedreno/ir3: fix indirect cb0 load_ubo lowering
We can no longer assume that `state->ranges[0]` is block 0.  It *often*
is, but when we encounter a "real" ubo that we lower to `load_uniform`
before a block 0 `load_ubo`, it could end up another entry in the table.
Resulting in the second pass after gathering ubo ranges, not finding a
valid range.  Which results in a `load_ubo` for a thing that is not
actually a ubo making it's way into ir3 frontend.  Resulting in grabbing
what we think is a ubo address out of some unrelated const register, and
trying to dereference that.  Which as you can imagine, fails in amusing
ways.

Fixes: fc850080ee ("ir3: Rewrite UBO push analysis to support bindless")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
2020-05-12 23:51:46 +00:00
Rob Clark
c4dc877cb5 freedreno/ir3: don't allow negative const_offset
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
2020-05-12 23:51:46 +00:00
Brian Ho
a43e974064 turnip: Execute ir3_nir_lower_gs pass again
This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.

As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
2020-05-12 13:42:55 -07:00
Jonathan Marek
d76e722ed6 turnip: enable tiling for compressed formats
Now that layout code supports this, we can enable it.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
2020-05-12 17:25:38 +00:00
Jonathan Marek
f543d87f23 turnip: update "fetchsize" value to match fdl6_layout changes
It seems this is actually a "minimum pitch" value. For example
TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels.

This fixes breakage with compressed formats. For example this test:

dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3

Fixes: a34b3fa198 ("freedreno/fdl: Align after dividing by block size")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
2020-05-12 17:25:38 +00:00
Eric Anholt
f789c5975c freedreno: Fix non-constbuf-upload UBO block indices and count.
The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse
what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers
to the HW for UBO block 1-N, so let's just fix up the shader state.

Fixes an off by one in const state layout setup, and some really dodgy
register addressing trying to deal with dynamic UBO indices when the UBO
pointers happen to be at the start of the constbuf.

There's no fixes tag, though this fixes a bug from September, because it
would require the num_ubos fix in nir_lower_uniforms_to_ubo.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
2020-05-12 17:01:55 +00:00
Eric Anholt
51d7a71bd4 freedreno: Replace OUT_RELOCW with OUT_RELOC.
Final cleanup commit now that they're the same.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Eric Anholt
064f395a89 freedreno: Tell the kernel that all BOs are for writing.
Using non-write flags is pretty dubious -- it means the kernel tracking an
array of read-only consumers of the BO and having exclusive consumers wait
on each reader's fence.  It allows multiple readers through dma-bufs to do
work in parallel, but at the cost of kernel CPU time and memory management
of the shared array.  Other drivers have dropped this distinction since
dma-buf sharing is usually producer-consumer, not producer-two-consumers,
and the userspace and kernel space tracking is expensive.

For us, this lets us drop the flags passed in for relocs and tracked in
the ringbuffer reloc lists.  The end result of the flags reduction work is
drawoverhead uniforms test throughput 2.37195% +/- 0.365579% (n=15)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Eric Anholt
b2c23b1e48 freedreno: Mark all ringbuffer BOs as to be dumped on crash.
We can avoid passing these flags around in the DRM backends by just
marking ring BOs up front.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Eric Anholt
554b959df0 freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Eric Anholt
9d8d936dfc freedreno: Start moving relocs flags into the BOs.
It's silly to have all the reloc emitters passing around FD_RELOC_READ
when you have to have it set on all relocs (that don't include WRITE,
which implies read) for the kernel to actually track the fences on the BO.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Mauro Rossi
a92a483ff7 freedreno: android: add adreno-pm4-pack.xml.h generation to android build
Fixes the following building errors:

In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blitter.c:40:
external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_blend.c:36:
external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_const.c:26:
external/mesa/src/gallium/drivers/freedreno/a6xx/fd6_pack.h:42:10: fatal error: 'adreno-pm4-pack.xml.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: ee293160 "freedreno/a6xx: add OUT_PKT()"
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4973>
2020-05-09 16:19:14 +00:00
Mauro Rossi
5dc3b22dd0 freedreno/drm: android: add libfreedreno_registers static dependency
The dependency is required to get the necessary generated headers

Fixes the following building error:

In file included from external/mesa/src/freedreno/drm/msm_bo.c:27:
In file included from external/mesa/src/freedreno/drm/msm_priv.h:30:
In file included from external/mesa/src/freedreno/drm/freedreno_priv.h:51:
external/mesa/src/freedreno/drm/freedreno_ringbuffer.h:35:10: fatal error: 'adreno_common.xml.h' file not found
#include "adreno_common.xml.h"
         ^~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 6c688ae8 ("freedreno: Deduplicate ringbuffer macros with computerator/fdperf")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4973>
2020-05-09 16:19:14 +00:00
Eric Anholt
c9e8df61dc freedreno: Initialize the bo's iova at creation time.
Avoids repeated conditionals at reloc time checking if we need to go ask
the kernel.

No statistically significant difference on the drawoverhead case I'm
looking at (n=300).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>
2020-05-08 12:35:39 -07:00
Eric Anholt
b3c4e6a597 freedreno: Rename append_bo() in case it doesn't get inlined.
In a debugoptimized build, it wasn't inlined and so I wasn't noticing
where a bunch of CPU usage was going in the DRM functions.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>
2020-05-08 12:35:39 -07:00
Eric Anholt
e1c74f3fac freedreno: Clean up tests around ORing in the reloc flags.
gcc was surprisingly not seeing through this to just do an AND and an OR.
Improves drawoverhead's few uniforms / 1 change throughput 1.64141% +/-
0.188152% (n=60).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>
2020-05-08 12:35:39 -07:00
Eric Anholt
6c688ae81f freedreno: Deduplicate ringbuffer macros with computerator/fdperf
They're sugar around freedreno_ringbuffer.h, so put them there and reuse them.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4957>
2020-05-08 12:35:38 -07:00
Hyunjun Ko
094c7646a3 freedreno,tu: Don't request fragcoord components not being read.
v1. Replace the existed bool type with new bitfield and edit register
files to take a mask instead of duplicating codes to do masking.

v2. Use fragcoord_compmask != 0 instead of fragcoord_compmask > 0 since
it represents a bitfield.

Tested with
  dEQP-VK.glsl.builtin_var.simple.fragcoord_xyz/w
  dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz/w

Closes: #2680

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4723>
2020-05-08 17:45:03 +00:00
Connor Abbott
6d513eb0db tu: Support pipelines without a fragment shader
Apparently this is allowed, and the CTS started doing this more often
recently which resulted in frequent hangs running the entire CTS. I
copied the code to create an empty FS from radv.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4928>
2020-05-07 16:05:53 +00:00
Kristian H. Kristensen
a34b3fa198 freedreno/fdl: Align after dividing by block size
For compressed formats, we need to align the number of blocks, not the
logical number of pixels in the texture.  Only compressed formats have
block width/height > 1, so we can just unconditionally multiply the
alignment by the block width/height.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4868>
2020-05-06 17:11:34 -07:00
Eric Anholt
9a6bbf4c80 freedreno/ir3: Disable sin/cos range reduction for mediump.
robclark noted that the blob wasn't doing range reduction in the mediump
case, and I confirmed it on
dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.mediump_float_fragment
vs
dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.highp_float_fragment.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4893>
2020-05-05 17:23:34 +00:00
Joshua Ashton
785803a2e5 turnip: Remove RANGE_SIZE usage
These were removed from the latest Vulkan headers
https://github.com/KhronosGroup/Vulkan-Docs/issues/1230

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4878>
2020-05-05 00:28:00 +00:00
Eric Anholt
5c81f51c3c freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx.
These come from the disasm tests, and fix our disasm of blob's
uniform/nonuniform cat6 operands.  We also now include human-readable names
for all the modes we know about (though bindless gets distinguished by its
.baseN, like Connor's original disasm).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
2020-05-04 11:15:50 -07:00
Eric Anholt
97b21110b8 freedreno/ir3: Sync some new changes from envytools.
With this I also brought in a few new control flow instruction disasm
tests that I'd made back when I wrote the disasm test, but which were too
far from correct to include until now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
2020-05-04 11:14:46 -07:00
Eric Anholt
1e5b0c92c5 freedreno/ir3: Add some more tests of cat6 disasm.
I put these together from traces I had while trying to do LDC for GL.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
2020-05-04 11:14:46 -07:00
Eric Anholt
29f58cfbd0 freedreno/ir3: Set up outputs for multi-slot varyings.
Necessary to avoid compiler assertion failures in:

dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt
88dcfaf0ee freedreno/ir3: Stop initializing regid of so->outputs during setup.
It's unused and overwritten by ir3_compile_shader_nir().

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt
8c1c218909 freedreno/ir3: Improve shader key normalization.
We can remove a bunch of conditional code at key comparison time by
computing a bitmask of used key bits at ir3_shader creation time.  This
also gives us a nice place to put additional key simplification to reduce
how many variants we create (like skipping rastflat if we don't read
colors in the FS, or skipping vclamp_color if we don't write colors).

It does mean walking the whole key to AND it, but the key is just 28 bytes
so far so that seems pretty fine.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00
Eric Anholt
6f1e3235f2 freedreno: Emit debug messages when doing draw-time recompiles of shaders.
Right now that's "always" unless you have shaderdb set.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00