Commit graph

1118 commits

Author SHA1 Message Date
Rob Clark
a6184eae31 freedreno/drm: handle ancient kernels
Older kernels did not support `MSM_INFO_GET_IOVA`.  But this is only
required for (a) clover (ie. `fd_set_global_binding()`) and drm paths
that are limited to newer kernels.  So move the location of the assert
to fix new userspace on old kernels.

Fixes: c9e8df61dc ("freedreno: Initialize the bo's iova at creation time.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5081>
2020-05-18 19:00:47 +00:00
Rob Clark
106c2a65db freedreno/drm: don't pass thru 'DUMP' flag on older kernels
"softpin" mode was introduced in the same kernel as the 'DUMP' flag.  So
if we are using the legacy non-softpin path, clear the dump flag.  OTOH
the 'DUMP' flag isn't quite so needed on older kernels, since we would
get all cmdstream, even SDS stateobjs, dumped regardless, as they would
have cmd table entries.

Fixes: b2c23b1e48 ("freedreno: Mark all ringbuffer BOs as to be dumped on crash.")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5081>
2020-05-18 19:00:47 +00:00
Rob Clark
5e10506834 freedreno/fdperf: add dependency on generated headers
To fix an issue reported here:
https://bugs.chromium.org/p/chromium/issues/detail?id=1083815

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5088>
2020-05-18 17:53:45 +00:00
Ilia Mirkin
b5accb3ff9 freedreno/a3xx: parameterize ubo optimization
A3xx apparently has higher alignment requirements than later gens for
indirect const uploads. It also has fewer of them. Add compiler
parameters for both settings, and set accordingly for a3xx and a4xx+.
This fixes all the ubo test failures caused by this optimization.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5077>
2020-05-17 19:51:40 -04:00
Ilia Mirkin
9048adbd24 freedreno/ir3: avoid applying (sat) on bary.f
This causes failures on a3xx resulting in the non-sensical dEQP failures
on packUnorm2x16. The same test uses ldlv on a4xx+, so just disallow
(sat) on bary.f on all generations.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5074>
2020-05-17 21:17:57 +00:00
Ilia Mirkin
ff4df32fae freedreno/a3xx: there's no r8i/ui rb format, only rg8i/rg8ui
This fixes a number of dEQP tests:

  dEQP-GLES3.functional.fbo.blit.conversion.r8*
  dEQP-GLES3.texture.specification.basic_teximage2d.r8*

and others. The reason why this enum showed up in traces for R8 is that
it was an "upgraded" texture to R8G8.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5073>
2020-05-17 14:39:42 -04:00
Eduardo Lima Mitev
e7458f19e1 freedreno/uuid: Generate meaningful device and driver UUID
Device UUID becomes SHA1('freedreno' + gpu_id).
Driver UUID becomes SHA1(mesa-version + git-head-sha1).

v2: Don't use build_id for driver UUID since it generates different
    values for vulkan and gl shared objects. (Kristian)

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
2020-05-14 19:05:02 +00:00
Eduardo Lima Mitev
9623debf48 freedreno: Centralize UUID generation into new files freedreno_uuid.c/h
The new files are created under a 'common' folder under 'src/freedreno',
where shared functionality between GL and Vulkan drivers (that is not
registers, layout or compiler) will be placed.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
2020-05-14 19:05:02 +00:00
Connor Abbott
f293d02dc4 tu: Advertise COLOR_ATTACHMENT_BLEND_BIT for blendable formats
Whoops. After fixing dual-source blending, dEQP-VK.pipeline.blend.* all
go from skipped to pass, and fixes a bunch of
dEQP-VK.api.info.format_properties.* tests where blending is required.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
2020-05-14 18:15:31 +00:00
Connor Abbott
adbdab3ee8 tu: Implement dual-src blending
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
2020-05-14 18:15:31 +00:00
Connor Abbott
078aa9df8d tu: Move RENDER_COMPONENTS setting to pipeline state
This needs to be pipeline state because it can change when dual-source
blending is active.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
2020-05-14 18:15:31 +00:00
Connor Abbott
2a9d12d513 ir3: Fixup dual-source blending slot
The hardware expects that where MRT0 and MRT1 would normally go are the
dual sources for MRT0, whereas GLSL has an extra "index" parameter that
indicates which source it is. Remap it when handling FS outputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
2020-05-14 18:15:31 +00:00
Connor Abbott
0e0580550e freedreno/a6xx: Document dual-src blending enable bits
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
2020-05-14 18:15:31 +00:00
Eric Anholt
b1151cd2ff freedreno: Avoid duplicate BO relocs in FD_RINGBUFFER_OBJECTs.
For the piglit drawoverhead case, 5/18 of the objects' relocs were
duplicated.  We can dedupe them at object create time (since objects are
long-lived) and avoid repeated relocation work at emit time.

nohw drawoverhead program statechange throughput 2.34082% +/- 0.645832%
(n=10).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>
2020-05-14 14:12:15 +00:00
Eric Anholt
a6fe0799fa freedreno: Fix resource layout dump loop.
Apparently I've never dumped a fully populated slices array, so the 0-init
always terminated the loop.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>
2020-05-14 14:12:15 +00:00
Rob Clark
cf21b76383 freedreno/ir3: use lower_wrmasks pass
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:53 -07:00
Rob Clark
a506d49fae nir: add helper to copy const_index[]
It seems less brittle to not assume they are in the same order for src
and dst instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:45 -07:00
Rob Clark
ea6b404294 freedreno/ir3: use const_index accessors
Cleans up a couple spots that were still open-coding this.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:38 -07:00
Kristian H. Kristensen
14969aab11 freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics
These intrinsics are supposed to map to the underlying hardware
instructions, which don't have wrmask. We use them when we lower
store_output in the geometry pipeline and since store_output gets
lowered to temps, we always see full wrmasks there.
2020-05-13 20:24:33 -07:00
Eric Anholt
112c65825f freedreno/a6xx: Use LDC for UBO loads.
It saves addressing math, but may cause multiple loads to be done and
bcseled due to NIR not giving us good address alignment information
currently.  I don't have any workloads I know of using non-const-uploaded
UBOs, so I don't have perf numbers for it

This makes us match the GLES blob's behavior, and turnip (other than being
bindful).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
ab93a631b4 freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf.
With the upcoming LDC usage in the GL driver, we don't want to be
uploading descriptors for every UBO when they aren't actually in use.
Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling
elsewhere right now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
d5176c453e freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges.
I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges
would move things back in a broken way.  We can just run this pass later
and drop the _ir3 path.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
5387c27140 freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift.
Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3()
after the non-offset src's definition, leading to nir_validate() failures.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
e0a4d1c4e5 freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa).
Just copy the src through.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Eric Anholt
6670475a44 freedreno/a6xx: Fix UBWC mipmapping height alignment.
After fixing the power of two sizing, pitches worked, but 1-pixel high and
unaligned height miplevels were off.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
81f21ff4ef freedreno/a6xx: Fix UBWC mipmap sizing.
The HW requires a log2 width/height of the level 0 meta_* size in the
descriptors, making it pretty clear that UBWC mipmapping is all
power-of-two sized.  Fixes a bunch of failures in the upcoming unit UBWC
layout unit tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
b5db2a2574 freedreno/a6xx: Fix UBWC blockheight for RG8.
Using texturator on a P3A at 1024x1024, RG8 has log2w/h of 6x7 instead of
R16I/UI's 6x8.  The other blockw/h I verified other than cpp=1
(R8/R8I/R8UI didn't use UBWC) and 32 (would need a bigger type).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
9da4ce9953 freedreno: Pull the tile_alignment lookup for a layout to a helper.
The r8g8 case UBWC alignment will be changing in the next commit, so
fdl6_get_ubwc_blockwidth needs to start paying attention to r8g8 too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
dc7ccdb3f5 freedreno/a6xx: Add a testcase for UBWC buffer sharing.
These offsets are hand-computed referencing msm_media_info.h, and match
our driver's current behavior.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
e32783c644 freedreno/a6xx: Improve layout testcase logging for UBWC fails.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Eric Anholt
2e4ddb6353 freedreno/a4xx+: Increase max texture size to 16384.
Noticed when poking around with texture layouts and found that my big
texture layout from the blob buffer overflowed.  Values come from
http://vulkan.gpuinfo.org for Adreno 418, 512, 630.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
2020-05-13 19:18:16 +00:00
Connor Abbott
b408734e5e tu: Implement fallback linear staging blit for CopyImage
Also, rewrite the format decision code so that we correctly decide when
the linear fallback is needed, even if UBWC is disabled. As part of
that, I also moved around some of the code to handle compressed formats
to make sure that copying compressed formats with a linear staging blit
works (this is now possible since we started allowing tiled compressed
textures).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
2020-05-13 13:39:04 +00:00
Connor Abbott
40e842c009 tu: Add noubwc debug flag to disable UBWC
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
2020-05-13 13:39:04 +00:00
Connor Abbott
ed79f805fa tu: Add a "scratch bo" allocation mechanism
This is simpler than a full-blown memory reuse mechanism, but is good
enough to make sure that repeatedly doing a copy that requires the
linear staging buffer workaround won't use excessive memory or be slowed
down due to repeated allocations.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
2020-05-13 13:39:04 +00:00
Samuel Pitoiset
91c757b796 turnip: use the common code for generating extensions and dispatch tables
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
2020-05-13 08:45:29 +02:00
Rob Clark
d6706fdc46 freedreno/ir3/sched: try to avoid syncs
Similar to what we do in postsched.  It is useful for pre-RA sched to be
a bit aware of things that would cause syncs.  In particular for the tex
fetches, since the vecN src/dst tends to limit postsched's ability to
re-order them.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
d95a6e3a0c freedreno/ir3/sched: avoid scheduling outputs
If an instruction's only use is as an output, and it increases register
pressure, then try to avoid scheduling it until there are no other
options.

A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these
immed loads to the end of the shader.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
488cf208d5 freedreno/ir3/postsched: try to avoid (sy) syncs
Similar to avoidance of `(ss)` syncs, it turns out to be helpful to
avoid `(sy)` syncs as well.  This helps us turn an tex, (sy)alu, tex,
(sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in
gfxbench gl_fill2.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
25f4fb346e freedreno/ir3/postsched: reset sfu_delay on sync
Once we schedule an instruction that will require an `(ss)` sync flag,
there is no need to delay any further instructions that consume an
SFU result (until the next SFU instruction is scheduled).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
f351e1d137 freedreno/ir3: limit # of tex prefetch by shader size
It seems for short frag shaders, too much prefetch can be detrimental.
I think what we *really* want to do is decide after pre-RA sched, when
we also know about nop's and what the actual ir3 instruction count is.
But that will require re-working how prefetch lowering works.  For now
this is a super crude heuristic to attempt to approximate a good
solution.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
2020-05-13 03:28:40 +00:00
Rob Clark
d69f6fd852 freedreno/ir3: fix indirect cb0 load_ubo lowering
We can no longer assume that `state->ranges[0]` is block 0.  It *often*
is, but when we encounter a "real" ubo that we lower to `load_uniform`
before a block 0 `load_ubo`, it could end up another entry in the table.
Resulting in the second pass after gathering ubo ranges, not finding a
valid range.  Which results in a `load_ubo` for a thing that is not
actually a ubo making it's way into ir3 frontend.  Resulting in grabbing
what we think is a ubo address out of some unrelated const register, and
trying to dereference that.  Which as you can imagine, fails in amusing
ways.

Fixes: fc850080ee ("ir3: Rewrite UBO push analysis to support bindless")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
2020-05-12 23:51:46 +00:00
Rob Clark
c4dc877cb5 freedreno/ir3: don't allow negative const_offset
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
2020-05-12 23:51:46 +00:00
Brian Ho
a43e974064 turnip: Execute ir3_nir_lower_gs pass again
This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.

As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
2020-05-12 13:42:55 -07:00
Jonathan Marek
d76e722ed6 turnip: enable tiling for compressed formats
Now that layout code supports this, we can enable it.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
2020-05-12 17:25:38 +00:00
Jonathan Marek
f543d87f23 turnip: update "fetchsize" value to match fdl6_layout changes
It seems this is actually a "minimum pitch" value. For example
TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels.

This fixes breakage with compressed formats. For example this test:

dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3

Fixes: a34b3fa198 ("freedreno/fdl: Align after dividing by block size")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
2020-05-12 17:25:38 +00:00
Eric Anholt
f789c5975c freedreno: Fix non-constbuf-upload UBO block indices and count.
The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse
what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers
to the HW for UBO block 1-N, so let's just fix up the shader state.

Fixes an off by one in const state layout setup, and some really dodgy
register addressing trying to deal with dynamic UBO indices when the UBO
pointers happen to be at the start of the constbuf.

There's no fixes tag, though this fixes a bug from September, because it
would require the num_ubos fix in nir_lower_uniforms_to_ubo.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
2020-05-12 17:01:55 +00:00
Eric Anholt
51d7a71bd4 freedreno: Replace OUT_RELOCW with OUT_RELOC.
Final cleanup commit now that they're the same.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Eric Anholt
064f395a89 freedreno: Tell the kernel that all BOs are for writing.
Using non-write flags is pretty dubious -- it means the kernel tracking an
array of read-only consumers of the BO and having exclusive consumers wait
on each reader's fence.  It allows multiple readers through dma-bufs to do
work in parallel, but at the cost of kernel CPU time and memory management
of the shared array.  Other drivers have dropped this distinction since
dma-buf sharing is usually producer-consumer, not producer-two-consumers,
and the userspace and kernel space tracking is expensive.

For us, this lets us drop the flags passed in for relocs and tracked in
the ringbuffer reloc lists.  The end result of the flags reduction work is
drawoverhead uniforms test throughput 2.37195% +/- 0.365579% (n=15)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Eric Anholt
b2c23b1e48 freedreno: Mark all ringbuffer BOs as to be dumped on crash.
We can avoid passing these flags around in the DRM backends by just
marking ring BOs up front.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00
Eric Anholt
554b959df0 freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>
2020-05-12 16:30:57 +00:00