Commit graph

39979 commits

Author SHA1 Message Date
Connor Abbott
8c7ad22adb lima/gpir: Fix fake dep handling for schedule_first nodes
The whole point of schedule_first nodes is that they need to be
scheduled as soon as possible, so if a schedule_first node is the
successor in a fake dependency that prevents it from being scheduled
after its parent, that can cause problems. We need to add these fake
dependencies to the parent as well, and we need to guarantee that the
pre-RA scheduler puts schedule_first nodes right before their parents in
order to prevent this from adding cycles to the dependency graph.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:42:00 +07:00
Connor Abbott
2955875381 lima/gpir: Fix schedule_first insertion logic
The idea was to make sure schedule_first nodes were always first in the
ready list. I made sure they were inserted first, but not that other
nodes wouldn't later be scheduled ahead of them. Fixes
spec@glsl-1.10@execution@built-in-functions@vs-exp-float and probably
others.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:41:35 +07:00
Connor Abbott
63acdb5ce6 lima/gpir: Ignore unscheduled successors in can_use_complex()
The point of the function is to avoid creating a complex move which is
used by certain slots in the next instruction, but unscheduled
successors will never be in the next instruction. Found while debugging
a crash that the previous commit fixed.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:40:58 +07:00
Connor Abbott
ee8cc90e55 lima/gpir: Do all lowerings before rsched
The scheduler assumes that load nodes are always duplicated so that they
can always be scheduled eventually and therefore they never need to be
spilled. But some lowerings were running after the pre-RA scheduler,
whereas duplication has to happen before then since it's needed for the
scheduler to do a better job reducing register pressure. This meant
that lowerings were introducing multiple uses of a load instruction,
which broke the scheduler's expectation and resulted in infinite loops
in situations where the only nodes available to spill were load nodes.
Spilling load nodes would be silly, so we want to fix the lowerings
rather than the scheduler. Just do all lowerings before the pre-RA
scheduler, which also helps with reducing pressure since the scheduler
can more accurately compute the pressure.

Fixes lima/mesa#104.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-09 17:39:20 +07:00
Boris Brezillon
3ce03374b3 panfrost: Rename pan_bo_cache.c into pan_bo.c
So we can move all the BO logic into this file instead of having it
spread over pan_resource.c, pan_drm.c and pan_bo_cache.c.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:24:54 +02:00
Boris Brezillon
14bfb0cb67 panfrost: Get rid of the now unused SLAB allocator
The last users have been converted to use plain BOs. Let's get rid of
this abstraction. We can always consider adding it back if we need it
at some point.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:24:19 +02:00
Boris Brezillon
2c90045cf2 panfrost: Get rid of unused panfrost_context fields
Some fields in panfrost_context are unused (probably leftovers from
previous refactor). Let's get rid of them.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:23:34 +02:00
Boris Brezillon
76274bcb5e panfrost: Convert ctx->{scratchpad, tiler_heap, tiler_dummy} to plain BOs
ctx->{scratchpad,tiler_heap,tiler_dummy} are allocated using
panfrost_drm_allocate_slab() but they never any of the SLAB-based
allocation logic. Let's convert those fields to plain BOs.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:22:59 +02:00
Boris Brezillon
a2bba567ae panfrost: Make transient allocation rely on the BO cache
Right now, the transient memory allocator implements its own BO caching
mechanism, which is not really needed since we already have a generic
BO cache. Let's simplify things a bit.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:22:26 +02:00
Boris Brezillon
12d8a17957 panfrost: Stop passing a ctx to functions being passed a batch
The context can be retrieved from batch->ctx.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:21:44 +02:00
Boris Brezillon
beb18c6172 panfrost: Pass a batch to panfrost_drm_submit_vs_fs_batch()
Given the function name it makes more sense to pass it a job batch
directly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:20:59 +02:00
Boris Brezillon
2c526993bc panfrost: s/job/batch/
What we currently call a job is actually a batch containing several jobs
all attached to a rendering operation targeting a specific FBO.

Let's rename structs, functions, variables and fields to reflect this
fact.

Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-08 16:19:56 +02:00
Tapani Pälli
f83f9d7daa android: fix linking issues with liblog
Fixes Android build errors observed in Intel CI.

Fixes: f9f7cbc1aa "util: android logging support"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-09-07 13:16:29 +03:00
Kenneth Graunke
dfb86405cf iris: Support the disable_throttling=true driconf option. 2019-09-06 18:35:24 -07:00
Eric Engestrom
4ad99ee961 amd: move adaptive sync to performance section, as it is defined in xmlpool
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
ba73564b52 gallivm: drop LLVM<3.3 code paths as no build system allows that
Suggested-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
2406b35151 llvmpipe: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
ba1e085587 clover: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
1c1c477470 gallivm: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
7527144383 clover: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
08890068c5 gallivm: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
6120c442ee swr: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
19d9e57f2c amd: replace major llvm version checks with LLVM_VERSION_MAJOR
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:26:29 +01:00
Eric Engestrom
bce9c05ca8 svga: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Eric Engestrom
cf7d186be6 r600: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Eric Engestrom
28cb16b6f8 aux/draw: replace binary HAVE_LLVM checks with LLVM_AVAILABLE
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Eric Engestrom
5aebe37b53 gallivm: replace 0x version print with actual version string
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-09-06 22:19:01 +01:00
Jordan Justen
9790cfcefa
anv,iris: L3ALLOC register replaces L3CNTLREG for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-06 13:11:25 -07:00
Kenneth Graunke
0d0ae16e8f intel: Stop redirecting state cache to command streamer cache section
This bit redirects the state cache from the unified/RO sections of the
L3 cache to the "CS command buffer" section of the cache, which would
be set up via TCCNTLREG.  The documentation says:

   "Additionaly, this redirection should be enabled only if there is a
    non-zero allocation for the CS command buffer section."

We don't allocate any cache to the CS command buffer section, so
enabling this redirection effectively disabled the state cache.
The Windows driver only sets up that section when using POSH, which
we do not currently use.  So, leave it unallocated and disable the
redirection to get a functional state cache again.

Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%,
and Car Chase by 2%.
2019-09-06 10:57:55 -07:00
Kenneth Graunke
68be5ff8d0 iris: Invalidate state/texture/constant caches after STATE_BASE_ADDRESS
Jason pointed out that the caches likely refer to offsets from dynamic
and surface state base addresses, so when we change those, we need to
invalidate the caches.

Comment borrowed from src/intel/vulkan/genX_cmd_buffer.c.
2019-09-06 10:57:55 -07:00
Kristian H. Kristensen
30ab3e39fd freedreno/a6xx: Implement primitive count queries on GPU
The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or
PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or
tessellation, since these stages add primitives at runtime.  Use the
WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and
implement a hw query for this.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-09-06 09:53:28 -07:00
Kristian H. Kristensen
1acf8d2354 freedreno/a6xx: Let the GPU track streamout offsets
The GPU writes out streamout offsets as it goes to the FLUSH_BASE
pointer.  We use that value with CP_MEM_TO_REG when appending to the
stream so that we don't have to track the offsets with the CPU in the
driver.  This ensures that streamout continues to work once we enable
geometry and tessellation shader stages that add geometry.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-09-06 09:53:28 -07:00
Roland Scheidegger
de1c89fd93 llvmpipe: fix CALLOC vs. free mismatches
Should fix some issues we're seeing. And use REALLOC instead of realloc.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2019-09-06 18:31:34 +02:00
Tomeu Vizoso
0efc0f8edc panfrost/ci: Increase timeouts
Sometimes LAVA jobs will timeout due to transient issues, and the Gitlab
job will fail in that case. Increase the timeouts to reduce the
likeliness of that happening and reduce false positives.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-06 16:35:16 +02:00
Tomeu Vizoso
8a5dd61828 panfrost/ci: Use special runner for LAVA jobs
So repositories don't need to be specially configured with a token to
access LAVA, store this token in a bind volume for a special runner.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-06 16:35:16 +02:00
Tomeu Vizoso
10b60dbd2c panfrost/ci: Re-add support for armhf
Now that Volt supports armhf, build again images and submit to LAVA for
RK3288.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-06 16:35:16 +02:00
Zhu, James
878439bba3 radeon: Fix mjpeg issue for ARCTURUS
ARCTURUS mjpeg is using direct register access.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2019-09-06 08:53:52 -04:00
Leo Liu
a3074370d9 radeon/vcn: add RENOIR VCN decode support
It has same VCN2.x block as Navi1x

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
2019-09-06 08:53:52 -04:00
Timur Kristóf
3debd0ef15 tgsi_to_nir: Remove dependency on libglsl.
This commit removes the GLSL dependency in TTN by manually recording
the textures used and calling nir_lower_samplers
instead of its GL counterpart.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-06 12:20:53 +03:00
Gert Wollny
9b9e1de90e radeonsi: Release storage for smda_uploads when the context is destroyed
This fixes a memory leak in the flush code:

Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 in __interceptor_realloc .../gcc-8.3.0/libsanitizer/asan/asan_malloc_linux.cc:105
    #1 in si_buffer_do_flush_region src/gallium/drivers/radeonsi/si_buffer.c:573
    #2 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:608
    #3 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:597

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-09-06 09:44:24 +02:00
Vasily Khoruzhick
aa77fc309a lima/ppir: don't lower phis to scalar
Utgard PP is vec4 architecture, so lowering phis to scalars
increases instruction count and potentially interferes with
spilling.

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-05 19:29:16 -07:00
Jonathan Marek
feea5986a9 freedreno/a2xx: formats update
For render formats, update fd2_pipe2color to only work with HW supported
render formats, and remove the format whitelist is_format_supported. This
patch enables float render formats (which work).

For vertex/texture formats, use a generic function which translates using
the bitsize of the channels. Since we fake support for some vertex formats,
check for these in is_format_supported to avoid enabling them as sampler
formats.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
21dfa8e486 freedreno/a2xx: fix depth gmem restore
Use fd_gmem_restore_format() to avoid trying to use unsupported Z24S8/Z16
render formats for gmem restore.

Also apply this change to gmem2mem so it doesn't depend on fd2_pipe2color
working with depth formats.

gmem2mem/mem2gmem also doesn't need to use the swap/swizzle, since dst/src
formats are the same.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
88ca73bcd0 freedreno/a2xx: implement polygon offset
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.polygon_offset.*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
ac4ca24c32 freedreno/a2xx: fix SRC_ALPHA_SATURATE for alpha blend function
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.fragment_ops.*src_alpha_saturate*

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
80906a12d9 freedreno/a2xx: ir2: update register state in scalar insert
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
588cfe4a2b freedreno/a2xx: ir2: fix incorrect instruction reordering
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
2019-09-06 02:24:29 +00:00
Jonathan Marek
a6ebd4ab08 freedreno/a2xx: ir2: check opcode on the right instruction in export cp
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
19e62fec60 freedreno/a2xx: ir2: fix saturate in cp
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00
Jonathan Marek
c5e6961a58 freedreno/a2xx: ir2: set lower_fdph
The fdph opcode is not supported.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-06 02:24:29 +00:00