Commit graph

39985 commits

Author SHA1 Message Date
Boris Brezillon
1e47c3ee7b panfrost: Stop passing has_draws to panfrost_drm_submit_vs_fs_batch()
has_draws can be inferred directly from the batch->last_job value, no
need to pass it around.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:28:03 +02:00
Boris Brezillon
07085fe8a4 panfrost: Kill a useless memset(0) in panfrost_create_context()
ctx is allocated with rzalloc() which takes care of zero-ing the memory
region. No need to call memset(0) on top.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:47 +02:00
Boris Brezillon
4eac1b2008 panfrost: Add polygon_list to the batch BO set at allocation time
That's what we do for other per-batch BOs, and we'll soon add an helper
to automate this create_bo()+add_bo()+bo_unreference() sequence, so
let's prepare the code to ease this transition.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:30 +02:00
Boris Brezillon
c16fb1f48d panfrost: Add missing panfrost_batch_add_bo() calls
Some BOs are used by batches but never explicitly added to the BO set.
This is currently not a problem because we wait for the execution of
a batch to be finished before releasing a BO, but we will soon relax
this rule.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:27:09 +02:00
Boris Brezillon
a94d028065 panfrost: Use the correct type for the bo_handle array
The DRM driver expects an array of u32, let's use the correct type, even
if using an int works in practice because it's still a 32-bit integer.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:49 +02:00
Boris Brezillon
2b771b8424 panfrost: Stop exposing internal panfrost_*_batch() functions
panfrost_{create,free,get}_batch() are only called inside pan_job.c.
Let's make them static.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-09-18 10:26:21 +02:00
Christian Gmeiner
8d5f905faa etnaviv: disable ARB_shadow
Looks like only HALT2 GPUs have support for it but that is not yet
implemented so disable ARB_shadow for now.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:26 +02:00
Christian Gmeiner
dcc0e23438 Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"
There are GPUs that do not support this feature.

This reverts commit e871abe452

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-18 06:47:21 +02:00
Lepton Wu
417d602fda virgl: Remove wrong EAGAIN handling for drmIoctl
drmIoctl handles EAGAIN itself and actually it always return -1 on errors.
Remove the wrong handling of its return value. Also, print a warning when
it fails.

v2: - use _debug_printf instead of fprintf (Gurchetan Singh)

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
2019-09-18 03:36:10 +00:00
Kenneth Graunke
f8c44e4ed7 iris: Skip allocating a null surface when there are 0 color regions.
The compiler now sets the "Null Render Target" bit in the RT write
extended message descriptor, causing it to write to an implicit null
surface without us needing to set one up in the binding table.

Together with the last patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 14:27:51 -07:00
Adam Jackson
320c36ed3a gallium/xlib: Fix glXMakeCurrent(dpy, None, None, ctx)
This is entirely legal in GL 3.0+. I wonder how many more times I'll
need to fix this specific bug.
2019-09-17 20:16:00 +00:00
Adam Jackson
a693f98e17 gallium/xlib: Remove MakeCurrent_PrevContext
As the comment notes, this is not thread-safe. You can just as easily
use GetCurrentContext instead, so, do that.
2019-09-17 20:16:00 +00:00
Adam Jackson
db8be355d1 gallium/xlib: Remove drawable caching from the MakeCurrent path
AFAICT this only exists to avoid hitting XMesaFindBuffer, which is a
linear search. But you don't have that many GLX drawables, so whatever.
2019-09-17 20:16:00 +00:00
Adam Jackson
6ec1259423 ci: Run tests on i386 cross builds
Yes, some tests fail, but we can turn those into XFAILs at meson time.
Better to keep the things that work working than not cover them at all.
Unfortunately XPASS results will not cause the build to fail until we
update CI to meson 0.51 or newer.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-09-17 14:53:57 -04:00
Tapani Pälli
631255387f iris: close screen fd on iris_destroy_screen
Otherwise it never gets closed, this fixes errors seen with deqp-egl
where we end up opening 1024 files.

Fixes: 2dce0e94 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-17 14:46:45 +03:00
Michel Dänzer
2c278602d8 swr: Limit DEBUG workaround to LLVM < 7
As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-17 10:24:29 +00:00
Michel Dänzer
8218f6e22d gallivm: Limit DEBUG workaround to LLVM < 7
As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-09-17 10:24:29 +00:00
Christian Gmeiner
1c34d19f90 etnaviv: a bit of micro-optimization
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
2019-09-17 05:50:37 +00:00
Icenowy Zheng
d61b67b41d lima: reset scissor state if scissor test is disabled
The PLBU seems to preserve scissor state between draws, and since lima doesn't
emit PLBU_CMD_SCISSORS() if scissor test is disabled, it uses state from previous draw.

Fix it by emitting PLBU_CMD_SCISSORS() for full fb if scissor test is disabled.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-09-17 04:13:24 +00:00
Erik Faye-Lund
9c57b54994 gallium/gdi: use GALLIUM_FOO rather than HAVE_FOO
This matches what other targets do, and makes it easier to port to
meson.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-16 17:54:00 +00:00
Dylan Baker
9e1f49aae1 scons: Make scons and meson agree about path to glapi generated headers
Currently scons puts them in src/mapi/glapi, meosn puts them in
src/mapi/glapi/gen. This results in some things being compilable only by
one or the other, put them in the same places so that everyone is happy.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-09-16 17:54:00 +00:00
Vasily Khoruzhick
ca5782f0ee lima: add standalone disassembler with primitive MBS parser
It's useful for analyzing shader binaries produced by ARM mali offline
compiler which outputs files in MBS format. MBS is mali binary shader,
currently parser just extracts shader binary and ignores everything else.

Reviewed-and-tested-by: Connor Abbott<cwabbott0@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-09-16 09:29:55 -07:00
Timothy Arceri
741cff91d3 radeonsi/nir: fix number of used samplers
Commit f3e978db incorrectly assumed the maximum number of
samplers was equal to the max number of defined samplers
e.g. where bindings skip slots.

This fixes an assert in si_nir_load_sampler_desc() for an
enemy territory quake wars shader. And fixes potential bugs with
incorrect bounds limiting in the same code for production builds
of mesa.

Fixes: f3e978db ("radeonsi/nir: Remove uniform variable scanning")

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-16 10:14:48 +00:00
Danylo Piliaiev
6f5a8617b4 iris: Fix fence leak in iris_fence_flush
Documentation for pipe_context::flush states:
 "NOTE: use screen->fence_reference() (or equivalent) to transfer
  new fence ref to **fence, to ensure that previous fence is unref'd"

Hence we need to unref previous out_fence.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-16 08:47:37 +00:00
Lionel Landwerlin
04dc6074cf driconfig: add a new engine name/version parameter
Vulkan applications can register with the following structure :

typedef struct VkApplicationInfo {
    VkStructureType    sType;
    const void*        pNext;
    const char*        pApplicationName;
    uint32_t           applicationVersion;
    const char*        pEngineName;
    uint32_t           engineVersion;
    uint32_t           apiVersion;
} VkApplicationInfo;

This enables the Vulkan implementations to apply workarounds based off
matching this description.

Here we add a new parameter for matching the driconfig options with
the following :

    <device driver="anv">
        <application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
            <option name="blaaah" value="true" />
        </application>
    </device>

v2: switch engine name match to use regexps

v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)

v4: Add missing bit that went to the following commit (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-15 15:37:02 +03:00
Christian Gmeiner
9466e4cfab gallium: util_set_vertex_buffers_mask(..): make use of u_bit_consecutive(..)
Also move the clearing of the bits out of if/else.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-14 17:45:47 +00:00
Lepton Wu
ac175fb168 virgl: replace fprintf with _debug_printf
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
2019-09-14 00:14:41 +00:00
Kenneth Graunke
c9fb704f72 iris: Initialize ice->state.prim_mode to an invalid value
It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we
fail to notice an initial primitive of points being new, and fail at
updating the "primitive is points or lines" field.

We do not need to reset this on device loss because we're tracking
the last primitive mode sent to us on the CPU via draw_vbo, not the
last primitive mode sent to the GPU.

Fixes several tests:
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner

Fixes: dcfca0af7c ("iris: Set XY Clipping correctly.")
2019-09-13 16:31:29 -07:00
Andreas Baierl
4b1a14fd47 lima/ppir: Add undef handling
Add a ppir dummy node for nir_ssa_undef_instr, create a reg for it and mark
it as undefined, so that regalloc can set it non-interfering to avoid
register pressure.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khozuzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-09-13 19:41:32 +00:00
Andreas Baierl
4ddadd6370 lima/ppir: Rename ppir_op_dummy to ppir_op_undef
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
2019-09-13 19:41:32 +00:00
Boris Brezillon
6ddfd37c7e panfrost: Move the batch submission logic to panfrost_batch_submit()
We are about to patch panfrost_flush() to flush all pending batches,
not only the current one. In order to do that, we need to move the
'flush single batch' code to panfrost_batch_submit().

While at it, we get rid of the existing pipelining logic, which is
currently unused and replace it by an unconditional wait at the end of
panfrost_batch_submit(). A new pipeline logic will be introduced later
on.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
2fc91a16ab panfrost: Move the fence creation in panfrost_flush()
panfrost_flush() is about to be reworked to flush all pending batches,
but we want the fence to block on the last one. Let's move the fence
creation logic in panfrost_flush() to prepare for this situation.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
835439b84f panfrost: Delay payloads[].offset_start initialization
panfrost_draw_vbo() Might call the primeconvert/without_prim_restart
helpers which will enter the ->draw_vbo() again. Let's delay
payloads[].offset_start initialization so we don't initialize them
twice.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
4166ca92e2 panfrost: Prepare things to avoid flushes on FB switch
panfrost_attach_vt_xxx() functions are now passed a batch, and the
generated FB desc is kept in panfrost_batch so we can switch FBs
without forcing a flush. The postfix->framebuffer field is restored
on the next attach_vt_framebuffer() call if the batch already has an
FB desc.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
e5c7701a0a panfrost: Pass a batch to panfrost_set_value_job()
So we can emit SET_VALUE jobs for a batch that's not currently bound
to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
bc0f6c0b15 panfrost: Use ctx->wallpaper_batch in panfrost_blit_wallpaper()
We'll soon be able to flush a batch that's not currently bound to the
context, which means ctx->pipe_framebuffer will not necessarily be the
FBO targeted by the wallpaper draw. Let's prepare for this case and
use ctx->wallpaper_batch in panfrost_blit_wallpaper().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
aa851a62b9 panfrost: Pass a batch to functions emitting FB descs
So we can emit such jobs to a batch that's not currently bound to the
context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
07a68835a1 panfrost: Pass a batch to panfrost_{allocate,upload}_transient()
We need that if we want to upload transient buffers to a batch that's
not currently bound to the context, which in turn will be needed if we
want to relax the batch serialization we have right now (only flush
batches when we need to: on a flush request, or when one batch depends
on the result of other batches).

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
e46d95d51b panfrost: Allow testing if a specific batch is targeting a scanout FB
Rename panfrost_is_scanout() into panfrost_batch_is_scanout(), pass it
a batch instead of a context and move the code to pan_job.c.

With this in place, we can now test if a batch is targeting a scanout
FB even if this batch is not bound to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
40e20324e0 panfrost: Get rid of the unused 'flush jobs accessing res' infra
Will be replaced by something similar but using a BOs as keys instead
of resources.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Boris Brezillon
1b5873b73c panfrost: Use a pipe_framebuffer_state as the batch key
This way we have all the fb_state information directly attached to a
batch and can pass only the batch to functions emitting CMDs, which is
needed if we want to be able to queue CMDs to a batch that's not
currently bound to the context.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-09-13 16:25:06 +02:00
Indrajit Das
92765f85e1 radeon/vcn: exclude raven2 from vcn 2.0 encode initialization
Signed-off-by: Indrajit Das <indrajit-kumar.das@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
2019-09-13 09:18:43 -04:00
Iago Toral Quiroga
2eace10c62 v3d: fix TF primitive counts for resume without draw
The V3D documentation states that primitive counters are reset when
we emit Tile Binning Mode Configuration items, which we do at the start
of each draw call, however, in the actual hardware this doesn't seem to
take effect when transform feedback is not active (this doesn't happen in
the simulator). This causes a problem in the following scenario:

glBeginTransformFeedback()
   glDrawArrays()
   glPauseTransformFeedback()
   glDrawArrays()
   glResumeTransformFeedback()
glEndTransformFeedback()

The TF pause will trigger a flush of the primitive counters, which results
in a correct number of primitives up to that point. In theory, the counter
should then be reset when we execute the draw after pausing TF, but that
doesn't happen, and since TF is enabled again by the resume command before
we end recording, by the time we end the transform feedback recording we
again check the counters, but instead of reading 0, we read again the same
value we read at the time we paused, incorrectly accumulating that value
again.

In theory, we should be able to avoid this by using the other method to
reset the primitive counters: using operation 1 instead of 0 when we
flush the counts to the buffer at the time we pause, but again, this
doesn't seem to be work and we still see obsolete counts by the time we
end transform feedback.

This patch fixes the problem by not accumulating TF primitive counts
unless we know we have actually queued draw calls during transform
feedback, since that seems to effectively reset the counters. This should
also be more performant, since it saves unnecessary stalls for the
primitive counters to be updated when we know there haven't been any
new primitives drawn.

Fixes CTS tests:
dEQP-GLES3.functional.transform_feedback.*

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Iago Toral Quiroga
ded6ea9209 v3d: remove redundant update of queued draw calls
This was updating the counter for the indexed draw path only, but we are
already updating the counter for all paths a bit later, so this is only
duplicating counts for indexed paths.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Iago Toral Quiroga
b9a07eed00 v3d: make sure we have enough space in the CL for the primitive counts packet
Fixes: 0f2d1dfe65 ("v3d: use the GPU to record primitives written to transform feedback")

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Iago Toral Quiroga
b69f51a5ef v3d: add missing line break for performance debug message
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-09-13 06:53:26 +00:00
Tomeu Vizoso
bc79e5c437 panfrost/ci: Use releases for Volt dEQP
So we can better correlate different results to versions of the runner.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-13 08:35:36 +02:00
Tomeu Vizoso
c301fc027a panfrost/ci: Update kernel to 5.3-rc8
We haven't updated in a long time, so better do it now and again when
5.3 is released.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-13 08:35:36 +02:00
Tomeu Vizoso
ca4e6637d0 panfrost/ci: Run dEQP with the surfaceless platform
Instead of running it with the Wayland platform, which introduces
unwanted dependencies and complexity.

Makes tests run 30% faster, as well.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-09-13 08:35:36 +02:00
Rob Clark
b4df115d3f freedreno/a6xx: pre-calculate userconst stateobj size
The AnTuTu "garden" benchmark overflows the fixed size constbuffer
stateobject, so lets be more clever and calculate (a potentially
slightly pessimistic) actual size.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-09-12 18:07:20 -07:00