ctx is allocated with rzalloc() which takes care of zero-ing the memory
region. No need to call memset(0) on top.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
That's what we do for other per-batch BOs, and we'll soon add an helper
to automate this create_bo()+add_bo()+bo_unreference() sequence, so
let's prepare the code to ease this transition.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Some BOs are used by batches but never explicitly added to the BO set.
This is currently not a problem because we wait for the execution of
a batch to be finished before releasing a BO, but we will soon relax
this rule.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The DRM driver expects an array of u32, let's use the correct type, even
if using an int works in practice because it's still a 32-bit integer.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
panfrost_{create,free,get}_batch() are only called inside pan_job.c.
Let's make them static.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Looks like only HALT2 GPUs have support for it but that is not yet
implemented so disable ARB_shadow for now.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
There are GPUs that do not support this feature.
This reverts commit e871abe452
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
drmIoctl handles EAGAIN itself and actually it always return -1 on errors.
Remove the wrong handling of its return value. Also, print a warning when
it fails.
v2: - use _debug_printf instead of fprintf (Gurchetan Singh)
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
The compiler now sets the "Null Render Target" bit in the RT write
extended message descriptor, causing it to write to an implicit null
surface without us needing to set one up in the binding table.
Together with the last patch, this improves performance in Car Chase on
an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Yes, some tests fail, but we can turn those into XFAILs at meson time.
Better to keep the things that work working than not cover them at all.
Unfortunately XPASS results will not cause the build to fail until we
update CI to meson 0.51 or newer.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Otherwise it never gets closed, this fixes errors seen with deqp-egl
where we end up opening 1024 files.
Fixes: 2dce0e94 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The PLBU seems to preserve scissor state between draws, and since lima doesn't
emit PLBU_CMD_SCISSORS() if scissor test is disabled, it uses state from previous draw.
Fix it by emitting PLBU_CMD_SCISSORS() for full fb if scissor test is disabled.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
This matches what other targets do, and makes it easier to port to
meson.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Currently scons puts them in src/mapi/glapi, meosn puts them in
src/mapi/glapi/gen. This results in some things being compilable only by
one or the other, put them in the same places so that everyone is happy.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
It's useful for analyzing shader binaries produced by ARM mali offline
compiler which outputs files in MBS format. MBS is mali binary shader,
currently parser just extracts shader binary and ignores everything else.
Reviewed-and-tested-by: Connor Abbott<cwabbott0@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Commit f3e978db incorrectly assumed the maximum number of
samplers was equal to the max number of defined samplers
e.g. where bindings skip slots.
This fixes an assert in si_nir_load_sampler_desc() for an
enemy territory quake wars shader. And fixes potential bugs with
incorrect bounds limiting in the same code for production builds
of mesa.
Fixes: f3e978db ("radeonsi/nir: Remove uniform variable scanning")
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Documentation for pipe_context::flush states:
"NOTE: use screen->fence_reference() (or equivalent) to transfer
new fence ref to **fence, to ensure that previous fence is unref'd"
Hence we need to unref previous out_fence.
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Vulkan applications can register with the following structure :
typedef struct VkApplicationInfo {
VkStructureType sType;
const void* pNext;
const char* pApplicationName;
uint32_t applicationVersion;
const char* pEngineName;
uint32_t engineVersion;
uint32_t apiVersion;
} VkApplicationInfo;
This enables the Vulkan implementations to apply workarounds based off
matching this description.
Here we add a new parameter for matching the driconfig options with
the following :
<device driver="anv">
<application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
<option name="blaaah" value="true" />
</application>
</device>
v2: switch engine name match to use regexps
v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)
v4: Add missing bit that went to the following commit (Eric)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Also move the clearing of the bits out of if/else.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
It was calloc'd to 0 which is PIPE_PRIM_POINTS, which means that we
fail to notice an initial primitive of points being new, and fail at
updating the "primitive is points or lines" field.
We do not need to reset this on device loss because we're tracking
the last primitive mode sent to us on the CPU via draw_vbo, not the
last primitive mode sent to the GPU.
Fixes several tests:
- dEQP-GLES3.functional.clipping.point.wide_point_clip
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_center
- dEQP-GLES3.functional.clipping.point.wide_point_clip_viewport_corner
Fixes: dcfca0af7c ("iris: Set XY Clipping correctly.")
Add a ppir dummy node for nir_ssa_undef_instr, create a reg for it and mark
it as undefined, so that regalloc can set it non-interfering to avoid
register pressure.
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Vasily Khozuzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
We are about to patch panfrost_flush() to flush all pending batches,
not only the current one. In order to do that, we need to move the
'flush single batch' code to panfrost_batch_submit().
While at it, we get rid of the existing pipelining logic, which is
currently unused and replace it by an unconditional wait at the end of
panfrost_batch_submit(). A new pipeline logic will be introduced later
on.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
panfrost_flush() is about to be reworked to flush all pending batches,
but we want the fence to block on the last one. Let's move the fence
creation logic in panfrost_flush() to prepare for this situation.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
panfrost_draw_vbo() Might call the primeconvert/without_prim_restart
helpers which will enter the ->draw_vbo() again. Let's delay
payloads[].offset_start initialization so we don't initialize them
twice.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
panfrost_attach_vt_xxx() functions are now passed a batch, and the
generated FB desc is kept in panfrost_batch so we can switch FBs
without forcing a flush. The postfix->framebuffer field is restored
on the next attach_vt_framebuffer() call if the batch already has an
FB desc.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
So we can emit SET_VALUE jobs for a batch that's not currently bound
to the context.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
We'll soon be able to flush a batch that's not currently bound to the
context, which means ctx->pipe_framebuffer will not necessarily be the
FBO targeted by the wallpaper draw. Let's prepare for this case and
use ctx->wallpaper_batch in panfrost_blit_wallpaper().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
So we can emit such jobs to a batch that's not currently bound to the
context.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
We need that if we want to upload transient buffers to a batch that's
not currently bound to the context, which in turn will be needed if we
want to relax the batch serialization we have right now (only flush
batches when we need to: on a flush request, or when one batch depends
on the result of other batches).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Rename panfrost_is_scanout() into panfrost_batch_is_scanout(), pass it
a batch instead of a context and move the code to pan_job.c.
With this in place, we can now test if a batch is targeting a scanout
FB even if this batch is not bound to the context.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Will be replaced by something similar but using a BOs as keys instead
of resources.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
This way we have all the fb_state information directly attached to a
batch and can pass only the batch to functions emitting CMDs, which is
needed if we want to be able to queue CMDs to a batch that's not
currently bound to the context.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
The V3D documentation states that primitive counters are reset when
we emit Tile Binning Mode Configuration items, which we do at the start
of each draw call, however, in the actual hardware this doesn't seem to
take effect when transform feedback is not active (this doesn't happen in
the simulator). This causes a problem in the following scenario:
glBeginTransformFeedback()
glDrawArrays()
glPauseTransformFeedback()
glDrawArrays()
glResumeTransformFeedback()
glEndTransformFeedback()
The TF pause will trigger a flush of the primitive counters, which results
in a correct number of primitives up to that point. In theory, the counter
should then be reset when we execute the draw after pausing TF, but that
doesn't happen, and since TF is enabled again by the resume command before
we end recording, by the time we end the transform feedback recording we
again check the counters, but instead of reading 0, we read again the same
value we read at the time we paused, incorrectly accumulating that value
again.
In theory, we should be able to avoid this by using the other method to
reset the primitive counters: using operation 1 instead of 0 when we
flush the counts to the buffer at the time we pause, but again, this
doesn't seem to be work and we still see obsolete counts by the time we
end transform feedback.
This patch fixes the problem by not accumulating TF primitive counts
unless we know we have actually queued draw calls during transform
feedback, since that seems to effectively reset the counters. This should
also be more performant, since it saves unnecessary stalls for the
primitive counters to be updated when we know there haven't been any
new primitives drawn.
Fixes CTS tests:
dEQP-GLES3.functional.transform_feedback.*
Reviewed-by: Eric Anholt <eric@anholt.net>
This was updating the counter for the indexed draw path only, but we are
already updating the counter for all paths a bit later, so this is only
duplicating counts for indexed paths.
Reviewed-by: Eric Anholt <eric@anholt.net>
Instead of running it with the Wayland platform, which introduces
unwanted dependencies and complexity.
Makes tests run 30% faster, as well.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
The AnTuTu "garden" benchmark overflows the fixed size constbuffer
stateobject, so lets be more clever and calculate (a potentially
slightly pessimistic) actual size.
Signed-off-by: Rob Clark <robdclark@chromium.org>