Cloning texture loads isn't a good idea since we may move it into
a block that is not shared between all the invocations of the shader.
We'd like to avoid that since it may result in undefined behavior.
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
This should help avoid stalls in the pixel mask array in certain
non-promoted depth cases. It especially helps for Z16, as each bit
in the PMA corresponds to two pixels when using Z16, as opposed to
the usual one pixel.
Improves performance in GFXBench5 TRex by 22% (n=1).
bound_vertex_buffers doesn't include extra draw parameters buffers.
Tracking this correctly is kind of complicated, and iris_destroy_state
isn't exactly in a hot path, so just loop over all VBO bindings.
Fixes: 4122665dd9 (iris: Enable ARB_shader_draw_parameters support)
Reported-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
When only the depth/stencil bufs are cleared, we should make sure the
color content is reloaded into the tile buffers if we want to preserve
their content.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
glClear()s are expected to be the first thing GL apps do before drawing
new things. If there's already an existing batch targetting the same
FBO that has draws attached to it, we should make sure the new clear
gets a new batch assigned to guaranteed that the FB content is actually
cleared with the requested color/depth/stencil values.
We create a panfrost_get_fresh_batch_for_fbo() helper for that and
call it from panfrost_clear().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Hitting any fallback path on Broxton as we require clflushing the whole
buffer even for an upload of a subtexture. However, since gallium
provides a pbo upload path, allow it to sample packed RGB if supported.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This commit is a step towards the goal of being able to build RADV
without LLVM. In the future we would like to offer the option to
use RADV solely with ACO. There is still a need for the common AMD
code located in amd/common but the LLVM specific parts need to be
separated.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This mirrors the intrinsics in the GLSL IR. One could imagine an
alternate definition where reading the semantic would account for the
READ_HELPER functionality, but that feels potentially dodgy and could be
subject to CSE unpleasantness.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
u_upload_mgr sets it, so that util_range_add can skip the lock.
The time spent in tc_transfer_flush_region decreases from 0.8% to 0.2%
in torcs on radeonsi.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts commit 19546108d3.
This commit breaks the build because lima implements
->set_damage_region(). I guess we'll need more discussion before
removing the ->set_damage_region() hook.
This reverts commit 492ffbed63.
BACK_LEFT attachment can be outdated when the user calls
KHR_partial_update(), leading to a damage region update on the
wrong pipe_resource object.
Let's not expose the ->set_damage_region() method until the core is
fixed to handle that properly.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
In preparation for testing drivers other than Panfrost in LAVA labs.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Include Panfrost's gitlab.ci.yml file from Mesa's main .gitlab-ci.yml so
we test on devices with Panfrost.
This uses LAVA to schedule jobs in the devices and will be the base for
testing Etnaviv, Lima, etc.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This is a port of Nanley's 904c2a617d
from i965 to iris.
One concern is that iris uses larger batches, and also emits far fewer
commands, so we may come closer to the 500 limit within a batch, and
could need to supplement this with actual counting. Manhattan 3.0 had
239 3DSTATE_CONSTANT_PS packets in a batch, Unigine Valley had 155.
So it seems like we're still in the realm of safety.
This should improve texture sampling performance on GC3000.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>