it turns out we have explicit control over helper invocations; if a
particular bit in the fragment shader descriptor is set, helper
invocations are launched; if it clear, they are not. Helper invocations
are required whenever computing derivatives, whether explicitly
(dFdx/dFdy) *or* implicitly (any texturing). Accordingly, we set this
bit when texturing to fix edge case behaviour (literally, haha).
Thank you to Jason Ekstrand and Ilia Mirkin for pointing out the
representative dEQP test failed along triangle edges and for suggesting
helper invocations / derivatives as a list of suspect pieces (which led
to discovering the helper invocations enable bit in the first place).
Ideally we would use the new NIR analysis pass for this, but that hasn't
landed quite yet.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In some cases, Gallium can give us bad info about the texture count,
counting some NULL textures. We pass Gallium's info to the hardware
blindly, which can confuse the hardware in edge cases. This patch
adjusts accordingly.
This worked around a bug in oooold versions of Panfrost. Nowadays, its
presence is, at best, *creating* bugs. Let's wack it.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In a poorly coded app, the framebuffer can be partially drawn, an FBO
switched, switch back to the framebuffer and keep drawing, etc.
Reordering would fix this, but for now we need to just be careful about
flushing scanout too.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
On more complex apps (possibly using desktop GL specific extensions?),
our viewport code was getting wacky results for unclear reasons. Let's
be a little less wacky.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
To do so, we route some basic information through to the FBD creation
routines (currently just a binary toggle of "has draws?"). Eventually,
more refactoring will enable dynamic hierarchy mask selection, but right
now we do the most basic.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Quite a bit of refactoring in the main driver will be necessary to make
use of this effectively, so the implementation is incomplete.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
As per the notes at the beginning of pan_tiler.c, we implement a routine
to calculate the size of the polygon list header given the framebuffer
dimensions and the provided hierarchy mask.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
I'm not sure how the blob does it, but this seems to be a dead simple
test and roughly corresponds to what I've noticed from the blob, so
maybe it's good enough.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Following the research into Midgard's hierarchical tiling
infrastructure, we now understand (in broad stokes) the purpose of each
tiler field in the MFBD. Additionally, we understand more of the tiling
fields in the SFBD and in Bifrost's structures, although this knowledge
is still incomplete.
Update the names, decoder, and comments to reflect this new
understanding.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This explains how the polygon list is allocated, updating the headers
appropiately to sync the terminology.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
These names are from the replay workaround in kbase; they begin to shine
some light on the meaning of these fields. In particular, we now
understand why the "tiler_meta" field has the effect it does on
performance in certain scenes (controlling tile granularity).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Since commit 4f3c82c72c fmod is no longer being lowered in nir, and
ends up crashing lima programs with "unsupported nir_op: fmod" in both
ppir and gpir.
There seems to be no mod operation in hardware in utgard and there is an
optimization in nir to lower fmod to instructions that lima already
implements, so let's use that.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Now that we can blit depth/stencil in a way that plays nicely with UBWC,
re-enable it.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Now that it can turn these blits into rendering to RB6_Z24_UNORM_S8_UINT
it can properly handle cases where only one of depth+stencil is being
blit. And this avoids lying about he format, which completely doesn't
work when UBWC is used.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
For re-written z/s blits, we want to use the re-written `pipe_blit_info`
even if we have to fallback to 3d pipe (`u_blitter`). So handle that
fallback ourself.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
The name 'separate' doesn't make a while lot of sense, as only one of
the cases is the blit actually split. But split out from previous patch
in an attempt to reduce the noise.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
This will get even simpler with the next patch
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
This maps to a special format that recent generations of adreno have,
for blitting z24s8. Conceptually it is similar to doing Z and/or S
blits by pretending it is r8g8b8a8 (with appropriate writemask). But
it differs when bandwidth compression is used, as z24 is a different
type from r8g8b8.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Commit de8a919702 refactored dynarray usage and changed the size of the
allocation in lima_submit_add_bo.
That causes a segfault in programs running with lima.
This commit restores the allocation size back to the previous size.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
When searching for resources in the cache, we previously released all
expired resources even after having found a compatible resource.
This commit changes this behavior to return immediately when finding a
compatible resource, so that the operation finishes more quickly. This
moves more of the burden of releasing expired resources to cache
addition, which, since it happens at resource destruction time, it's
less time critical.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Replace the cache implementation in the vtest winsys with
virgl_resource_cache.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Replace the cache implementation in the drm winsys with
virgl_resource_cache.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Introduce a resource cache implementation that can be used by any virgl
winsys backend.
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
The offsets to read the query results were off-by-one, which causes the
counters to report bogus increasing values.
Also the counter result is u32, so we need to initialize the query type
to reflect that.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
When both the fragment and vertex shaders point to the same varying
location they expect to share the same varying slot.
Make sure vertex and fragment varyings pointing to the same loc have
->src_offset set to the same value.
[Alyssa: In addition a patch implement txs, this fixes GALLIUM_HUD on
Panfrost]
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Used to enable INTEL_shader_atomic_float_minmax.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The src/dst format is overriden from the pipe_blit_info, so this just
logic just serves to confuse the reader.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Since we could only need a subset of the layers, and otherwise we
trigger an assert in util_max_layer()
Signed-off-by: Rob Clark <robdclark@chromium.org>
These are tests that regressed in RK3288 but still pass on RK3399.
So we still have a CI we can rely on, add them to the flip-flop list for
now.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Some tests got fixed since the last update, but also some regressions
crept in.
To keep the CI green, add the regressions to the expected failures.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
The dri2 winsys also uses libdrm (and you can only enable dri3 if
you enable dri2), and the drm winsys only requires libdrm.
So if any winsys is enabled you can also enable the drm winsys, and
since we always want at least one winsys we can always enable it.
I removed the check for the drm platform for VA and OMX since they
do not care anymore. Since we still check for one of r600g, nouveau
or radeonsi, we are guarantueed to still only enable it by default
in a configuration that requires libdrm anyway. So for people using
va=auto, we don't suddenly start requiring libdrm were we did not
before.
This supersedes "vl: Enable DRM by default.", which I pushed, but
rolled back because it used dep_libdrm before its definition.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
We should avoid having potentially dangling pointers to
pipe_resources in general.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
A pipe_transfer is a context object. It is fine for the
constructor/destructor to have access to the context.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
A pipe_transfer is a context object. It is fine for
virgl_transfer_queue to have access to the context.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
This dumps disassembly to the pipe_debug_callback together with shader
stats.
Can be used together with shader-db to get full disassembly of all shaders
in the database.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>