I'm not sure if this is strictly necessary but it makes debugging easier
and minimizes the diff with the experimental scheduler.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Scheduling occurs on a per-block basis, strongly assuming that a given
block contains at most a single branch. This does not always map to the
source NIR control flow, particularly when discard intrinsics are
involved. The solution is to allow scheduling barriers, which will
terminate a block early in code generation and open a new block.
To facilitate this, we need to move some post-block processing to a new
pass, rather than relying hackily on the current_block pointer.
This allows us to cleanup some logic analyzing branches in other parts
of the driver us well, now that the MIR is much more well-formed.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
It's sometimes convenient to call this with no instruction specified. By
definition, a missing instruction cannot reference any argument, so
let's check for NULL and shortciruit to false.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
The branch has the writeout specified in its source list, making this
special even if it's not explicitly part of r0.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In order to run register allocation after scheduling, it is sometimes
necessary to be able to insert instructions into an already-scheduled
program. This is suboptimal, since it forces us to do a worst-case
scheduling, but it is nevertheless required for correct handling of
spills/fills. Let's add helpers to insert instructions as standalone
bundles for use in spilling code.
These helpers are minimal -- they *only* work on load/store ops or
moves. They should not be used for anything but register spilling; any
other instructions should be added prior to the schedule.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
While it doesn't matter with an unconditional move to the conditional
register (r31), when we try to elide that move we'll need to track the
swizzle explicitly, and there is no slot for that yet since ALU ops are
normally binary.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Oh boy. Midgard scheduling is crazy... These are all just the
requirements, not even the algorithm yet.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This does not affect shaders in any way. Rather, it makes the shader-db
instruction count recorded in the compiler accurate with the in-order
scheduler, matching up with what we calculate from pandecode.
Though shaders are the same, instruction counts cannot be compared
across this commit for this reason.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Allow a direct link to the PDF itself from the authors themselves,
rather than a paywall splash page.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Rob Clark <robdclark@chromium.org>
The GLX extension strings are independent of any context, so abusing the
direct_support bit to control this extension's visibility is wrong.
This reverts commit 079d0717fc896bc8086b037d0ed22642274986c7.
Reported-by: Michel Dänzer <michel@daenzer.net>
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Memory allocated through panfrost_allocate_transient() is likely to
come from the transient pool. Let's add the BO backing the allocated
memory region to the job batch so the kernel can retain this BO while
jobs are executed.
In practice that has never been a problem because the transient pool
is never shrinked, and even if it was, we still control the lifetime of
the job, so there's no reason for this BO to be freed before the GPU is
done executing the batch. But it still make sense to add the BO for
debugging purpose.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Both the BO cache and the transient pool are shared across
context's. Protect access to these with mutexes.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Jobs _must_ only be shared across the same context, having
the last_job tracked in a screen causes use-after-free issues
and memory corruptions.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
This fix two dEQP tests for virgl:
dEQP-EGL.functional.image.create.gles2_cubemap_positive_x_rgba_texture
dEQP-EGL.functional.image.render_multiple_contexts.gles2_cubemap_positive_x_rgba8_texture
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* Fix 2D/2DArray/3D tiling parameters:
There is a bottom threshold for width and height.
* Renable tiling for Cubemap, after setting the right parameters.
Reviewed-by: Rob Clark <robdclark@gmail.com>
This way, the test jobs can start running before all build+test jobs
have finished, once the meson-main job has.
Idea suggested by Daniel Stone on IRC.
See https://docs.gitlab.com/ce/ci/directed_acyclic_graph/ and
https://docs.gitlab.com/ce/ci/yaml/README.html#needs for details.
v2:
* Improve commit log (Daniel Stone, Eric Engestrom)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride." for v3d.
Allows YUV buffers with a single buffer and plane offsets to be
passed in.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Input to GS is just a set of attributes, so remove explicit setup of
'position' which is meaningless for GS input processing.
Reviewed-by: Alok Hota <alok.hota@intel.com>
This avoids multiple copies for nothing and it's more elegant.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
RADV no longer uses specific LLVM options compared to the common code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
The patch adds support for 64 bit HAL_PIXEL_FORMAT_RGBA_FP16
for android platform.
Fixes android.graphics.cts.BitmapColorSpaceTest#test16bitHardware
which failed in egl due to "Unsupported native buffer format 0x16"
on chromebooks.
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:
value -= value
with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This reverts commit 5a2e65be89.
Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
v2: Update several of the comments. Drop some redundant uses of
ASSERT_UNION_OF_OTHERS_MATCHES_UNKNOWN_*_SOURCE source. Suggested by
Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>