OpenCL doesn't really allow a wide range of different samplers, so the
cache hit rate is pretty high across all applications.
This also allows us to stop unbinding samplers after each kernel launch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36243>
The pipe_context might never be reused or the new queue won't ever reuse
all shader image slots leading to stale objects being referenced by the
driver.
Fixes: 50dbcb1d00 ("rusticl: stop clearing shader images after every dispatch")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36243>
The existing refcounting code is correct, unless si_texture_from_winsys_buffer
fails in which case we get a refcount error.
The error code path will use si_texture_reference(&tex, NULL), which
will drop a reference to the memobj buffer, but none was taken yet.
Reviewed-by: Ganesh Belgur Ramachandra <ganesh.belgurramachandra@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36228>
The functionality is provided by isl_surf_supports_ccs().
Also, move the protected content restriction to
iris_resource_configure_aux(). I'm not aware of any reason protected
content wouldn't support CCS. However, to keep this series simple,
enabling that combination is left for another time.
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32120>
On XeKMD, BOs need to be created with a vm_id of zero in order to get
prime handles. That only occurs if the image was created with
PIPE_BIND_SHARED/BO_ALLOC_SHARED. Ensure that shareable images have this
flag in iris_flush_resource().
Fixes the dmabufshare demo on BMG with INTEL_DEBUG=noccs and mesa hacked
to disable suballocation.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13511
Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32120>
The AHB type is no longer needed there as the assert check is no longer
used on the export path. One less DETECT_OS_ANDROID.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36252>
Per spec VUID-VkMemoryAllocateInfo-allocationSize-07897
> If the parameters do not define an import or export operation,
allocationSize must be greater than 0.
So early return there is simply messing up with external memory.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36252>
Every PAN_FAULT_INJECTION_RATE allocation from a mempool, we generate an
invalid GPU address which will cause a GPU fault.
This is useful to debug the GPU reset / context recover path.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31868>
Panthor can report the group state, so let's provide a
get_device_reset_status() querying this state and turning it into a
pipe_reset_status status.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31868>
Use the existing DRIVER_NAME mechanism to pick up common skips.
This is less error prone than manually adding the skips.
A redundant freedreno-a618-skips.txt is also dropped, as it's already
included via GPU_VERSION.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36215>
The cheza runners were decommissioned.
Rename the restricted trace results to a618 (same GPU generation) to keep
the history.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36215>
for intrinsics, we have these really nice builders using designated initializers
+ macros to specify optional indices. texture instrs have even more craziness
involved, but we can do the same trick. this commit takes the existing "fixed
form" deref-centric tex builders and generalizes them to work with non-deref
textures, making it useful also for GL and late VK passes, while providing an
API that strives to be ergonomic and consistent.
this series only implements a subset of possible texture operations for now, but
more generalizing could be added as people have need.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
As seen in the Vivante kernel driver context init, GPUs with the icache
feature have a new set of states to specify the shader ranges. While the
old state still seems to work, it limits the size of the shader that can
be executed to 64K instructions. The new range states holds up to 20 bits
according to the comment in the Vivante kernel driver, which allows up
to 1M instructions.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36114>
Cores with unified instruction memory get the start and end points of
the shaders via the shader range registers. Don't emit the unnecessary
START_PC and END_PC states on those cores.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36114>
When writing new shader instructions through the unified state area
we must tell the GPU which caches to flush by setting the appropriate
code steering bit. Failing to do this doesn't seem to have much of an
effect when only loading shaders through the state memory, but when
toggling between using icache (as in load shaders from memory) and
loading instructions from the state area, this fixes severe corruption
and GPU hangs due to old code being executed.
Programming the steering bits is only needed for GPUs with either
unified instruction or unified uniform states.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36114>
Bit 0 of the SH_CONTROL register does not control uniform cache
flushes so stop touching this bit when updating the uniforms.
While it is harmless to change the bit at this time in the emit
sequence, it's confusing and not needed.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36114>
Update from rnndb commit 19bc9377a80a ("rnndb: rename
UNIFORM_CACHE to CONTROL and document code cache flushing")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36114>
The GL driver expects special sync handling when a buffer is newly
exported, and also requires that bo->prime_fd be set so the batch code
can use it later. Add a function to do this for the KMS export case,
which otherwise would not need a PRIME fd.
agx_bo_export() then becomes a simple dup of bo->prime_fd (which is
probably marginally faster than redoing drmPrimeHandleToFD() anyway).
The thread safety story here is that as long as we do all this the first
time a BO is exported (in any way), there is no way for another thread
to have gotten ahold of the BO already, so no need for extra locking.
This does not affect hk, since it doesn't rely on bo->prime_fd for
anything. It also doesn't affect the timestamp BO and other special
cases.
Fixes: 067d820c9d ("asahi: Mark KMS exported resource BOs as shared")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13563
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36241>
add support for specifying the following experimental controls
- CODECAPI_AVEncVideoRateControlFramePreAnalysis
- CODECAPI_AVEncVideoRateControlFramePreAnalysisExternalReconDownscale
to make testing easier.
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36236>