UCHE and CCU use virtual-tagged addresses, so whenever an alias may have
changed we have to always flush and invalidate everything. We detect
this through the sparse memory aliasing flag on the buffer/image, or for
plain memory barriers whether the feature is enabled.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
Plumb through support for a sparse queue and enable sparse binding using
the kernel interfaces we added earlier. We also support sparse residency
for buffers, which is straightforward, but sparse residency for images
is much more complicated so it will be enabled later.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
Add a "sparse VMA" abstraction, and functions creating them, destroying
them, and submitting commands to map and unmap BOs into them. This
mirrors the Vulkan API, but with image offsets resolved to page offsets.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
Use a new driver-internal VM_BIND submit queue for mapping and unmapping
"normal" BOs. This will be required for sparse, because we can't mix
the old and new interface, but it should also allow us to stop using
"zombie" VMAs and the bo list.
Also use MSM_BO_NO_SHARE, which we assume is available when VM_BIND is.
This should significantly reduce kernel submit overhead, in parallel to
the userspace submit overhead cut by using VM_BIND.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
According to the spec and as implemented by other drivers, this should
use the size of the buffer instead of the size of the VkDeviceMemory
it's bound to when VK_WHOLE_SIZE is specified or pSizes is NULL. The
current behavior doesn't make sense at all for sparse buffers which are
not bound to a single VkDeviceMemory. Just use the common helper that
already does the right thing, copied from anv.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
The kernel was rounding the size up for us, but it doesn't like a
non-aligned map size, so just sanitize the size here.
tu_cs was relying on the size not being rounded to keep the maximum size
2^20-1 or less, so fix that by using the initial unrounded size.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
For VM_BIND, BO deletion will have to be implemented differently in
native drm and virtio. We already have a somewhat awkward situation with
native-specific code in the common BO deletion helper, which we only get
away with because it's for kernels without SET_IOVA in which case virtio
isn't supported. Add a few common helpers for some of the guts, and move
the guts into backend-specific functions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533>
Replace loop over all PIPE_MAX_SAMPLERS with u_foreach_bit(..) to iterate
only over active sampler views. This avoids unnecessary iterations.
Improves drawoverhead test 1 performance by ~10%.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36831>
This adds a generic lowering pass for coop mat flexible dimensions.
This should be suitable for all drivers that implement coop mat2 flexible dimensions
or even just lowering sw exposed sizes to hw sizes.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36544>
We're supposed to completely ignore VkTimelineSemaphoreSubmitInfo if
there aren't any timeline semaphores, including the array lengths, which
is made clear by the various VUs already cited by the code. The
vkQueueSubmit() path correctly handled this when asserting but still
dereferenced pWaitSemaphoreValues unconditionally, which could lead to
dereferencing an invalid pointer if waitSemaphoreValueCount is less than
waitSemaphoreCount. The vkQueueSparseBind() path didn't even assert
correctly. Bring vkQueueSparseBind() in line with vkQueueSubmit()
and make both only dereference the wait/signal array once we've
determined it must be present. While we're here, also fix the assert in
vkQueueSubmit() to disallow a waitSemaphoreValueCount of 0 if there are
timeline semaphores present, which conversely is not allowed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36989>
We are reading out some of the parameters from IR data structure those
have been written previously, on some platforms L3 is not coherent, so
explicitly add those flushes.
Cc: mesa-stable
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36952>
The blit code is destructive on the framebuffer state, which means
that it can set new render targets. If you have 2 BGRA surfaces bound
for logic ops, then after setting up the surface for the first one,
the blit for the second will end up destroying + re-creating the
surface for the first one.
Let's be robust to this by putting the blit in a first pass, and
then actually initializing all of the descriptors in a second pass.
This is still woefully inefficient but at least it's correct.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36945>
This avoids having to hardcode the proxy in the traces `download-url` or
jobs setting `PIGLIT_REPLAY_EXTRA_ARGS` and accidentally overriding the
default args when the author meant to append.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36955>
FDO_HTTP_CACHE_URI is not defined, but LAVA_HTTP_CACHE_URI is and is the
right URL for this.
This job is currently disabled, but fix it in preparation for when
someone eventually brings it back.
Note that this line also has another bug that will be addressed by the
next commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36955>
Without this, I get failures in the following CTS test:
dEQP-VK.api.device_init.create_instance_device_intentional_alloc_fail.basic
Fixes: 05006c21dd ("panvk/utrace: Alloc utrace copy buf from userspace heap")
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36980>
There is a FW issue when using constrained intra prediction with rate
control enabled, causing unexpected quality degradation.
Disable it until FW fix is available.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36837>
RADV_DYNAMIC_RASTERIZATION_SAMPLES triggers the following states:
- FS (not needed when FS is NULL)
- MSAA (already triggered)
- BINNING (use radv_get_ps_iter_samples() on GFX9)
- OCCLUSION_QUERY (doesn't use the PS)
- DB_SHADER_CONTROL (already triggered)
- RAST_SAMPLES (use radv_get_ps_iter_samples())
- NGGC (doesn't use the PS)
So this can be simplified to BINNING (gfx9) | RAST_SAMPLES.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36912>
This reduces the number of states that are re-emitted but the logic
is mostly duplicated because sample shading can be set from the
fragment shader or the graphics pipeline. Could be refactored
eventually.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36912>
This is already handled slightly above in the same function. Also
state->dirty isn't for RADV_DYNAMIC_xxx and there is no corresponding
RADV_CMD_DIRTY_xxx either.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36912>