Flush deferred CPU sync ops so we can make CPU changes visible to the GPU.
This is currently a NOP because we haven't enabled cached mappings in
panvk yet, but we need to prepare for that before we progressively
switch each relevant buffer to use writeback CPU mappings.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
This makes it easier to say we want WB maps various places.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
Now that we have it hooked up at the props level, we can filter
this flag out in panvk_device_adjust_bo_flags() and use this helper
when creating our uncached mempool.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
The buffer descriptor is copied to the descriptor set, and there's no
side-band data to allocate in GPU memory.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
pan_kmod_flush_bo_map_syncs() queues CPU-sync operations, and
pan_kmod_flush_bo_map_syncs_locked() ensures all queued
operations are flushed/executed. Those will be used when we start
adding support for CPU-cached mappings.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
Fail early in pan_kmod_bo_mmap() if PAN_KMOD_BO_FLAG_NO_MMAP is set.
This saves us a user -> kernel round-trip, but most importantly, it
allows us to enforce NO_MMAP at the userspace level on BOs that the
kernel would otherwise accept to mmap() (mapping of imported BOs
requires extra DMA_BUF_IOCTL_SYNC calls we don't have).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
Will be used to skip cache maintenance operations when the GPU is IO
coherent.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
Will be needed to let the frontend know if it can use cached CPU-mappings,
and it allows us to extend the set of supported flags without introducing
a new field if we ever have to.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
We have a few hand-rolled instances of this which work well enough but it
gets more complicated as soon as we care about checking a major version
more than 1. Add a helper to make this more robust.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
The frontend is going to query the device props anyway, so let's just
query it at device creation time and store it in pan_kmod_dev::props.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36385>
It's not good for performance, but it's possible to use for debugging.
Running single-wave GS workgroups could work around any LDS race conditions.
Setting the workgroup size to 64 reliably works around
GLCTS *primitive_counter*line failures, indicating streamout data
corruption with multi-wave GS workgroups.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38328>
This avoids u_upload_data_ref() when cb0 is bound. The u_upload_*_ref()
paths are still problematic to mix with uploaders that the front-end
uses with explicitly managed releasebufs, but this at least side-steps
the issue, and is a legit fix on it's own.
Cc: mesa-stable
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
pipe_upload_constant_buffer0() was immediately releasing the
u_upload_alloc() releasebuf. But it is used in various call-
paths where the release needs to be deferred further.
Fixes crashes in firefox for any driver that uses the same
u_upload_mgr instance for pipe->const_uploader and
pipe->stream_uploader.
Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14309
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
Triggering the rollover where the old upload buffer is released is a
good way to catch bugs with a releasebuf being dropped too soon (ie.
while the frontend still needs a reference).
This makes it easy to reproduce firefox crashes in any driver where
pipe->const_uploader == pipe->stream_uploader.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38896>
URB messages on Xe2 are LSC messages with FLAT addressing. We can
specify a S19 immediate offset in the extended message descriptor,
which should be more than adequate to hold any offsets we need.
We wrote the original URB code before implementing that, and never
doubled back to take advantage of it. But doing so can drop ADDs
near every URB access.
fossil-db results on Battlemage:
Totals:
Instrs: 232239759 -> 231432254 (-0.35%)
Cycle count: 34044435848.0 -> 34055507100.0 (+0.03%); split: -0.00%, +0.04%
Spill count: 520370 -> 520362 (-0.00%); split: -0.00%, +0.00%
Fill count: 470790 -> 470803 (+0.00%); split: -0.00%, +0.00%
Max live registers: 72111853 -> 72111369 (-0.00%); split: -0.00%, +0.00%
Totals from 227920 (28.89% of 788851) affected shaders:
Instrs: 59841897 -> 59034392 (-1.35%)
Cycle count: 683385208.0 -> 694456460.0 (+1.62%); split: -0.14%, +1.76%
Spill count: 17278 -> 17270 (-0.05%); split: -0.10%, +0.06%
Fill count: 17481 -> 17494 (+0.07%); split: -0.03%, +0.10%
Max live registers: 23052652 -> 23052168 (-0.00%); split: -0.00%, +0.00%
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38899>
I recently converted urb->offset to be in bytes on Xe2, but neglected to
update these comments that still said OWord.
Fixes: 9ffae42975 ("brw: Store brw_urb_inst::offset in bytes on Xe2")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38899>
Without this commit, panfrost_batch_update_access receives a parameter
called "batch", and then it uses the same name while iterating for all
batches on the current context. This can be confusing and error-prone,
so let's rename the latter.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38908>
It seems placing the shader at the end has a negative performance
impact.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8ba197c9ef ("anv: Switch shaders to dedicated VMA allocator")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38900>
Before DG2, the value the HW gives us seems to be backwards, but
since DG2 this is supposed to be supported just fine.
However, due to Wa_22012766191, enable it only for Xe2 and up.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
But before ACM, we need to mis-report it to keep the CTS sane, as the
implementation of coarse pixel seems to have all sorts of wrongs in
older HW.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
GLSL defines gl_SampleMaskIn as :
"a fragment language that indicates the set of samples covered
by the primitive generating the fragment during multisample
rasterization"
when variable rate shading is enabled, a single invocation might cover
multiple samples. The lowering done in nir_lower_single_sampled() does
not account for that case, so add an option to selectively disable it.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38641>
This shows off how we don't need to pass an explicit size per CRB instance
in our non-growable CSes.
However, I don't like the additional indentation I did to make a CRB go
out of scope when I needed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38762>
Loosely based on freedreno's, but simplified since a lot of overflow
handling was already there in tu_cs. It successfully catches issues of:
- Overflowing the CRB reservation
- Starting a new CRB with one in progress.
- Emitting a pkt4 while a CRB emit is in progress.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38762>
Current stack size is stored in layout.sw_stack_size, but the function
thats supposed to update it is comparing layout.total_size instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CC: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38898>
Replace the duplicated swapchain image detection pattern across all
Vulkan drivers with the new wsi_common_is_swapchain_image() helper.
Since the swapchain handle can be extracted from VkImageCreateInfo's
pNext chain inside wsi_common_create_swapchain_image(), remove the
now-redundant VkSwapchainKHR parameter from that function.
This removes the #ifdef guards for Android/WSI platforms from each
driver, as the helper now handles this uniformly.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38541>
Add a helper function to check if a VkImageCreateInfo represents a
swapchain image by looking for VkImageSwapchainCreateInfoKHR in the
pNext chain.
This consolidates the swapchain detection logic that is currently
duplicated across all Vulkan drivers, and handles the Android case
in one place.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Suggested-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38541>
This is causing an OOM, and weston gets killed, which causes all
the remaining jobs to fail after that point.
Until this is sorted out, disable THP for this specific job.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38912>