When a timestamped present is not used (MAILBOX or the very first present),
it's possible that the very last queued present ID won't complete in finite time.
Similar to frame callback based workaround, apply a timeout to present
waits when they target the very last submitted presentID.
Only apply the workaround when we're not guaranteed forward progress.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32556>
When transitioning from FIFO to MAILBOX with swapchain_maintenance1,
we must make sure that the first MAILBOX after FIFO observes the wait
barrier. This was done implicitly in the timestamp path, but not for
the non-commit-timing path.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Cc: mesa-stable
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32556>
When commit-timing was not supported, but FIFO was we would end
up in a situation with throttling on FIFO barrier and legacy fence.
At that point, the entire point of FIFO falls flat.
There are some caveats with this approach, but it's not expected
that compositors will only support FIFO, and not commit-timing long
term.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Reviewed-by: Autumn Ashton <misyl@froggi.es>
Reviewed-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32556>
to integrate debug printf/abort, vulkan drivers need to implement a device
status. we would need to thicken the runtime to do that entirely in common code,
but we can at least add a helper to make it easier for vk drivers to wire.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32564>
For Anv this will make more sense than strings.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32483>
Not every implementation supports VK_EXT_debug_marker.
VK_EXT_debug_utils is also pretty similar, it would be nice to plug
into whatever is available.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32483>
If we pass both, and only one of them is cleared, the other aspect might
be disturbed if the format contains both aspects.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32379>
"constant" is a special keyword in OpenCL C, and we'd like to #define it
suitably in host C23 to facilitate compatiblity between host/device headers.
That means we can't have any identifiers named "global" or "constant".
Fortunately, this is the only 'constant' in any file I'm hitting.
To avoid the clash, don't abbreviate "constant factor", use "constant_factor"
instead. For consistency, "slope factor" then becomes "slope_factor".
The new names are longer but match the Vulkan API exactly.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> [Intel]
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> [NVK and panvk]
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [V3DV]
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com> [IMG]
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32505>
We create NIR shaders here, and we need to free them when we're done with
them as well.
These shaders are created using nir_builder_init_simple_shader(), which
allocates using a NULL ralloc-parent, so ralloc_free should be the right
function to free them with.
Fixes: 514c10344e ("vulkan/meta: Add a concept of rect pipelines")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32486>
Since the leaf node pass doesn't run for updates, leaf_node_count never
got set. This resulted in updates always running on 0 leaves (i.e. being
no-ops).
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32451>
The Vulkan headers add typedefs to fix aliasing issues whenever a type
gets renamed. However, C doesn't allow "enum typedef" so this doesn't
work if people stick the "enum" keyword in front.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32433>
The way the XML is being organized these days, they're doing one
<requires> section for each promoted thing and if something gets
promoted twice by different extensions. As long as we grab the lowest
of those core versions, we should be fine.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32433>
This is mostly adapted from radv's BVH building. This defines a common
"IR" for BVH trees, two algorithms for constructing it, and a callback
that the driver implements for encoding. The framework takes care of
parallelizing the different passes, so the driver just has to split the
encoding process into "stages" and implement just one part for each
stage.
The runtime changes are:
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
The radv changes are;
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31433>
RADV has a pipeline cache for meta shaders that can be used. It is also
required to correctly identify the pipelines as meta pipelines.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31433>
All of these are functions that could reasonably be incorporated into a
Vulkan extension, but are currently missing. While we could in theory do
BVH building without them, using them simplifies the code significantly
and both radv and turnip can reasonably implement them.
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31433>
Y8_U8V8_422_UNORM is more commonly known as NV16. There has been
a fourcc for NV16 for a while now, so let's rename it to be in
line with NV12 and similar formats.
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31854>
When we create a new swapchain to replace the one currently presenting on
a surface, we need to reset all these timing variables. Otherwise we can
lose track of corrections that were made for the old swapchain when we
delete undelivered presentation feedback results.
Also, we use these variables when queuing a presentation, but we also use
them in the dispatch code that can be called by WaitForPresent from another
thread. We need to protect these variables against concurrent usage.
This is all much easier to do when they're stored as part of the swapchain
instead of the surface, so just move them there and adjust the locking.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32121>
When we start using timestamps, the current code will generate an event
stream like:
feedback
set barrier
wait barrier
commit
feedback
set timestamp
set barrier
commit
wait barrier
commit
The second content update can cause the feedback request from the first to
send a discarded event if the timestamp is in the past.
Be less clever and just put waits in both our content updates.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32121>
When occluded, the current math always rounds down to 0 cycles and leads
to improperly throttled frame delivery.
Improve the comment, and use a formula that leads to generating future
times even when occluded.
Also remove some dead code, as we can never get here with a period of 0.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Fixes: c26ab1aee1 ("vulkan/wsi/wayland: Pace frames with commit-timing-v1")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32121>
From the perspective of the gpu, host read or host write has the same
implication (gpu cache flush) in the dst access flags. We should
include host write in the dst access flags.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32102>
Adds an optional region selection, based off percentages of the
starting/ending of an image's X & Y values.
This is intended as a performance enhancement tradeoff for smaller
images to be created.
With a smaller image size, the screenshotting layer will change the
region boundaries on the GPU side, which will decrease the amount of
time it takes to copy the image over to CPU-accessible memory.
Using vkcube as an example, the original image size is 500x500:
mesa-screenshot: DEBUG: Screenshot Authorized!
mesa-screenshot: DEBUG: Needs 2 steps
mesa-screenshot: DEBUG: Time to copy: 123530 nanoseconds
Then, by cropping the area to a 100x100 image, we get the following:
mesa-screenshot: DEBUG: Screenshot Authorized!
mesa-screenshot: DEBUG: Using region: startX = 40% (200), startY = 40% (200), endX = 60% (300), endY = 60% (300)
mesa-screenshot: DEBUG: Needs 2 steps
mesa-screenshot: DEBUG: Time to copy: 12679 nanoseconds
For this example, this is a ~90% time reduction improvement!
Overall, this option reduces the copy time to a point where it can
become negligible, relative to the frame time of the application.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood felix.j.degrood@intel.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32016>
Avoids sanitizer errors like:
```
../src/intel/vulkan/anv_pipeline_cache.c:406:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:406:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:417:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:435:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:435:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:439:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:439:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:443:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:443:4: runtime error: null pointer passed as argument 2, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:447:4: runtime error: null pointer passed as argument 1, which is declared to never be null
../src/intel/vulkan/anv_pipeline_cache.c:447:4: runtime error: null pointer passed as argument 2, which is declared to never be null
```
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32098>
Map VkSwapchainCreateInfoKHR.compositeAlpha to corresponding
DXGI_SWAP_CHAIN_DESC1.alphaMode.
Add VK_COMPOSITE_ALPHA_POST_MULTIPLIED_BIT_KHR to capabilities as
it was missing there.
Signed-off-by: Benjamin Otte <otte@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32048>
This is required, otherwise we regress latency in cases where
applications are using FIFO without explicit KHR_present_wait.
This is an unacceptable regression.
The fix is to normalize the behavior to X11 WSI.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: d052b0201e ("vulkan/wsi/wayland: Use fifo protocol for FIFO")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32029>
Instead of using frame callbacks - which may stop firing if our surface is
occluded - use the new commit-timing-v1 protocol in combination with the
presentation feedback protocol.
If the required protocols are unavailable, or the environment variable
MESA_VK_WSI_DEBUG contains "nowlts", we fall back to frame callback
based pacing behaviour.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
The fifo protocol allows us to ensure that a compositor presents
an image that we submit to it. Use this to reliably implement FIFO
semantics.
Note: On systems where the fifo protocol is available an occluded
surface may find itself unthrottled when previously it would have
been frozen.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
Reduced the looping copy structure of a 2D array down to a single malloc
& memcpy operation to copy the entirety of the image buffer to a local
1D array copy.
With this setup, we can write the image row by row using the associated
libpng API call.
Local testing with vkcube showed ths as a large perf gain, reducing the
time it took to copy images by more than half:
Previous method:
mesa-screenshot: DEBUG: Time to copy: 251907 nanoseconds
Current method:
mesa-screenshot: DEBUG: Time to copy: 112904 nanoseconds
Also reduced swapchain image list malloc operations to one use. This
doesn't have much perf impact, but it's good to reduce the number of malloc
operations overall.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31793>
This frees a fairly large amount of memory from the 2D matrix by
iterating over the rows to free them individually.
Liuqiang spotted some areas that we return early in the threaded
function and don't free some pointers.
To remedy this, we'll reorder the checks so that we don't have to
return early and can instead use an if/else flow to take care of
these problematic areas in a more elegant way.
Co-authored-by: Casey Bowman <casey.g.bowman@intel.com>
Co-authored-by: liuqiang <liuqiang@kylinos.cn>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31793>
This allows larger buffer sizes when using the env config as well
as filepath for the output directory.
This will allow, for example, using a large number of singular frames:
frames=1/2/3/4/5/6/7/8/.../300
Also fixed an issue with filepaths sometimes being appended with garbage
characters due to not being initialized.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31793>
Previously, only the first image in the swapchain was chosen at all times
to be copied to a file.
This meant that if a list of consecutive images were selected, multiple
duplicate images would be saved, instead of the proper frames actually
used in the workload.
Now, the index is properly obtained from AcquireNextImageKHR(), leading
to the same image being used for the workload to be copied and saved to
a file.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31793>
Perfetto is allowed to choose it's own default clock, and before this we just assumed the presentation times reported by the compositor are the same as perfetto's internal clock, which is not always the case. I got a nasty trace where all the wayland presents were in the wrong location. This fixes that by asking the compositor which clock it uses, then passing that along to perfetto.
A workaround for my compositor was setting use_monotonic_clock=true in the perfetto config, as my compositor (and I suspect most others) use the monotonic clock for presentation timestamps. However, asking the compositor is definitely the most correct solution.
I added a clock param to `MESA_TRACE_TIMESTAMP_{BEGIN,END}`, as it's only use that I could see was in wsi_common_wayland, and in general it seems good to be careful about which clock tracing timestamps come from.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31779>