swapchain images are never going to be used as texture views, and zs formats
aren't compatible with any other formats
this enables implicit modifiers in some cases, and more work can be done in the future
to eliminate mutable format usage to further improve performance
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10180>
rendering onto a linear-tiled image is unbelievably slow if any sort of
blending is enabled, so instead always render to optimal tiling and then
copy to linear for scanout
this doubles performance for now and can be deleted in its entirety along
with the rest of the related hacks once real wsi support is implemented
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10180>
this flag cannot be used to infer that a transfer_map call will be matched
by a transfer_unmap call when tc reordering is active
fixes#4600
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10114>
it turns out there's not actually a requirement that resources be unmapped,
which means that a ton of overhead can be saved both in the unmap codepath
(the cpu overhead here is pretty insane) and then also when mapping cached
resource memory, as the map can now be added to the cache for immediate reuse
as seen in radeonsi
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9980>
this is not only more correct according to vk spec, it avoids having a 0-sized
layer_stride, which totally breaks the transfer map
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9969>
this includes:
* async buffer mapping/replacement
* async queue submission
* async (threaded) gallium flush handling
the main churn here is from handling async gallium flushes, which involves
creating multiple gallium fences (zink_tc_fence) for each zink fence (zink_fence).
a tc fence may begin waiting for completion at any time, even before the zink_fence
has had its cmdbuf(s) submitted, so handling this type of desync ends up needing
almost a complete rewrite of the existing queue architecture
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9935>
this massively speeds up memory access across the driver, specifically
for pbo-related operations
it does require that mapped memory is manually invalidated/flushed, however, and
we also need to manually align host-visible memory to be able to do that
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9884>
the STORAGE_TEXEL and STORAGE_IMAGE bits can't be accurately applied due
to opengl allowing all resources to be used everywhere, so instead we can
create a separate object on demand which is used only by shaders and gets
extra barriers inferred along with the base object to avoid desync whenever
it is used
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9788>
now that batches aren't limited and flushing is less costly, there's no
reason to keep these separate
the primary changes here are removing the zink_queue enum and collapsing
related arrays which used it as an index, e.g., zink_batch_usage into single
members
remaining future work here will include removing synchronization flushes which
are no longer necessary and (eventually) removing batch params from a number of
functions since there is now only ever a single batch
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9765>
vk spec disallows mapping memory regions more than once, but we always
map the full memory range, so we can just refcount and store the pointer
onto the resource object for reuse to avoid issues here
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9725>
historically zink has been bound to 4 gfx batches and then a separate compute batch
was added. this is not ideal for a number of reasons, the primary one being that if
an application performs 5 glFlush commands, the fifth one will force a gpu stall
this patch aims to do the following, all of which are necessarily done in the same patch
because they can't be added incrementally and still have the same function:
* rewrite batch tracking for resources/views/queries/descriptors/...
|originally this was done with a single uint32_t as a bitmask, but that becomes cumbersome
to track as batch counts increase, not to mention it becomes doubly-annoying
when factoring in separate compute batches with their own ids. zink_batch_usage gives
us separate tracking for gfx and compute batches along with a standardized api for
managing usage
* flatten batch objects to a gfx batch and a compute batch
|these are separate queues, so we can use an enum to choose between an array[2] of
all batch-related objects
* switch to monotonic batch ids with batch "states"
|with the flattened queues, we can just use monotonic uints to represent batch ids,
thus freeing us from constantly using bitfield operations here and also enabling
batch counts to scale dynamically by allocating/caching "states" that represent a batch
for a given queue
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9547>
I'll be rewriting how resource tracking works, so abstracting it and removing
direct uses is going to reduce the chances of breaking things as well as code churn
plus it's a bit easier to use
downside is that until that rewrite happens, this will be a (very small) perf hit and
there's some kinda gross macros involved to consolidate code
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9626>
this really just means to nuke the counter buffer, and this can be done
by using a special bind_history bit that can be unset when the buffer has
been rebound
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9546>
this creates new backing objects for the invalidated resource and, if
needed, rebinds it to any descriptor sets it might be used in by invalidating
the descriptor state and creating new view objects
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9546>
the resource object is now a state tracker for the inner resource_object,
so it can safely be destroyed even if the backing resource is still in use
this also requires updating various things to keep descriptor states/caches working
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9544>
this was only needed to cover up some other bugs:
* missing barriers for buffer sampler/image descriptors
* weirdness with first frame handling
there's better ways of handling both cases, and now they're handled better
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9566>
this stores a number (currently 5) of backing allocations for resources
for later reuse when creating matching resources
because this is on the screen object it requires locking, but this is still
far faster than allocating new memory each time
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9540>
we can pass the offset of the 'invalid' flag directly to the resources
to let them run through and invalidate their sets in time for us to detect
it when we recycle the set during batch reset and throw it onto our allocation array
additionally, by adding refs for the actual objects used in a descriptor set, we can
ensure that our cache is as accurate as possible
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9348>
this is a lot of churn that more or less amounts to hashing the descriptor
state during draw and then performing lookups with this to determine whether
we can reuse an existing descriptor set instead of allocating one
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9348>
Use debug_printf more consistently, normalize formatting a bit, and
trace a few more places you're likely to care about.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9436>