nvidia claims format feature support for various flags but then doesn't
actually support them for a 3D image, only 2D, meaning that an extra step
is necessary before creating a linear image
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10823>
when emitting barriers, we don't need to end the renderpass in some cases
this handles the case of emitting barriers for new resources which have
never before been accessed and thus do not require synchronization at a
specific point in the batch
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10700>
this breaks the driver!
the uploader always maps its own pointer, so modifying that at any
point just explodes things later
Fixes: d179c5d28e ("zink: implement threaded context")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10787>
sparse buffers are not cpu-readable, so any mapping requires that they
use a staging buffer, either the stream uploader for writes or a manual
copy for readback
future work here should attempt to resolve two perf issues:
* sparse allocations should be allocated dynamically using a suballocator
or some other, more useful strategy
* readback shouldn't allocate a huge staging buffer
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10509>
I think at one point before staging resource flagging was less reliable
this method made sense, but now it's worse
Fixes: 6ff6d01c37 ("zink: don't use cached mem for staging resources")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10363>
this is just for unit tests where the scanout object is redundant and
the only time a flush occurs is from stalling on readback
Fixes: 104603fa76 ("zink: create separate linear tiling image for scanout")
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10239>
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
This fixes basic rendering on top of V3DV, which doesn't seem to expose
the cached memory we expect and love.
Fixes: 598dc3dca4 ("zink: use cached memory for all resources when possible")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10230>
swapchain images are never going to be used as texture views, and zs formats
aren't compatible with any other formats
this enables implicit modifiers in some cases, and more work can be done in the future
to eliminate mutable format usage to further improve performance
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10180>
rendering onto a linear-tiled image is unbelievably slow if any sort of
blending is enabled, so instead always render to optimal tiling and then
copy to linear for scanout
this doubles performance for now and can be deleted in its entirety along
with the rest of the related hacks once real wsi support is implemented
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10180>
this flag cannot be used to infer that a transfer_map call will be matched
by a transfer_unmap call when tc reordering is active
fixes#4600
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10114>
it turns out there's not actually a requirement that resources be unmapped,
which means that a ton of overhead can be saved both in the unmap codepath
(the cpu overhead here is pretty insane) and then also when mapping cached
resource memory, as the map can now be added to the cache for immediate reuse
as seen in radeonsi
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9980>
this is not only more correct according to vk spec, it avoids having a 0-sized
layer_stride, which totally breaks the transfer map
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9969>
this includes:
* async buffer mapping/replacement
* async queue submission
* async (threaded) gallium flush handling
the main churn here is from handling async gallium flushes, which involves
creating multiple gallium fences (zink_tc_fence) for each zink fence (zink_fence).
a tc fence may begin waiting for completion at any time, even before the zink_fence
has had its cmdbuf(s) submitted, so handling this type of desync ends up needing
almost a complete rewrite of the existing queue architecture
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9935>
this massively speeds up memory access across the driver, specifically
for pbo-related operations
it does require that mapped memory is manually invalidated/flushed, however, and
we also need to manually align host-visible memory to be able to do that
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9884>
the STORAGE_TEXEL and STORAGE_IMAGE bits can't be accurately applied due
to opengl allowing all resources to be used everywhere, so instead we can
create a separate object on demand which is used only by shaders and gets
extra barriers inferred along with the base object to avoid desync whenever
it is used
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9788>
now that batches aren't limited and flushing is less costly, there's no
reason to keep these separate
the primary changes here are removing the zink_queue enum and collapsing
related arrays which used it as an index, e.g., zink_batch_usage into single
members
remaining future work here will include removing synchronization flushes which
are no longer necessary and (eventually) removing batch params from a number of
functions since there is now only ever a single batch
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9765>
vk spec disallows mapping memory regions more than once, but we always
map the full memory range, so we can just refcount and store the pointer
onto the resource object for reuse to avoid issues here
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9725>