Layer strides are based on the full miptree, and even for single-layer
images macOS always allocates a full one (possibly relevant for
compression). Make sure we do the same, regardless of how many mip
levels the user asked for.
Fixes Darwinia.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20365>
79ca456b48 reintroduced the use
of dynamic_cast<> in r600/sfn tests. This breaks compilation with
-fno-rtti, as required to build against the LLVM configuration
recommended upstream. Use static_cast<> instead to fix this.
Fixes: #7820
Signed-off-by: Michał Górny <mgorny@gentoo.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20239>
Without globalFencing, exportable fences and semaphores must instead
have their proxy vn_renderer_sync installed in the same renderer
ring_idx as ther last queue submissions to ensure they signal after
all work previously submitted to the same ring_idx. Exportable
fences/semaphores with a temporary (imported) payload don't need a proxy
vn_renderer_sync, since they already have a `poll()`able fd available.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19691>
With implicit fencing, the image has a fence that blocks scanout until
rendering is complete. virtgpu doesn't support implicit fencing yet, but
Sommelier (a VM Wayland compositor) does the wait by exposing the bo as
a GEM handle and waiting on all fences in userspace with a
DRM_IOCTL_VIRTGPU_WAIT before issuing the wl_surface commit.
During vkQueueSubmit involving wsi images, we follow with an empty
renderer submission on the corresonding ring_idx to install a fence
on the appropriate virtgpu fence context after the last rendering
submission.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19691>
For submissions to renderers that support multiple timelines, put
them on the virtgpu fencing timeline (dma fence context) specified by
the VkQueue's bound ring_idx. CPU-sync'd renderer submissions
can be sent in the same manner by using ring_idx = 0.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19691>
And remove SIMULATE_CONTEXT_INIT and PARAM_MAX_SYNC_QUEUE_COUNT now that
we expect guest kernel support for CONTEXT_INIT with standard support
for up to 64 rings.
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19691>
Pick DEBUG_GET_ONCE_BOOL_OPTION as a example:
The intention of DEBUG_GET_ONCE_BOOL_OPTION are returned the same value across
thread, before this commit, on different thread call the function generated by
DEBUG_GET_ONCE_BOOL_OPTION may return different value if called setenv in the
middle of debug_get_bool_option, so use debug_get_option_cached along with
new exposed function debug_parse_bool_option to solve this issue
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19554>
struct hash_table is not thread-safety, need guard by mutex,
but with thread local storage, we can simplify the code and also
got the thread safety without the need of mutex.
Another advantage is by using thread local storage, os_get_android_option
will have the same actions like getenv does, that it's not cached the
value, each call will access the property_get, like getenv will be affected
by putenv
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19554>
Getting debug_get_num_option to return int64_t, as long under 64 bit Linux are 64 bit size,
so using fixed int64_t for cross platform consistence, as long under win32 is 32 bit size.
Getting DEBUG_GET_ONCE_FLAGS_OPTION to return uint64_t to getting it to be
consistence with debug_get_flags_option.
DEBUG_GET_ONCE_NUM_OPTION is not accessed in codebase, so add unittest for it, it maybe
used in future, remove it is not consistence
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19554>
error messages:
src/compiler/glsl/glcpp/glcpp-parse.c:1691:9: error: variable 'glcpp_parser_nerrs' set but not used [-Werror,-Wunused-but-set-variable]
int yynerrs = 0;
^
src/compiler/glsl/glsl_parser.cpp:2370:9: error: variable '_mesa_glsl_nerrs' set but not used [-Werror,-Wunused-but-set-variable]
int yynerrs = 0;
^
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19875>
There are multiple error messages, show one of them:
../../src/gallium/auxiliary/pipe-loader/pipe_loader_sw.c:219:54: error: passing arguments to a function without a prototype is deprecated in
all versions of C and is not supported in C2x [-Werror,-Wdeprecated-non-prototype]
sdev->ws = sdev->dd->winsys[i].create_winsys(drisw_lf);
^
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19875>
For cases where both clip and cull are used, and a shader has both
inputs and outputs that can contain them, we need metadata to tell
us where the clip array ends and the cull array begins, since they
get combined into CLIP location registers. For outputs, this is in
the nir info, but for input we pass it in a sideband channel.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20346>
User varyings are linked by both name and register. The name is based
on how many *variables* are before it in final driver_location sort
order, not necessarily how many registers are before it.
In some cases where clip/cull distance are involved, it's possible
for one shader to write into a part of the cull distance that's
ignored by a downstream shader, but because linking is done by
*whole* register locations, and clip/cull can be combined using
*fractional* register locations, this is hard to detect. Since no
non-sysvals end up using fractional locations, just put all non-sysvals
first so they always generate the same semantic names for the same
register locations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20346>
They're too restricted for AFBC. Fix up instead. There are two problems at play:
1. We can't just map the format swizzle to the pixel format ordering on v7,
because the "reordered" values aren't allowed with compression.
2. We can't just compose the format swizzle with the API swizzle, because the
composed swizzle is applied to the border colour, so we need to be able to
apply an inverted swizzle to the border colour. That only works for bijective
format swizzles.
Fortunately, there's a neat solution: decompose the format's swizzle into two
swizzles, the first mapping to a reordering that IS allowed for compression, and
the second a bijection. Then we use the allowed reordering when texturing, apply
the bijective swizzle to the API swizzle, and apply the inverse of the bijective
swizzle to the border colour. When we're sampling a border colour, what's now
happening mathematically is:
(API swizzle o bijective swizzle)((bijective swizzle^-1)(border colour)) =
(API swizzle o (bijective swizzle o bijective swizzle^-1))(border colour) =
API swizzle(border colour)
which is exactly what we wanted.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
On v6 and earlier, the hardware supports arbitrary format swizzles for AFBC, so
there's no restriction on AFBC. On v8 and newer, the format swizzle gets applied
to the *decompressed* interchange format, so we can effectively support BGRA of
AFBC images without any special handling. (Confirmed working on v9. Obviously I
can't test on v8 but the expression is cleaner if we assume optimistically it's
like v9. Without hardware, we get to make that assumption :-p)
That just leaves v7 as the only architecture where format swizzles are
restricted for compression but there are no plane descriptor. Don't apply the
restriction to the newer parts.
This gets us AFBC of window surfaces on v9+. As the limiting case, fullscreen
glmark2-es2-wayland -btexture (1080p) in sway on Mali-G57 from 1300fps to
2353fps.
45% reduction in frame time is nothing to sneeze at.
Achoo.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
Introduce an enum to represent an AFBC compression mode. These modes are not
formats, on Valhall they are decoupled from the format. As such, it does not
make sense to use a pipe_format to represent them. Add an enum that we can use
in a straightforward way on Midgard and Bifrost to fallback for texture views,
and can map 1:1 to the Valhall hardware enum.
In addition to being less overloaded semantically, this lets -Wswitch kick in to
ensure that we handle all enums when translating. The straightforward
translation raises the following warnings:
../src/panfrost/lib/pan_cs.c:437:9: warning: enumeration value ‘PAN_AFBC_MODE_R5G5B5A1’ not handled in switch [-Wswitch]
437 | switch (panfrost_afbc_format(PAN_ARCH, format)) {
| ^~~~~~
...indicating that some formats were missed, leading to assertion fails "unknown
canonical AFBC format" when rendering RGB5A1, which dEQP-GLES31 does. Fixes
regressions in
dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.*
on Valhall.
Given how scarce v9 hardware is, that v10 isn't upstream yet, and the offending
code was merged a week ago, this should not have actually affected anyone. At
any rate, it's a good reminder we really do need CI for v9...
Fixes: 8e125b6c15 ("panfrost: Enable AFBC of more formats")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
The L8_UNORM, A8_UNORM, and L8A8_UNORM v7 formats do not support AFBC,
regardless of swizzling. We're about to lift the restrictions on swizzling with
AFBC on v7, so we'll need to handle these cases explicitly to avoid using AFBC
in these cases.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>
When calculating legacy WSI strides for tiled AFBC, we need to account for the
greater alignment requirement of tiled AFBC, or importing resources will fail
later.
Since tiled AFBC is only supported on v7 and later, and AFBC of window surfaces
isn't being used on Linux on v7 and later, this probably hasn't been hit in
practice. Probably.
We're about to fix AFBC of window surfaces so we need to fix this side first.
Fixes: 0255f554f3 ("panfrost: Advertise 16x16 tiled AFBC")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20311>