Align with gallium side. When fixed-function blending is not available,
the internal blend shader is used. This is handled by a single ST_TILE
in the blend shader with the current sample ID, which requires sample
shading enablement.
Cc: mesa-stable
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38129>
This is already set in `.panfrost-test`.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38182>
This helps us to more accurately count the number of registers that
need to be spilled to keep us below the maximum.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37188>
This change adds the minimum support for VK_EXT_device_memory_report,
which only reports device memory events at this point. We can make it
more useful later (like what's done in ANV) if desired by some tools.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37987>
Disable the max-fails feature in deqp-runner for this job, since it
aborts the run due to failures that don't occur otherwise.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38051>
This job runs on MT8196 Rauru Chromebooks.
Also remove the old G725 expectation files, as G725 is a smaller variant
of G925.
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38051>
Most of the time, we can infer the type to append in
util_dynarray_append using __typeof__, which is standardized in C23 and
support in Jesse's MSMSVCV. This patch drops the type argument most of
the time, making util_dynarray a little more ergonomic to use.
This is done in four steps.
First, rename util_dynarray_append -> util_dynarray_append_typed
bash -c "find . -type f -exec sed -i -e 's/util_dynarray_append(/util_dynarray_append_typed(/g' \{} \;"
Then, add a new append that infers the type. This is much more ergonomic
for what you want most of the time.
Next, use type-inferred append as much as possible, via Coccinelle
patch (plus manual fixup):
@@
expression dynarray, element;
type type;
@@
-util_dynarray_append_typed(dynarray, type, element);
+util_dynarray_append(dynarray, element);
Finally, hand fixup cases that Coccinelle missed or incorrectly
translated, of which there were several because we can't used the
untyped append with a literal (since the sizeof won't do what you want).
All four steps are squashed to produce a single patch changing every
util_dynarray_append call site in tree to either drop a type parameter
(if possible) or insert a _typed suffix (if we can't infer). As such,
the final patch is best reviewed by hand even though it was
tool-assisted.
No Long Linguine Meals were involved in the making of this patch.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38038>
Assuming disk I/O isn't painfully slow compared to running the whole
compiler (OS file system caching should help), this should massively
reduce the number of shaders compiled.
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38019>
This way we get the same cache UUIDs for identical sources and build
environments, even if they're built at different times.
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38019>
This one just hangs out and prevents unnecessary duplicate compilation.
We set .weak_ref = true so it doesn't actually hold references forever.
It just hangs onto any shaders that exist in some VkPipeline or VkShader
somewhere. But if an app compiles a pipeline it already has, it will
let us avoid a duplicate compile, even if the app doesn't provide a
pipeline cache.
Acked-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38019>
Rather than calling REQ_RES during subqueue_init, only call it during
submit the first time we see a command buffer that requires the use of
specific resources.
This ensures queues processing compute-only workloads (i.e. not actually
requiring tiler/fragment resources) don't preempt vertex/fragment work
on other queues due to resource congestion.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37890>
With this commit we do two things:
* Remove pan_shader_info.fs.outputs_written because it is not used,
as there is already pan_shader_info.outputs_written
* For pan_shader_info.fs.outputs_read we direcly copy the nir
info.outputs_read, without a shift, for consistency and also
because it is not really required.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38022>
Skip waiting/signaling on semaphores with stages not related
to a given subqueue
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37810>
- rename vk_stage_to_subqueue_mask -> vk_stages_to_subqueue_mask
- handle stage masks instead of single stages.
- Add which sync scope it is reading to better reason with the mask semantics.
- Handle ALL_COMMANDS as well as TOP/BOTTOM (using sync scopes)
- add timestamp utility vk_stage_to_timestamp_subqueue_mask
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37810>
Until now, the driver has been using a single set for internal state
while leaving maxBoundDescriptorSets to the remaining 15.
This gives us no room for optimizations of driver sets, which might
become an issue in the future.
To remedy this, we therefore reduce maxBoundDescriptorSets to 7. This
aligns with the proprietary driver and gives us the space to optimize
the driver sets.
We might increase this in the future if we see that we don't need all
the driver sets we now reserve.
Reviewed-by: John Anthony <john.anthony@arm.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37978>
nir_lower_compute_system_values will attempt to lower
load_workgroup_size unless workgroup_size_variable is set. For precomp
shaders, the workgroup size is set statically for each entrypoint by
nir_precompiled_build_variant. Because we call
lower_compute_system_values early, it sets the workgroup size to zero.
Temporarily setting workgroup_size_variable while we are still
processing all the entrypoints together inhibits this.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 20970bcd96 ("panfrost: Add base of OpenCL C infrastructure")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37799>
Upon acquiring an external image from external/foreign queue family,
skip AFBC metadata invalidation if the app has explicitly requested
acquireUnmodifiedMemory. This also applies to CRC which may or may not
get hooked up later.
Reviewed-by: John Anthony <john.anthony@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37972>
Usage of implicit input file in valhall_parse_isa makes it very
complicated for tools like ninja-to-soong to generate the Android
equivalent build file.
Instead use an explicit argument.
It also deduplicate the location of the input file name to have it
only in 'meson.build'.
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37742>
Trivial but still better than cs_add32 workaround on platforms
supporting MOVE_REG32.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37951>
This change:
1. Use a different loop var for outer multidraw repeat
2. Make sure fau total_count matches the filled faus per loop
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37951>
Fix a regression from an unfortunate typo.
Fixes: 48e8d6d207 ("panfrost, panvk: The size of resource tables needs to be a multiple of 4.")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37951>
Should only set once outside the multidraw loop so that per draw can
patch its own own desc attribs when needed.
Fixes: a5a0dd3ccc ("panvk: Implement multiDrawIndirect for v10+")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37951>
bi_load_tl's dst arg was named src, this was confusing.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37955>
IADD_IMM and uniform loads should not be spilled. Make them
rematerializable. Restrict to loads that write at most one register
for simplicity for now.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37955>
Currently when LCRA spills it also fills on every single usage of the
spilled value. This leads to a lot of cases where a spill is immediately
followed by a fill.
This patch reduces fills by LCRA by simply not filling until reaching
either the index that caused LCRA to fail allocating registers, or the
end of a block.
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37299>
It's valid for the tiler desc to be 0 when the tiler isn't being
used. Currently, we set the descriptor based on an offset over the
pointer in the gfx state, and if this is 0, we end up setting it to
just the offset when there are more than 8 layers on a target.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37837>
Adds opt-in support for AFBC WSI swapchain image creation by
adding the supported modifiers to the lists expected by mesa WSI.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
External images given to us by WSI use this tiling mode instead
of optimal, and we want to allow this.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
This seems too restrictive with our current set of supported
modifiers, as we frequently end up skipping through the entire list
of AFBC modifiers.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
External importers of AFBC dmabufs expect these to be in terms
of bytesizes instead of direct superblock counts. This makes these
calculations aligned with panfrost and fixes WSI imports.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
If there isn't any modifier info coming from the compositor, we go
through our internal image path and pick the best modifier that
supports the image. This causes problems on X11, as it actually
expects the image to be in a linear layout.
Explicitly set the modifier to linear for legacy scanout images,
which specifically indicates that the image doesn't have an
explicit DRM modifier and we should do the safe thing by using
linear.
The naming becomes confusing for scanout with this change,
so the flag is now split into two separate flags, one for controlling
the AFBC optimalness called wsi, the other more directly called
legacy_scanout, which is used for enforcing the linear mod.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
Instead of always adding storage usage on pre_mod_adjustments
and preventing AFBC for all images with usage TRANSFER_DST,
only do this when the image doesn't use AFBC, by adding a
new post_mod_adjustments pass.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
Since we don't have multiplanar AFBC support, this causes issues
when we try to alias an image with a single plane to a plane of
a multiplanar image.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
On panvk, we can use the render area to set fbd bbox extents
instead of setting them based on image sizes. Doing this improves
partial updates (eg. loadOp:load with renderArea < image_size) of
AFBC render targets.
This commit introduces a new structure for this setting and uses
it on both panvk and panfrost. We can't reuse the existing extent
here as that is based on viewport+scissor, which can change within
a renderpass/batch, which causes issues on panfrost.
No functional changes for panfrost, as it doesn't have an equivalent
to renderpass::renderArea so we can't do the same thing there, it
still uses the entire framebuffer extent.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
This represents what this bounding box is being used for better,
as it can be easily confused with the framebuffer bounding box
otherwise.
Also fixes the comment about inclusiveness, as these are being
used as exclusive on both panfrost and panvk.
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37771>
This was something that came up in the slop MR. Not sure it's actually a
good idea or not but kind of curious what people think, given we have a
sound tool (Coccinelle) to do the transform. Saves a redundant branch
but means extra noninlined function calls.. likely no actual perf impact
but saves some code.
Via Coccinelle patches:
@@
expression ptr;
@@
-if (ptr) {
-free(ptr);
-}
+free(ptr);
@@
expression ptr;
@@
-if (ptr) {
-FREE(ptr);
-}
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr) {
-ralloc_free(ptr);
-}
+ralloc_free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-free(ptr);
-}
-
+free(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-FREE(ptr);
-}
-
+FREE(ptr);
@@
expression ptr;
@@
-if (ptr != NULL) {
-ralloc_free(ptr);
-}
-
+ralloc_free(ptr);
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> [v3d]
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org> [venus]
Reviewed-by: Frank Binns <frank.binns@imgtec.com> [powervr]
Reviewed-by: Janne Grunau <j@jannau.net> [asahi]
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> [radv]
Reviewed-by: Job Noorman <jnoorman@igalia.com> [ir3]
Acked-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Job Noorman <jnoorman@igalia.com>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37892>
Non-PCI devices are responsible for checking device matches
themselves by overriding can_present_on_device on wsi_device
to take the non-blitting WSI path.
We can allow on-device presentation for both the display
controller, and any panthor node. The simplest way of
achieving that is allowing this for all devices with bus
type platform, as there's nothing else that has this
property.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37543>