Because we called nir_lower_io_to_temporaries which ensure the
store output and emit vertex in the same block.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20350>
I thought I'd fixed this already, must have gotten lost in a rebase.
fixes
dEQP-VK.pipeline.pipeline_library.graphics_library.misc.bind_null_descriptor_set.1010
Fixes: 20902d1ed6 ("lavapipe: fix descriptor set layout reference counting in layout merge")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20662>
fixes a memory leak seen in lavapipe asan tests
dEQP-VK.robustness.robustness2.bind.template.rg32f.unroll.nonvolatile.storage_buffer.readwrite.no_fmt_qual.null_descriptor.samples_1.1d.frag
Fixes: 2909c654b0 ("llvmpipe: add fragment shader image support")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20630>
When creating a merged layout, don't use ralloc, just use the
correct reference counting, also only reference a layout if the
pipeline uses it.
Fixes: d4d5a7abba ("lavapipe: implement EXT_graphics_pipeline_library")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20630>
When taking the descriptor set layouts from the pipeline layout, make
sure to take references
Fixes: d4d5a7abba ("lavapipe: implement EXT_graphics_pipeline_library")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20630>
If the blit formats match and the resource formats match, then that's a
memcpy whether or not the blit's view of the resource matches the
resource's format.
Improves perf of portal-2-v2's last frame on zink+anv by 1.33212% +/-
0.302829% (n=5), where there's a blit that is viewing the RGBA8_UNORM
src/dst resources as RGBA8_SRGB.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20594>
In file included from src/vulkan/wsi/wsi_common_drm.c:34:
include/drm-uapi/dma-buf.h:23:10: fatal error: 'linux/types.h' file not found
#include <linux/types.h>
^~~~~~~~~~~~~~~
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16987>
From https://cgit.freedesktop.org/drm-misc/
9cc4853e4781bf0dd0f35355dc92d97c9da02f5d
Author: Antonio Borneo <antonio.borneo@foss.st.com>
Date: Tue Jun 7 23:31:44 2022 +0200
drm: adv7511: override i2c address of cec before accessing it
This version has the new sync_file import/export ioctls.
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16987>
Fixes assertion fails in piglit isinf-and-isnan, which uses a constant infinity,
which has an out-of-bounds mantissa (but the function contract says that's
fine and we just return something undefined.)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20563>
This acts as a depth/stencil write. The AGX compiler checks outputs_written to
determine what conservative depth settings the driver needs. Nominally, this
should work: the original store_output(FRAG_RESULT_DEPTH) intrinsic causes the
DEPTH outputs_written bit to be set, so the metadata is still correct after
lowering store_output to store_zs_agx. However, there are a handful of places
that call nir_gather_info late, which *resets* the existing outputs_written
value and regathers, causing Asahi to use the wrong conservative depth settings
when shuffling NIR pass order and breaking gl_FragDepth.
To fix, handle store_zs_agx conservatively when gathering info so we don't have
to play games with the pass order or stashing info in a sideband.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20563>
Rely on the common address arithmetic optimizations. We don't need the
special formats for UBO loads anyway, so this is simpler and optimizes
out the ushr.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20558>
This works like store_global, but lets us optimize address arithmetic. Like
load_agx, it is formatted to match the hardware semantic. We don't make use of
any clever formats in this series, though.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20558>
Freedreno needs to know when an image has volatile or coherent access
flags in the shader.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20612>
Don't turn gl_access_qualifier coming from NIR back into GL enums,
losing information in the process.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20612>
Implement the get_decoder_fence vfunc. Note that the waiting for
completion in this driver happens in the end_frame vfunc itself.
Signed-off-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Implement the get_decoder_fence vfunc by waiting on the fence
previously passed in the end_frame vfunc.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Implement the get_decoder_fence vfunc by waiting on the fence
previously passed in the end_frame vfunc.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Implement the get_decoder_fence vfunc by waiting on the fence
previously passed in picture->fence in the end_frame vfunc.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Implement the get_decoder_fence vfunc by waiting on the fence
previously passed in picture->fence in the end_frame vfunc.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Use the new get_decoder_fence vfunc to implement
vaQuerySurfaceStatus and vaSyncSurface in the va state tracker.
A pointer to the surface's fence is passed to the codecs before the
end_frame vfunc and the codec is responsible for allocating a fence on
command stream submission.
This fence is then queried on vaQuerySurfaceStatus and waited on in
vaSyncSurface.
Notably both functions were not implemented as per the VA-API docs for
PIPE_VIDEO_ENTRYPOINT_BITSTREAM.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Add PIPE_DEFAULT_DECODER_FEEDBACK_TIMEOUT_NS as a way to control
how much to wait for decoders if this is supported.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Add a get_decoder_fence vfunc that can be used to query the status
of the previous decode job denoted by 'fence' given 'timeout'.
A pointer to a fence pointer can be passed to the codecs before the
end_frame vfunc and the codec should then be responsible for allocating
a fence on command stream submission.
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20133>
Only construct the key on-demand if the PROG state is dirty. The newly
added "virtual" PROG_KEY state is used to know when other state that the
shader key depends on changes. Worth ~13% at drawoverhead test 0.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20572>
A pretty significant amount of time spent in fd6_draw_vbo is calling
memset to zero init the on-stack struct. And a big part of the size
of the struct is fd6_state, of which we only need to initialize
num_groups to zero. This is worth a 15% improvement in drawoverhead
test 0 ("no state change").
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20572>
We already overwrote the entire descriptor in patch_fb_read_sysmem().
Doing the same in patch_fb_read_gmem() will simplify things for moving
the fb_read descriptor to the FS's bindless descriptor set.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20572>
Split out the build-up of CP_SET_DRAW_STATE packet, as we are going to
want to re-use this for compute state later when we switch to bindless
IBO descriptors.
While we are at it, drop the enable_mask param, as this is determined
solely by the group_id, and it is easier to maintain a table for the
handful of exceptions to ENABLE_ALL. The compiler should be able to
optimize away the table lookup.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20572>
Same thing as https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20530:
newly added `src/vulkan/util/rmv/vk_rmv_tokens.h` (see !17331) includes
`src/util/` files, so anything that includes it needs `idep_mesautil`.
In file included from ../src/vulkan/util/rmv/vk_rmv_common.h:29,
from ../src/vulkan/runtime/vk_device.h:26,
from ../src/vulkan/wsi/wsi_common.c:31:
../src/util/simple_mtx.h:34:12: fatal error: valgrind.h: No such file or directory
34 | # include <valgrind.h>
| ^~~~~~~~~~~~
compilation terminated.
Fixes: 5f30a7538b ("vulkan: Add RMV token definitions")
Signed-off-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Friedrich Vock <friedrich.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20642>
Driver workarounds for game bugs can be easily broken. This one
shouldn't be applied to meta shaders and this restores previous logic.
Fixes: da32cbb5c6 ("aco: fix missing uses of MRT output flags")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20637>
The chance we'll miss anything from non-LTO is minimal, and having
both builds in one is too slow (usually the latest job to finish).
Acked-by: Martin Roukala <martin.roukala@mupuf.org>
Acked-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20623>