etnaviv_ml.h uses dynarray, but the u_inlines.h header is needed by
some of the files that include it.
Fixes: d6473ce28e ("etnaviv: Use NN cores to accelerate convolutions")
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de
(cherry picked from commit 70bff0c971)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Because dyn_start and dyn_end are indices into
nvk_root_descriptor_table->dynamic_buffers, we would need to offset
cbuf->dynamic_idx by
nvk_root_descriptor_table->set_dynamic_buffer_start[cbuf->desc_set]
in order to do those comparisons correctly.
We could do that, but it's simpler and no less precise to sinply
re-use the same comparison that we do in the other cases here.
This fixes a rendering artifact in Baldur's Gate 3 (Vulkan), which
regressed with the commit listed below.
Fixes: 091a945b57 ("nvk: Be much more conservative about rebinding cbufs")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit dc12c78235)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Previously, we were passing the end index which was incorrect.
Also, improve the macros so that they can take an expression for
the count.
Fixes: b2d85ca36f ("nvk: Use helper macros for accessing root descriptors")
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
(cherry picked from commit 64f17c1391)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Some limits got stuck to the old binding table limits. Those don't
apply anymore since EXT_descriptor_indexing was implemented.
Fixes: 6e230d7607 ("anv: Implement VK_EXT_descriptor_indexing")
Fixes: 96c33fb027 ("anv: enable direct descriptors on platforms with extended bindless offset")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
(cherry picked from commit d6acb56f11)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
The Early-Z optimization is disabled when there is a discard
instruction in the shader used in the draw call.
But if discard is the only reason to disable Early-Z, and at
draw call time the updates in the draw call are disabled we
can enable Early-Z using a shader variant.
If there are occlussion queries active we also need to disable
Early-z optimization.
So this patch enables Early-Z in this scenario.
The performance improvement is significant when running gfxbench
benchmark showing an average improvement of 11.15%
fps_avg helped: gl_gfxbench_aztec_high.trace: 3.13 -> 3.73 (19.13%)
fps_avg helped: gl_gfxbench_aztec.trace: 4.82 -> 5.68 (17.88%)
fps_avg helped: gl_gfxbench_manhattan31.trace: 5.10 -> 6.00 (17.59%)
fps_avg helped: gl_gfxbench_manhattan.trace: 7.24 -> 8.36 (15.52%)
fps_avg helped: gl_gfxbench_trex.trace: 19.25 -> 20.17 ( 4.81%)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable
(cherry picked from commit 5b951bcdd7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
We can end up calling vk_multialloc_alloc with 0 size when
`attachment_count` is 0 and `clearValueCount` is 0.
Addressed:
```
Direct leak of 1 byte(s) in 1 object(s) allocated from:
#0 0x7faf033ee0 in __interceptor_malloc
../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7fada5cc10 in vk_default_alloc ../src/vulkan/util/vk_alloc.c:26
#2 0x7fac50b270 in vk_alloc ../src/vulkan/util/vk_alloc.h:48
#3 0x7fac555040 in vk_multialloc_alloc
../src/vulkan/util/vk_alloc.h:234
#4 0x7fac555040 in void
tu_CmdBeginRenderPass2<(chip)7>(VkCommandBuffer_T*,
VkRenderPassBeginInfo const*, VkSubpassBeginInfo const*)
../src/freedreno/vulkan/tu_cmd_buffer.cc:4634
#5 0x7fac900760 in vk_common_CmdBeginRenderPass
../src/vulkan/runtime/vk_render_pass.c:261
```
seen in:
dEQP-VK.robustness.robustness2.bind.notemplate.r32i.dontunroll.nonvolatile.uniform_texel_buffer.no_fmt_qual.len_252.samples_1.1d.frag
Fixes: 4cfd021e3f ("turnip: Save the renderpass's clear values in the cmdbuf state.")
Signed-off-by: Karmjit Mahil <karmjit.mahil@igalia.com>
(cherry picked from commit c923eff742)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Non-trivial collects (i.e., ones that will introduce moves because the
sources don't line-up with the destination) may cause source intervals
to get implicitly moved when they are inserted as children of the
destination interval. Since we don't support moving intervals in shared
RA, this may cause illegal register allocations. Prevent this by
creating a new top-level interval for the destination so that the source
intervals will be left alone.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
(cherry picked from commit b36a7ce0f1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Otherwise anv_descriptor_set is accessed through an unaligned pointer,
which is undefined behavior in C.
```
anv_descriptor_set.c:1620:17: runtime error: member access within misaligned address 0x61900002c2b5
for type 'struct anv_descriptor_set', which requires 8 byte alignment 0x61900002c2b5
```
Fixes: 2570a58bcd ("anv: Implement descriptor pools")
(cherry picked from commit a2c4a34303)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Since $RESULTS_DIR is now centrally defined in setup-test-env.sh it's no
longer necessary to manually add a hard-coded results directory for the
b2b-test job results.
This keeps the results directory consistent between b2c-test jobs and lava.
Fixes: 9b6d14aed1 ("ci: Always create results dir from init")
(cherry picked from commit 276447ef81)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
While the documentation says to use NUM_SIMD_LANES_PER_DSS for the stack
address calculation, what the HW actually uses is
NUM_SYNC_STACKID_PER_DSS. The former may vary depending on the platform,
while the latter is fixed to 2048 for all current platforms.
Fixes: 6c84cbd8c9 ("intel/dev/xe: Set max_eus_per_subslice using topology query")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit aee04bf4fb)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Copy propagation would incorrectly occur in this code
mov(16) v4+2.0:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0
to create
mov(16) v4+2.0:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, u0<0>:UD NoMask group0
This has different behavior. I think I just made a mistake when I
changed this condition in e3f502e007.
It seems like this condition could be relaxed to cover cases like (note
the change of destination stride)
mov(16) v4+2.0<2>:UW, u0<0>:UW NoMask
...
mov(8) v6+2.0:UD, v4+2.0:UD NoMask group0
I'm not sure it's worth it.
No shader-db or fossil-db changes on any Intel platform. Even the code
for the test case mentioned in the original commit did not change.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: e3f502e007 ("intel/fs: Allow copy propagation between MOVs of mixed sizes")
Closes: #12116
(cherry picked from commit 80a5d158ae)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Specifically, allow two immediate sources for BFE on Gfx12+. I stumbled
on this while trying some stuff with !31852.
v2: Don't be lazy. Add proper assertions for all the things on all the
platforms. Based on a suggestion by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes: 7bed11fbde ("intel/brw: Allow immediates in the BFE instruction on Gfx12+")
(cherry picked from commit c1c09e3c4a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
This is required, otherwise we regress latency in cases where
applications are using FIFO without explicit KHR_present_wait.
This is an unacceptable regression.
The fix is to normalize the behavior to X11 WSI.
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: d052b0201e ("vulkan/wsi/wayland: Use fifo protocol for FIFO")
(cherry picked from commit 5f70858ece)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
Fixes: 212f1ab40e ("nvc0: support PIPE_CAP_RESOURCE_FROM_USER_MEMORY_COMPUTE_ONLY")
Acked-by: David Heidelberg <david@ixit.cz>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
(cherry picked from commit 277925471e)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32119>
When there is no dynamic buffer, create_copy_table early returns. Make
sure dummy_sampler_handle is still set.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32018>
CmdSetEvent2 does not call cs_wait_slots. CmdWaitEvents2 should wait
for the syncobj even on the same subqueue. To that goal, update
collect_cs_deps to not clear self from wait_subqueue_mask.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31997>
There's not too much point in running tests in general, but also
specifically for wayland-protocols, which requires a newer
wayland-scanner to run the tests (for DTD validation) but not to parse
the protocol files.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: cdef622a0a ("meson: Update wayland-protocols to 1.38")
Closes: mesa#12126
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32036>
Having the RGB* formats before the BGR* formats in the table causes
problems where under some circumstances, some applications end up
with the wrong colors.
The repro case for me is: Xvnc + mutter + chromium
There was an existing comment in dri_fill_in_modes() which explained
the problem. This was lost when dril_target.c was created.
Fixes: ec7afd2c24 ("dril: rework config creation")
Fixes: 3de62b2f9a ("gallium/dril: Compatibility stub for the legacy DRI loader interface")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31950>
It's the first time RADV is Vulkan conformant on GFX6-7! Some chips
are missing because we don't have access but most of the GFX6-7 GPUs
are covered.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32022>
This fixes various crashes that I saw with occlusion query tests.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Fixes: ba2c7fd00a ("panvk: use force_fb_preload for unaligned preload")
Fixes: c108dfc930 ("panvk: force_fb_preload should insert a barrier")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32024>
When etna_screen_create(..) is called with gpu != NULL and npu == NULL,
screen->pipe_nn is incorrectly set up. This leads to an unintended
stream configuration for compute-only contexts, as determined by
pipe = (compute_only && screen->pipe_nn) ? screen->pipe_nn : screen->pipe;
To address this, extend the gpu != npu condition by adding a check for
npu != NULL to ensure pipe_nn is only initialized when both gpu and npu
are provided.
Fixes: a4653587cc ("etnaviv: Add a separate NPU pipe")
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32025>
Instead of using frame callbacks - which may stop firing if our surface is
occluded - use the new commit-timing-v1 protocol in combination with the
presentation feedback protocol.
If the required protocols are unavailable, or the environment variable
MESA_VK_WSI_DEBUG contains "nowlts", we fall back to frame callback
based pacing behaviour.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
The fifo protocol allows us to ensure that a compositor presents
an image that we submit to it. Use this to reliably implement FIFO
semantics.
Note: On systems where the fifo protocol is available an occluded
surface may find itself unthrottled when previously it would have
been frozen.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
Update the wrap and the dependency, as well as bumping several build tags.
I've also turned off wayland-protocols tests, as we don't want to bump the
wayland-scanner version at this time.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26150>
Preloading is effectively texel fetching. When we force preloading, we
need to insert a barrier for the feedback loop.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31895>
Extend force_fb_preload to take an optional VkRenderingInfo. When it is
non-NULL, this is the unaligned preload and force_fb_preload should
clear attachments.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31895>