On panthor, VK_SYNC_FEATURE_TIMELINE is always supported. On panfrost,
we can use vk_sync_timeline_get_type.
Note that there is a kernel issue regarding syncobj query that causes
dEQP-VK.synchronization.timeline_semaphore.wait.poll_signal_from_device
to time out when VK_SYNC_FEATURE_TIMELINE is set. It is considered a
kernel bug and is not dealt with here.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31720>
src_stages_to_subqueue_sb_mask calls stages_cover_subqueue, but also has
a special case for VK_PIPELINE_STAGE_2_DRAW_INDIRECT_BIT.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31720>
This bit is not needed for barriers and appears to trigger a
performance regression. So leave it for just for AUX-TT
flushing/invalidation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e3814dee1a ("anv: add plumbing/support for L3 fabric flush")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12090
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31915>
Detect when the temporary index buffer cannot be generated due to too
large primitive count, and simply drop the draw on the floor.
Fixes a webgl reachable asan/crash.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12092
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31914>
This fixes a CTS hang on Hawaii.
We previously only did a CB/DB flush,
but that doesn't include a L2 cache flush.
Also fix the comment that said this is for GFX9+.
Fixes: 7c62f6fa01
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31906>
It looks like the fixed-func hardware is very slow to cull primitives
with zero pos.w but shader based culling helps a lot.
This fixes a massive performance gap with the FSR2 demo compared to
AMDGPU-PRO, +228% on RDNA2.
Based on my investigation, AMDGPU-PRO seems to always cull these
primitives. Note that disabling NGG culling with AMDGPU-PRO reports the
same performance as RADV without that fix. Also note that the FSR2
sample doesn't specify any cull mode (ie. VK_CULL_MODE_NONE is used),
so this is the only reason PRO was culling more than RADV.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7260
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31891>
Out of 5 run during busy hours, all the jobs that once took 5+ min are:
build-for-tests / debian-arm64-asan 6 min
build-only / debian-s390x 6 min
build-only / debian-android 7 min
build-only / debian-clang 8 min
build-for-tests / debian-arm32-asan 8 min
build-only / debian-vulkan 11 min
build-for-tests / debian-testing 12 min
build-only / debian-testing-msan 12 min
build-only / debian-clang-release 13 min
build-only / alpine-build-testing 14 min
build-for-tests / debian-testing-asan 21 min
The jobs at 10+ min are considered to take long enough that they might
risk crossing the 15 min mark, so let's keep these ones at 30 min and
lower the timeout for everyone else to 15 min.
It's worth pointing out that debian-testing-asan is a build-for-tests
job and as such it blocks build-only jobs from running until it's
finished, which can be a problem for a job that has been seen taking 20+
minutes. We should do something about that, but that's not the topic of
this MR.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31846>
GitLab doesn't (yet) support `timeout:` being a variable, so let's put
the real `timeout:` at the max timeout we want, and internally use
another timeout (using coreutils' `timeout`) that we _can_ set using
a variable.
With that, we can set a 1h timeout on nightly LTO builds while keeping
our tighter 30min timeout the rest of the time.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31846>
We may still need to insert a continue block even if there is only one
backedge, in a situation like:
for (...) {
if (...) continue;
foo();
break;
}
We want foo() to be executed before reconverging. This is important for
the BVH encoding kernel, which launches an invocation for each node in
the tree and does a preorder traversal:
while (true) {
if (!ready[node]) continue;
encode();
for (child node)
ready[child] = true;
break;
}
For the first few nodes, which will be in the same wave, we need
encode() for the root node to be called first, then its children spin
until ready, then the children call encode(), and so on. This can only
work if the children that aren't ready yet are parked while the parent
executes encode(), which requires the continue block.
This is also required because divergence analysis will assume that
uniform values written before the continue are still uniform after it,
which isn't the case now and causes an RA validation failure with Godot.
Fixes: 0fa93fb662 ("ir3: Fix convergence behavior for loops with continues")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31905>
We're not yet fully conformant to Vulkan 1.0, but we are getting
reasonably close. But this document isn't about conformance, it's
documenting what features are implemented, and we currently support all
Vulkan 1.0 features on PanVK.
So let's toggle this switch.
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31887>
we can get into weird situations and the clever logic isn't worth it. do
unclever logic instead and fix subtle CTS flakes. GL was a mistake.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
this was definitely the coolest thing I did in my career. but it has a lot of
drawbacks:
* complexity across the whole stack.
* perf #s even for synthetic tess workloads are... lackluster. couple % on
terraintess. and games tend not to be tess heavy (and if they are we're
screwed and this won't save us.) ironically, sascha willem's non-terrain tess
is sped up a ton by the prefix sum path, so.. lol?
* more brittle for (M3?) porting.
* makes it harder to make the tessellator common code.
* doesn't play nice with the indirect path.
* pile of extra tessellator variants.
* harder to test, coverage is already not great here.
so... drop it, for now. the code isn't gone, and the idea may come back in a
future iteration, perhaps based on mesh shaders. but in its current form I don't think it's worth keeping right now.
my main resevation is actually about heap usage from doubling the index buffer
size. hopefully this is tolerable in practice.
this gets us from 24 to 16 tessellator variants.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>