for indirect GS, do it in the indirect kernel (not the pre-GS)
for direct, do it on the host (not the pre-GS)
we don't want pre-GS.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33901>
It is expected that inputs and prefetches are always in the first block.
However, ir3_create_empty_preamble would create blocks before the first
one, leaving inputs after the preamble. This causes issues with
(probably among others) spilling/RA where precolored inputs could
illegally reuse the spill base register.
Fixes RA validation failures on a7xx for
dEQP-VK.ray_query.multiple_ray_queries.vertex_shader
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: f3026b3d3e ("ir3: add some preamble helpers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33977>
For empty BVHs we shouldn't emit any leaf nodes, but there is one
invocation to encode the root node. Guard leaf node encoding so that
invocation doesn't try writing any leaves.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33985>
Prime blit can be used in setups like venus on lavapipe over vtest. It's
native env so Venus relies on renderer side driver to tell about the pci
info, while lavapipe doesn't implement that extension, which ends up
with mismatched gpu thus prime blit.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33956>
The swizzle check was too strict, we actually don't care about the
swizzle on the constant source at this point, it is only checked
later whether the constant source actually has the correct form.
So this effectively enables INV and BIAS presub on R300/R400.
RV370 stats:
total instructions in shared programs: 85379 -> 84948 (-0.50%)
instructions in affected programs: 15669 -> 15238 (-2.75%)
helped: 336
HURT: 81
total presub in shared programs: 1318 -> 2991 (126.93%)
presub in affected programs: 797 -> 2470 (209.91%)
helped: 0
HURT: 514
total omod in shared programs: 387 -> 384 (-0.78%)
omod in affected programs: 9 -> 6 (-33.33%)
helped: 3
HURT: 0
total temps in shared programs: 13290 -> 13243 (-0.35%)
temps in affected programs: 1388 -> 1341 (-3.39%)
helped: 91
HURT: 52
total consts in shared programs: 81922 -> 81855 (-0.08%)
consts in affected programs: 173 -> 106 (-38.73%)
helped: 67
HURT: 0
total cycles in shared programs: 126746 -> 126560 (-0.15%)
cycles in affected programs: 30752 -> 30566 (-0.60%)
helped: 255
HURT: 124
LOST: shaders/godot3.4/22-69.shader_test FS
GAINED: shaders/ck2/172.shader_test FS
GAINED: shaders/tesseract/389.shader_test FS
GAINED: shaders/tesseract/393.shader_test FS
GAINED: shaders/unity/64-DeferredPointShadows.shader_test FS
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33915>
Previously asynchronous descriptor set allocation is only enabled when
the VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT bit is not set.
However, some engine would use that bit but alloc/free with identical
descriptor set layout. So this change extends the async set alloc to
cover that since the spec has guaranteed no fragmentation there.
Besides, a pool before any descriptor set free is also considered w/o
fragmentation. so this change extends to cover here as well. Both
would also help with dEQP run time since all descriptor pools involved
are with that bit set.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33966>
We get a kernel message "You are adding an unorder point to timeline!"
on many CTS runs. This stems from us SIGNALing the queue syncobj then
WAITing but not reseting it. It is assumed by the time we get to
panvk_queue_submit_init_signals() that the value is 0, however it is 1
due to the previous calls.
Signed-off-by: Ashley Smith <ashley.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 5544d39f ("panvk: Add a CSF backend for panvk_queue/cmd_buffer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33943>
VEGA10, RENOIR, NAVI10, RAPHAEL and NAVI31 are covered, they passed
100% of 25 runs each.
NAVI21 and VANGOGH still don't enable video testing in CI because I
got few hangs during my last stress test. Need to be stress tested
again.
Note that the kernel in Mesa CI is too old and doesn't have latest
firmwares that should fix the remaining failures.
GFX6-8 have different issues like GPU hangs on Polaris10, so it's not
yet enabled in CI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33968>
This is not needed now when deinterlace can handle non-interlaced
buffers. Also this forces the buffer as interlaced which doesn't work
on radeonsi anymore.
This reverts commit 0ee4506c3a.
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33927>
At least, NAVI10, NAVI21 and NAVI24 are affected by this what looks
like a hardware bug when primitive restart is changed and no context
registers are written between draws. It seems the hardware doesn't
consider primitive restart at all in this situation.
Adding SQ_NON_EVENT(0) as suggested by Marek seems to fix it reliably
without introducing any overhead. It's basically a NOP packet that adds
a small delay.
Fixes new VKCTS coverage dEQP-VK.transform_feedback.primitive_restart.*.
Also fixes this old vkd3d-proton issue.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7258
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33929>
IDVS2 uses a new special FAU value shader_output to determine what the
vertex shader is supposed to store as output.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33910>
This introduce a new pass that wrap store_output to check for
shader_output bitfield.
bifrost_nir_specialize_idvs nows only lower shader_output to a constant
value and removal of store_output is handled by DCE/dead_cf passes.
This is required for Avalon's deferred new ABI.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33910>
On Avalon, this is a bitfield that holds information on what
values a vertex shader should output.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33910>
Lowering the indirect derefs multiple times leads to very inefficient
shaders because of all the control flow inserted.
In particular on some DGC tests with mesh shaders, the tests can spin
for 1hour on an i7 and still not complete compilation.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33809>
Per push descriptor spec:
Each element of pDescriptorWrites is interpreted as in
VkWriteDescriptorSet, except the dstSet member is ignored.
Cc: mesa-stable
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33948>
This forces software vertex processing wia the draw module and should
hopefully test the exact same codepaths that the r300 chipsets without
built-in vertex engines use.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28216>
If the allocation didn't fit within the segment the loop incorrectly
freed ids of a range of different segments due to the loop redeclaring
i.
Fixes: d4085aaf56 ("util: add util_idalloc_sparse, solving the excessive virtual memory usage")
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33934>
We need to use nir_is_helper_invocation instead of
nir_load_helper_invocation, to correctly predicate stores after demote.
Identified in a Piglit on AGX a year ago but I forgot to upstream this.
Fixes: 586da7b329 ("nir: Add nir_lower_helper_writes pass")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33939>
needed to avoid regression from the next patch.
backported because the next patch is too
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Cc: mesa-stable
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33939>
There are some jobs that were missing the FARM variable, which is useful
to lava_job_submitter.py to classify how it should interact with each
LAVA server and how it should assemble the job definition.
Right now, we use a set of regexex with the RUNNER_TAG variable, but
that is error-prone.
Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33888>