Without this, we fail to register-allocate the shader used in the
dEQP-VK.ssbo.phys.layout.random.8bit.scalar.78 VK-CTS test case.
Yeah, this sucks, but failing to compile sucks even more. We need a new
register allocator plan here.
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33124>
We can't use VK_SHADER_STAGE_ALL here, because we don't support geometry
and tesselation shaders. Additionally, the DDK doesn't support the
vertex stage, so let's not even try that for now; it probably won't
work.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32710>
Implement as_uniform with a simple mov, as the HW doesn't have
uniform registers (registers shared by all threads in the warp)
like some other hardware does.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32710>
If we determine that the amount of varyings will fit within the 8-bit
offset of LD_VAR_BUF[_IMM], instruct the compiler to use it for varyings
and skip setting up Attribute Descriptors.
This should save a bit of memory and overhead in reading varyings.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32969>
Introduce a varying load count pass to get the maximum amount of varying
loads from a fragment shader (prior to optimization passes), in order to
only allocate as many Attribute Descriptors as required. This will
generally lead to smaller buffers in SRT0 for fragment shaders.
As the amount of ADs is now dynamic based on the shader, we need to
lower varying loads early for fragment shaders in v9+, as the amount of
ADs will determine the offset for dummy_sampler, required during
nir_lower_descriptors.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32969>
The current implementation uses LD_VAR_BUF[_IMM] to look up varyings,
which limits the number of varying components to 64 due to an 8-bit
offset value.
As this does not align to maxVertexOutputComponents (128), this change
replaces the use of LD_VAR_BUF[_IMM] with LD_VAR[_IMM] + Attribute
Descriptors, which do not have this limitation.
As allocating Attribute Descriptors is potentially expensive, this can
be further optimized by falling back to LD_VAR_BUF[_IMM] in cases where
we can ensure we do not use more than 64 varying components.
This change currently does not change behavior for gallium/panfrost,
though that should be done as well.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32969>
The fields "Attribute stride" and "Packet stride" are in the wrong
order, and "Packet stride" should not be shr() modified.
This has probably not shown up as an issue before due to the use of
LD_VAR_BUF[_IMM] for varyings, which does not require us to create
Attribute Descriptors with type vertex_packet.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32969>
The benefit of macros here is that they don't care about constness,
which is going to be benefitial once we stricten constness a bit here.
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32851>
Ensure we've read all the relevant NIR state before freeing it for the
current shader.
Also ensure we free the shaders in the same order we compile them.
Fixes: d93f9d6d1a ("panvk: use static noperspective when statically linking VS and FS")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33011>
We're doing a better job at selecting the tiler hierarchy mask in PanVK,
so let's move that to common code and reuse it for the Gallium driver as
well.
The logic to disable the first level for large tile-sizes has been left
at the call-sites, because this is specific to V10 GPUs and later, so it
doesn't apply to the JM code-paths.
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32866>
* GLES3.x is only valid for x <= 2
* The expected error is GLXBadProfileARB, not BadValue
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33036>
This job is somehow failing to expand $RUNNER_TAG, and it seems to have
happened around the time that the last entry from the variables list was
removed.
Let's remove this, it's no longer needed anyway. And it seems to fix the
problem, so yay.
Fixes: 61d9c47944 ("ci/lava: Use CI_JOB_TIMEOUT instead of separate variable")
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33086>
As described in the comment, enabling the round_to_nearest_even results
in the upper 2^-9 of the texel i being sampled at i+1. This appears to
be allowed by the spec, but triggers a CTS bug[1]. Changing this behavior
is not necessary (we could fix the CTS), but is desirable regardless
because of the precision improvement.
[1]: https://gitlab.khronos.org/Tracker/vk-gl-cts/-/issues/5547
Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32985>
The documentation says that if we don't use force_delta, the LOD will be
-infinity for non-active lanes before bias and clamp are applied. This
is not what we want, so let's instead assume all threads are active, and
let helper-invocations do their job to compute correct values.
While this is only needed for the second iteration, let's just leave it
on for both for simplicity.
Fixes: e317136536 ("pan/va: Add support for nir_texop_lod")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33069>
Update expectation files for the test
runs with kernel 6.13-rc4.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Reviewed-by: David Heidelberg <None>
Reviewed-by: Sergi Blanch Torné <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32788>
Move to 6.13-rc4 for all mesa-ci jobs except anv-jsl.
Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com>
Reviewed-by: David Heidelberg <None>
Reviewed-by: Sergi Blanch Torné <None>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32788>
Applications tend to forget to describe subpass dependencies, especially
when it comes to write -> read dependencies on attachments. The
proprietary driver forces "others" invalidation as a workaround, and this
invalidation even became implicit (done as part of the RUN_FRAGMENT) on
v13+.
We will consider adding a dri-conf hook for this option in the future,
but for now, let's just keep it as an opt-in debug flag.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33056>
Input attachment reads are lowered to image reads and thus require
a flush of the read-only L1 caches.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33056>
The expansion of DUMP_CL is missing parenthesis, making the dumping of
descriptors incorrect.
Fixes: 3b69edf825 ("pan/genxml: Enforce explicit packed types on pan_[un]pack")
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33040>
Right now, we have a problem when we flush draws inside a render pass
and we don't have enough information to re-emit the framebuffer/tiler
descriptors.
Turns out the only situations where this happens is when an occlusion
query end happens, but we shouldn't really flush the draws in that case.
What we should do instead is record the OQ in our command buffer, so we
can signal OQ availability when the fragment job is done.
In order to solve that, we add an OQ chain to the command buffer to
track OQs ending inside the render pass. We then walk this chain at
fragment job emission time to signal the syncobjs attached to each
query.
This also simplifies the whole occlusion query synchronization model:
instead of waiting for each syncobj individually, we now wait on
the iterators to make sure all OQs have landed. Thanks to this new
synchronization, we can batch OQ reset/copy operations and make the
command stream a lot shorter when big query ranges are copied/reset.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
We have wrappers distinguishing staging registers from sratch registers,
so let's use cs_sr_reg64() here.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
The SYSTEM scope triggers CPU interrupts we don't really need, so let's
use the CSG scope to avoid those. Note that the scope doesn't encode
the visibility aspect, meaning changes to the sync object with a CSG
scope will still be instantly visible to the CPU, it's just that the
CPU needs to poll the value to detect a change, which is basically what
we're doing for syncobjs attached to events/queries, so we're good.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
Let's prevent clang-format from adding the semi-colon on a new line when
we use cs_{continue,break}();
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
We already do that in the other cs_emit(b, BRANCH, I), so let's fix this
path too.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
The increment was wrong, which ended up generating a lot more stores
than we need.
Fixes: bf05842a8d ("pan/cs: Add an event-based tracing mechanism")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32973>
The spec says
vkCmdCopyQueryPoolResults is considered to be a transfer operation,
and its writes to buffer memory must be synchronized using
VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT before
using the results.
While STORE_MULTIPLE is not exactly VK_PIPELINE_STAGE_TRANSFER_BIT /
VK_ACCESS_TRANSFER_WRITE_BIT, we can still rely on user barriers to do
the right thing (e.g., flush caches for host access).
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32697>
When VK_QUERY_RESULT_WAIT_BIT is set, we rely on sync wait. When
VK_QUERY_RESULT_WAIT_BIT is not set, no wait is needed.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32697>
We can guarantee ordering with this sequence of async cmds
RUN_FRAGMENT ->
(signal and wait SB_ITER) ->
FLUSH_CACHE2 ->
(signal and wait DEFERRED_FLUSH) ->
SYNC_SET32
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32697>
The spec says
VUID-vkCmdBeginQueryIndexedEXT-None-00807
All queries used by the command must be unavailable
and panvk_cmd_reset_occlusion_queries is synchronous.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32697>
The spec says
Resetting a query via vkCmdResetQueryPool or vkResetQueryPool sets the
status to unavailable and makes the numerical results undefined.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32697>
The spec says
VUID-VkQueryPoolCreateInfo-queryCount-02763
queryCount must be greater than 0
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32697>
The spec says
After query pool creation, each query is in an uninitialized state and
must be reset before it is used.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32697>
This switch all __gen_unpack functions to macros to keep address space
information when working with OpenCL C.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32962>
Because of OpenCL C, we need a way to retain address space information
contains with the pointers.
As a result this switch all [un]pack functions to macros, resulting in
pointers retaining their respective address space information.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32962>