So far, we were doing the compilation to qpu when the pipeline was
created (as part of vkCreateGraphicsPipeline).
But this would not be correct when some specific descriptors are
involved, like textures. For that case some nir lowerings depend on
the texture format, and that info is not available until the specific
descriptors are bound to the command buffer. In the same way, the same
command buffer with a given pipeline could get their descriptor bound
again.
So it would be needed to support compilation variants of the same
shader. So finally, the v3d_key would work as keys, as the variants
would be tracked with a hash table.
This commit introduces the new structures for that. What we were
building as the final qpu shader would become the initial default
variant for the pipeline. We are also saving the keys used at that
point, to avoid needing to fully regenerate them when a new variant is
created. Not just for performance, but also to avoid needing to track
the graphics pipeline create info structure.
The code to handle updating the current variant would be done on
following commits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This include SAMPLER, COMBINED_IMAGE_SAMPLER and SAMPLED_IMAGE
descriptors.
In order to support them we do the pre-packing of TEXTURE_SHADER_STATE
and SAMPLER_STATE when Images and Samplers (respectively) are
created. Those packets doesn't need to be tweaked later, so we upload
them to an bo.
A possible improvement of this would be that the descriptor pool
manages a bo for all descriptors, that suballocate for each descriptor
allocated. This is what other drivers do (and as far as I understand,
one of the reasons of having a descriptor pool).
Immutable samplers are not supported, will be handled on a following
patch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
v3dv_descriptor is going to be expanded with more data, so it doesn't
make sense anymore to handle a fake descriptor for the push
constants. Introducing a new struct, that is just a pair
bo/offset. Initially named v3dv_resource, as it could be the base to
reuse bos for different resources (like assembly bo)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were configuring the TLB to use ABGR1555, but that doesn't really
give us what we want. There were two issues:
* We were using the wrong Texture Data Format and Output Image
Format. In fact those we need to use were not included on the
packet file.
* Even using the correct one, we need to do a RB swap to match the
semantics of the Vulkan format.
This patch fixes both issues. As we are here we keep the formats we
were already used, that would provide support for r5g5b5a1.
So this patch makes tests like the following going from skip to pass:
dEQP-VK.texture.filtering.2d.formats.r5g5b5a1_unorm.nearest
And the following test from fail to pass:
dEQP-VK.texture.filtering.2d.formats.a1r5g5b5_unorm.nearest
Note that the R5G5B5A1_UNORM_PACK16 is not mandatory, but as we
already made the effort to understand them and get them working let's
just keep it on the list
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This fixes multi-layer vkCmdClearAttachments CTS tests. The underlying
problem here is that even though this command runs inside a render pass,
it is implemented as a separate job that emits its own RCL to program
render target color clears, so we should not emit the subpass RCL for it.
Fixes 250+ CTS tests (all but a1r5g5b5) in:
dEQP-VK.api.image_clearing.core.clear_color_attachment.cube_layers.*
dEQP-VK.api.image_clearing.core.clear_color_attachment.multiple_layers.*
dEQP-VK.api.image_clearing.core.clear_depth_stencil_attachment.multiple_layers.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We had changed the interface for job starts so they take the subpass index
rather than a boolean indicating whether the job starts a new subpas, but we
forgot to update this accordingly.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were not considering that the depth of the image is minified according
to its miplevel. For some reason this only seemed to show up for tiled
images.
Fixes (except a1r5g5b5 format):
dEQP-VK.api.image_clearing.core.clear_color_image.3d.optimal.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
If a format is not supported by the TLB, we can still use the TLB path
if we setup the render target using a compatible format. The only caveat
is that for clears we need to pack the clear value using the original
format of the underlying image, not the compatible format.
With this change we get to use the TLB path successfully for all supported
image formats (except a1r5g5b5, at least for now) so long as the region starts
at (0,0), and we only need to consider fallback paths for partial copies
and clears, not because of the format.
This gets us to pass a few extra hundreds of tests in:
dEQP-VK.api.image_clearing.core.clear_color_image.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This is required to pass line rasterization tests in CTS while exposing
at least 4 bits of subpixel precision, which is the minimum required
by the spec. We are currently exposing 6 bits, however, if we select
diamond exit instead of perp end caps rasterization, then even if we
lower subpixel precision bits to 4 bits, we'd still fail one of the tests.
Fixes:
dEQP-VK.rasterization.flatshading.line_strip
dEQP-VK.rasterization.flatshading.lines
dEQP-VK.rasterization.interpolation.basic.line_strip
dEQP-VK.rasterization.interpolation.basic.lines
dEQP-VK.rasterization.interpolation.projected.line_strip
dEQP-VK.rasterization.interpolation.projected.lines
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The name suggests that this method emits the full graphics pipeline,
but that is not the case (ie: scissor is emitted at a different
point).
Right now that method is mostly emitting the gl_shader state plus some
other packets. So we just renamed it to emit_gl_shader_state, and move
the other packet emission to new emission methods.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
First this makes it so we only clear dirty stencil state if we actually
emit the stencil packets. Second, now we check if we need to emit stencil
whenever a new pipeline is bound, since a new pipeline may not change the
dynamic stencil state but might still be changing other aspects of stencil,
which means that even if the dynamic stencil state is not dirty, we might
still need to emit new stencil packets.
This fixes a regression in VkRunner test depth-buffer.vk_shader_test after
we dropped the redundant emission of stencil state, since that redundant
emission was happening unconditionally whenever we had a new pipeline.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The current implementation assumed that we would clear all dirty state
after we have emitted a pipeline, but that is not always true. In
particular, we don't emit blend constants unless we need them, so we
can't clear its dirty bit until we have bound a pipeline that actually
requires them.
The change implemented here has individual emit functions clear pipeline
states they hadle as they emit the corresponding state and we clear
the dirty pipeline bit at the end.
This fixes some CTS pipeline blend tests where we have multiple draws
with blending and only some of them require blend constants. In this case,
the original behavior would clear the blend constants dirty bit on draw
calls that don't actually emit blend constants (because they don't need
them), making the later draw calls that do need them fail because they
don't emit them either (since the previous draw calls cleared the dirty
bit).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Not quite sure why this is required though. Conversion from/to
sRGB happens on tile loads and stores, with the tile buffer
being always linear, so there should be no difference.
Fixes all test failures in:
dEQP-VK.pipeline.blend.format.r8g8b8a8_srgb.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
In this mode, which can be activated with V3D_DEBUG=always_flush like
in the GL driver, we flush every draw call separately. For now this
is useful for debugging, but we can also set the flag internally on
specific jobs when we identify scenarios where we need the same behavior.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
It is valid to submit with an empty list ofcommand buffers, however,
we still need to wait on the pWaitSemaphores provided and only signal
the pSignalSemaphores and fence once we have finished waiting on them
to honor the semantics of the submission.
Because waiting and signaling happens in the kernel, the easiest way
to do this is to submit a trivial no-op job to the GPU. To do this,
we need to refactor some of our code so that code that might have been
operating on a command buffer starts operating on a job instead, so we
can resuse most of our infrastructure to create the no-op job.
Additionally, because no-op jobs are created internally by the driver,
we are responsible for destroying them too. For this, we bind a fence
to each no-op job we submit and we test for completion of in-flight
no-op jobs (and destory them if completed) every time vkQueueSubmit
is called.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
There are a handful of tests that simulate 'out of memory' situations
during swapchain image creation, and these can lead to failed job
allocations when the driver is running on the prime blit path, as that
involves creating a command buffer. The tests expect us to handle this
scenario gracefully and return an appropriate OOM error as a result.
This make sure we don't try to dereference a job if we failed to allocate
it so we don't crash and can return the OOM error gracefully in the
process.
Fixes:
dEQP-VK.wsi.xlib.swapchain.simulate_oom.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Vulkan has Z NDC range in [0, 1], we where using OpenGL's [-1, 1].
Fixes:
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_deltasmall
dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_deltaone
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
In this scenario we can end up generating a clip window where
the max coordinates are smaller than the min coordinates and the simulator
asserts.
Fixes:
dEQP-VK.draw.scissor.dynamic_scissor_outside_viewport
dEQP-VK.draw.scissor.static_scissor_outside_viewport
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were not doing this right for images created with VK_IMAGE_TILING_LINEAR.
Also, only assign a DRM modifier if the image has been created for WSI.
This fixes a bunch of CTS tests that use copies to linear images to verify
the result of rendering.
Fixes multiple failures in:
dEQP-VK.draw.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This triggers when dumping CLIF because the dump process involves
internally mapping all the BOs. We could unmap them there after we
are done, but there is really no reason why we need to assert on this,
so let's just keep things simple and unmap. If the user is really
double mapping, that should be caught by the validation layers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
It is possible to specify a depth/stencil clear but then have no
actual depth/stencil attachment used in the subpass. In that case
we are already skipping the store.
Fixes:
dEQP-VK.renderpass.suballocation.unused_clear_attachments.*depthonly_unused
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
For some reason our backend compiler doesn't have an implementation for
usubborrow, only for uaddcarry. We could add it, however, the existing
uaddcarry implementation also seems to fail some of the CTS tests,
which pass if we lower.
Fixes:
dEQP-VK.glsl.builtin.function.integer.uaddcarry.*
dEQP-VK.glsl.builtin.function.integer.usubborrow.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
They still need some review to get some real final values, but what we
had before were somewhat too low. Increasing them a little. This
allows to get some CTS tests from skip to pass, which afais they are
using reasonable values.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>