To do this we just implement the stencil blit as a masked color bit
with uint8 format. This allows us to support blitting on combined
depth/stencil formats, and therefore, also partial image copies
for these formats.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
For this we use blits with nearest filtering and choose a compatible
format for the render target if the copy format is not renderable.
This works for all supported formats except combined depth/stencil
(for which we don't support blitting for now) and compressed formats.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
So far we were just asserting or aborting if any of the internal
method used during the pipeline creation failed.
We needed to change the return value of several methods, in order to
bubble up the proper memory allocation error.
Note that as the pipeline creation is complex and splitted in several
methods, if an error happens in the middle, it returns back, and rely
on the higher level to call PipelineDestroy. This method needs to take
into account that some of the resources could have not been allocated
when freeing it.
Also note that v3dv_get_shader_variant is used during the pipeline
bind, as with the new resources bound, we need to check if we need to
recompile a new variant. At that moment we are not creating a new
vulkan object so we can really return a OOM error. For now we just
assert on that case.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This was allocating image views in the stack, which was kind of
hackish, and of course was expecting that allocated Vulkan objects
could be immediately freed after being recorded in the command buffer
which is not always safe to do in the general case (even if it was
here). This makes things more consistent and reliable.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This uses the framework to register private commmand buffer objects
that get freed automatically when the command buffer is destroyed by
the application.
This change also moves the descriptor set pool that the meta blit path
uses to allocate descriptors for the blit source textures, from the
device to the command buffer, so we can have a descriptor pool per
command buffer. This is necessary to ensure correct behavior when
doing multi-threaded command buffer recording (alternatively, we would
have to lock around the descriptor set allocation code, which would be
undesirable).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This allows the driver to register private Vulkan objects it creates as part
of command buffer recording (usually for meta operations) in the command
buffer, so they can be destroyed together with it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The hardware can't do sampling from raster depth/stencil textures and
1D images are always raster, even if the user requested optimal tiling.
Using an image as the source of a blit is a transfer source operation,
so we can't expose that either, as blitting involves sampling.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
In this case the hardware seems to copy the bits that actually fit
in the destination instead of clamping to the maximum value allowed
by the bit size of the destination components like Vulkan expects.
Fix this by adding code to clamp the color results to the bit size
of the destination components.
It should be noted that this is a general issue with the hardware,
and while we can fix it here for blits done by the driver, user
shaders writing outside the range of the destination bit size will
have the same issue and we probably don't want to add code to clamp
every single render target write in every shader with integer format.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The pipeline stages have a reference to the NIR code produced from
the SPIR-V shader modules, but they never destroy it.
It should also be noted that our coordinate shader stage was sharing
the NIR with the vertex shader stage, which is kind of tricky to handle
and probably very error prone. Just make sure each pipeline stage has
owns it NIR shader and that we always free it when the stage is
destroyed.
Also, for the case of NIR modules created by the driver internally,
we always need to clone them, since the driver will destroy the NIR
as soon as it is done creating pipelines with it. We could also not
clone it and let the pipeline stage take ownership of the NIR code for
NIR modules, but that would be inconsistent with how ownership works for
SPIR-V modules.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This needs to be updated everytime we bind a new pipeline, but we can
bind a pipeline and not have an actual job yet, so we want to postpone
this until we actually need to emit CFG_BITS, during the pre-draw
setup.
Also, rename the update helper to be about the job rather the command
buffer, since it is updating state that we track per job.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were asserting that we had a valid subpass index, but we can have
meta operations that run outside a render pass, such as for blitting.
If we allow this, then we also need to account for the fact that
pipelines can be bound outside a render pass too.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
While very limited in scope, this might be the most efficient way to blit
when applicable. In fact, we might also want to use this for the image copy
commands when possible instead of the TLB.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This is similar to the scenario where we have a submit without
any command buffers, even if we don't have any actual GPU work to do
we still might need to signal fences/semaphored and possibly wait on
previous jobs to finish, so we need to submit something to the kernel
to get all that done right.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The design for queries in Vulkan requires that some commands execute
in the GPU as part of a command buffer. Unfortunately, V3D doesn't
really have supprt for this, which means that we need to execute them
in the CPU but we still need to make it look as if they happened
inside the comamnd buffer from the point of view of the user, which
adds certain hassle.
The above means that in some cases we need to do CPU waits for certain
parts of the command buffer to execute so we can then run the CPU
code. For exmaple, we need to wait before executing a query resets
just in case the GPU is using them, and we have to do a CPU wait wait
for previous GPU jobs to complete before copying query results if the
user has asked us to do that. In the future, we may want to have
submission thread instead so we don't block the main thread in these
scenarios.
Because we now need to execute some tasks in the CPU as part of a
command buffer, this introduces the concept of job types, there is one
type for all GPU jobs, and then we have one type for each kind of job
that needs to execute in the CPU. CPU jobs are executed by the queue
in order just like GPU jobs, only that they are exclusively CPU tasks.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Most of our state doesn't carry over across jobs, so it needs to be re-emitted.
For example, if we have two render passes running back to back using the
same pipeline, the application could decide to only bind the vertex buffer
or/and the pipeline just once, but as soon as we record the second render
pass and create a new job for it we will need to re-emit the shader state
record for it just because it is a new job.
We could probably only do this for jobs inside a render pass, since those
are the only ones that actually draw something and need to care about
dirty state, however, there is no harm in doing this for all jobs, for the
same reason.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were enabling VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT for
any format valid for texturing, but for example, right now we don't
support linear filtering on any depth format.
This is needed to get some hundreds of tests like this:
dEQP-VK.pipeline.sampler.view_type.1d.format.r32g32_sfloat.mag_filter.linear
properly skipped (those were all Crashes with the simulator, and
almost all Fails with the real device).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
When testing if we could merge the new subpass into the previous one
We were taking the subpass index from the state (which isn't updated
to the new subpass until a bit later when the job for the new subpass
has been settled). This means that we were doing the merge checks with
the previous subpass, not the current one.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
There are some texture operations (like mipmap query levels) that
doesn't require a sampler. In fact, you should ignore it. So we need
to take it into account when combining the
indexes. nir_tex_instr_src_index is returning a negative value to
identify that case, but as we are using a uint32_t to pack both values
(for convenience, easy to pack/unpack the hash table key), we just use
a uint value big enough to be a wrong sampler id.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This requires that we emit a specific draw command and that we emit
the base instance if not zero right before the instanced draw call.
Notice that we were already doing this for instanced indexed draw
calls, so here we are only adding this for non-indexed draw calls.
We also need to flag whether the vertex shader reads the base instance
in the shader record (which it will if it reads uses gl_InstanceIndex,
as that is lowered in Vulkan to base_instance + instance_id).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Vulkan lowers gl_InstanceIndex to load_base_instance +
load_instance_id, so we need to implement loading the base instance in
the compiler.
The base instance is set by the BASE_VERTEX_BASE_INSTANCE command
right before the instanced draw call and it is included in the VPM
payload together with the InstanceID and VertexID if this is requested
by the shader record.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
OpenGL doesn't have the concept of individual texture and sampler, so
texture and sampler indexes have the same value. v3d compiler uses
this assumption, so for example, the texture info at the v3d key
include values that you need to use the texture format and the sampler
to fill (like the return_size).
One option would be to adapt the v3d compiler to handle both, but then
we would need to adapt to the lowerings it uses, like nir_lower_tex,
that also take the same assumption.
We deal with this on the Vulkan driver, by reassigning the texture and
sampler index to a combined one. We add a hash table to map the
combined texture idx and sampler idx to this combined idx, and a
simple array to the opposite map. On the driver we work with the
separate indices to fill up the data, while the v3d compiler works
with the combined one.
As mentioned, this is needed to properly fill up the texture return
size, so as we are here, we fix that. This gets tests like the
following working:
dEQP-VK.glsl.texture_gather.basic.2d.depth32f.base_level.level_2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were using the subpass render target index to index into the framebuffer,
which is not correct, since the framebuffer is defined for the render pass.
We should use the attachment index instead.
Fixes:
dEQP-VK.renderpass.suballocation.attachment_allocation.roll.{40,48}
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were using the subpass render target index to index into the framebuffer,
which is not correct, since the framebuffer is defined for the render pass.
We should use the attachment index instead, which we were already computing
but that we were not actually using for indexing by mistake.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We must update our check for whether the render area is tile-aligned for
each subpass, since the hardware will update tile sizes for each RCL.
Fixes:
dEQP-VK.renderpass.suballocation.attachment_allocation.roll.8
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
While this was already being achieved by the scissort rect set on the
pipeline, we still want to limit the render area to we reduce the tile
coverage of the pass as much as possible and avoid unnecessar
tile load and store operations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This is the same as the subpass start version, only that it won't
emit subpass clears. This is necessary when resuming a subpass
from a partial clear to make sure we don't try to clear subpass
attachments again.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Since a meta partial clear starts a new render pass, we need to store
all state that can be changed with vkCmdBeginRenderPass.
Also, since the meta clear pipeline sets dynamic state, we also
have to restore that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>