We can use this to specify the maximum vertex index that can
be accessed, which the hardware will use to detect and prevent
out-of-bounds accesses to vertex buffers.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29425>
Before VK_KHR_maintenance5 point size is undefined unless the
shader explicitly writes it, but this extension changes this and
expects a default point size of 1.0 if none has been written.
We accomplish this by emitting a POINT_SIZE packet with the
default point size the first time we draw with a POINT primitive
in the job. If the shaders used in the draw call doesn't write
point size then the hardware will take the point size from the
state set by the packet. If the shader does write to point size
then the value written in the shader will be used instead.
Passes all tests we support in:
dEQP-VK.rasterization.primitive_size.default_size.points.*
when forcing maintenance5 enabled.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29413>
Note that when it is dynamic, it goes to the codepath of having
enabled raster_enabled at the pipeline, even if at the end it will be
disabled. The fragment shader compilation, and the stage keys, depends
on rasterization being enabled or not. As mentioned, if the state is
dynamic, it assumes that the rasterization is enabled.
That would work, as then the rasterization could be discarded at the
CFG_BITS package, by the command buffer at draw time. We just have a
(discarded) shader slightly more complex that it would have been with
rasterization enabled.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28980>
Since VK_EXT_extended_dynamic_state2
We just move all related with depth bias to the command buffer. There
is not good reason to compute and save it at the pipeline.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28980>
This commit introduces a significant change when we emit STENCIL_CFG,
with any dynamic state: we stop to use cl_emit_with_prepacked, and use
directly cl_emit. The reason is that now most of the STENCIL_CFG
parameters are dynamic, any improvement of using
cl_emit_with_prepacked is minimized. Also gets the code simpler, and
avoid the need to be extra careful with the fact that
cl_emit_with_prepaked doesn't override values.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
As with CmdSetViewport, we need to provide a custom implementation
because we want to call and save the outcome of viewport_compute_xform
when the viewport is set, not during emission.
We can just call v3dv_CmdSetViewport, as that one is already calling
vk_common_SetViewportWithCount.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
Mostly equal to vkCmdBindVertexBuffers, but adding strides, that with
VK_EXT_extended_dynamic_state become dynamic, and setting pSizes.
It is worth to note that at this moment we don't use
CmdBindVertexBuffers2 pSizes.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
Specifically to use the common vk_dynamic_graphics_state.
The advantage of using the common struct is not only reducing the size
of our custom one, but also using common helpers (like all those cmd
buffer setters), and a lot of the logic that in the future will be
used for other extensions.
Some notes:
* We still keep dirty flags, for things like PIPELINE,
DESCRIPTOR_SETS, etc. Other driver do the same. FWIW, this is also
an improvement, as before we were mixing those with the per-spec
Vulkan dynamic info.
* For the port viewport/scissor we still keep some data on a custom
structure, as we cache the translate/scale info that is derived
from scissor/viewport, but used in three different places.
For that we also maintain a custom implementation of
CmdSetViewport, that computes translate/scale, and call the common
implementation.
* We make the same for color_write_enables. The vulkan runtime saves
it as a 8-bit bitset, with a bit per attachment. But when combining
with color_write_mask you need a 32bit with 4 bits set per
attachment. To avoid recompute it during emission, we also cache
the color_write_enables, using the runtime just to track the dirty
status.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27609>
Even if the pipeline is the same.
The followin sequence, used on
dEQP-VK.dynamic_state.*.double_static_bind tests, is valid:
1. Bind pipeline with some static state.
2. Set state command for that static state (to a bad value).
3. Bind the same pipeline again.
4. Draw.
So on 3 we need to ensure to load again the pipeline static state.
Fixes: dEQP-VK.dynamic_state.*.double_static_bind
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28897>
With the simultaneous use flag we can reuse the same command
buffer multiple times. That means, for example, that we can
have an instance of a job running in the GPU while we are
submitting another one for execution to a queue.
This scenario is problematic with dynamic rendering and job
suspension because suspended jobs need to be patched with the
resume address at queue submit time, and thus, if we have another
instance of the same job currently executing in the GPU we could
stomp its resume address, which could be different.
To fix this, at queue submission time, when we detect a suspending
job in a command buffer with the simultaneous use flag, we clone the
job and create its own copy of the BCL so we can patch the resume
address into it safely without conflicting with any other instance
of the job that may be running.
We need to flag these clones as having their own BCL since
we would have to free it when the job is destroyed, unlike other
clones that don't own any resources of their own. Also, because
this job is created at queue submit time, it won't be in the
execution list of the command buffer, so it won't be automatically
destroyed with it, so we need to add it to the command buffer
as a private object.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28521>
We had these pointing to the original job instead of pointing
to the cloned job. This can be confusing, particularly, if we
then emit commands that include references to new BOs into the
cloned jobs, since we would then try to insert these BOs in the
original jobs instead of the clones, which was the situation
we had when we implemented resume address patching with dynamic
rendering.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28521>
This will clone the job but it won't automatically put it in the
job list of a command buffer. This will come in handy to handle
the required job cloning for suspending jobs with the command buffer
reuse flag in a follow-up patch.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28521>
With dynamic rendering secondary command buffers can start subpasses
so we need this. Outside dynamic rendering secondary command buffers
won't be calling here since they are restricted to record commands
within a subpass.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
If a secondary command buffer recording a dynamic pass has the
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT flag
then the rendering information for it should come from a
VkCommandBufferInheritanceRenderingInfo struct in the pNext
chain instead of the usual render pass information in the
VkCommandBufferInheritanceInfo struct. We take the information
from the new struct and build a render pass description from it
assuming a setup without a framebuffer (which is optional for
regular render passes too).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
This was used only in secondary CL command buffers so it made
sense but with dynamic rendering we are going to also have
regular CLs also in secondaries (since secondaries can now
record full dynamic rendering passes), so renaming this to
INCOMPLETE makes more sense, since this is really what they
refer to: parts of CLs that are intended to be merged into
other primaries through branching.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
Dynamic rendering allows the client to suspend recording of a
render pass and have it continued in a different command buffer.
When a suspended command buffer is submitted to a queue, the
resuming command buffer must be te next one in submission order.
This means we need to be able to "merge" or "stitch" together
these command buffers at submit time.
To accomplish this, when we suspend a command buffer we emit
a BRANCH instruction to finish it. Then at submit time, when
we know the resuming job, we patch the BRANCH address with the
address of the resuming binning list (bcl). This is very similar
to how we execute secondary command buffers inside a render pass.
Also, only the last resuming job should flush the binning lists
in the bcl since we won't have processed the full binning command
list until we have execute the last linked job in the resume
list.
Since all jobs and command buffers in the suspend/resume chain
must be part of the same dynamic render pass, we only need to
produce and emit the render command list (rcl) once.
Since the way we implement stitching is that we branch from the
suspending job into the resuming one, the first job suspending
will link into all the resuming jobs necessary to complete the
chain, therefore, after the stitching is complete, we only want
to submit the first job in the suspend/resume chain, and thus,
we only produce and emit the rcl for this one job.
Notice as well that suspending only affects the last job
recording a dynamic rendering pass (the one that needs the branch
so we can resume execution with another job in another command
buffer).
Resuming affects all jobs in the dynamic render pass, since
we won't produce RCLs for them (as only the originating job
on the suspend/resume chain will emit the RCL).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
With this we are able to run basic dynamic render passes, however,
we are still missing a few things like support for secondary
render passes, suspend/resume, etc that will be adding in follow-up
patches.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
Since the plan is to leverage our render pass infrastructure, we
also need to setup a framebuffer from the rendering info provided
with dynamic rendering.
We allocate the framebuffer lazily, only once, if a dynamic render
pass is used. To do this, we make it so it can hold the maximum
number of attachments possible with our hardware.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
It is allowed for a shader to enable the multiview extension
even if the draw call in which it is used doesn't use multidraw.
This allows the shader to still use gl_ViewIndex, which will
always be 0 in that scenario.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27978>
This is based on a lowering that we are already using in the OpenGL
driver.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28171>
We need to allocate "shared size" bytes for each workgroup but
we were incorrectly multiplying by the number of workgroups in
each supergroup instead, which would typically cause us to allocate
less memory than actually required.
The reason this issue was not visible until now is that the kernel
driver is using a large page alignment on all BO allocations and
this causes us to "waste" a lot of memory after each allocation.
Incidentally, this wasted memory ensured that out of bounds
accesses would not cause issues since they would typically land
in unused memory regions in between aligned allocations, however,
experimenting with reduced memory aligments raised the issue,
which manifested with the UE4 Shooter demo as a GPU hang caused
by corrupted state from out of bounds memory writes to CS
shared memory.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27675>
Both OpenGL and Vulkan drivers share the same V3D_CSD definitions.
Therefore, move it to a common place instead of duplicating.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26448>
The indirect CSD user extension allows the creation of a CSD
job linked to a CPU job. When we submit the CPU job, the CPU job
will run when the indirect CSD dependency is completed, map the
indirect buffer to read the CSD dispatch parameters and reconfigure
the CSD job accordingly.
Using the indirect CSD user extension, allows us to use the multisync
user extension to synchronize this type of job, which is not currently
possible from user-space without stalling.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26448>
We can use the actual bpp of each color attachment to compute real
tile memory requirements, which may allow us to choose a larger tile
size configuration than in V3D 4.2 in certain scenarios.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25450>
The pipeline layout lifetime tests in CTS allocate some descriptors
and bind them to the command buffer without actually ever writing
valid resources to them since they never actually execute the command
buffers, so we want to be careful at bind time and not assume the
resources exist.
Fixes crashes in dEQP-VK.api.pipeline.pipeline_layout.lifetime.*
Fixes: 95f881adbd ('v3dv: add support for sampling simple 2D linear textures')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25172>
V3D can't sample linear images (other than 1D), however, some applications
will require this to work. Particularly, our swapchain images may need to be
linear (for display), so sampling from them won't work.
This change detects the case where we are binding a descriptor which attempts
to sample from a simple 2D linear texture, transparently creates a tiled
copy of the image and rewrites the descriptor to refer to the tiled image
instead. This will be slow but will allow some applications that require this
to work (i.e. some aspects of Android's user interface).
As of this patch, this only supports sampling linear images with a single
miplevel and layer from single-plane images in non-arrayed descriptors. We
could handle other cases too with a bit more work though.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9712
Tested-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25048>
KHR_maintenance1 allows clients to specify a negative
height, but not a negative width, so assert on that and
simplify the computations for the horizontal dimension.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23489>
They are exactly the same, so it's safe to do the replace
Also gen OS_TIMEOUT_INFINITE var with rusticl_mesa_bindings_rs by OS_ prefix and
include "util/os_time.h" in rusticl/rusticl_mesa_bindings.h
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23401>