When calling GetPhysicalDeviceProperties2 we were ignoring and logging
the structures for extensions not supported. But for the case of
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PCI_BUS_INFO_PROPERTIES_EXT we
already know that we are not going to support it, so let's just do
nothing (not even logging) when passed.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7497>
V3D doesn't provide any means to acquire timestamps from the GPU
so we have to implement these in the CPU.
v2: enable timestampComputeAndGraphics and set timestampPeriod (Piñeiro)
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7373>
This is another case of a feature that is implemented in the compiler
and that only required that we set the shader key properly from the
pipeline state, which we were already doing.
I verified we pass the tests in dEQP-VK.pipeline.multisample.alpha_to_one.*
(we only support 4x multisampling, so we can only pass a single test there),
however, the tests seem to have a bug by which they always pass, even if
the driver doesn't actually implement alpha to one correctly. I submitted
a fix to Khronos and verified that we also pass the fixed tests (and that
we failed them if we don't actually set te shader key correctly).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7336>
For us this is mostly handled in the compiler by a NIR lowering so
for the Vulkan driver we only need to make sure that we program our
shader key correctly from the pipeline state, which we were already
doing.
It doesn't look like CTS has any coverage for this yet so it has only
been smoke tested, but it seems to be working correctly, as expected.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7313>
If we are blitting to tile boundaries we don't need to emit
tile loads. The exception to this is the case where we are
blitting only a subset of the pixel components in the image
(which we do for single aspect blits of D24S8), since in that
case we need to preserve the components we are not writing.
There is a corner case where some times we create framebuffers
that alias subregions of a larger image. In that case the edge
tiles are not padded and we can't skip the loads.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7247>
So far V3DV_ENABLE_DEFAULT_PIPELINE_CACHE allowed to configure
pipeline cache to avoid any caching using a pipeline cache.
With this change we can be more detailed. Then envvar is not anymore a
boolean. Allowed values:
* "off": no pipeline cache at all. PipelineCache objects behaves as
no-op objects.
* "no-default-cache": user PipelineCache caches nir/variants, but we
don't provide a default cache in case the user doesn't provide a
PipelineCache object, neither for internal pipelines.
* "full" (default): we provide a default PipelineCache, used when
the user doesn't provide one when creating a Pipeline, and for
internal Pipelines.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Subpass color clear pipelines are those used to emit partial attachment
clears as draw calls inside the render pass currently bound by the
application in the command buffer, leading to a huge performance improvement
compared to the case where we emit them in their own render pass.
Unfortunately, because the pipeline references the render pass
object in which it is used and the render pass object is owned by the
application (and can be destroyed at any point), we can't cache these
pipelines (unless we implement a refcounting mechanism or other
similar strategy).
Performance impact looks negligible based on experiments with vkQuake3,
probably because the underlying pipeline cache is preventing the
redundant shader recompiles.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The lowering will get all the interpolateAt() functions from GLSL lowered to
the corresponding intrinsics we have just implemented in the compiler backend,
which was the last piece we needed to enable the feature.
This gets us to pass all the relevant tests in:
dEQP-VK.pipeline.multisample_interpolation.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
as we have just set proper values for point granularity etc, we can
enable largePoints. With this change tests like this:
dEQP-VK.rasterization.primitive_size.points.point_size_*
goes from Skip to Pass.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
As we are here, we also tweak some line-related limits, as some use
the same value that for point, and in order to use the enum we added
recently at common/v3d_limits.h
Fixes the following test:
dEQP-VK.glsl.builtin_var.simple.pointcoord
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Asking the simulator the total memory it is using, instead of sysinfo
(that returned the host system memory).
Fixes the following CTS tests when using the simulator:
dEQP-VK.memory.allocation.basic.percent_1.forward.count_12
dEQP-VK.memory.allocation.basic.percent_1.reverse.count_12
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
As we understand that texture accesses should be aligned to the UIF
block size.
Fixes several of the CTS tests under this pattern:
dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_texel_buffer.*.offset_nonzero
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_texel_buffer.*.offset_nonzero
Note: for those tests, using a lower value (64) was enough to get them
working, but again, we understand that the real alignment is the UIF
block size.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
That it would be used as fallback. Three advantages:
* Having a cache for user operations even if the user doesn't
provide it.
* Having a cache for internal operations. v3dv_meta_copy creates
pipelines for some copy path, so it is interesting to have them
cached.
* Testing: so now the pipeline cache is tested by more CTS tests.
As any other pipeline cache, it can be disabled with the
V3DV_ENABLE_PIPELINE_CACHE. It was suggested that would make sense to
have a specific envvar for the default pipeline cache, but for now
just one envvar is enough.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Heavily based on anv nir caching. One of the bigger difference is that
we don't create the nir shader using a ralloc_context local to the
main compile graphics method. On anv, after compiling the shader, they
discard the nir shader. We need it as we could need it to build shader
variants later.
As anv, we introduce a environment variable to disable the cache:
V3DV_ENABLE_PIPELINE_CACHE
By default is enabled. The main purpose for this envvar is debugging,
in order to provide a easy way to discard a bug on the cache.
It is pending to serialize/deserialize the NIR shaders as part of
GetPipelineCacheData and PipelineCacheCreate. We also plan is to cache
too shader variants. We would do that on following patches.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Since vkCmdClearAttachments executes inside a render pass, we would
benefit from converting it to a draw within the current subpass job to
improve batching and avoid expensive tile load/store operations.
This can dramatically improve performance for applications using this
command, however, we can only use this if we are clearing the base
layers of framebuffer attachments, since otherwise we would need to
use layered rendering, which we don't support yet.
This improves vkQuake3 performance dramatically (almost 100%
performance improvement at 1080p), which calls this twice per frame.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
For some reason, CTS expects E5B9G9R9 and B10G11R11 with
transparent black border clamping produce alpha 1 instead of 0.
Since border color takes precedence over the texture state swizzle,
the only way to fix this is to lower the texture swizzle in the shader
to set alpha to 1.
Fixes:
dEQP-VK.pipeline.sampler.view_type.*b10g11r11*clamp_to_border_transparent_black
dEQP-VK.pipeline.sampler.view_type.*e5b9g9r9*.clamp_to_border_transparent_black
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
In OpenGL, unnormalized coordinates are implicit based on the sampler
type (rectangle textures), so the compiler can set the flag when needed.
In Vulkan, however, this is configured explicitly in the sampler object,
so the compiler won't set it and we need to do it manually when we are
writing the P1 uniform.
Fixes:
dEQP-VK.pipeline.sampler.exact_sampling.*.unnormalized_coords
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Instead of asserting that users don't try to create images that
would require 4GB+ of memory, error out with the corresponding
OOM error when the user tries to actually allocate the memory
for the image.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Only free the underlying BO when the exported memory object is freed
to avoid multiple frees of the same memory.
The only exception is winsys BOs where we import a BO created in the
display device into the render device. In this case, we only have one
memory object referencing the BO and we want to destroy it with that
memory object.
Fixes:
dEQP-VK.api.external.memory.dma_buf.*
dEQP-VK.api.external.memory.opaque_fd.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Heavily based on the already existing for the v3d OpenGL driver, but
without references, and with some extra OOM checks (Vulkan CTS has
several OOM tests).
With this commit v3dv_bo_alloc and v3dv_bo_free became frontends to
the bo_cache. The former tries to get a BO from the cache if possible,
and the latter stores the BO on the cache if possible. The former also
adds a new parameter to point if the BO to allocate is private.
As v3d we are only caching private BOs, those created by the driver
for internal use (like CLs, tile_alloc, etc). They are the ones with
the highest change of being reused (for example, CL BOs are always
4KB, so they can always be reused). User-created BOs can have any
size, including some very large ones for buffers and images, which
makes them far less likely to be reused and would add a lot of memory
pressure if we decided to cache them.
In any case, in practice, we found that we could get a performance
improvement by caching also user-created BOs, but that would need more
care and an analysis to decide which ones makes sense. Would also
require to change how the cached BOs are stored by size. Right now
there are an array of list_head, that doesn't work well with big
BOs. If done, that would be handled on a separate commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Both the API user and the driver may attempt to map a BO, possibly
only partially and using different ranges. This is a problem because
we only have a single map per BO. Fix this by making sure that when
a BO is mapped, we always map its entire range. This way if a BO
has been mapped before, we know that map is still valid no matter the
region we need to access now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Asserting on them makes it easier to identify applications and tests that
try to use unimplemented features.
Also, there are some APIs that relate to optional features we don't
(or can't) support, such as sparse, so for these we just provide
the trivial implementation.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This allows us to remove some individual bos for the image and
sampler, used to store the SAMPLER_STATE and TEXTURE_SHADER_STATE. Now
they are prepacked on static memory as part of the vulkan object
struct.
This commit introduces small descriptor structs, used to define what
the bo subregion would contain. It is used mostly to compute offsets
to that specific data, and define the size needed. Having said so, it
would be possible to replace them with some kind of flag (like anv) or
just compute the offset based on the context.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
There are basically two types of scenarios to consider:
- Secondary command buffers that run inside a render pass.
- Secondary command buffers that run outside a render pass.
For the former we want to record their commands into a binning command
list that we can branch to when executed into a primary command
buffer. This means this kind of command buffers don't spawn new jobs,
just the default one where they record the binning commands which
won't include the frame setup, which will be provided by the primary
they will be executed in.
For the latter we don't require anything special, we just record as
many jobs as we need as usual and link that job list from the primary
job list when executed.
This handles most scenarios except:
- vkCmdWaitForEvents
- VkCmdClearAttachments
Both of these can spawn new jobs inside a render pass, which is not
what we want for secondary command buffers. We will address this is
follow-up patches.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>