We don't support them by hw, so we would need to get them
lowered. This fix some crashes while using renderdoc with UE4 shooter
demo traces.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8582>
Instead, wait for the specific queries being reset to
not be in use by the GPU.
This takes query pool resets in the UE4 Shooter demo from
50-60ms down to 0.5-2ms.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8554>
I saw this while inspecting CL dumps from the UE Shooter demo,
where they disable Z writes for occlusion queries. The hardware
is probably doing this internally, but it doesn't hurt
to do this explicitly and make CL traces consistent with intended
behavior.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8571>
If we have dirty descriptor set state we have to update our uniform
data to reference the new resources such as addresses for textures
or UBOs. This is known to have a high CPU cost, so we want to limit
this as much as we can.
It is a common rendering pattern in applications to render many objects
using the same pipeline, but modifying the descriptor sets bound to update
textures, UBOs, etc. In this scenario, we would be incurring in unnecessary
uniform stream updates for stages that don't access descriptor sets at all.
This change makes it so we track which shader stages in a pipeline
use descriptor set state and skips updating uniform streams for them
when dirty descriptor set state is the only reason requiring us to
generate new uniform streams for a draw call.
v2: reuse shader stage information from the pipeline set layouts
to track shader stages that use descriptor state.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8555>
Instead of checking whether the source and destination are the same,
we should check if the underlying BOs are the same, since we may
be suballocating resources from the same allocation and the kernel
will fail to execute jobs if the BO list has duplicated entries.
Fixes aborts with Unreal Engine due to failed TFU jobs.
Fixes: 30f1fc25ce ('v3dv: implement TFU blits')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8098>
dep_valgrind gives you -I/usr/include/valgrind (or whatever) so if
valgrind/ wasn't in the search path anyway, these includes would fail.
Found in CI when adding valgrind to the build images.
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7936>
Replace mesa's slightly different container_of() with one more aligned
to the linux kernel's version which takes a type as the 2nd param. This
avoids warnings like:
freedreno_context.c:396:44: warning: variable 'batch' is uninitialized when used within its own initialization [-Wuninitialized]
At the same time, we can add additional build-time type-checking asserts
Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7941>
This adds an enum to the load tile buffer that forces the alpha channel
to be set to 1. This will be required later to load RGBX formats.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7816>
The filter is only relevant to handle blits that invole scaling, but
our TFU path doesn't handle any kind of scaling so we can safely
ignore it.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>
The TFU path only activates for blits that are really copies
(no linear filtering, no scaling, same pixel format, etc.), and
we do it slice by slice, so we can easily handle mirroring of the
Z coordinate for 3D images by reversing the order of the layers
as we copy them.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>
Same as with other TFU paths, we only handle exact copies without
conversion, so we can rewrite the format to use a compatible TFU format
based on its texel size, which allows us to use this path with more
formats.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7845>
Just like we do for image copies, since we are not doing any pixel
format conversions, we can translate the image format to a compatible
format that is supported by the TFU.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7809>
We were always using baseArrayLayer from the image subresource, but
for 3D images we should use the Z offset of the region.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7809>
We will be using the new parameter in an upcomig change. The TFU
unit has a limited list of supported formats, so for cases where
we don't want to do any pixel format conversions and we are just
copying raw image data, we want to be able to rewrite the underlying
image format to use a compatible format. We will be using this
in a follow-up patch that adds a TFU path for image copies. For this
purpose, we also, move the function definition up in the file
so it is available for that upcoming TFU path without having to
put its prototype earlier in the file.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7809>
When copying multiple regions that have the same image subresource we are
effectively copying various rects across the same layer range, so we can
batch together all the rects to copy for each layer in a single job.
This allows us to significantly reduce CPU overhead when recording the
command, as we need to produce less jobs and allocate less descriptor
sets. It also offers smaller gains in execution time due to the reduced
job count.
A stress test where we copy 10 subrects of an image in a loop 100 time,
choosing regions that will involve the texel buffer path, we get these
results:
| Recording Time | Execution Time |
----------|----------------|----------------|
master | 3.021s | 0.112s |
----------|----------------|----------------|
patch | 0.163s | 0.080s |
----------|----------------|----------------|
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7782>
Halt is like a return for the entire shader or exit() if you prefer to
think of it that way. Once an invocation hits a halt, it's 100% dead.
Any writes to output variables which happened before the halt do,
however, still apply.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356>
We were allocating twice the size we need for this array. This was
probably caused by a copy and paste error from the GL driver which
grows this dynamically as BOs are added to the job.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7733>
Caused to return early wrongly on CmdPushConstants with some tests
using several calls to that method. As we are here we are also
replacing the (void *) casting at the memcpy below.
Fixes: e1c8041cde ("v3dv: try harder to skip emission of redundant state")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7718>
Writing uniform streams is performance sensitive so we should try our
best to avoid writing new uniforms if they have not changed. Particularly,
if only the vertex buffers have changed, we should not write new uniforms.
This improves performance in vkQuake2 by about 11.15%.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7683>
We are already ensuring that we only copy the appropriate pixel
rect via the scissor and viewport state, so there is no need to
do this check in the shader.
Using a stress test with 100 buffer to image copies of a single
layered image with 10 miplevels recorded into a command buffer and
measuring the time it gets to execute the command buffer we get
these results:
| Execution Time |
----------|----------------|
master | 0.142s |
----------|----------------|
patch | 0.071s |
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7671>
We still need a fallback for the case where the application makes
WSI allocations without a surface (Zink), but for the general case,
this is the right way to do this, as it would ensure that we use
the same display connection that was used to create the surface.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7631>
Used as reference Hyujun's commit
5d3fdbc52b, that does the same for
turnip.
This commit also replaces in several cases alloc for zalloc, and adds
checks on more Destroy methods if the object to be free is NULL or
not. Most of them were needed to avoid crashes/weird behaviour due
trying to use un-initialized data. Note that now that vk_object_free
iterates over a array, making it more against un-initialized or just
NULL data.
Additionally, using zalloc we can also remove some memset to 0. In
fact we needed to remove them, as if not, they would override the
vk_object_base object to 0 (the alternative would me doing a memset
computing a pointer offset, but that's is not needed as we can just
use zalloc).
v2:
* Call memset(0) on reused descriptor sets when calling
ResetDescriptorPool, not when reallocating them (Iago)
* Add null check when calling DestroyImageView (detected by a full CTS run)
v3: Fixed rebase conflicts after last meta copy/clear changes
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7627>
Following a suggestion from Alejandro, since playout is a word on its own
and can be confusing. It also makes it more consistent with other
variable names that use an underscore.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7651>
This avoids redundant per-layer operations that are the same across
layers or that only need to do once. Namely:
- The sampler for the blit source is the same for all layers.
- The decision about whether we need to load TLB contents or not only
needs to be done once.
- Some command buffer state such as the pipeline, the viewport and the
scissor is the same for all layers and should only be bound once.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7651>