Remove the allocate_tile_state_now parameter from v3dv_job_start_frame().
So v3dv_job_allocate_tile_state() is explicitly called after
job_emit_binning_flush() as we know the value of job->draw_count instead
of using always 0.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40554>
Replace the inline tile_alloc/TSDA sizing in v3dv_job_allocate_tile_state()
with a call to the new v3d_tile_alloc_sizes() helper. This switches from
64B to 128B initial tile alloc blocks (avoiding overflow for simple draws)
and from a flat 512KB headroom to a draw-proportional formula.
Set tile_allocation_initial_block_size and tile_allocation_block_size
in all TILE_BINNING_MODE_CFG emissions and update the
TILE_LIST_INITIAL_BLOCK_SIZE packets to match.
Benchmarked on RPi5 (V3D 7.1) with GfxBench Vulkan Aztec Ruins at
1920x1040. Average tile_alloc BO size dropped 75% (535 KB to 132 KB)
with 20% fewer OOM events (521 to 417) and no FPS regression.
This avoids exhausting GPU memory when multiple blit or fill jobs
are batched in the same command buffer, with a huge reduction of
the memory footprint avoiding the 512 KB of the tile_alloc per batched
job.
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40554>
Replace printf and nir_print_shaders by proper mesa_logX and
nir_log_shaderX functions, that provides better features (like logging
to a file, setting the logging verbosity, etc) and works better with
Android.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40434>
The extension is implemented in shared Vulkan/WSI code and
not driver specific. The underlying kms driver needs to
support HDR metadata signalling on the drm connector, which
vc4 kms does for VideoCore 5 and later since April 2021.
Successfully tested on RaspberryPi 4/400 with a HDR-10
capable HDMI monitor.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40696>
These extensions are implemented in shared Vulkan/WSI code and
not driver specific. A Vulkan driver just needs to support
VK_KHR_timeline_semaphore, which v3dv already supports via
emulated timeline semaphores since April 2022.
Successfully tested on RaspberryPi 4/400.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40696>
We have 4 image intrinsic variants now. This enum is useful for
nir_rewrite_image_intrinsic() and it will be used by other NIR passes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40709>
The variable doesn't store a granularity specific to CLE buffers. It
stores the granularity that the OS imposes on buffer allocations (that
is, the OS page size). Therefore, rename the variable to best reflect
its meaning.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>
When a resolve attachment is created with VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT,
the render pass may use a view format that differs from the image creation
format (e.g. view=R16G16_SINT on an image created as B8G8R8A8_SRGB).
cmd_buffer_emit_resolve() was calling v3dv_CmdResolveImage2() which only
receives images but not the view format. This means that blit_shader()
will use the wrong format, causing miss-renderings.
So instead of using directly v3dv_CmdResolveImage2(), let's have an
intermediate function that receives both images and view formats to do
the resolve.
This fixes dEQP-VK.image.mutable.* failures.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40234>
Split the monolithic v3dv_private.h (~2600 lines) into self-contained
sub-headers so each .c file only includes what it needs:
v3dv_common.h, v3dv_device.h, v3dv_image.h, v3dv_pass.h,
v3dv_query.h, v3dv_pipeline.h, v3dv_descriptor_set.h,
v3dv_cmd_buffer.h, v3dv_version_dispatch.h
As part of this commit we remove v3dv_private.h.
We keep v3dvx_private.h as it is, because the gain would be really
small (a lot of really small sub-headers).
In addition to keep things more tidy, we made a quick performance
check. We measured how many files are re-compiled and the performance
difference when touching one of the headers, compared with keeping
just one monolithic header.
Header touch (incremental) Split Monolithic Speedup
-------------------------- ----- ---------- -------
v3dv_image.h 2369 (24f) 2436 (33f) 1.03x
v3dv_query.h 2357 (20f) 2436 (33f) 1.03x
v3dv_pass.h 2352 (20f) 2436 (33f) 1.04x
v3dv_cmd_buffer.h 2354 (20f) 2436 (33f) 1.03x
v3dv_descriptor_set.h 2436 (33f) 2436 (33f) 1.00x
v3dv_pipeline.h 2437 (33f) 2436 (33f) 1.00x
v3dv_device.h 2418 (31f) 2436 (33f) 1.01x
v3dv_common.h 2419 (33f) 2436 (33f) 1.01x
v3dv_version_dispatch.h 2371 (26f) 2436 (33f) 1.03x
Header touch (incremental) Split Monolithic Speedup
-------------------------- ---------- ---------- -------
v3dv_image.h 2377 (24f) 2443 (33f) 1.03x
v3dv_query.h 2346 (20f) 2443 (33f) 1.04x
v3dv_pass.h 2360 (20f) 2443 (33f) 1.04x
v3dv_cmd_buffer.h 2351 (20f) 2443 (33f) 1.04x
v3dv_descriptor_set.h 2438 (33f) 2443 (33f) 1.00x
v3dv_pipeline.h 2429 (33f) 2443 (33f) 1.01x
v3dv_device.h 2418 (31f) 2443 (33f) 1.01x
v3dv_common.h 2432 (33f) 2443 (33f) 1.00x
v3dv_version_dispatch.h 2373 (26f) 2443 (33f) 1.03x
The bigger gain is on the files recompiled for some headers (going
from 33 down to 20 in some cases). The performance gain is not so
relevant though.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40169>
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
The runtime builds a final pipeline state with pointers to structures
coming from the associated pipelines libraries.
So far it has considered that the viewMask was part of a structure
together with the rest of the renderpass information. This information
can be specified in pre-raster, fragment & color-output state groups
and it was assumed would be consistent for all 3. And the runtime
currently takes the pointer to the structure from the last pipeline
library (color output).
Some coming spec/cts will clarify that the viewMask only needs to be
specified for pre-raster & fragment groups, making the value in the
color-output group untrustworthy.
This change creates a new state structure to hold the viewMask on its
own so it is only gather on pre-raster & fragment groups.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv)
Reviewed-by: Aitor Camacho <aitor@lunarg.com> (kosmickrisp)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (turnip)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv)
Reviewed-by: Frank Binns <frank.binns@imgtec.com> (powervr)
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (panvk)
Royaled-yes-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (lavapipe)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39940>
Replace manual string parsing for V3DV_ENABLE_PIPELINE_CACHE
in instance creation with parse_debug_string and a dedicated
debug_control table.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40202>
Refactor pipeline creation path to use the vk_graphics_pipeline_state
structures provided by runtime instead of raw Vulkan CreateInfo structs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39834>
The Vulkan spec states:
"If logicOpEnable is VK_TRUE, then a logical operation selected by
logicOp is applied between each color attachment and the
fragment’s corresponding output value, and blending of all
attachments is treated as if it were disabled. Any attachments
using color formats for which logical operations are not supported
simply pass through the color values unmodified."
pack_blend() was only checking blendEnable from the attachment state,
causing hardware blending to be applied even when logic ops were enabled.
This is the v3dv equivalent of the RADV fix in commit c172f6ef01
("radv: fix disabling logic op for srgb/float formats when blending
is enabled").
Fixes: dEQP-VK.pipeline.monolithic.logic_op_na_formats.*_blend
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40025>
On V3D 4.2 (Raspberry Pi 4), there is a hardware bug where the binner
can trigger a GPU reset in some situations where primitives are
discarded, such as due to primitive restarts.
The way to avoid this is to force the binner to do always something, by
emitting the proper CL. In this case we decided to always set point
size, as it is a very simple and fast operation.
This fixes resets caused by
dEQP-VK.pipeline.monolithic.input_assembly.primitive_restart.*.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39826>
Use v3dv_job_apply_barrier_state to consume pending barriers when
executing secondary command buffers. This ensures we only serialize
against relevant stages, addressing FIXME.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39278>
Vulkan spec requires binding flags to be matched with the binding with
the same index, however currently bindings are sorted with flags not
properly sorted, which leads to bindings and flags mismatch.
Resolve this by adding optional flags info to the parameters of
vk_create_sorted_bindings(), and refactoring panvk/pvr (which really
pair bindings and flags instead of only iterating flags) to use sorted
flags.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38967>
Move tfu_supports_tex_format() and
get_internal_type_bpp_for_output_format() from v3dvx_private.h
to v3dvx_format_table.h.
Move v3dv_format_plane and v3dv_format struct from v3dv_private.h
to v3dv_format_table.h.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38732>
Just changing the intrinsic for load_push_constant is wrong, as nothing
guarantees they will have the same indices in the future.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38759>
Replace the duplicated swapchain image detection pattern across all
Vulkan drivers with the new wsi_common_is_swapchain_image() helper.
Since the swapchain handle can be extracted from VkImageCreateInfo's
pNext chain inside wsi_common_create_swapchain_image(), remove the
now-redundant VkSwapchainKHR parameter from that function.
This removes the #ifdef guards for Android/WSI platforms from each
driver, as the helper now handles this uniformly.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38541>
Weird that CTS did not catch that ...
Signed-off-by: Hans-Kristian Arntzen <post@arntzen-software.no>
Fixes: 11195eb8de ("vulkan: Add KHR_swapchain_maintenance1 promotions.")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38728>
We have nir_lower_sysvals_to_varyings() so we can just have that lower
it for the drivers who don't want a sysval. Most have to support the
sysval version anyway for various lowering so making them all have to
support both is pretty annoying.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38562>