When converting the index buffer from 4-bytes to 2-bytes, we use the
uploader for the job. Since commit b3133e250e we do an uploader alloc
ref, which releases the uploader buffer if there is no enough space,
creating a new one.
The problem happens when we also need this buffer because it is the one
containing the index buffer to convert. This happens, for instance, if
we need to convert the primitives because they are not supported (e.g.,
converting quads to triangles), as this is done
also using the uploader.
The solution is to ensure the uploader's buffer has an extra reference
so when released, it is not destroyed. This can easily achieved by
calling first pipe_buffer_map_range(), which is required to access the
buffer, and it increases the references.
This fixes `spec@!opengl 1.1@longprim`.
Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40642>
The extension is implemented in shared Vulkan/WSI code and
not driver specific. The underlying kms driver needs to
support HDR metadata signalling on the drm connector, which
vc4 kms does for VideoCore 5 and later since April 2021.
Successfully tested on RaspberryPi 4/400 with a HDR-10
capable HDMI monitor.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40696>
These extensions are implemented in shared Vulkan/WSI code and
not driver specific. A Vulkan driver just needs to support
VK_KHR_timeline_semaphore, which v3dv already supports via
emulated timeline semaphores since April 2022.
Successfully tested on RaspberryPi 4/400.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40696>
We have 4 image intrinsic variants now. This enum is useful for
nir_rewrite_image_intrinsic() and it will be used by other NIR passes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40709>
The variable doesn't store a granularity specific to CLE buffers. It
stores the granularity that the OS imposes on buffer allocations (that
is, the OS page size). Therefore, rename the variable to best reflect
its meaning.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>
When a resolve attachment is created with VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT,
the render pass may use a view format that differs from the image creation
format (e.g. view=R16G16_SINT on an image created as B8G8R8A8_SRGB).
cmd_buffer_emit_resolve() was calling v3dv_CmdResolveImage2() which only
receives images but not the view format. This means that blit_shader()
will use the wrong format, causing miss-renderings.
So instead of using directly v3dv_CmdResolveImage2(), let's have an
intermediate function that receives both images and view formats to do
the resolve.
This fixes dEQP-VK.image.mutable.* failures.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40234>
Split the monolithic v3dv_private.h (~2600 lines) into self-contained
sub-headers so each .c file only includes what it needs:
v3dv_common.h, v3dv_device.h, v3dv_image.h, v3dv_pass.h,
v3dv_query.h, v3dv_pipeline.h, v3dv_descriptor_set.h,
v3dv_cmd_buffer.h, v3dv_version_dispatch.h
As part of this commit we remove v3dv_private.h.
We keep v3dvx_private.h as it is, because the gain would be really
small (a lot of really small sub-headers).
In addition to keep things more tidy, we made a quick performance
check. We measured how many files are re-compiled and the performance
difference when touching one of the headers, compared with keeping
just one monolithic header.
Header touch (incremental) Split Monolithic Speedup
-------------------------- ----- ---------- -------
v3dv_image.h 2369 (24f) 2436 (33f) 1.03x
v3dv_query.h 2357 (20f) 2436 (33f) 1.03x
v3dv_pass.h 2352 (20f) 2436 (33f) 1.04x
v3dv_cmd_buffer.h 2354 (20f) 2436 (33f) 1.03x
v3dv_descriptor_set.h 2436 (33f) 2436 (33f) 1.00x
v3dv_pipeline.h 2437 (33f) 2436 (33f) 1.00x
v3dv_device.h 2418 (31f) 2436 (33f) 1.01x
v3dv_common.h 2419 (33f) 2436 (33f) 1.01x
v3dv_version_dispatch.h 2371 (26f) 2436 (33f) 1.03x
Header touch (incremental) Split Monolithic Speedup
-------------------------- ---------- ---------- -------
v3dv_image.h 2377 (24f) 2443 (33f) 1.03x
v3dv_query.h 2346 (20f) 2443 (33f) 1.04x
v3dv_pass.h 2360 (20f) 2443 (33f) 1.04x
v3dv_cmd_buffer.h 2351 (20f) 2443 (33f) 1.04x
v3dv_descriptor_set.h 2438 (33f) 2443 (33f) 1.00x
v3dv_pipeline.h 2429 (33f) 2443 (33f) 1.01x
v3dv_device.h 2418 (31f) 2443 (33f) 1.01x
v3dv_common.h 2432 (33f) 2443 (33f) 1.00x
v3dv_version_dispatch.h 2373 (26f) 2443 (33f) 1.03x
The bigger gain is on the files recompiled for some headers (going
from 33 down to 20 in some cases). The performance gain is not so
relevant though.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40169>
The V3D 7.1 TFU ICFG register restructured the IFORMAT field to 3 bits
(25:23) vs 4 bits on V3D 4.2. The defines were still using the V3D 4.2
encoding (11-15) which overflows the 3-bit field. Fix values to the
correct 3-7 range.
This was working by accident because the overflow bits land in the
SVTWID field, which is not used for the affected tiling formats.
Also rename SAND_128 to SAND since V3D 7.1 has a single SAND input
format; the tile width is now controlled by SVTWID.
Fixes: 146ceadcf4 ("v3dv: add support for TFU jobs in v71")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40540>
_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
The old calculation depended on the sample count, and gave subpar
results for 8x MSAA with standard sample locations. The new calculation
is based on the Intel pass, with some changing of the constants so that
the sample count is always proportional to alpha for 2xMSAA and 4xMSAA
and the addition of rotating the sample mask based on the pixel.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39335>
The runtime builds a final pipeline state with pointers to structures
coming from the associated pipelines libraries.
So far it has considered that the viewMask was part of a structure
together with the rest of the renderpass information. This information
can be specified in pre-raster, fragment & color-output state groups
and it was assumed would be consistent for all 3. And the runtime
currently takes the pointer to the structure from the last pipeline
library (color output).
Some coming spec/cts will clarify that the viewMask only needs to be
specified for pre-raster & fragment groups, making the value in the
color-output group untrustworthy.
This change creates a new state structure to hold the viewMask on its
own so it is only gather on pre-raster & fragment groups.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv)
Reviewed-by: Aitor Camacho <aitor@lunarg.com> (kosmickrisp)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (turnip)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv)
Reviewed-by: Frank Binns <frank.binns@imgtec.com> (powervr)
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (panvk)
Royaled-yes-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (lavapipe)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39940>
On V3D 4.2, txf instructions with an out of bounds LOD do not
return robust values (zero) as required by robustImageAccess2.
This commit introduces a NIR lowering pass that explicitly checks
if the LOD is within bounds. If the LOD is out of bounds,
the texture coordinate is replaced with an out of bounds value
to force the hardware to return the robust value.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39430>
Replace manual string parsing for V3DV_ENABLE_PIPELINE_CACHE
in instance creation with parse_debug_string and a dedicated
debug_control table.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40202>
The samples=2 variant also flakes, matching the RPi4 pattern which
covers all sample counts. Broaden the entry to match all variants.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.
So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.
Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.
Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.
Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
The DISCARD_WHOLE_RESOURCE path in vc4_map_usage_prep() replaces the
resource's BO with vc4_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in vc4_submit_setup_rcl_surface(), any pending
write job would store to the new BO instead of the old one, corrupting
the new written data.
This is the same bug that was fixed in v3d in the previous commit.
Fixes: 18ccda7b86 ("vc4: When asked to discard-map a whole resource, discard it.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40180>
The DISCARD_WHOLE_RESOURCE path in v3d_map_usage_prep() replaces the
resource's BO with v3d_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in emit_rcl() any pending write job would
store to the new BO instead of the old one, corrupting the new
written data.
This is adressed by flushing all pending write jobs affecting the
resource before replacing its BO.
This fixes multiple tests where data copied to a renderbuffer was
overwritten by a previos GPU clear. Test are from the subgroup:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.*
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40180>
Most of the work was already done for the Vulkan driver.
The main difference to handle is that OpenGL request to ignore sample
mask when the framebuffer is non-multisampled, while Vulkan applies it
always.
This also fixes KHR-GL31.frag_coord_conventions.multisample.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40059>
Refactor pipeline creation path to use the vk_graphics_pipeline_state
structures provided by runtime instead of raw Vulkan CreateInfo structs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39834>
The Vulkan spec states:
"If logicOpEnable is VK_TRUE, then a logical operation selected by
logicOp is applied between each color attachment and the
fragment’s corresponding output value, and blending of all
attachments is treated as if it were disabled. Any attachments
using color formats for which logical operations are not supported
simply pass through the color values unmodified."
pack_blend() was only checking blendEnable from the attachment state,
causing hardware blending to be applied even when logic ops were enabled.
This is the v3dv equivalent of the RADV fix in commit c172f6ef01
("radv: fix disabling logic op for srgb/float formats when blending
is enabled").
Fixes: dEQP-VK.pipeline.monolithic.logic_op_na_formats.*_blend
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40025>