The runtime builds a final pipeline state with pointers to structures
coming from the associated pipelines libraries.
So far it has considered that the viewMask was part of a structure
together with the rest of the renderpass information. This information
can be specified in pre-raster, fragment & color-output state groups
and it was assumed would be consistent for all 3. And the runtime
currently takes the pointer to the structure from the last pipeline
library (color output).
Some coming spec/cts will clarify that the viewMask only needs to be
specified for pre-raster & fragment groups, making the value in the
color-output group untrustworthy.
This change creates a new state structure to hold the viewMask on its
own so it is only gather on pre-raster & fragment groups.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (radv)
Reviewed-by: Aitor Camacho <aitor@lunarg.com> (kosmickrisp)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (turnip)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3dv)
Reviewed-by: Frank Binns <frank.binns@imgtec.com> (powervr)
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (panvk)
Royaled-yes-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (lavapipe)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39940>
On V3D 4.2, txf instructions with an out of bounds LOD do not
return robust values (zero) as required by robustImageAccess2.
This commit introduces a NIR lowering pass that explicitly checks
if the LOD is within bounds. If the LOD is out of bounds,
the texture coordinate is replaced with an out of bounds value
to force the hardware to return the robust value.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39430>
Replace manual string parsing for V3DV_ENABLE_PIPELINE_CACHE
in instance creation with parse_debug_string and a dedicated
debug_control table.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40202>
The samples=2 variant also flakes, matching the RPi4 pattern which
covers all sample counts. Broaden the entry to match all variants.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
v3d_tlb_blit_fast includes the blit onto a pending job that writes
to the source resource. The TLB data is already unpacked according to
the job's RT format, so storing it with a different RT format performs
a channel reinterpretation rather than a raw byte copy, corrupting the
data.
So when copying from RGB10_A2UI to RG16UI with glCopyImageSubData,
the copy_image path remaps both formats to R16G16_UNORM for a raw
32-bit copy. The fast TLB blit found the pending clear job
(RGB10_A2UI, 4 channels: 10-10-10-2) and stored its TLB data as RG16UI
(2 channels: 16-16), writing the unpacked 10-bit R and G channel values
into 16-bit fields instead of preserving the raw packed bits.
Previous internal_type/bpp check was insufficient: both RGB10_A2UI
and RG16UI share internal_type=16UI and the source bpp (64) exceeds
the destination bpp (32), but their channel layouts are different.
Add a check that the job's source surface RT format matches the blit
destination RT format before allowing the fast path.
Fixes: 66de8b4b5c ("v3d: add a faster TLB blit path")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40200>
The DISCARD_WHOLE_RESOURCE path in vc4_map_usage_prep() replaces the
resource's BO with vc4_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in vc4_submit_setup_rcl_surface(), any pending
write job would store to the new BO instead of the old one, corrupting
the new written data.
This is the same bug that was fixed in v3d in the previous commit.
Fixes: 18ccda7b86 ("vc4: When asked to discard-map a whole resource, discard it.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40180>
The DISCARD_WHOLE_RESOURCE path in v3d_map_usage_prep() replaces the
resource's BO with v3d_resource_bo_alloc(). As the RCL resolves
rsc->bo at job submit in emit_rcl() any pending write job would
store to the new BO instead of the old one, corrupting the new
written data.
This is adressed by flushing all pending write jobs affecting the
resource before replacing its BO.
This fixes multiple tests where data copied to a renderbuffer was
overwritten by a previos GPU clear. Test are from the subgroup:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.*
Fixes: 45bb8f2957 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40180>
Most of the work was already done for the Vulkan driver.
The main difference to handle is that OpenGL request to ignore sample
mask when the framebuffer is non-multisampled, while Vulkan applies it
always.
This also fixes KHR-GL31.frag_coord_conventions.multisample.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40059>
Refactor pipeline creation path to use the vk_graphics_pipeline_state
structures provided by runtime instead of raw Vulkan CreateInfo structs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39834>
The Vulkan spec states:
"If logicOpEnable is VK_TRUE, then a logical operation selected by
logicOp is applied between each color attachment and the
fragment’s corresponding output value, and blending of all
attachments is treated as if it were disabled. Any attachments
using color formats for which logical operations are not supported
simply pass through the color values unmodified."
pack_blend() was only checking blendEnable from the attachment state,
causing hardware blending to be applied even when logic ops were enabled.
This is the v3dv equivalent of the RADV fix in commit c172f6ef01
("radv: fix disabling logic op for srgb/float formats when blending
is enabled").
Fixes: dEQP-VK.pipeline.monolithic.logic_op_na_formats.*_blend
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40025>
These tests were previously skipped because they contain dynamic loops
in the VS, which can cause GPU resets on VC4. However, (1) the only
tests that cause GPU resets are the ones that have divergent loops and
(2) now, the compiler is able to fail shader linking when it finds
divergent loops.
Therefore, allow tests with non-divergent loops to run on the CI and
add tests with divergent loops to the fail list.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39768>
On V3D 4.2 (Raspberry Pi 4), there is a hardware bug where the binner
can trigger a GPU reset in some situations where primitives are
discarded, such as due to primitive restarts.
The way to avoid this is to force the binner to do always something, by
emitting the proper CL. In this case we decided to always set point
size, as it is a very simple and fast operation.
This fixes resets caused by
dEQP-VK.pipeline.monolithic.input_assembly.primitive_restart.*.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39826>
Unfortunately RPi firmware does not contain the latest available kernel
deployed in Raspberry Pi OS.
So for now, we create a custom version by just packaging manually the
kernel distributed in the official OS.
It is worth to note that Raspberry Pi OS does not contain a 32-bit
kernel for rpi4 or rpi5 (it does for rpi3). Instead, they use the 64-bit
kernel but then use 32-bit userland.
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Eric Engestrom <eric@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39864>
Explain why the driver uses demote instead of an immediate jump to the
end of the shader for OpTerminate, noting that the jump approach showed
no performance gains.
Reference: !38381
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39703>
Some ALU instructions will likely end up being copy propagated in the
backend, which means they would not have any cost. This helps the
scheduler make better decisions for the new open-coded patterns
produced in NIR for extracts (i.e. unpack_2x16) with MR#39511.
With this (together with previous patches) we manage to produce similar
shader-db results as with the unpack_2x16 NIR extract opcodes that
MR#39511 will drop.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39687>
We need this to produce optimal code in the backend for sequences
like this:
32 %10 = ushr %5.x, %9 (0x10)
16 %14 = u2u16 %10
32 %17 = f2f32 %14
With such code, our copy propagation pass will drop the u216 and
with this patch we will be able to drop the ushr too.
This pattern can show up for VK_KHR_16bit_storage when we successfully
vectorize 16-bit loads into 32-bit loads, but will become a lot more
common after MR#39511 lands, since that would also affect things like
16-bit TMU loads, which are more common.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39687>
We only really use sub-32bit integers in conversions, so we can skip
clearing the MSB bits when we produce them by converting from larger types
(leaving these bits undefined) and only clear them when we convert from them
to larger types, since we don't have native opcodes to do these conversions
that would only access relevant bits, at least on Pi4. Also, document the
cases where we could do better for Pi5.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39687>
As rpi5 can work with either 16k or 4k pages, instead of hardcoding the
pagesize just query the kernel.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39555>
The implicit_unmap tests complete in ~18s each on my A740, so I think they
should be fine to remove from all devices' skips files -- the problem was
hitting swap in parallel.
This reshuffles some test groups, making new xfails show up. The changes
are particularly notable in virgl, where virglrenderer gets wedged at some
point, arbitrary sets of tests after that fail.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39568>
This basic assertion helps static analyzer to avoid complaining that the
data memory could be NULL when we copy data from there later.
This fixes static analyzer warning null pointer passed to 2nd parameter expecting 'nonnull'
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39577>
The generation version for V3D XML package was marked as 3.3, but
actually we removed all the code supporting this generation, and the
generations we support now are from 4.2 onwards.
So we bump up the generation version.
Fixes: 9c4829473a ("broadcom/cle: remove v33 and v41 from xml definition")
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39577>
Skip tests that are causing GPU resets to avoid issues with other tests
running simultaneously with them.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39637>