When calling vkResetQueryPool() or vkCmdCopyQueryPoolResults() with a
queryCount of 0, currently a query compute program with workgroup size
0*1*1 will be emited, which is ridiculous and will be rejected by some
assertion in pvr_compute_generate_control_stream() .
As the operation should be noop when queryCount is 0, the functions can
and should just return in such cases.
Fixes: 0aa9f32b95 ("pvr: Implement vkCmdResetQueryPool API.")
Fixes: b6e8e1cf37 ("pvr: Implement vkCmdCopyQueryPoolResults API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
(cherry picked from commit 01ba4867fa)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The array index value is a signed integer but the compiler was using
the unsigned version of the clamp helper function meaning the value
was not been clamped to 0 when its value was < 0.
Fix the following deqp test cases when shaderImageGatherExtended is enabled
dEQP-VK.glsl.texture_gather.basic.2d_array.*
dEQP-VK.glsl.texture_gather.offset.*.2d_array.*
dEQP-VK.glsl.texture_gather.offset_dynamic.*.2d_array.*
dEQP-VK.glsl.texture_gather.offsets.*.2d_array.*
Fixes: 854563f0f8 ("pco: fully switch over to common smp emission code")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit b80a5f9b7d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Use lower_tg4_offsets to take care of explicit offsets, and just swizzle
the texels in the order defined by textureGather*
Fixes: 46c9239c11 ("pvr, pco: initial texture gather support with gather sampler")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 56b8dc92a9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Increase past the minimum required by the Vulkan Spec to fix tests. This
was needed due to Zink requirements which splits
`maxPerStageDescriptorStorageBuffers` between atomic buffers and
`MaxShaderStorageBlocks`.
Fixes the following GLES conformance tests:
KHR-GLES31.core.compute_shader.resources-max
KHR-GLES31.core.draw_indirect.advanced-twoPass-Compute-arrays
KHR-GLES31.core.shader_image_load_store.advanced-sync-vertexArray
KHR-GLES31.core.shader_image_load_store.basic-allTargets-store-cs
KHR-GLES31.core.shader_image_load_store.basic-allTargets-store-fs
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-int
KHR-GLES31.core.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case1-cs
KHR-GLES31.core.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
dEQP-GLES31.functional.draw_indirect.compute_interop.combined.drawelements_compute_cmd_and_data_and_indices
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_alias_overwrite
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_alias_write
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_atomic_alias_overwrite
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_atomic_alias_write
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.ssbo_atomic_multiple_write_read
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.ssbo_multiple_write_read
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_alias_overwrite
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_alias_write
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_atomic_alias_overwrite
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_atomic_alias_write
Backport-to: 26.0
Signed-off-by: Arjob Mukherjee <arjob.mukherjee@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 35f57a2739)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Sampling coeffs with trilinear filtering will output 2x sets of data.
Whether bilinear or trilinear filtering is in use can't be determined
without checking state words, so unconditionally reserve 2x to avoid
clobbering output regs.
Fixes: 7df32ba09d ("pco: initial texture/sampler compiler support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
(cherry picked from commit af1669d9e2)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Because of a previous refactor, pco_last_igrp was incorrectly changed to return
the first entry in a linked list instead of the last. Update pco_last_igrp to
return the last entry in a linked list.
The following CTS tests now pass:
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_2_dynamic_fragment
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_dynamic_fragment
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_uniform_fragment
Fixes: 719ece42c0 ("pco: Switch back to util/list")
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 7428af29f6)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
Previously the output length of pvr_get_driver_build_sha() is changed to
BUILD_ID_EXPECTED_HASH_LENGTH, but the array defined to receive its
output, the driver_build_sha array inside struct pvr_instance, is
declared with BLAKE3_KEY_LEN, which is longer than
BUILD_ID_EXPECTED_HASH_LENGTH.
This leads to uninitialized memory being accessed when creating
pipelineCacheUUID value, and the pipelineCacheUUID value would become
random in each run, defecting the purpose of it.
Refactor the code copying the build ID to follow other drivers: changing
the parameter from the buffer to the instance pointer, inserting a
static assert thanks to being able to retrieve the destination buffer
length (as an array inside the instance structure) and using
copy_build_id_to_sha1() to do the final copy.
Fixes: 6a42493c94 ("pvr: Use BUILD_ID_EXPECTED_HASH_LENGTH")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit 9870c8d8c4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
When recording secondary command buffers with occlusion queries, the
get_vis_results flag could be set for some graphics sub_cmd's job.
Propagate this flag from secondary command buffer graphics sub_cmds to
primary command buffer sub_cmds to ensure occlusion queries in secondary
command buffers being correctly executed.
Fixes: 5c34be4340 ("pvr: Process secondary buffer queries in vkCmdExecuteCommands.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit b8c5e47949)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
There's a dynarray field inside gfx sub_cmd called sub_query_indices,
which will contain pending query indices for gfx sub_cmds inside a
secondary command buffer. It's expected that when finishing such gfx
sub_cmds, the content of query_indices is going to be moved there.
However the `util_dynarray_append_dynarray()` call is called with wrong
parameter order, thus it's copying sub_query_indices to query_indices
and then immediately wiping query_indices, forgetting all query indices
in such case.
Fix the `util_dynarray_append_dynarray()` call to fix occlusion queries
in secondary command buffers.
Fixes: 8c506c4b03 ("pvr: Use util_dynarray_append_dynarray()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 87f4122e11)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
The last sub_cmd in the command buffer could be a graphics one, and when
ending a graphics sub_cmd, the query_indices array will be checked to
know whether a occlusion query starts during this graphics sub_cmd.
Finalize the query_indices array after ending the last sub_cmd,
otherwise the check for query initiation may have a false negative
result.
Fixes the `dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.
random.seed6` test case.
Fixes: 2b1992a000 ("pvr: Implement vkCmdBeginQuery API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 36f34a72c1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41269>
All B-series Rogue cores seem to have USC rounding mode as RTE instead
of RTZ.
Set the has_usc_alu_roundingmode_rne feature flag for them (currently
only BXS-4-64 has it set).
Verified via testing on BXM-4-64 (36.52.104.182) by fixing CTS tests
dEQP-VK.spirv_assembly.instruction.*.float_controls.fp32.input_args.* ,
and via proprietary driver vulkaninfo result on BXE-2-32 (36.29.52.182),
BXE-4-32 (36.50.54.182) and BXM-4-64 (36.56.104.183) (checking
shaderRoundingModeRT?Float32 properties).
Fixes: 1db1038a61 ("pvr: add device info for BXM-4-64 (36.56.104.183)")
Fixes: e60e0c96ba ("pvr: add device info for BXE-2-32 (36.29.52.182)")
Fixes: 2743363a57 ("pvr: add device info for BXM-4-64 (36.52.104.182)")
Fixes: ea28791d40 ("pvr: add device info for BXE-4-32 (36.50.54.182)")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 9b44def4e9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
The pvr_clear_vdm_state_get_size_in_dw() wrongly think instance count
inputs are needed when doing RTA clear for cores without the
gs_rta_support feature. However, the instance ID is exploited to output
the target layer ID, which isn't supported at all for cores w/o that
feature, so it looks that the condition is inverted. In addition, the
pvr_pack_clear_vdm_state() function seems to have similar logic deciding
whether to emit instance_count, and the logic is opposite to the logic
in pvr_clear_vdm_state_get_size_in_dw() for the part checking the
gs_rta_support feature.
Invert the condition to take instance ID inputs for cores with the
gs_rta_support feature instead of those without this feature.
Fixes: b59eb30e88 ("pvr: Fix cs corruption in pvr_pack_clear_vdm_state()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
(cherry picked from commit 3db93bbf34)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40979>
The dirty state of stencil ops is not checked when deciding whether to
rebuild the ISP state, although the values are part of the ISP state
(the 27:16 bits of ISPB word).
Add MESA_VK_DYNAMIC_DS_STENCIL_OP to the condition for rebuilding ISP
control registers.
Fixes GLCTS tests when running on top of Zink:
dEQP-GLES2.functional.fragment_ops.stencil.zero_stencil_fail
Fixes: 88f1fad3f7 ("pvr: Use common pipeline & dynamic state frameworks")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit ee031d67b4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
The DDMADT instruction of PDS has out-of-bound test capability, which is
used for implementation of robust vertex input fetch.
According to the pseudocode in the comment block before the "LAST DDMAD"
mark in pvr_pipeline_pds.c, the check is between
`calculated_source_address + (burst_size << 2)` and `base_address +
buffer_size`, in which the `burst_size` seems to correspond to the BSIZE
field set in the low 32-bit of DDMAD(T) src3 and the `buffer_size`
corresponds to the MSIZE field set in the DDMADT-specific high 32-bit of
src3. As the calculated source address is just the base address adds the
multiplication result (the offset), the base address could be eliminated
from the check, results in the check between `offset + (BSIZE * 4)` and
`MSIZE` .
Naturally it's expected to just set the MSIZE field to the buffer size.
In addition, as the Vulkan spec says "Reads from a vertex input MAY
instead be bounds checked against a range rounded down to the nearest
multiple of the stride of its binding", the driver rounds down the
accessible buffer size before setting MSIZE to it.
However when running OpenGL ES 2.0 CTS, two problems are exhibited about
the setting of the size to check:
- dEQP-GLES2.functional.buffer.write.basic.array_stream_draw sets up a
VBO with 3 bytes per vertex (RGB colors and 1B per color) and 340
vertices (results in a buffer size of 1020 = 0x3fc). However as the
DMA request size, which is specified by BSIZE, is counted by dwords,
3 bytes are rounded up to 1 dword (which is 4 bytes). When the bound
check of the last vertex happens, the vertex's DMA start offset is
0x3f9, so the DDMADT check happens between 0x3fd (0x3f9 + 1 * 4) and
0x3fc, and indicates a check failure. This prevents the last vertex,
which is perfectly in-bound, from being properly fetched; this is
against the Vulkan specification, and needs to be fixed.
- dEQP-GLES2.functional.vertex_arrays.single_attribute.strides.
buffer_0_32_float2_vec4_dynamic_draw_quads_1 sets up a VBO with a size
of 168 bytes, and tries to draw 6 vertices (each vertex consumes 2
floats (thus 8 bytes) of attribute) with a stride of 32 bytes using
this VBO. Zink then translates the VBO to a Vulkan vertex buffer bound
with size = 168B, stride = 32B. Here the optional rule about rounding
down buffer size happens in the current PowerVR driver, and the
checked bound is rounded down to 160B, which prevented the last
vertex's 8B attributes to be fetched. It looks like this kind of
situation is considered in the codepath without DDMADT, but omitted
for the codepath utilizing DDMADT for bound check.
So this patch tries to mimic the behavior of DDMADT when setting the
MSIZE field of it to prevent false out-of-bounds. It first calculates
the offset of the last valid vertex DMA, then adds the DMA request size
to it to form the final MSIZE value. With the code calculating the last
valid DMA offset considering the situation of fetching the attribute
from the space after the last whole multiple of stride, both problems
mentioned above are solved by this rework.
There're 99 GLES CTS testcases fixed by this change, and Vulkan CTS
shows no regression on `dEQP-VK.robustness.robustness1_vertex_access.*`
tests.
Fixes: 4873903b56 ("pvr: Enable PDS_DDMADT")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit 252904f3d1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
This memory padding is enforced by GetBufferMemoryRequirements2 and
might be then checked against to decide whether it's enough.
Move it to pvr_buffer.h for further assertions.
Backport-to: 25.3
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit d992474be9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Currently the size of single components inside one attribute is saved
and checked against when checking DMA capability. However, the vertex
attribute DMA happens for a whole attribute instead of individually for
its components, so checking against the component size is useless -- the
size of the whole attribute is what needs to be saved and checked.
Rename all component_size_in_bytes fields to attrib_size_in_bytes, and
save the size of the whole attribute inside them.
Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit aa8dad141c)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
The ddmadt_oob_buffer_size structure to be filled is named
`obb_buffer_size`, which is obviously a typo.
Change to `oob_buffer_size` to fix the typo.
Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
(cherry picked from commit caea72cffc)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Even if a linear image isn't created with usages declaring PBE writes,
the image might be exported and then re-imported with a usage that
allows rendering to.
Always align linear images' width for being written by PBE.
This fixes WSI creating surfaces with odd width, exporting them and
re-importing for rendering.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 765a9f4fd9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
The s0abs bit in the encoing of fred instruction is wrongly set to the
status of .neg modifier instead of .abs modifier.
Fix this copy-n-paste error.
Fixes GLCTS tests when running on top of Zink:
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.4
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.45
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.4
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.45
Fixes: 8ec174b3f9 ("pco: add support for various selection, complex, trig ops")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 54860bb4c7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Move options were bit or-ing from the wrong enum, causing undefined
behaviour when the number of intrinsics changed.
Replaced it with the values from the right nir_move_options enum that
were previously working. (Further refinement needed on these after
extensive testing.)
Fixes: f1b24267d2 ("pco: rework nir processing and passes")
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 721c1b8f65)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
When the first attachment is assigned to a tile buffer, the buffer
alloc mask was not been updated. This means when a second attachment
is added to the same tile buffer it will be assigned the same offset
as the first which will lead to incorrect behaviour.
Fixes for depq-vk:
dEQP-VK.renderpasses.dynamic_rendering.complete_secondary_cmd_buff.suballocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.complete_secondary_cmd_buff.dedicated_allocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.suballocation.attachment.4.568
dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.dedicated_allocation.attachment.4.568
Fixes: a7de9dae6b ("pvr: Add routine for filling out usc_mrt_setup from dynamic rendering state")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 96cfb1cb7f)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Do not assume that the application always provides images for backing
attachments. The app can provide a super set of attachments of which
only some are actually backed with images.
We want to filter-out attachments that aren't meaningful for rendering
or sampling, and create compiler resources only for relevant ones.
Fix assert in CTS:
pvr_arch_mrt.c:215: pvr_rogue_init_usc_mrt_setup: Assertion `att_format != VK_FORMAT_UNDEFINED' failed.
Seen in pipeline monolithic, for instance:
dEQP-VK.pipeline.monolithic.multisample.misc.dynamic_rendering.multi_renderpass.r8g8b8a8_unorm_r16g16b16a16_sfloat_r16g16b16a16_sint_d32_sfloat_s8_uint.random_127
Fixes: d549c1d045 ("pvr: add pipeline handling to use dynamic rendering info")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 5473ca3be3)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Expose the routine in preperation for a later commit.
Backport-to: 26.0
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 6b0fea938b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Use the valid/input coverage masks for tile buffer store coverage masks
when running single/multi-sampled fragment shaders respectively.
Fixes: 297a0c269a ("pvr, pco: tile buffer support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reported-by: Nick Hamilton <nick.hamilton@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 8eee60fa78)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40752>
Among all subcommands, only gfx subcommands are bound to a query pool,
other subcommands seem to need no special handling.
In addition, if a ResetQuery is done before BeginQuery, the last
subcommand will be a event one, which fails the current assert that
assumes it's a gfx one.
Change the assertion of the subcommand being a gfx one to an addition
check of whether the subcommand is a gfx one.
This fixes crash of Vulkan CTS 1.4.5.1 test
dEQP-VK.query_pool.discard.normal.no_depth.none.discard .
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 5a497316d4)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40359>
- RGBA8888_* is a preprocessor alias for R8G8B8A8_* in u_format.yaml.
- Both entries in the format tables collide on the same enum value, and
RGBA8888 overwrites R8G8B8A8.
- The fix was reverting to the version that was in the commit
39e949434c because there is a different format
was used that did not cause any collisions.
dEQP fixes:
dEQP-VK.api.info.format_properties.r8g8b8a8_sint
dEQP-VK.api.info.format_properties.r8g8b8a8_snorm
dEQP-VK.api.info.format_properties.r8g8b8a8_uint
dEQP-VK.api.info.format_properties.r8g8b8a8_unorm
Fixes: 9f740b26a6 ("pvr: Fix bugs in the format table")
Signed-off-by: Leon Perianu <leon.perianu@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
(cherry picked from commit 7c6dbb099a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
On the Rogue architecture add support for using a fragment passthrough
shader when there is no fragment shader present in a graphics
pipeline but the sample mask is required.
fix:
dEQP-VK.pipeline.monolithic.empty_fs.masked_samples
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Co-authored-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 14508b4c9a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Configure the EOT setup for SPM EOT programs so that the generated
programs load the tile buffer into the output buffer before doing
the emit
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit d1f2ad17dd)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
In subpasses preserve attachments are not used by the subpass but
their contents must be preserved throughout the subpass.
Add a list for the preserve attachments info specified by a subpass
and when determining a subpass attachments total uses check the
preserve attachments list and add it uses to the total.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 0e01b9ef2d)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The struct will also be used for preserve attachments in the next
commit.
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit e18670347a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
When calculating the dwords per pixel the output registers should
always be taken into account in addition to the number of tile buffers.
Fixes incorrect scratch buffer space calculation when both output
registers and tile buffers are emitted by a render.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Fixes: 3457f8083a ("pvr: Acquire scratch buffer on framebuffer creation.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit df445dc9b9)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
The subpass merging optimisation check for when subpasses are using
tile buffers was in the incorrect location.
The current check is in a function called from two places but only
the first of these should have been doing the optimisation check.
This was incorrectly affecting the number of renders that subpass
merging could avoid.
Partial fix for:
dEQP-VK.renderpass.*.attachment_allocation.input_output.71
Fixes: 10b6a0d567 ("pvr: Add support for generating render pass hw setup data.")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 0640ac7e3b)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
When creating frame buffers the alloc callbacks are used in the host
allocations, those same alloc callbacks need to be used when freeing
those allocations but are missing in some places causing the CTS to
report memory leaks in certain test cases.
Fixes: 146364ab9f ("pvr: add support for VK_KHR_dynamic_rendering")
fix:
dEQP-VK.api.object_management.alloc_callback_fail.framebuffer
dEQP-VK.api.object_management.single_alloc_callbacks.framebuffer
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
(cherry picked from commit 05ef9f01a7)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40092>
Its no longer an error for depth and stencil formats to have invalid
accumulator format.
Fixes the following tests:
* dEQP-VK.api.info.image_format_properties.2d.optimal.d16_unorm
* dEQP-VK.api.info.image_format_properties.2d.optimal.d24_unorm_s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.d32_sfloat
* dEQP-VK.api.info.image_format_properties.2d.optimal.d32_sfloat_s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.s8_uint
* dEQP-VK.api.info.image_format_properties.2d.optimal.x8_d24_unorm_pack32
Backport-to: 26.0
Signed-off-by: Arjob Mukherjee <arjob.mukherjee@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
(cherry picked from commit 58c7437d3a)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39828>
The samples per tile calculation was incorrect for sample count 4 and 8.
Fix:
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.depth.samples_4.*
dEQP-VK.pipeline.monolithic.multisample.std_sample_locations.draw.stencil.samples_4.*
Backport-to: 26.0
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39580>
(cherry picked from commit 9f9788330e)
Within the driver buffers are treated as 2D as sampling them as 1D
will run into HW restrictions on max size.
The compiler does the same however for atomic image ops the address
is manually calculated and doing this via the 2D path leads to
incorrect offsets.
The fix is to treat buffers as 1D for atomic ops which calculates
the correct offsets for the operations.
Fix deqp:
dEQP-VK.image.atomic_operations.add.buffer.*
dEQP-VK.image.atomic_operations.and.buffer.*
dEQP-VK.image.atomic_operations.compare_exchange.buffer.*
dEQP-VK.image.atomic_operations.dec.buffer.*
dEQP-VK.image.atomic_operations.exchange.buffer.*
dEQP-VK.image.atomic_operations.inc.buffer.*
dEQP-VK.image.atomic_operations.max.buffer.*
dEQP-VK.image.atomic_operations.min.buffer.*
dEQP-VK.image.atomic_operations.or.buffer.*
dEQP-VK.image.atomic_operations.sub.buffer.*
dEQP-VK.image.atomic_operations.xor.buffer.*
Fixes: 6dc5e1e109 ("pco: fully support Vulkan 1.2 image atomics")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39521>
(cherry picked from commit 079377c767)
The extension is optional in Vulkan 1.2 and is causing crashes in
multiple CTS tests.
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Backport-to: 26.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39351>
(cherry picked from commit 3aacc324bc)
The skip check should only be checking the format rather than the entire
packed word.
Fixes: 52ddc40a75 ("pco: restrict shadow sampler comparator clamping to unorm formats")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39428>
(cherry picked from commit c5b70dcb48)
Fixes: 6bda88bfdb ("pvr: copy WSI can_present_on_device function from PanVK")
Signed-off-by: Kitlith <kitlith@kitl.pw>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39415>
(cherry picked from commit b18b52e61d)
Jobs have been timing out, requiring 2-6 additional minutes beyond the
previous 30-minute limit. Increase the overall timeout from 30 to 45
minutes, with an additional 5-minute buffer for GitLab (50m total) to
provide margin for variance.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39432>
Only clamp shadow sampler comparators for "unsigned normalized
fixed-point format[s]" as per the Vulkan spec.
Fixes: 69a56d33de ("pco: Fix for shadow sampler comparison not clamping the compare value")
Reported-by: Georg Lehmann <dadschoorse@gmail.com>
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39389>
Vulkan spec requires binding flags to be matched with the binding with
the same index, however currently bindings are sorted with flags not
properly sorted, which leads to bindings and flags mismatch.
Resolve this by adding optional flags info to the parameters of
vk_create_sorted_bindings(), and refactoring panvk/pvr (which really
pair bindings and flags instead of only iterating flags) to use sorted
flags.
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Ryan Mckeever <ryan.mckeever@collabora.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38967>
PCO has global_atomic_{,_swap}pco intrinsics that are different from
generic 2x32 ones, but the 2x32 ones are emitted by
nir_lower_explicit_io.
Add code to lower the generic 2x32 intrinsics to the PCO variant.
Passed Vulkan CTS dEQP-VK.glsl.atomic_operations.* and fixes crash of
dEQP-VK.glsl.atomic_operations.*_reference .
Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39054>