Dynamic rendering codepath allows binding an attachment with a
depth+stencil format, but only depth or stencil active. The
corresponding test should be disable in such case.
Ignore the attachment's depth or stencil according to the rendering
attachment info's is_depth and is_stencil variables.
Fixes the following CTS testcases:
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.static_enable.d24_unorm_s8_uint
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.static_enable.d32_sfloat_s8_uint
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.dynamic_enable.d24_unorm_s8_uint
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.dynamic_enable.d32_sfloat_s8_uint
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41054>
Currently the code handling deferred RTA clears cannot handle them for
secondary command buffers within render passes, because the code
immediately configures the transfer command for the deferred clear
operation, but the specific attachment image view isn't known when
recording secondary command buffers to be executed inside render passes.
Add code to record parameters for deferred RTA clears in secondary
command buffers when the attachment is unknown, and bind the recorded
clears to the attachment's image view when executing the secondary
command buffer inside a render pass.
Fixes many dynamic rendering random tests.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
The code that adds deferred RTA clear transfer commands checks whether
the newly allocated transfer command is NULL. However the list_addtail
call is before the check, which means that the check does not prevent
NULL dereference.
Reorder the code to ensure no NULL transfer commands would ever be added
to the deferred clear list.
In addition, pvr_transfer_cmd_alloc() has already set the command
buffer's error status when it returns NULL, so it's not needed to set it
again.
Fixes: 2eabbbe57d ("pvr: use linked list to back deferred clears")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
For 2D array views of 3D images, the layer of the view corresponds to
the depth (instead of the layer, which should be always 0) of the image.
Fix the code emitting deferred RTA clears to set the depth instead of
the layer of the image to clear.
Fixes the flakiness of `dEQP-VK.renderpasses.renderpass*.
remaining_array_layers.multi_layer_fb.*`.
Fixes: 95820584d0 ("pvr: Add deferred RTA clears for cores without gs_rta_support.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
Deferred RTA clear will happen after the current graphics subcommand is
executed, which may override rendered image in the graphics subcommand.
In addition, the active render targets do not need "emulated" clear --
they can be really cleared by drawing rectangles.
Skip set up deferred RTA clear for active render target layers, and
continue to do immediate clear for these layers.
Fixes a few dynamic rendering random CTS tests, but the issue should
also exist in legacy renderpasses RTA clears.
Fixes: 95820584d0 ("pvr: Add deferred RTA clears for cores without gs_rta_support.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
While testing HW without gs_rta_support it was raised that this
change had been made in error. After retesting with the change
reverted the listed tests still pass.
This reverts commit d68344bffe.
Backport-to: 26.0
Reported-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
When maxPerStageResources is less than 128, it must be at least the sum
of maxPerStageDescriptorUniformBuffers,
maxPerStageDescriptorStorageBuffers, maxPerStageDescriptorSampledImages,
maxPerStageDescriptorStorageImages,
maxPerStageDescriptorInputAttachments and maxColorAttachments.
As maxPerStageDescriptorStorageBuffers is previously increased, the
value of maxPerStageResources should be increased too.
This fixes regression on two limit validation tests in the Vulkan CTS --
dEQP-VK.info.device_properties and dEQP-VK.api.info.
vulkan1p2_limits_validation.general .
Fixes: 35f57a2739 ("pvr: increase value of maxPerStageDescriptorStorageBuffers")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41270>
The last graphics job, which might write to the occlusion query result,
could still be running when vkCmdCopyQueryPoolResults is called.
Additionally wait for graphics jobs before copying the results.
Fixes: 24b1e3946c ("pvr: Add support to submit occlusion query sub cmds.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40884>
Entry points must be wrapped in the PVR_PER_ARCH macro else there
will be multiple definitions of the same symbol.
Fixes: dfddb3fe ("pvr: Add support for VK_KHR_pipeline_executable_properties")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41238>
When calling vkResetQueryPool() or vkCmdCopyQueryPoolResults() with a
queryCount of 0, currently a query compute program with workgroup size
0*1*1 will be emited, which is ridiculous and will be rejected by some
assertion in pvr_compute_generate_control_stream() .
As the operation should be noop when queryCount is 0, the functions can
and should just return in such cases.
Fixes: 0aa9f32b95 ("pvr: Implement vkCmdResetQueryPool API.")
Fixes: b6e8e1cf37 ("pvr: Implement vkCmdCopyQueryPoolResults API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40911>
Enables the shaderImageGatherExtended feature and sets the
{min,max}TexelGatherOffset physical device properties.
The properties are queried via Zink and are expected to be non-zero.
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
Rather than always emitting and swizzling 16 components for raw samples,
scale it by the number actually needed as defined by the selected tg4
channel/components.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
The array index value is a signed integer but the compiler was using
the unsigned version of the clamp helper function meaning the value
was not been clamped to 0 when its value was < 0.
Fix the following deqp test cases when shaderImageGatherExtended is enabled
dEQP-VK.glsl.texture_gather.basic.2d_array.*
dEQP-VK.glsl.texture_gather.offset.*.2d_array.*
dEQP-VK.glsl.texture_gather.offset_dynamic.*.2d_array.*
dEQP-VK.glsl.texture_gather.offsets.*.2d_array.*
Fixes: 854563f0f8 ("pco: fully switch over to common smp emission code")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
Use lower_tg4_offsets to take care of explicit offsets, and just swizzle
the texels in the order defined by textureGather*
Fixes: 46c9239c11 ("pvr, pco: initial texture gather support with gather sampler")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
Increase past the minimum required by the Vulkan Spec to fix tests. This
was needed due to Zink requirements which splits
`maxPerStageDescriptorStorageBuffers` between atomic buffers and
`MaxShaderStorageBlocks`.
Fixes the following GLES conformance tests:
KHR-GLES31.core.compute_shader.resources-max
KHR-GLES31.core.draw_indirect.advanced-twoPass-Compute-arrays
KHR-GLES31.core.shader_image_load_store.advanced-sync-vertexArray
KHR-GLES31.core.shader_image_load_store.basic-allTargets-store-cs
KHR-GLES31.core.shader_image_load_store.basic-allTargets-store-fs
KHR-GLES31.core.shader_storage_buffer_object.advanced-unsizedArrayLength-cs-int
KHR-GLES31.core.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case1-cs
KHR-GLES31.core.shader_storage_buffer_object.basic-stdLayout_UBO_SSBO-case2-cs
dEQP-GLES31.functional.draw_indirect.compute_interop.combined.drawelements_compute_cmd_and_data_and_indices
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_alias_overwrite
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_alias_write
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_atomic_alias_overwrite
dEQP-GLES31.functional.synchronization.in_invocation.ssbo_atomic_alias_write
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.ssbo_atomic_multiple_write_read
dEQP-GLES31.functional.synchronization.inter_call.with_memory_barrier.ssbo_multiple_write_read
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_alias_overwrite
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_alias_write
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_atomic_alias_overwrite
dEQP-GLES31.functional.synchronization.inter_invocation.ssbo_atomic_alias_write
Backport-to: 26.0
Signed-off-by: Arjob Mukherjee <arjob.mukherjee@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41156>
Sampling coeffs with trilinear filtering will output 2x sets of data.
Whether bilinear or trilinear filtering is in use can't be determined
without checking state words, so unconditionally reserve 2x to avoid
clobbering output regs.
Fixes: 7df32ba09d ("pco: initial texture/sampler compiler support")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41051>
Because of a previous refactor, pco_last_igrp was incorrectly changed to return
the first entry in a linked list instead of the last. Update pco_last_igrp to
return the last entry in a linked list.
The following CTS tests now pass:
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_2_dynamic_fragment
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_dynamic_fragment
dEQP-GLES3.functional.shaders.switch.conditional_fall_through_uniform_fragment
Fixes: 719ece42c0 ("pco: Switch back to util/list")
Signed-off-by: Duncan Brawley <duncan.brawley@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41125>
With SPV_KHR_constant_data, it's allowed to specialize array of
constants.
RustiCL changes are from Karol Herbst <kherbst@redhat.com>.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41046>
Previously the output length of pvr_get_driver_build_sha() is changed to
BUILD_ID_EXPECTED_HASH_LENGTH, but the array defined to receive its
output, the driver_build_sha array inside struct pvr_instance, is
declared with BLAKE3_KEY_LEN, which is longer than
BUILD_ID_EXPECTED_HASH_LENGTH.
This leads to uninitialized memory being accessed when creating
pipelineCacheUUID value, and the pipelineCacheUUID value would become
random in each run, defecting the purpose of it.
Refactor the code copying the build ID to follow other drivers: changing
the parameter from the buffer to the instance pointer, inserting a
static assert thanks to being able to retrieve the destination buffer
length (as an array inside the instance structure) and using
copy_build_id_to_sha1() to do the final copy.
Fixes: 6a42493c94 ("pvr: Use BUILD_ID_EXPECTED_HASH_LENGTH")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40673>
When recording secondary command buffers with occlusion queries, the
get_vis_results flag could be set for some graphics sub_cmd's job.
Propagate this flag from secondary command buffer graphics sub_cmds to
primary command buffer sub_cmds to ensure occlusion queries in secondary
command buffers being correctly executed.
Fixes: 5c34be4340 ("pvr: Process secondary buffer queries in vkCmdExecuteCommands.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>
There's a dynarray field inside gfx sub_cmd called sub_query_indices,
which will contain pending query indices for gfx sub_cmds inside a
secondary command buffer. It's expected that when finishing such gfx
sub_cmds, the content of query_indices is going to be moved there.
However the `util_dynarray_append_dynarray()` call is called with wrong
parameter order, thus it's copying sub_query_indices to query_indices
and then immediately wiping query_indices, forgetting all query indices
in such case.
Fix the `util_dynarray_append_dynarray()` call to fix occlusion queries
in secondary command buffers.
Fixes: 8c506c4b03 ("pvr: Use util_dynarray_append_dynarray()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>
The last sub_cmd in the command buffer could be a graphics one, and when
ending a graphics sub_cmd, the query_indices array will be checked to
know whether a occlusion query starts during this graphics sub_cmd.
Finalize the query_indices array after ending the last sub_cmd,
otherwise the check for query initiation may have a false negative
result.
Fixes the `dEQP-VK.renderpasses.dynamic_rendering.primary_cmd_buff.
random.seed6` test case.
Fixes: 2b1992a000 ("pvr: Implement vkCmdBeginQuery API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40854>
Partial revert of a22ad99bdd ("pvr: set device features/props/extensions to
Vulkan 1.0 minimums (unless implemented)"), as this optional feature is fully
implemented already.
Tested with:
dEQP-VK.*wide*
dEQP-VK.dynamic_state.monolithic.line_width.*
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40890>
Partial revert of a22ad99bdd ("pvr: set device features/props/extensions to
Vulkan 1.0 minimums (unless implemented)"), as this optional feature is fully
implemented already.
Tested with:
dEQP-VK.*depth_bias*
dEQP-VK.*bias_clamp*
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40890>
Partial revert of a22ad99bdd ("pvr: set device features/props/extensions to
Vulkan 1.0 minimums (unless implemented)"), as this optional feature is fully
implemented already.
Tested with:
dEQP-VK.draw.*_multi_draw
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40890>
Partial revert of a22ad99bdd ("pvr: set device features/props/extensions to
Vulkan 1.0 minimums (unless implemented)"), as this optional feature is fully
implemented already.
It also turns out that Vulkan CTS was already testing this feature even though
it wasn't being advertised as supported,
dEQP-VK.draw.renderpass.indexed_draw.draw_instanced_indexed_triangle_list being
an example of this.
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40890>
Implements ops without needing the NIR lowering.
The sum and carry parts can later be combined into single instruction.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Tested-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40607>
All B-series Rogue cores seem to have USC rounding mode as RTE instead
of RTZ.
Set the has_usc_alu_roundingmode_rne feature flag for them (currently
only BXS-4-64 has it set).
Verified via testing on BXM-4-64 (36.52.104.182) by fixing CTS tests
dEQP-VK.spirv_assembly.instruction.*.float_controls.fp32.input_args.* ,
and via proprietary driver vulkaninfo result on BXE-2-32 (36.29.52.182),
BXE-4-32 (36.50.54.182) and BXM-4-64 (36.56.104.183) (checking
shaderRoundingModeRT?Float32 properties).
Fixes: 1db1038a61 ("pvr: add device info for BXM-4-64 (36.56.104.183)")
Fixes: e60e0c96ba ("pvr: add device info for BXE-2-32 (36.29.52.182)")
Fixes: 2743363a57 ("pvr: add device info for BXM-4-64 (36.52.104.182)")
Fixes: ea28791d40 ("pvr: add device info for BXE-4-32 (36.50.54.182)")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40794>
The pvr_clear_vdm_state_get_size_in_dw() wrongly think instance count
inputs are needed when doing RTA clear for cores without the
gs_rta_support feature. However, the instance ID is exploited to output
the target layer ID, which isn't supported at all for cores w/o that
feature, so it looks that the condition is inverted. In addition, the
pvr_pack_clear_vdm_state() function seems to have similar logic deciding
whether to emit instance_count, and the logic is opposite to the logic
in pvr_clear_vdm_state_get_size_in_dw() for the part checking the
gs_rta_support feature.
Invert the condition to take instance ID inputs for cores with the
gs_rta_support feature instead of those without this feature.
Fixes: b59eb30e88 ("pvr: Fix cs corruption in pvr_pack_clear_vdm_state()")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40831>
The dirty state of stencil ops is not checked when deciding whether to
rebuild the ISP state, although the values are part of the ISP state
(the 27:16 bits of ISPB word).
Add MESA_VK_DYNAMIC_DS_STENCIL_OP to the condition for rebuilding ISP
control registers.
Fixes GLCTS tests when running on top of Zink:
dEQP-GLES2.functional.fragment_ops.stencil.zero_stencil_fail
Fixes: 88f1fad3f7 ("pvr: Use common pipeline & dynamic state frameworks")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40623>
When running GLES2 conformance tests with Zink on the PowerVR driver, I
found that the PowerVR driver has the same kind of weird behavior of not
ignoreing wrap mode for seamless cubes with Apple AGX (See !21978 for
the description of the quirk on AGX).
As GLES2 exposes non-seamless cubes, exposing non-seamless cube support
at PowerVR help seems to help lot about these GLES2 tests. Implementing
full GLES 3 and relying on the workaround for AGX is another choice, but
it's still too far.
Implementing non-seamless cube seems to be as easy as setting a bit in
the sampler control word, so do it.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40660>
The DDMADT instruction of PDS has out-of-bound test capability, which is
used for implementation of robust vertex input fetch.
According to the pseudocode in the comment block before the "LAST DDMAD"
mark in pvr_pipeline_pds.c, the check is between
`calculated_source_address + (burst_size << 2)` and `base_address +
buffer_size`, in which the `burst_size` seems to correspond to the BSIZE
field set in the low 32-bit of DDMAD(T) src3 and the `buffer_size`
corresponds to the MSIZE field set in the DDMADT-specific high 32-bit of
src3. As the calculated source address is just the base address adds the
multiplication result (the offset), the base address could be eliminated
from the check, results in the check between `offset + (BSIZE * 4)` and
`MSIZE` .
Naturally it's expected to just set the MSIZE field to the buffer size.
In addition, as the Vulkan spec says "Reads from a vertex input MAY
instead be bounds checked against a range rounded down to the nearest
multiple of the stride of its binding", the driver rounds down the
accessible buffer size before setting MSIZE to it.
However when running OpenGL ES 2.0 CTS, two problems are exhibited about
the setting of the size to check:
- dEQP-GLES2.functional.buffer.write.basic.array_stream_draw sets up a
VBO with 3 bytes per vertex (RGB colors and 1B per color) and 340
vertices (results in a buffer size of 1020 = 0x3fc). However as the
DMA request size, which is specified by BSIZE, is counted by dwords,
3 bytes are rounded up to 1 dword (which is 4 bytes). When the bound
check of the last vertex happens, the vertex's DMA start offset is
0x3f9, so the DDMADT check happens between 0x3fd (0x3f9 + 1 * 4) and
0x3fc, and indicates a check failure. This prevents the last vertex,
which is perfectly in-bound, from being properly fetched; this is
against the Vulkan specification, and needs to be fixed.
- dEQP-GLES2.functional.vertex_arrays.single_attribute.strides.
buffer_0_32_float2_vec4_dynamic_draw_quads_1 sets up a VBO with a size
of 168 bytes, and tries to draw 6 vertices (each vertex consumes 2
floats (thus 8 bytes) of attribute) with a stride of 32 bytes using
this VBO. Zink then translates the VBO to a Vulkan vertex buffer bound
with size = 168B, stride = 32B. Here the optional rule about rounding
down buffer size happens in the current PowerVR driver, and the
checked bound is rounded down to 160B, which prevented the last
vertex's 8B attributes to be fetched. It looks like this kind of
situation is considered in the codepath without DDMADT, but omitted
for the codepath utilizing DDMADT for bound check.
So this patch tries to mimic the behavior of DDMADT when setting the
MSIZE field of it to prevent false out-of-bounds. It first calculates
the offset of the last valid vertex DMA, then adds the DMA request size
to it to form the final MSIZE value. With the code calculating the last
valid DMA offset considering the situation of fetching the attribute
from the space after the last whole multiple of stride, both problems
mentioned above are solved by this rework.
There're 99 GLES CTS testcases fixed by this change, and Vulkan CTS
shows no regression on `dEQP-VK.robustness.robustness1_vertex_access.*`
tests.
Fixes: 4873903b56 ("pvr: Enable PDS_DDMADT")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
This memory padding is enforced by GetBufferMemoryRequirements2 and
might be then checked against to decide whether it's enough.
Move it to pvr_buffer.h for further assertions.
Backport-to: 25.3
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
Currently the size of single components inside one attribute is saved
and checked against when checking DMA capability. However, the vertex
attribute DMA happens for a whole attribute instead of individually for
its components, so checking against the component size is useless -- the
size of the whole attribute is what needs to be saved and checked.
Rename all component_size_in_bytes fields to attrib_size_in_bytes, and
save the size of the whole attribute inside them.
Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
The ddmadt_oob_buffer_size structure to be filled is named
`obb_buffer_size`, which is obviously a typo.
Change to `oob_buffer_size` to fix the typo.
Fixes: 8991e64641 ("pvr: Add a Vulkan driver for Imagination Technologies PowerVR Rogue GPUs")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Ella Stanforth <ella@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40528>
Even if a linear image isn't created with usages declaring PBE writes,
the image might be exported and then re-imported with a usage that
allows rendering to.
Always align linear images' width for being written by PBE.
This fixes WSI creating surfaces with odd width, exporting them and
re-importing for rendering.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40250>
Currently the display FD is opened twice because of pvr_winsys_create()
being called twice, however the WSI (which will do modeset on the
display FD in case of VK_KHR_display) is registered against the winsys
created at PhysicalDevice enumeration time, and the display FD opened at
Device creation time will only be used for allocating dumb buffer (which
does not require master privilege).
Add a parameter to pvr_winsys_create() to indicate whether the master
privilege is desired on the display FD, and pass true only when creating
the winsys for PhysicalDevice initialization.
Fixes VK_KHR_display operation on PowerVR driver, which is broken after
the WSI code starts to drop master in commit 870e233ca5
("vulkan/wsi/display: Avoid holding drm master for the device's fd.").
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/15161
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40640>
The s0abs bit in the encoing of fred instruction is wrongly set to the
status of .neg modifier instead of .abs modifier.
Fix this copy-n-paste error.
Fixes GLCTS tests when running on top of Zink:
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.4
dEQP-GLES2.functional.shaders.random.trigonometric.vertex.45
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.4
dEQP-GLES2.functional.shaders.random.trigonometric.fragment.45
Fixes: 8ec174b3f9 ("pco: add support for various selection, complex, trig ops")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40611>
In pvr_pipeline_pds.c, there's a pseudocode snippet describing the
behavior of DDMADT PDS instruction, which seems to be copied from some
internal document about PDS behavior.
However, the pseudocode isn't properly indented, especially some
brackets are misaligned. This blocks fluent reading of the pseudocode
and may even mislead the reader.
Re-indent the pseudo code with similar rules with C codes in the driver.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Acked-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40533>
Move options were bit or-ing from the wrong enum, causing undefined
behaviour when the number of intrinsics changed.
Replaced it with the values from the right nir_move_options enum that
were previously working. (Further refinement needed on these after
extensive testing.)
Fixes: f1b24267d2 ("pco: rework nir processing and passes")
Signed-off-by: Radu Costas <radu.costas@imgtec.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40568>