We'll get three new opcodes to properly model float multiply-add.
ffma_old is temporary and will be deleted at the end of this series.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41165>
As most work for maintenance5 is already done in the common Vulkan
runtime, functions required by it are implemented in pvr driver and
blitting functions are changed to use common functions for acquiring
subresource layer count, advertise VK_KHR_maintenance5 now.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41569>
It's implemented like GetDeviceImageMemoryRequirements by creating a
temporary image.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41569>
As the functionality of GetImageSubresourceLayout was already a
dedicated function, simply changing the parameters for the function call
is enough to implement GetImageSubresourceLayout2.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41569>
As the implementation of GetRenderAreaGranularity does not reference the
render pass object, just move it to a new function and use it for both
functions.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41569>
As index buffer bound checking isn't yet supported by the driver, the
newly added size parameter is just assert-checked.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41569>
With VK_KHR_maintenance5, the layerCount member of
VkImageSubresourceLayers can be VK_REMAINING_ARRAY_LAYERS.
Change any direct access to this member to the common runtime function
vk_image_subresource_layer_count, which can handle the maint5 case.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41569>
VK_KHR_maintenance5 requires the driver to accept values beyond the
defined enumerants for physical-device-level functions. Although the
VkImageType enumeration hasn't received any new enumerants since
VK_VERSION_1_0, it's possible that it gets extended in the future.
Change the code to return a VK_ERROR_FORMAT_NOT_SUPPORTED instead of
using UNREACHABLE (which triggers an assert).
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41569>
Global I/O intrinsics don't have an index offset, and can't directly be
mapped to descriptors.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41545>
Spills shared memory based on a fixed threshold, currently set to 75%.
This is to account for other usage of the common store.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41545>
These tests were fixed by 68cb76de5d ("pco: Fix encoding of branch to an empty
block").
Fixes: ef860bcaa1 ("pvr/ci: Add dEQP-VK testing for BXS-4-64 on TI AM68 SK")
Signed-off-by: Frank Binns <frank.binns@imgtec.com>
Acked-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41544>
The fix for this test is merged between the start and the merging of the
Vulkan CTS uprev MR.
Remove it from the fails list because it was already fixed.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41571>
I've pulled in a pile of changes to reduce the overhead (runtime and
memory) when sharding for deqp-runner, along with a bunch of fixes for
KHR_display testing that we recently enabled, plus a few others that
affect our drivers.
The big new set of failures looks like it's from more complete coverage of
blitting between formats.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41243>
Suspending render pass jobs have more things than render targets to
preserve, e.g. occlusion query related information, atomic / compute
overlap enablement information etc.
Preserve them too when suspending. When resuming, for boolean
properties, or'ing them; for other preserved things assign them. This is
for ensuring the last resuming fragment job is compatible with all
suspending geometry jobs, as for suspending render passes the fragment
job is omitted.
The situation of the suspending render pass and the resuming render pass
have different query pools is still not supported, and quite difficult
to support.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41002>
As more things than render targets data need to be kept for suspending
renderpasses, add a structure to sort out them.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41002>
As we're going to kick frag for suspending rendering passes to mitigate
frag job inconsistency between suspending rendering passes and resuming
render passes, deriving render target datasets based on
geometry_terminate property will be incorrect.
Stop to use geometry_terminate to decide whether to remember render
target datasets, instead use is_suspend directly.
In addition, is_resume is now also used instead of checking whether
suspended render taget datasets is available. This will help when either
the suspending render pass or the resuming render pass have multiple
graphics sub_cmds.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41002>
When executing a secondary command buffer outside a renderpass, the
sub_cmds of that secondary command buffer is simply copied into the
primary command buffer. However, the 4 flags outside the type-specific
structures are not copied. Although owned flag is intentionally set to
false, the other 3 flags should be preserved.
Copy these 3 flags when executing sub_cmds of a secondary command buffer
outside renderpasses.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41002>
The attachments field of the render pass state could be
MESA_VK_RP_ATTACHMENT_INFO_INVALID, which indicates no attachment
information is valid. If such situation really happens when initializing
the fragment state of a pipeline, this means neither a render pass nor a
VkPipelineRenderingCreateInfo structure is available -- in this case,
the specificiation for that structure says colorAttachmentCount is
considered as 0, so the loop iterating color attachments should just not
happen.
Skip iterating color attachments if the render pass has a attachments
field with value MESA_VK_RP_ATTACHMENT_INFO.
This fixes some regression on the Vulkan CTS testcase
dEQP-VK.pipeline.monolithic.misc.no_rendering introduced by !40870, in
which MESA_VK_RP_ATTACHMENT_INFO instead of 0 is set as the value of the
attachments field of the render pass state, if neither a render pass nor
the VkPipelineRenderingCreateInfo structure is available.
Fixes: 1950b6c1a7 ("vulkan: mark RP attachments as invalid when no rendering create info")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41032>
Add a new helper function pvr_pbe_format_num_sample_components that maps a
pvr_transfer_pbe_pixel_src format to the number of components it actually
uses. Use pvr_pbe_format_num_sample_components in pvr_uscgen_tq_frag_load
to set params.sample_components before calling pco_emit_nir_smp, so the
instruction is emitted with the correct component count. This allows the
generation of a more optimal SMP instruction, avoiding the emission of
unused result components.
Signed-off-by: Caius Moldovan <caius.moldovan@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41393>
The common Mesa Vulkan WSI code checks some DRI options.
Add them to the option list of the PVR driver.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41197>
The DRI options list is formatted specically and clang-format cannot
handle it properly.
Disable clang-format for this snippet.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41197>
Bringing force_vk_vendor as the first option, force_vk_devicename
will be added later
Signed-off-by: hmtheboy154 <buingoc67@gmail.com>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
[Icenowy: rebased on top of main]
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41197>
It's possible to use a shader that has ViewIndex input when multiview
isn't enabled. According to the Vulkan specification, when multiview
isn't enabled in a renderpass, the value of the ViewIndex input should
be 0.
However currently the driver does not emit execution of the PDS code
setting up view index, which leads to stale value to remain in
ViewIndex.
Setup the PDS code for setting view index and emit the command stream
for executing that PDS code when the shader wants ViewIndex, even if
multiview isn't enabled.
Fixes: 9d48088428 ("pvr: add view index support for vertex shaders")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40972>
sed -i "s/nir_src_parent_instr/nir_src_use_instr/" `find ./ -type f`
sed -i "s/nir_src_parent_if/nir_src_use_if/" `find ./ -type f`
sed -i "s/nir_src_set_parent/nir_src_set_use/" `find ./ -type f`
There are two kinds of "parent" in relation to a src/def:
- the instruction where the def or src's def is defined
- the instruction which the src is a part of and where the def is used
Clarify that the parent here is where the src's def is used, not where
it's defined.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41344>
Dynamic rendering codepath allows binding an attachment with a
depth+stencil format, but only depth or stencil active. The
corresponding test should be disable in such case.
Ignore the attachment's depth or stencil according to the rendering
attachment info's is_depth and is_stencil variables.
Fixes the following CTS testcases:
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.static_enable.d24_unorm_s8_uint
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.static_enable.d32_sfloat_s8_uint
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.dynamic_enable.d24_unorm_s8_uint
dEQP-VK.pipeline.monolithic.stencil.no_stencil_att.dynamic_rendering.dynamic_enable.d32_sfloat_s8_uint
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41054>
Currently the code handling deferred RTA clears cannot handle them for
secondary command buffers within render passes, because the code
immediately configures the transfer command for the deferred clear
operation, but the specific attachment image view isn't known when
recording secondary command buffers to be executed inside render passes.
Add code to record parameters for deferred RTA clears in secondary
command buffers when the attachment is unknown, and bind the recorded
clears to the attachment's image view when executing the secondary
command buffer inside a render pass.
Fixes many dynamic rendering random tests.
Backport-to: 26.0
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
The code that adds deferred RTA clear transfer commands checks whether
the newly allocated transfer command is NULL. However the list_addtail
call is before the check, which means that the check does not prevent
NULL dereference.
Reorder the code to ensure no NULL transfer commands would ever be added
to the deferred clear list.
In addition, pvr_transfer_cmd_alloc() has already set the command
buffer's error status when it returns NULL, so it's not needed to set it
again.
Fixes: 2eabbbe57d ("pvr: use linked list to back deferred clears")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
For 2D array views of 3D images, the layer of the view corresponds to
the depth (instead of the layer, which should be always 0) of the image.
Fix the code emitting deferred RTA clears to set the depth instead of
the layer of the image to clear.
Fixes the flakiness of `dEQP-VK.renderpasses.renderpass*.
remaining_array_layers.multi_layer_fb.*`.
Fixes: 95820584d0 ("pvr: Add deferred RTA clears for cores without gs_rta_support.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
Deferred RTA clear will happen after the current graphics subcommand is
executed, which may override rendered image in the graphics subcommand.
In addition, the active render targets do not need "emulated" clear --
they can be really cleared by drawing rectangles.
Skip set up deferred RTA clear for active render target layers, and
continue to do immediate clear for these layers.
Fixes a few dynamic rendering random CTS tests, but the issue should
also exist in legacy renderpasses RTA clears.
Fixes: 95820584d0 ("pvr: Add deferred RTA clears for cores without gs_rta_support.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
While testing HW without gs_rta_support it was raised that this
change had been made in error. After retesting with the change
reverted the listed tests still pass.
This reverts commit d68344bffe.
Backport-to: 26.0
Reported-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
When maxPerStageResources is less than 128, it must be at least the sum
of maxPerStageDescriptorUniformBuffers,
maxPerStageDescriptorStorageBuffers, maxPerStageDescriptorSampledImages,
maxPerStageDescriptorStorageImages,
maxPerStageDescriptorInputAttachments and maxColorAttachments.
As maxPerStageDescriptorStorageBuffers is previously increased, the
value of maxPerStageResources should be increased too.
This fixes regression on two limit validation tests in the Vulkan CTS --
dEQP-VK.info.device_properties and dEQP-VK.api.info.
vulkan1p2_limits_validation.general .
Fixes: 35f57a2739 ("pvr: increase value of maxPerStageDescriptorStorageBuffers")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41270>
The last graphics job, which might write to the occlusion query result,
could still be running when vkCmdCopyQueryPoolResults is called.
Additionally wait for graphics jobs before copying the results.
Fixes: 24b1e3946c ("pvr: Add support to submit occlusion query sub cmds.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40884>
Entry points must be wrapped in the PVR_PER_ARCH macro else there
will be multiple definitions of the same symbol.
Fixes: dfddb3fe ("pvr: Add support for VK_KHR_pipeline_executable_properties")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41238>
When calling vkResetQueryPool() or vkCmdCopyQueryPoolResults() with a
queryCount of 0, currently a query compute program with workgroup size
0*1*1 will be emited, which is ridiculous and will be rejected by some
assertion in pvr_compute_generate_control_stream() .
As the operation should be noop when queryCount is 0, the functions can
and should just return in such cases.
Fixes: 0aa9f32b95 ("pvr: Implement vkCmdResetQueryPool API.")
Fixes: b6e8e1cf37 ("pvr: Implement vkCmdCopyQueryPoolResults API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40911>
Enables the shaderImageGatherExtended feature and sets the
{min,max}TexelGatherOffset physical device properties.
The properties are queried via Zink and are expected to be non-zero.
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
Rather than always emitting and swizzling 16 components for raw samples,
scale it by the number actually needed as defined by the selected tg4
channel/components.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
The array index value is a signed integer but the compiler was using
the unsigned version of the clamp helper function meaning the value
was not been clamped to 0 when its value was < 0.
Fix the following deqp test cases when shaderImageGatherExtended is enabled
dEQP-VK.glsl.texture_gather.basic.2d_array.*
dEQP-VK.glsl.texture_gather.offset.*.2d_array.*
dEQP-VK.glsl.texture_gather.offset_dynamic.*.2d_array.*
dEQP-VK.glsl.texture_gather.offsets.*.2d_array.*
Fixes: 854563f0f8 ("pco: fully switch over to common smp emission code")
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>
Use lower_tg4_offsets to take care of explicit offsets, and just swizzle
the texels in the order defined by textureGather*
Fixes: 46c9239c11 ("pvr, pco: initial texture gather support with gather sampler")
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40687>