While testing HW without gs_rta_support it was raised that this
change had been made in error. After retesting with the change
reverted the listed tests still pass.
This reverts commit d68344bffe.
Backport-to: 26.0
Reported-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Signed-off-by: Nick Hamilton <nick.hamilton@imgtec.com>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40838>
When maxPerStageResources is less than 128, it must be at least the sum
of maxPerStageDescriptorUniformBuffers,
maxPerStageDescriptorStorageBuffers, maxPerStageDescriptorSampledImages,
maxPerStageDescriptorStorageImages,
maxPerStageDescriptorInputAttachments and maxColorAttachments.
As maxPerStageDescriptorStorageBuffers is previously increased, the
value of maxPerStageResources should be increased too.
This fixes regression on two limit validation tests in the Vulkan CTS --
dEQP-VK.info.device_properties and dEQP-VK.api.info.
vulkan1p2_limits_validation.general .
Fixes: 35f57a2739 ("pvr: increase value of maxPerStageDescriptorStorageBuffers")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41270>
According to ubsan shifting an int32_t by 31 bits to the left is undefined
behavior. So just declare it as uint32_t.
Backport-to: 26.1
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41252>
Otherwise we'd get tracepoints out of logical order, which doesn't
matter for perfetto at the moment, but would matter with future
perf warnings.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41102>
The last graphics job, which might write to the occlusion query result,
could still be running when vkCmdCopyQueryPoolResults is called.
Additionally wait for graphics jobs before copying the results.
Fixes: 24b1e3946c ("pvr: Add support to submit occlusion query sub cmds.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Simon Perretta <simon.perretta@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40884>
Apparently CP_CCHE_INVALIDATE is just a plain register write underneath,
so it needs WFI before it, in order to invalidate at the right point.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41266>
Apparently CP_CCHE_INVALIDATE is just a plain register write underneath,
so it needs WFI before it, in order to invalidate at the right point.
```
CP_CCHE_INVALIDATE:
mov $addr, 0x9881
mov $data, 0x1
waitin
mov $01, $data
```
Fixes misrendering in Doom Eternal on A750.
Fixes: fb1c3f7f5d ("tu: Implement CCHE invalidation")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41266>
The intention of the original commit was to make all the shaders report
the same max_dispatch_width. When CS has multiple variants, this was
not happening as expected.
Fixes: 2acc2f18ea ("intel/compiler: report max dispatch width statistic")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41209>
When reloading live-out values along loop back-edges, we make sure to
reuse the original register. However, we failed to detect cases where
the spilled value got reloaded earlier for a src in a different
register. Fix this by reloading the value again in the original
register.
Fixes a RA validation failure in Windrose.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: fa22b0901a ("ir3/ra: Add specialized shared register RA/spilling")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41262>
Expose VK_KHR_shader_integer_dot_product 4x8-bit packed dot
products using native HW instructions v8dot and setnnmode.
QPU instruction count for sdot_4x8_iadd compute shader:
Before (scalar decomposition): 18 ALU cycles
After (setnnmode + v8dot): 3 ALU cycles (6x)
We advertise integerDotProduct4x8BitPacked*Accelerated for V3D 7.1+
Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
This new VIR optimization pass tracks the current NN signedness
mode per block and removes duplicate setnnmode instructions.
When consecutive dot products use the same signedness mode, the backend
emits one setnnmode per dot product. This pass removes the redundant
ones, keeping only the first.
Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
As nnmode register is read by v8dot instruction we need to add dependencies
between setnnmode instructions and v8dot via the nnmode register, so they
are scheduled correcty using last_nn_mode virtual register..
Add a last_nn_mode virtual register to the scheduler state and create:
- Write dependencies for all SETNNMODE variants
- Read dependencies for V8DOT.
This follows the same pattern as the existing MULTOP/UMUL24 rtop tracking.
Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
VIR instructions and nir_to_vir implementation of 4x8-bit dot products
using native HW accelerated ALU instructions.
setnnmode instructions are marked as having side effects.
Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
Add QPU instruction definitions, metadata, and encoding for V3D 7.1
v8dot product instruction and the setnnmode instruction that allows
defining the signedness (UU/SU/US/SS) of the v8dot operation.
Assisted-by: Claude Opus 4.6
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41255>
The IEEE754-2019 standard declaring the preceding sign "optional" when
converting NaN values to strings because the standard tries to not
regulate how sign bits in NaNs are interpreted.
In the real world, when using printf-series function to print a number
with type `float` on RISC-V, the sign of NaNs is wiped during the
conversion from `float` to `double` (defined as part of the default
argument promotions rule for variable arguments in the C spec).
Change the code to stop relying on isa_print() to print the negative
sign, instead parse it from the highest bit of value and manually print
it before "nan" string.
This fixes the `etnaviv_isa_disasm` unit test on RISC-V.
Suggested-by: Christian Gmeiner <cgmeiner@igalia.com>
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40887>
Entry points must be wrapped in the PVR_PER_ARCH macro else there
will be multiple definitions of the same symbol.
Fixes: dfddb3fe ("pvr: Add support for VK_KHR_pipeline_executable_properties")
Signed-off-by: Luigi Santivetti <luigi.santivetti@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41238>
When calling vkResetQueryPool() or vkCmdCopyQueryPoolResults() with a
queryCount of 0, currently a query compute program with workgroup size
0*1*1 will be emited, which is ridiculous and will be rejected by some
assertion in pvr_compute_generate_control_stream() .
As the operation should be noop when queryCount is 0, the functions can
and should just return in such cases.
Fixes: 0aa9f32b95 ("pvr: Implement vkCmdResetQueryPool API.")
Fixes: b6e8e1cf37 ("pvr: Implement vkCmdCopyQueryPoolResults API.")
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Reviewed-by: Nick Hamilton <nick.hamilton@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40911>