Currently, the information about the performance counters is duplicated
both in the kernel and in user space. Naturally, this leads to
inconsistency, as the user space might be updated and while the kernel
isn't.
Aiming to turn the kernel as the "single source of truth", use
DRM_IOCTL_V3D_GET_COUNTER, when available, to get performance counter
information.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29154>
Currently, the information about the performance counters is duplicated
both in the kernel and in user space. Naturally, this leads to
inconsistency, as the user space might be updated and the kernel isn't.
Aiming to turn the kernel as the "single source of truth", use
DRM_IOCTL_V3D_GET_COUNTER, when available, to get the performance
counter information.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29154>
Now, the kernel has the ability to inform about the maximum number of
performance counters of a V3D device. Let's add this information to the
`struct v3d_device_info` to use it when performing performance queries.
From now on, V3D_PERFCNT_NUM must not be used to retrieve the maximum
number of performance counters. We must use `devinfo->max_perfcnt`,
except on the case that the kernel doesn't support DRM_V3D_PARAM_MAX_PERF_COUNTERS.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29154>
The contents were previously copied with vk_copy_struct_guts(),
now that we use vk_set_physical_device_properties_struct, the sType
is needed.
Fixes: ("3c152a6e5dd venus: Use common physical device properties")
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29314>
If we have a loop with continue_or_break and no divergent exits, there is
no need for a loop header phi.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
This lets us use non-undef for the last operand, if necessary
(demonstrated in the test).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
These might be necessary if continue_or_break and divergent breaks are both used:
loop {
if (divergent) {
a = loop_invariant_sgpr
break
}
discard_if
}
b = phi a
If we break because discard_if makes exec empty but only did so in
previous iterations, then the phi should use "a" from those previous
iterations.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29121>
These new stats report the combined inputs and outputs of
graphics stages stages where applicable.
Task -> Mesh payload is not included.
This is useful for reporting the effects of any shader
optimizations which affect linking.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29209>
Not needed by actual driver functionality, but will be
used for reporting I/O stats.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29209>
To be closer to the RadeonSI helper. This doesn't change anything for
RADV because images are required to be bound at image view creation.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29286>
New pass that shares code with nir_opt_load_store_vectorize but
it only updates the alignment of load/store instructions.
It is useful before running other passes which may
potentially destroy that information (eg. by removing some
instructions from which the alignment may be deduced).
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29210>
When we start writing to an XFB buffer we need to synchronize with
any batches reading from it (because the data they need is about
to be overwritten). Do this by introducing a barrier in csf_launch_xfb.
This patch fixes a valhall failure in
KHR-GLES31.core.vertex_attrib_binding.advanced-iterations
Cc: mesa-stable
Signed-off-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29092>
The driver nowadays requires hardware version >= 4.2, but in the old
days it managed older versions.
Remove some leftovers remaining in the code.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29299>
So far we were using raw uint32_t for handling masks. But this has the
issue that it only allows to handle up to 32 elements; if we need to
handle more elements, the we need to upgrade to uint64_t.
And this happened inadvertently with commit 370f02bf02 ("gallium: Bump
PIPE_MAX_SHADER_IMAGES to 64"), where the number of elements to handled
were increased from 32 to 64, but we didn't upgrade the mask type.
To fix this, and avoid this happening again in the future, let's use
BITSET, which is designed to handle bitmasks, and can able to handle as
many elements as desired.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29299>
See 8af8dc97bc ("tc: do a GPU->CPU copy to initialize cpu_storage")
On zink, initializing cpu_storage that requires a GPU->CPU copy is
particularly expensive; in addition to a sync, the buffer_map call to
copy the GPU data submits a batch and has to wait for that batch.
Take the PIPE_USAGE_STREAM hint and disable using cpu_storage on
resources that "will be modified once and used at most a few times"
where the benefit of cpu_storage is outweighed by the heavy init time.
Signed-off-by: Juston Li <justonli@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29294>
They have been lowered in nir
v2: Keep the _amd versions
v3: Fix if's with removed ops
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29280>