_mesa_sha1_format has a few remaining uses, so it's moved to build_id.c,
which is its last user.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
Ethos-U can support 16-bit tensors. So far the driver just assumed 8-bit
tensors.
There's a few cases where 32-bit tensors are supported, but exactly what
those are hasn't been determined, so just reject them for now.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40525>
with the unlowering pass, there is no longer a separate gl_LastFragData variable,
so this workaround just breaks color outputs
fixes dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40437>
Previously the driver decided when the backend should use
LD_VAR_BUF[_IMM] instructions based on the total number of varyings
read, falling back to LD_VAR[_IMM] + descriptors when the varying index
could overflow the immediate index in the instructions. That means that
even adding a single varying read could overflow the index and make
everything fall back to LD_VAR.
With this patch the backend decides when to use LD_VAR_BUF for each
varying load, reporting that decision to the driver. This helps with
index overflows because only the instruction that actually overflow the
immediate use the LD_VAR fallback, leaving all other instructions on the
fast path.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40515>
calc_blockdep always returned MAX_BLOCKDEP without checking if the
previous op writes to a buffer the current op reads from. This let
the NPU start reading before the previous write was done.
Add overlap check between previous OFM and current IFM so we set
blockdep to 0 when they share the same buffer.
Update ethos-imx93-fails.txt to remove the tests that now pass.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
Replace the two functions simplified_elementwise_add_sub_scale and
eltwise_emit_ofm_scaling with a single advanced_elementwise_add_sub_scale
that follows the ethos-u-vela naming. Remove the large block of
commented out Vela Python code.
No functional change.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
The upscale field was a bool which happened to work since true maps
to 1 which is NEAREST in the hardware. Change from bool to an enum
ethosu_upscale_mode so the intent is clear and we dont rely on the
bool-to-int mapping.
Also add a check in operation_supported so RESIZE only accepts 2x
upscaling since thats what the NPU can do with IFM_UPSCALE. Other
sizes fall back to CPU.
Keep the original zero_points from tensors in RESIZE and STRIDED_SLICE
instead of forcing them to 0 since the requantization needs them.
Fixes the RESIZE_NEAREST_NEIGHBOR operations in EfficientDet-Lite
models that use BiFPN with 2x nearest neighbor upsampling.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
fill_weights subtracted a single zero_point from all weights which
did not handle models with per-channel zero_points. Use the
per-channel zero_point for each output channel when available.
Also decouple the zero_points copy from the scales copy in the lower
pass so they are handled independently.
Suggested-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
For those models with coefficients that have different quantization
parameters for each channel.
The NPU can handle per-channel scales as can be seen in
fill_scale_and_biases(), which already iterates per output channel.
Activation tensors (input/output) don't have per-channel quantization.
- Add scales/zero_points arrays to ethosu_kernel struct
- Copy per-channel scales from weight tensor in lower pass
- Use per-channel scale when computing conv_scale in coefs
- Allow per-channel quantization in operation_supported check
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39594>
This will surely lose performance in some cases, this is a temporary fix
to align ourselves with how the Vulkan compiler works. We might be able
to us indirect varyings directly in the future depending on how we
handle their memory layout.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40391>
When the pipe_resource pointer returned by resource_create is NULL, the
process importing the handle into the underlying Vulkan driver is known
to have failed, and the handle importing process shouldn't continue.
Just return NULL in this case to prevent further check of pres being
non-NULL.
This also fixes the issue that renderonly code lacks check for non-NULL
pres, and the conversion of pipe_resource to zink_resource in renderonly
codepath is now gone because of a converted zink_resource is available
above.
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40490>
This code path is usually used by lavapipe when importing dmabufs, not
for output.
The resulting size_required is then used to calculate the size
requirements for VkMemoryRequirements2 etc. Requiring a multiple of
LP_RASTER_BLOCK_SIZE - 4 - can eventually result in lavapipe rejecting
dmabuf imports.
An example is YUV420 at a resolution of 1680x1050 produced by Gstreamer
1.28 - e.g. from a screencasts. In this case we currently compute a size
of 3235840, while other drivers like radv compute 3225600. The actual
size is 3227648, fitting into the later but not the former.
Removing the alignment brings lavapipe in line with other drivers.
Cc: mesa-stable
Signed-off-by: Robert Mader <robert.mader@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40424>
The preprocessor symbol we want is `PAN_ARCH`, not `MALI_ARCH`.
Fixes: a21ee564e2 ("pan/bi: Make texel buffers use Attribute Buffers")
Reviewed-by: Lars-Ivar Hesselberg Simonsen <lars-ivar.simonsen@arm.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40459>