When converting the index buffer from 4-bytes to 2-bytes, we use the
uploader for the job. Since commit b3133e250e we do an uploader alloc
ref, which releases the uploader buffer if there is no enough space,
creating a new one.
The problem happens when we also need this buffer because it is the one
containing the index buffer to convert. This happens, for instance, if
we need to convert the primitives because they are not supported (e.g.,
converting quads to triangles), as this is done
also using the uploader.
The solution is to ensure the uploader's buffer has an extra reference
so when released, it is not destroyed. This can easily achieved by
calling first pipe_buffer_map_range(), which is required to access the
buffer, and it increases the references.
This fixes `spec@!opengl 1.1@longprim`.
Fixes: b3133e250e ("gallium: add pipe_context::resource_release to eliminate buffer refcounting")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40642>
Ian Romanick reported some "undefined behaviour" warnings during some
not specified tests, relating to introduction of RGB[A}16_UNORM formats
in merge request
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38588
This due to overflowing the 32-bits masks[], and then during assignment
the red/green/blue/alphaMask fields in struct gl_config when using a 16
bpc format. Iow. the red/green/blue/alphaMask would not be usable.
Suppress this warning by setting masks[] to zero for unorm16 formats,
just as was previously done for is_float16, ie. fp16 formats.
16 bpc formats are only exposed for display on non-X11 WSI target
platforms like GBM+DRM, Wayland, surfaceless, and these platforms do
not use the info in red/green/blue/alphaMask at all, so the "undefined
behaviour" is meaningless.
Fixes: f2aaa9ce00 ("dri,gallium: Add support for RGB[A]16_UNORM display formats.")
Reported-by: Ian Romanick @idr
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40695>
From the OpenCL specification, `clCreateSubBuffer` should return:
`CL_MISALIGNED_SUB_BUFFER_OFFSET` if there are no devices in
`context` associated with `buffer` for which the `origin` field of
the `cl_buffer_region` structure passed in `buffer_create_info` is
aligned to the `CL_DEVICE_MEM_BASE_ADDR_ALIGN` value.
This was previously unhandled in the entrypoint, marked as TODO.
Add two functions to `Device` for querying the address alignment in
both bits and bytes, for convenience. Properly retrieving the
alignment value from the underlying device/screen is still marked as
TODO.
Signed-off-by: Ahmed Hesham <ahmed.hesham@arm.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40726>
Rename the 'hurd' dri_platform to 'pseudo-drm' to represent non-DRM
presentation platforms.
This platform is now also enabled when building zink and Turnip with the
KGSL backend, allowing zink to use Kopper.
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8634
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40302>
Fix a crash in image descriptor emission caused by stale image_mask bits.
Root cause:
- set_shader_images used a shift expression with count==64 when clearing
image_mask, which is undefined behavior in C.
- This could leave image_mask inconsistent with actual image bindings,
so panfrost_emit_images() might dereferences NULL image resources.
Fixes:
- Use 64-bit-safe bit helpers for mask updates to avoid invalid shifts.
Crash observed when running: OpenCL-CTS api/test_api
Backtrace:
#0 util_image_to_sampler_view (v->resource is NULL)
#1 panfrost_emit_images
#2 panfrost_update_shader_state
#3 panfrost_launch_grid_on_batch
#4 panfrost_launch_grid
Backport-to: *
Signed-off-by: Eric Guo <eric.guo@nxp.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40621>
We have 4 image intrinsic variants now. This enum is useful for
nir_rewrite_image_intrinsic() and it will be used by other NIR passes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40709>
With the per-submit mode, the driver was overwriting the existing file.
To fix that properly, add a RGP capture mode to add _frameXXX or
_submitXXX to the filenames.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40535>
Nvidia implements both the same way as AMD does, so it makes sense to
allow for code sharing here.
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40541>
R300_PACKET3_3D_CLEAR_HIZ encodes COUNT in 14 bits (COUNT[13:0]), so a
single packet can clear at most 0x3fff dwords.
Large depth surfaces on R5xx can require more HiZ dwords than that.
When we emitted a single packet, COUNT truncated and part of HiZ RAM
remained uncleared, which could show up as HyperZ corruption.
Emit CLEAR_HIZ in chunks of R300_CLEAR_HIZ_COUNT_MAX and reserve enough
atom space for the worst-case packet count derived.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/360
Fixes: 12dcbd5954 ("r300g: enable Hyper-Z by default on r500")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40040>
Introduce r300_hyperz_pipe_count and use it in\nr300_setup_hyperz_properties.\n\nRV530 selects pipe topology from NUM_Z_PIPES, while other families use\nNUM_GB_PIPES. Keeping this in one helper avoids duplicated family checks\nand prepares follow-up HiZ clear sizing changes to reuse the same rule.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40040>
I have been running into crashes in this function when using blender.
Some of the entries in ice->state.framebuffer.base.cbufs[0] can
apparently have the texture field be null, which was causing a segfault
in this loop.
In my case, nr_cbufs was 3, and the first two cbufs entries had a null
texture and format set to PIPE_FORMAT_NONE. The last entry had format of
PIPE_FORMAT_R16G16_FLOAT and a non-null texture.
Adding this null check before attempting to dereference the texture
fixes the crash for me and allows blender to work normally.
Fixes: ca96f8517c ("iris: remove uses of pipe_surface as a pointer")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40688>
Abusing RADV_PERFTEST for experimental features doesn't make real
sense, and I think we should stop doing that.
The existing RADV_PERFTEST options like RADV_PERFTEST=transfer_queue
still exists but they are marked as deprecated, they will be removed
in future Mesa releases.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40646>
It didn't save states properly. The only correct place to save them is
si_blitter_begin. Unfortunately, we can't skip saving and restoring
those states because we don't know in advance whether the rectangle path
will be used.
Cc: mesa-stable
Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40634>
This should be faster because 2 triangles are inefficient on the diagonal,
generating helper invocations and potentially extra memory loads from dst
because tiles aren't fully covered.
Reviewed-by: Pierre-Eric
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40633>
Just a bit cleaner, and we can unify point size too.
Signed-off-by: Lorenzo Rossi <lorenzo.rossi@collabora.com>
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40677>
The state uploader was hardcoded to 4096 bytes, which doesn't fill the
full page on systems with 16KB pages. Use devinfo->page_size instead so
the uploader default matches the actual allocation granularity.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>
The variable doesn't store a granularity specific to CLE buffers. It
stores the granularity that the OS imposes on buffer allocations (that
is, the OS page size). Therefore, rename the variable to best reflect
its meaning.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>
Previously, each sampler view allocated a dedicated BO for its,
TEXTURE_SHADER_STATE packet (~24 bytes), which got rounded up to a
full 4KB page. This wastes memory and inflates the per-job BO handle
count.
Use u_upload_alloc_ref() to sub-allocate texture shader state from the
shared state_uploader, matching the pattern already used by image views.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>
From the documentation, the state uploader should be used inside the
driver for long-term state inside buffers, while the stream uploader
should be used by Gallium's internals. Considering that the image view
texture shader state can be considered long-lived state data, use
`state_uploader` instead of `uploader` for consistency.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40496>
PyTorch Conv2d without explicit bias produces a NULL bias_tensor
in the Gallium pipe_ml_operation. Guard against NULL dereferences
in two places:
- ethosu_lower.c: pass NULL to fill_coefs when bias_tensor is NULL
- ethosu_coefs.c: treat missing biases as zero
Fixes crashes when running Conv2d models without bias through the
Ethos-U NPU backend.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>
Add ethosu_ml_subgraph_deserialize() which reconstructs a subgraph
from a serialized byte buffer. Parses the header (cmdstream size,
coefs size, io size, tensors size), restores the tensor array,
cmdstream, and coefficient buffers.
DRM buffer object creation is deferred to prepare_for_submission()
which is called lazily on first invoke.
Wire pctx->ml_subgraph_deserialize in ethosu_create_context().
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>
Add ml_subgraph_deserialize() to pipe_context for reconstructing
a previously-serialized ML subgraph at runtime. This complements
ml_subgraph_serialize() on pipe_ml_device and allows the runtime
to load pre-compiled subgraphs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40578>
Move target-specific fields (is_u65, ifm_ublock, ofm_ublock,
max_concurrent_blocks, sram_size) from ethosu_screen into
ethosu_ml_device. This decouples the compilation phase from the DRM
file descriptor and pipe_screen, allowing ahead-of-time compilation
where the target NPU is not present on the compilation host.
The ethosu_device_screen() helper is retained only for runtime paths
that need the DRM fd (buffer allocation, job submission, destroy).
Compilation code now accesses hardware parameters through
ethosu_ml_device() cast of pipe_ml_device, which can be created
either from a DRM-backed screen or standalone via
ethosu_ml_device_create() with a target string like "65-256".
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40647>