Commit 534a04d557 optimized agx_resource_from_handle() to lazily
defer assignment of a kms-ro renderonly_scanout object to an imported
resource until its kms winsys handle is actually queried by a caller
via agx_resource_get_handle(), to avoid unnecessary import into the
DCP display controller. Only resources with bind flag PIPE_BIND_SCANOUT
will get a renderonly_scanout object assigned during such queries.
Problem: This prevents Mesa GBM's gbm_bo_import() function from properly
importing dmabufs for direct scanout use by some Wayland compositors,
e.g., GNOME mutter.
gbm_bo_import() of dmabuf fd's (GBM_BO_IMPORT_FD / GBM_BO_IMPORT_FD_MODIFIER),
even with GBM_BO_USE_SCANOUT flag, will not mark an imported bo with the
PIPE_BIND_SCANOUT bind flag before internally assigning its KMS winsys
handle via screen->resource_get_handle() -> agx_resource_get_handle(),
causing silent failure of that query. Therefore gbm_bo_import() seems
to return a successfully created gbm_bo with all proper properties,
but gbm_bo_get_handle() and gbm_bo_get_handle_for_plane() will return
invalid handles. These invalid handles cause drmAddFbXXX ioctl calls to
fail, and therefore failure of direct scanout of wl_buffers.
Setting PIPE_BIND_SCANOUT for a resource in agx_resource_from_handle()
may retain the optimization and makes gbm_bo_get_handle[_for_plane]()
work. This fixes direct scanout of fullscreen wl_surface / wl_buffers
under at least GNOME mutter 48.
Fixes: 534a04d557 ("asahi: Flip kmsro around to allocate on the GPU")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37538>
Only if required. I somehow misunderstood that those would need to be
independent too, not just the vertex slots.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8dee4813b0 ("brw: add ability to compute VUE map for separate tcs/tes")
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37251>
Some versions of math.h exports rsqrtf() while others don't, so this
was causing compilation to fail when it is supported.
I have not found a easy way to detect if rsqrtf() is supported and
as this is only used in a llvmpipe tests it is not worthy do changes
in Meson files to detected if it is supported.
So here just renaming the Mesa function to _rsqrtf() and fixing the
build for both math.h versions.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13797
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12934
Reviewed-by: Roland Scheidegger <roland.scheidegger@broadcom.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37515>
r600 does not implement pipe_screen.resource_get_param, so
dri2_resource_get_param just return false here.
eglExportDMABUFImageQueryMESA has been changed to support
multi plane resource, so some emulated multi plane format
gets here and return NULL which causes following queries
with this return value crash.
Fixes: f416a52960 ("egl: refine dma buf export to support multi plane")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13921
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37503>
v2:
- AF seems to be log2 based, so convert accordingly
v3:
- actually expose AF
v4:
- REALLY expose AF
- remove log2 modifier assert
- use log2 modifier
- rename AF field
- advertise AF support in features
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37189>
Add support for hardware-accelerated transform feedback using the TFB
command register to control capture state.
Maintains the hardware state through an enum distinguishing between
idle (no hardware state established), active (hardware currently
capturing), and paused (hardware stopped).
Hardware commands are emitted based on state transitions:
- ENABLE when moving from idle to active
- RESUME when transitioning from paused to active
- DISABLE when stopping capture
Transform feedback buffer setup is using the existing dirty state
mechanism through ETNA_DIRTY_STREAMOUT_BUFS, while command emission uses
the new ETNA_DIRTY_STREAMOUT_CMD flag. Buffer descriptors are computed by
mapping vertex shader transform feedback outputs to fragment shader input
registers, as required by the hardware.
A 64-byte context buffer is allocated per context to maintain hardware
state isolation between applications using transform feedback
simultaneously. The hardware state persists across pause and resume
cycles within a command stream but resets during flushes since transform
feedback state does not survive command buffer boundaries.
The implementation enables the full transform feedback capability with
support for 4 buffers and up to 64 separate or interleaved components,
replacing the previous debug-only stub implementation.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Add infrastructure for stream output by implementing the required Gallium
interface functions for creating, destroying, and binding stream output targets.
This lays the groundwork for transform feedback support in etnaviv.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Add support for transform feedback primitive counting queries using the
hardware TFB query mechanism. The implementation uses dedicated query
registers (VIVS_TFB_QUERY_BUFFER and VIVS_TFB_QUERY_COMMAND) to track
the number of primitives written during transform feedback operations.
The hardware automatically accumulates primitive counts and stores the
final result at offset 0 of the query buffer, eliminating the need for
manual accumulation.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Extend the supports(..) function signature in acc sample providers
to accept an etna_context parameter, enabling GPU feature validation
during query type support checks.
This change prepares the infrastructure for query providers to make
context-aware decisions based on available GPU capabilities.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
Add native hardware support for rasterizer_discard on GPU cores that
support the HWTFB (Hardware Transform Feedback) feature. This moves
rasterizer discard handling from software clipping to dedicated
hardware state.
Passes all dEQP-GLES3.functional.rasterizer_discard.* with HWTFB.
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37320>
The OpenCL spec indicates that functions which modify `cl_kernel` are
not thread-safe, allowing us to handle those functions with standard
mutability.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37354>
Some applications are not ready to handle multi plane
modifiers.
User who want this feature can use AMD_DEBUG=export_modifier
to enable it again.
Fixes: 0a266f0256 ("radeonsi: really support eglExportDMABUFImageQueryMESA")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13917
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37433>
Needing the positions both in the scene for rasterization (in fixed point)
and in the fs (as floats) is a bit awkward, for now just put it in fs key.
Otherwise pretty straight forward.
Reviewed-by: Michal Krol <michal.krol@broadcom.com>
Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37181>
mesa/st flips y coordinates if fbo orientation is Y_0_BOTTOM (essentially
user fbos), and all 3 gallium drivers supporting the feature then
unconditionally reverse this flip. llvmpipe wants to support this as well,
and it would have to do the flip too, and it's actually problematic for
lavapipe, since then lavapipe would have to flip as well, which means that
we'd lose the ability to set y positions to 0 (as the flip with the 4 bit
values does 16-val), and vulkan requires the minimum to be 0.
Hence, reverse this and flip when fbo orientation is Y_0_TOP. I don't actually
pretend to know if this is correct or if just no flipping should occur, but at
least this is consistent with how default sample locations are reported by mesa
via glGetMultisamplefv (which does y flip with the values it gets via
pipe->get_sample_position() if it's a winsys fb).
Reviewed-by: Michal Krol <michal.krol@broadcom.com>
Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37181>
There's nothing to be done in lavapipe, all handled by llvmpipe.
There's a couple new failures in zink with 6/8 samples, but they are all the
same as already happening with 2/4 samples, so nothing specific to 8 samples.
Reviewed-by: Michal Krol <michal.krol@broadcom.com>
Reviewed-by: Brian Paul <brian.paul@broadcom.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37181>
The scissor planes we were setting up did not actually correspond to scissor
with widths / heights being a multiple of full pixels, rather they had an
excess width / height of (nearly) half a pixel (at the x0 and y0 edges).
(Note it's not an actual scissor as in graphics APIs terms, it's the
intersection of fb/viewport/scissor.)
Without multisampling that was still fine (since we always test at pixel
center) however vk cts complained (for some reason only when using 8 samples
(not announced by lavapipe yet) - no idea why it doesn't fail when using 4
samples), in tests such as
dEQP-VK.renderpass2.depth_stencil_resolve.image_2d_17_1.samples_8.s8_uint.depth_zero_stencil_zero_testing_stencil_samplemask,
which uses scissor and noted that the result for some pixels which are outside
the rendering area don't contain the clear color. And actually a llvmpipe
test was failing as well.
There is in fact no need for separate adjustments for msaa at all, as long
as we ensure that x0, y0, x1, y1 all are exactly on their respective plane
edges (inclusive for x0/y0, exclusive for x1/y1). So the logic is mostly
reverted to what it was before this was adjusted for msaa (albeit the original
code then had an excess adjustment of nearly a full pixel at the x0 and y0
edges which is probably why it didn't work for msaa).
Fixes a couple cases of clip-and-scissor-blit tests in llvmpipe and various
other drivers hitting this indirectly. Interestingly though not quite all cases
are fixed and even more odd is that not exactly the same cases are fixed for
all drivers, so maybe there's more to it (need to respect bottom_edge_rule or
something similar?)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37181>
The temporary resource created for render_to_single_sampled was never freed,
hence asan complaining.
Let's fix this before announcing support for msaa 8x in lavapipe, which
otehrwise causes more failures there.
(As a side note, no matter what I try I can't get asan complaining about
this locally, so just relying on ci here.)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37181>
We are using different llvm type depending on the cpu supporting f16c
instructions or not.
The reasoning behind this was that we really couldn't do anything with f16
values and had to cast them to some int type anyway, plus IIRC originally
this actually predates llvm even supporting a half type in the first place
(or if it did, at the very least it was not able to do anything useful with
it).
There are now bugs with lavapipe when the cpu doesn't support f16c, since while
we don't expose f16 capabilities in this case, we can still hit f16 conversion
functions for the likes of unpack2x16float and quantizeToF16, and we're just
straight calling fpext/fptrunc functions, not touching our own code for half
conversion (I believe our own code might still be faster as llvm de-vectorizes
it if it's not supported by the cpu, but don't quote me on that - could depend
on llvm version, and also for trunc the rounding is actually different since
our own functions implement rounding according to d3d10 requirements (mostly
used for f16 render targets)).
This only seems to be a problem for vulkan, not GL, since glsl has its own
lowering pass if the half float packing instructions aren't supported by the
driver.
Ideally we'd fix this by just always using llvm half type for f16, however
still not all llvm backends can handle it.
So instead do some hacky bitcasts around the fpext/fptrunc calls with f16,
which works on x86 even when not supporting f16c. Other llvm backends not
really supporting halfs will still crash there as before (albeit it should
be a "cleaner" crash as the IR is now correct...), but at least keeps them
running for more ordinary things such as f16 texture sampling / render
targets (which they wouldn't if we'd use llvm half type everywhere).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13807
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13865
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37344>
previously this only checked to see if dst was bound, but that is not
the only condition in which clears may be flushed, and triggering a clear
flush while blitting will not set image layouts, which means that a renderpass
could be illegally triggered on an UNDEFINED image (even though it wouldn't be used)
instead, do a much more thorough check to determine whether clears can actually be
stored with the expectation that they will otherwise be flushed
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37467>
Allow users to set an environment variable to influence JM context slot
priorities.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37075>
A Panfrost JM context can be created by leveraging the new Panfrost 1.5
KM IOCTLs and translating Mesa pipe resource priority levels into
Panfrost-specific ones.
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Ashley Smith <ashley.smith@collabora.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37075>