Event::set_status only called the cbs for the passed in status, but we
need it to call all cbs up to the passed in status. This simplifies error
handling.
Cc: mesa-stable
Reviewed-by: @LingMan
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30215>
initializing the winsys from a /dev/dri/cardX node (as discovered by
gbm) doesn't work, as the kernel abi expects a render node
thus, the winsys needs to open the card's rendernode and use that
everywhere except when importing buffers, where it has to explicitly
export from the card node and import to the rendernode
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30224>
This reproduces panvk logic of showing how much memory is available
while taking into account the address space limits we have.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
Valhall can allow up to 48-bit of address space, we should reflect this
here to allow more memory to be mapped in the same address space when
possible.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
Before this, CL buffers would not be attached to subsequent batches.
This fix spurious fails when running OpenCL CTS test_basic astype and
likely others.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30088>
nouveau uses the OS page size which is almost always 4096. The next
patch will make this properly queried but this version is back-portable.
Fixes: 58181b7bbc ("nvk: Bump the sparse alignment requirement on buffers to 64K")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30138>
We skip loading the tile buffer from memory if the job has flagged
a clear (job->clears) for the buffer, however, this only tracks
clears emitted via the TLB. In some cases we may need to fallback
to clearing with a draw call, in which case we also want to skip
the load.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30205>
We need to call iris_wait_syncobj() on both release and debug builds,
so take it out of the assert() call. Only assert the result.
With this patch, gnome-session finally works for me. Also steam.
Fixes: 665d30b544 ("iris: Wait for drm_xe_exec_queue to be idle before destroying it")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30195>
this way we can use shader-db report.py to analyze shader-db changes for zink
with the vk driver of our choosing. requires corresponding report.py relaxations
to be useful, the idea was to change zink a bit and change report.py a bit and
meet in the middle with something useful for arbitrary vulkan drivers.
the output isn't too pretty but it works,
synthetic example on nvk:
Instruction count HURT: shaders/glmark/1-1.shader_test vertex: 42 -> 43 (2.38%)
total Instruction count in shared programs: 7135 -> 7136 (0.01%)
Instruction count in affected programs: 42 -> 43 (2.38%)
helped: 0
HURT: 1
total Code Size in shared programs: 114160 -> 114160 (0.00%)
Code Size in affected programs: 0 -> 0
helped: 0
HURT: 0
total Number of GPRs in shared programs: 2677 -> 2677 (0.00%)
Number of GPRs in affected programs: 0 -> 0
helped: 0
HURT: 0
total SLM Size in shared programs: 0 -> 0
SLM Size in affected programs: 0 -> 0
helped: 0
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30194>
the newline is implicit for util debug messages. this fixes shaderdb output
being half blank lines and brings zink in line with hardware mesa drivers.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30194>
Use common db_alignment to calculate dpb_size for DPB_MAX_RES,
DPB_DYNAMIC_TIER_1 and DPB_DYNAMIC_TIER_2. This makes the db_pitch
in sync with all DPB types.
Remove the VCN5 hack of using 256 for H264 as 64 works.
Remove redundant codes for width and height as they were calculated
at the beginning in calc_dpb_size().
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30186>
Some apps (like firefox and chromium) seem to like to create textures
with internal format GL_RGBA8 but then upload GL_BGRA to them. If the
app were a bit more clever it could hit the memcpy path. But fallback
sw conversion (convert_ubytes()) is slow. If we are going to have to
make this extra copy, do it on the gpu.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30064>
The blitter cannot do the needed swizzle gymnastics for blitting to
luminance and/or alpha formats. So fallback to the 3d path.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30064>
A sequence like:
1) set_framebuffer_state(A)
2) clears and/or draws
3) set_framebuffer_state(B)
4) context_flush(ASYNC)
would result in the fence for batch B being returned, instead of batch A.
Resulting in a fence that signals too early.
Instead, in context_flush() find the most recently modified batch, so
that in the example above batch A would be flushed.
There is one wrinkle, that we want to avoid a dependency loop. So the
'last_batch' can actually be a batch that depends on the most recently
modified batch. But in either case, the batch that is used for the
returned fence is one that directly or indirectly depends on all other
batches associated with the context.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30064>
These are for CPU access, so no point in having additional staging blit
to convert from/to linear. Depth formats are an exception, as normally
they cannot be linear.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30064>
Now that there is a util_blitter_clear_depth_stencil() helper, we can
use that to implement stencil blits in the fallback path.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30064>
The other callers were already doing the prepatory stencil clear before
calling util_blitter_stencil_fallback(). Doing the clear in the driver
avoids recursively entering u_blitter if the driver uses
util_blitter_clear_depth_stencil() to implement ->clear_depth_stencil().
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30064>
This makes panfrost_bo_alloc not assert anymore and propagate the
failure again.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30150>
This fix OpenCL CTS "multiple_device_context/test_multiples" failures.
Also improve create_context/destroy error management a bit.
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30150>
This was an obnoxious bit of cheating we had in the gl4.6 driver that I added
literally the morning I passed gl4.6 cts, just to fix my last gl4.6 cts test.
It had an expiration date.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
Add OpenCL kernels implementing the tessellation algorithm on device. This is an
OpenCL C port of the D3D11 reference tessellator, originally written by
Microsoft in C++. There are significant differences compared to the CPU based
reference implementation:
* significant simplifications and clean up. The reference code did a lot of
things in weird ways that would be inefficient on the GPU. I did a *lot* of
work here to get good AGX assembly generated for the tessellation kernels ...
the first attempts were quite bad! Notably, everything is carefully written to
ensure that all private memory access is optimized out in NIR; the resulting
kernels do not use scratch and do not spill on G13.
* prefix sum variants. To implement geom+tess efficiently, we need to first
calculate the count of indices generated by the tessellator, then prefix sum
that, then tessellate using the prefix sum results writing into 1 large index
buffer for a single indirect draw. This isn't too bad, we already have most of
the logic and the guts of the prefix sum kernel is shared with geometry
shaders.
* VDM generation variant. To implement tess alone, it's fastest to generate a
hardware Index List word for each patch, adding an appropriate 32-bit index
bias to the dynamically allocated U16 index buffers. Then from the CPU, we
have the illusion of a single draw to Stream Link with Return to. This
requires packing hardware control words from the tessellator kernel.
Fortunately, we have GenXML available so we just use agx_pack like we would in
the driver.
Along the way, we pick up indirect tess support (this follows on naturally),
which gets rid of the other bit of tessellation-related cheating. Implementing
this requires reworking our internal agx_launch data structures, but that has
the nice side effect of speeding up GS invocations too (by fixing the workgroup
size).
Don't get me wrong. tessellator.cl is the single most unhinged file of my
career, featuring GenXML-based pack macros fed by dynamic memory allocation fed
by the inscrutable tessellation algorithm.
But it works *really* well.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>