Doesn't do any fancy cross-process labeling like @shadeslayer's patches
but helps with all your intra-process labeling needs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
I would like to reuse pan_pool for persistent uploads (shaders and CSOs)
in Gallium. In theory u_upload_mgr is more appropriate, but pan_pool is
already a knockoff u_upload_mgr, so might as well finish the job.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
LOCAL_STORAGE.zw is workgroup local memory, whereas LOCAL_STORAGE.xy is
thread local memory. Likewise PC_SP.zw is the stack pointer, which is
initialized to (LOCAL_STORAGE.zw + offset) but is modifiable by the
shader. Panfrost doesn't modify the s tack pointer, and the register
allocation logic assumes a zero offset, so let's always spill to thread
local memory = LOCAL_STORAGE.xy, as was intended by Italo's cleanup.
This is visible on any shader that spills. Compute shaders aren't
advertised yet, so WLS will be null, causing a fault like the following
(reproduced on Mali T860 with the glyphy trace):
[15634.148873] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x0000000000000000
Reason: TODO
raw fault status: 0x70003C2
decoded fault status: SLAVE FAULT
exception type 0xC2: TRANSLATION_FAULT_LEVEL2
access type 0x3: WRITE
source id 0x700
[15634.658170] panfrost ff9a0000.gpu: gpu sched timeout, js=0, config=0x3300, status=0x8, head=0x31d4540,
tail=0x31d4540, sched_job=00000000e8101b2e
Fixes: 6a12ea02fe ("pan/mdg: properly encode/decode ldst instructions")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
BOs in the cache are chronological, so we try oldest BOs first. That
means if we find the oldest BO is busy, likely every BO is busy, and we
should bail early. This dramatically reduces the useless cycles spent in
bo_wait.
I studied the BO cache of the following drivers, all of which handle
this correctly: iris, lima, etnaviv, freedreno, vc4, v3d, v3dv.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10794>
Imported resources will not necessarily have their line stride aligned
on 64 bytes, and things prove to work just fine even on Bifrost, so
let's relax the condition and drop the comment stating that Bifrost
needs pixel lines to be aligned on 64 bytes.
Reported-by: Icecream95 <ixn@disroot.org>
Suggested-by: Icecream95 <ixn@disroot.org>
Fixes: 051d62cf04 ("panfrost: Add a pan_image_layout_init() helper")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10423>
Rockchip SoCs RK3566 and RK3568 have a Gondul with
one shader core and two execution engines, with product ID 0x7402.
Quoting the datasheet:
Mali-G52 1-Core-2EE
* Support 1600Mpix/s fill rate when 800MHz clock frequency
* Support 38.4GLOPs when 800MHz clock frequency
To distinguish it from other variants of G52, we agreed
to call this "G52L", L is for Little.
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10771>
If we only read 8 bytes from a UBO, we need to use LD_UNIFORM.64 as
opposed to LD_UNIFORM.128. In addition to probably being faster, this
fixes a buffer overrun manifesting as MMU faults with source ID
0x500/0x600/0x700, visible in WebGL Aquarium.
This is essentially the same patch as 616394cf31 ("pan/mdg: Use
appropriate sizes for global loads/stores"), only this is for UBOs where
that was for SSBOs.
Before enabling PACKED_UNIFORMS, this bug was not visible since we could
guarantee the UBO size was a multiple of 16. We no longer have that
invariant, and in rare cases the last 8 bytes of the last 16-byte slot
of a mapped uniform buffer would overrun the BO and trigger a fault,
even if the result is unused.
Fixes: 24d7c413fe ("panfrost: Enable packed uniforms.")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10772>
Extend the VERTEX_INSTANCE_OFFSETS sysval to pass
BaseVertex/BaseInstance information to the shader.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10417>
Some fields are overlapping, so let's just split those structs to
avoid confusion.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10417>
This is undefined behaviour in the hardware but perfectly legal NIR, so
lower to an if ladder predicated on the lane ID (the blob's preferred
strategy).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10022>
Hook-up support for native blits. We keep using u_blitter by default
for now.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
Right now the lib is just used for TB preloads. Let's make it generic
to support actual blits.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
That's not necessarily true for actual blits, so let's pass the min/max
coords to pan_blitter_emit_viewport().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
The type has already been selected when forging the key, no need to do
it again pan_blitter_emit_bifrost_blend().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
There's clearly no need to pass RTs/ZS views around, pass only what we
really need.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
We will emit one viewport and attach it to several DCDs for multi-layer
blits. Make that possible by adjusting the prototype and rename the
function along the way.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
The texture descriptors will be emitted once and re-used in several
DCDs when blitting more than one layer. Rename the functions and make
them return a GPU pointer instead of filling the DCD directly.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
It doesn't make sense to have it done in pan_preload_emit_varying().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
The sampler descriptors will be emitted once and re-used in
several DCDs for the multi-layer blit case. We will also need
to select the filter. Let's adjust the pan_preload_emit_*_sampler()
functions to support that and rename them along the way.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
It should speed up a bit hash calculation and allow us to add extra
info without changing the structure size.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
first should be initialized to true if we want to get rid of the leading
';' in the shader name.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10548>
Remove the PAN_MESA_DEBUG=fp16 flag that was hiding it.
Skip two buggy dEQP tests. See linked discussion. We'll need to make
sure this gets sorted out before submitting conformance, but I don't see
a test with a fix in the pipeline as valid reason to hold back valid
code.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
Needed for constant folding to be effective. But don't copyprop into
instructions already reading from FAU, that will just end up adding more
moves!
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>
We don't vectorize transcendentals, since those are scalar only in
hardware. Also don't vectorize a few places where impedance mismatches
between NIR and the hardware make handling vectors infeasible for now.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9239>