Fix designator order for `pan_pool_ref` fields by matching declaration
order and avoid an error by the C++ compiler.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11064>
If there is a preload job needing tiling, but no other jobs, then
first_tiler will be set but not tiler_dep.
Fixes faults when two depth-only (stencil is reloaded) clears are done
in a row.
panfrost ffa30000.gpu: Unhandled Page fault in AS1 at VA 0x0000000044870000
Reason: TODO
raw fault status: 0x49002C1
decoded fault status: SLAVE FAULT
exception type 0xC1: TRANSLATION_FAULT_LEVEL1
access type 0x2: READ
source id 0x490
panfrost ffa30000.gpu: gpu sched timeout, js=0, config=0x3301, status=0x8, head=0x608a300, tail=0x608a300, sched_job=f5b0862d
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11104>
Configure a panfrost performance structure with tables of categories and
counters for the current product id. An array for storing counter values
read from the GPU is also managed by this structure. A generic read
function can be used to retrieve the value of a counter from the conter
values array.
v2: Generate tables instead of calling register functions.
v3: Simplify counter read function and `pan_gen_perf.py` write method.
v4: Accumulate counter values from all cores.
v5: Wrap `STATIC_ASSERT`s within unused functions.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10844>
We previously allocated only 16MB, but this isn't always enough. Now
that we have growable (heap) on recent kernels, there's not much reason
to try to shrink this allocation.
Fixes OUT_OF_MEMORY fault on furmark trace.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10938>
This allows failing fast (optionally still tracing, if set with
PAN_MESA_DEBUG=trace) when a GPU fault is introduced. This is better
behaviour for both use cases:
1. When debugging a known fault, setting this mode together with trace
will stop the driver as soon as a buggy command stream is submitted,
and the offending stream will be the last trace file.
2. When running test suites (particularly in CI), setting this mode
will detect faults and crash, causing the pipeline to fail fast as
opposed to incorrectly marking the run green if the test happens to
pass despite the faults and slow downs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10938>
If the varying descriptors will always be the same for a given shader
variant (certainly true if none of separable shaders, transform
feedback, or point sprites are used), we only need to link once. Now
that pan_pool supports both owned and unowned modes, we have the
flexibility to reuse the code path for both allocation strategies.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10954>
Refactor all the linking code with the following objectives:
* Remove linking magic (especially around XFB)
* Cleaner code (obviously)
* Less stage coupling (in case someone ever implements geom/tess)
* Decouple ATTRIBUTE from ATTRIBUTE_BUFFER to enable optimizations
The main hack remaining is doing precision linking here to workaround
linking previously used.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>
The other part is |'d together. Do the happy path at compile time, and
reserve the draw time fixup for a v5 silicon issue.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10869>
These were already grouped nicely, just never got used. I also added
coverage writes to the list of reasons not to use early-z for
completeness. At the moment this doesn't do anything since this is a
Midgard flag and we haven't hooked up coverage writes on Midgard.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10869>
Blit shaders are small and the average app doesn't use many of them, so
try to pack in a single 4k BO. Saves 60k in a lot of simple apps.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
Doesn't do any fancy cross-process labeling like @shadeslayer's patches
but helps with all your intra-process labeling needs.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
I would like to reuse pan_pool for persistent uploads (shaders and CSOs)
in Gallium. In theory u_upload_mgr is more appropriate, but pan_pool is
already a knockoff u_upload_mgr, so might as well finish the job.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
LOCAL_STORAGE.zw is workgroup local memory, whereas LOCAL_STORAGE.xy is
thread local memory. Likewise PC_SP.zw is the stack pointer, which is
initialized to (LOCAL_STORAGE.zw + offset) but is modifiable by the
shader. Panfrost doesn't modify the s tack pointer, and the register
allocation logic assumes a zero offset, so let's always spill to thread
local memory = LOCAL_STORAGE.xy, as was intended by Italo's cleanup.
This is visible on any shader that spills. Compute shaders aren't
advertised yet, so WLS will be null, causing a fault like the following
(reproduced on Mali T860 with the glyphy trace):
[15634.148873] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x0000000000000000
Reason: TODO
raw fault status: 0x70003C2
decoded fault status: SLAVE FAULT
exception type 0xC2: TRANSLATION_FAULT_LEVEL2
access type 0x3: WRITE
source id 0x700
[15634.658170] panfrost ff9a0000.gpu: gpu sched timeout, js=0, config=0x3300, status=0x8, head=0x31d4540,
tail=0x31d4540, sched_job=00000000e8101b2e
Fixes: 6a12ea02fe ("pan/mdg: properly encode/decode ldst instructions")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10866>
BOs in the cache are chronological, so we try oldest BOs first. That
means if we find the oldest BO is busy, likely every BO is busy, and we
should bail early. This dramatically reduces the useless cycles spent in
bo_wait.
I studied the BO cache of the following drivers, all of which handle
this correctly: iris, lima, etnaviv, freedreno, vc4, v3d, v3dv.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10794>
Imported resources will not necessarily have their line stride aligned
on 64 bytes, and things prove to work just fine even on Bifrost, so
let's relax the condition and drop the comment stating that Bifrost
needs pixel lines to be aligned on 64 bytes.
Reported-by: Icecream95 <ixn@disroot.org>
Suggested-by: Icecream95 <ixn@disroot.org>
Fixes: 051d62cf04 ("panfrost: Add a pan_image_layout_init() helper")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10423>
Rockchip SoCs RK3566 and RK3568 have a Gondul with
one shader core and two execution engines, with product ID 0x7402.
Quoting the datasheet:
Mali-G52 1-Core-2EE
* Support 1600Mpix/s fill rate when 800MHz clock frequency
* Support 38.4GLOPs when 800MHz clock frequency
To distinguish it from other variants of G52, we agreed
to call this "G52L", L is for Little.
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10771>
If we only read 8 bytes from a UBO, we need to use LD_UNIFORM.64 as
opposed to LD_UNIFORM.128. In addition to probably being faster, this
fixes a buffer overrun manifesting as MMU faults with source ID
0x500/0x600/0x700, visible in WebGL Aquarium.
This is essentially the same patch as 616394cf31 ("pan/mdg: Use
appropriate sizes for global loads/stores"), only this is for UBOs where
that was for SSBOs.
Before enabling PACKED_UNIFORMS, this bug was not visible since we could
guarantee the UBO size was a multiple of 16. We no longer have that
invariant, and in rare cases the last 8 bytes of the last 16-byte slot
of a mapped uniform buffer would overrun the BO and trigger a fault,
even if the result is unused.
Fixes: 24d7c413fe ("panfrost: Enable packed uniforms.")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10772>