../src/asahi/lib/decode.c:933:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
933 | (void *)c->vertex_attachments;
| ^
../src/asahi/lib/decode.c:941:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
941 | (void *)c->fragment_attachments;
etc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>
../src/asahi/lib/decode.c:933:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
933 | (void *)c->vertex_attachments;
| ^
../src/asahi/lib/decode.c:941:7: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
941 | (void *)c->fragment_attachments;
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>
The asahi kernel driver is a pure-explicit-sync driver and userspace is
required to handle implicit sync itself, by importing/exporting fences
in shared dma-bufs. Mesa handles this in its own native or guest
context, but dma-buf fences are not shared between the guest and the
host, so this breaks implicit sync across the VM boundary.
To make this work, explicitly pass a resource list to the host so it can
perform the implicit sync dance, like we do in agx_batch.c. This
essentially turns the virtgpu protocol into an implicit sync protocol
(like many other legacy GPU drivers), which makes sense here since we
don't particularly have the primitives to pass through and manage "host"
syncobjs that we'd need to do it at that level.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>
V3D can use these too.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
This fixes the compute blitter with compression in the general case, and then
flips the switch since the compute blitter is faster / less buggy than the
traditional path.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
I don't know what Apple calls these, so we're using the name "explicit
coordinates".
AGX has instructions for loading/stores register <---> tilebuffer ---> storage
images. Usually these are used in the fragment shader and end-of-tile shader to
implement colour attachments, with implicitly specified coordinates based on the
shader stage. However they can also be used in compute shaders with explicitly
specified coordinates ("imageblocks" in Apple parlance). Model this in NIR.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
for preambles and for peephole selection.
total instructions in shared programs: 2159359 -> 2114124 (-2.09%)
instructions in affected programs: 359763 -> 314528 (-12.57%)
helped: 814
HURT: 6
Instructions are helped.
total alu in shared programs: 1685059 -> 1670200 (-0.88%)
alu in affected programs: 217210 -> 202351 (-6.84%)
helped: 589
HURT: 45
Alu are helped.
total fscib in shared programs: 1681202 -> 1666324 (-0.88%)
fscib in affected programs: 217477 -> 202599 (-6.84%)
helped: 590
HURT: 45
Fscib are helped.
total ic in shared programs: 460856 -> 455502 (-1.16%)
ic in affected programs: 41350 -> 35996 (-12.95%)
helped: 174
HURT: 8
Ic are helped.
total bytes in shared programs: 14302484 -> 14053982 (-1.74%)
bytes in affected programs: 2380614 -> 2132112 (-10.44%)
helped: 814
HURT: 7
Bytes are helped.
total regs in shared programs: 662302 -> 656517 (-0.87%)
regs in affected programs: 26979 -> 21194 (-21.44%)
helped: 432
HURT: 9
Regs are helped.
total uniforms in shared programs: 1651909 -> 1687077 (2.13%)
uniforms in affected programs: 95383 -> 130551 (36.87%)
helped: 17
HURT: 783
Uniforms are HURT.
total threads in shared programs: 20324608 -> 20326592 (<.01%)
threads in affected programs: 16192 -> 18176 (12.25%)
helped: 17
HURT: 3
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
Incompatible changes:
- Make VM layout more flexible to allow for SVM with rusticl
(eventually, hopefully)
Compatible changes:
- Expose soft fault state to userspace as a flag
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
honeykrisp wants to do this explicitly so we don't need prologs for TES. the gl
driver uses TES prologs implicitly for the same effect, but that's ...
suboptimal.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30382>
the calculation of workgroup reductions was wrong, giving nondeterministic
results when prefix summing >= 1024 items. fixes misrendering in
terraintessellation on honeykrisp.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30382>