This fixes the compute blitter with compression in the general case, and then
flips the switch since the compute blitter is faster / less buggy than the
traditional path.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
I don't know what Apple calls these, so we're using the name "explicit
coordinates".
AGX has instructions for loading/stores register <---> tilebuffer ---> storage
images. Usually these are used in the fragment shader and end-of-tile shader to
implement colour attachments, with implicitly specified coordinates based on the
shader stage. However they can also be used in compute shaders with explicitly
specified coordinates ("imageblocks" in Apple parlance). Model this in NIR.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
for preambles and for peephole selection.
total instructions in shared programs: 2159359 -> 2114124 (-2.09%)
instructions in affected programs: 359763 -> 314528 (-12.57%)
helped: 814
HURT: 6
Instructions are helped.
total alu in shared programs: 1685059 -> 1670200 (-0.88%)
alu in affected programs: 217210 -> 202351 (-6.84%)
helped: 589
HURT: 45
Alu are helped.
total fscib in shared programs: 1681202 -> 1666324 (-0.88%)
fscib in affected programs: 217477 -> 202599 (-6.84%)
helped: 590
HURT: 45
Fscib are helped.
total ic in shared programs: 460856 -> 455502 (-1.16%)
ic in affected programs: 41350 -> 35996 (-12.95%)
helped: 174
HURT: 8
Ic are helped.
total bytes in shared programs: 14302484 -> 14053982 (-1.74%)
bytes in affected programs: 2380614 -> 2132112 (-10.44%)
helped: 814
HURT: 7
Bytes are helped.
total regs in shared programs: 662302 -> 656517 (-0.87%)
regs in affected programs: 26979 -> 21194 (-21.44%)
helped: 432
HURT: 9
Regs are helped.
total uniforms in shared programs: 1651909 -> 1687077 (2.13%)
uniforms in affected programs: 95383 -> 130551 (36.87%)
helped: 17
HURT: 783
Uniforms are HURT.
total threads in shared programs: 20324608 -> 20326592 (<.01%)
threads in affected programs: 16192 -> 18176 (12.25%)
helped: 17
HURT: 3
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
Incompatible changes:
- Make VM layout more flexible to allow for SVM with rusticl
(eventually, hopefully)
Compatible changes:
- Expose soft fault state to userspace as a flag
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.
nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
Honeykrisp is a Vulkan 1.3 driver for Apple GPUs. It currently support M1 and
M2, future hardware support is planned. It passed CTS a few months ago and with
two exceptions[1] should still pass now.
Compared to the May snapshot that passed conformance [1], this adds a bunch of
new features, most notably:
* Geometry shaders
* Tessellation shaders
* Transform feedback
* Pipeline statistics queries
* Robustness2
* Host image copy
Theoretically, we now support everything DXVK requires for D3D11 with full
FL11_1. To quote Rob Herring:
How's performance? Great, because I haven't tested it.
This driver is NOT ready for end users... YET. Stay tuned, it won't be long now
:}
I would like to reiterate: Honeykrisp is not yet ready for end users. Please
read [3].
Regardless, as the kernel UAPI is not yet stable, this driver will refuse to
probe without out-of-tree Mesa patches. This is the same situation as our GL
driver.
On the Mesa side, the biggest todo before the release is improving
performance. Right now, I expect WineD3D with our GL4.6 driver to give better
performance. This isn't fundamental, just needs time ... our GL driver is 3
years old and honeykrisp is 3 months old.
On the non-Mesa side, there's still a lot of movement around krun and FEX
packaging before this becomes broadly useful for x86 games.
At any rate, now that I've finished up geometry and tessellation, I'm hopefully
done rewriting the whole driver every 2 weeks. So I think this is settled enough
that it makes sense to upstream this now instead of building up a gigantic
monster commit in a private branch.
[1] Pipeline robustness and pipeline statistics are included in this tree but
need bug fixes in the CTS to pass. This is being handled internally in
Khronos. These features may be disabled to get a conformant driver.
[2] https://rosenzweig.io/blog/vk13-on-the-m1-in-1-month.html
[3] https://dont-ship.it/
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30382>