for preambles and for peephole selection.
total instructions in shared programs: 2159359 -> 2114124 (-2.09%)
instructions in affected programs: 359763 -> 314528 (-12.57%)
helped: 814
HURT: 6
Instructions are helped.
total alu in shared programs: 1685059 -> 1670200 (-0.88%)
alu in affected programs: 217210 -> 202351 (-6.84%)
helped: 589
HURT: 45
Alu are helped.
total fscib in shared programs: 1681202 -> 1666324 (-0.88%)
fscib in affected programs: 217477 -> 202599 (-6.84%)
helped: 590
HURT: 45
Fscib are helped.
total ic in shared programs: 460856 -> 455502 (-1.16%)
ic in affected programs: 41350 -> 35996 (-12.95%)
helped: 174
HURT: 8
Ic are helped.
total bytes in shared programs: 14302484 -> 14053982 (-1.74%)
bytes in affected programs: 2380614 -> 2132112 (-10.44%)
helped: 814
HURT: 7
Bytes are helped.
total regs in shared programs: 662302 -> 656517 (-0.87%)
regs in affected programs: 26979 -> 21194 (-21.44%)
helped: 432
HURT: 9
Regs are helped.
total uniforms in shared programs: 1651909 -> 1687077 (2.13%)
uniforms in affected programs: 95383 -> 130551 (36.87%)
helped: 17
HURT: 783
Uniforms are HURT.
total threads in shared programs: 20324608 -> 20326592 (<.01%)
threads in affected programs: 16192 -> 18176 (12.25%)
helped: 17
HURT: 3
Threads are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
Incompatible changes:
- Make VM layout more flexible to allow for SVM with rusticl
(eventually, hopefully)
Compatible changes:
- Expose soft fault state to userspace as a flag
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30633>
honeykrisp wants to do this explicitly so we don't need prologs for TES. the gl
driver uses TES prologs implicitly for the same effect, but that's ...
suboptimal.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30382>
the calculation of workgroup reductions was wrong, giving nondeterministic
results when prefix summing >= 1024 items. fixes misrendering in
terraintessellation on honeykrisp.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30382>
Instead of having a hardcoded list of endian-independent format aliases
in the header, generate them from the format definitions.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29649>
This was an obnoxious bit of cheating we had in the gl4.6 driver that I added
literally the morning I passed gl4.6 cts, just to fix my last gl4.6 cts test.
It had an expiration date.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
Add OpenCL kernels implementing the tessellation algorithm on device. This is an
OpenCL C port of the D3D11 reference tessellator, originally written by
Microsoft in C++. There are significant differences compared to the CPU based
reference implementation:
* significant simplifications and clean up. The reference code did a lot of
things in weird ways that would be inefficient on the GPU. I did a *lot* of
work here to get good AGX assembly generated for the tessellation kernels ...
the first attempts were quite bad! Notably, everything is carefully written to
ensure that all private memory access is optimized out in NIR; the resulting
kernels do not use scratch and do not spill on G13.
* prefix sum variants. To implement geom+tess efficiently, we need to first
calculate the count of indices generated by the tessellator, then prefix sum
that, then tessellate using the prefix sum results writing into 1 large index
buffer for a single indirect draw. This isn't too bad, we already have most of
the logic and the guts of the prefix sum kernel is shared with geometry
shaders.
* VDM generation variant. To implement tess alone, it's fastest to generate a
hardware Index List word for each patch, adding an appropriate 32-bit index
bias to the dynamically allocated U16 index buffers. Then from the CPU, we
have the illusion of a single draw to Stream Link with Return to. This
requires packing hardware control words from the tessellator kernel.
Fortunately, we have GenXML available so we just use agx_pack like we would in
the driver.
Along the way, we pick up indirect tess support (this follows on naturally),
which gets rid of the other bit of tessellation-related cheating. Implementing
this requires reworking our internal agx_launch data structures, but that has
the nice side effect of speeding up GS invocations too (by fixing the workgroup
size).
Don't get me wrong. tessellator.cl is the single most unhinged file of my
career, featuring GenXML-based pack macros fed by dynamic memory allocation fed
by the inscrutable tessellation algorithm.
But it works *really* well.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
`src/asahi/lib/agx_device.c` includes `git_sha1.h` since
0be124b77e ("asahi: Deserialize libagx when opening device"),
although it only started making use of it once ece3896d5b
("asahi: add broken bits of unstable Linux UAPI") was merged.
Regardless, without this meson change, this leads to a race condition in
builds, where `git_sha1.h` might be built (if ever) after `agx_device.c`.
Fixes: 0be124b77e ("asahi: Deserialize libagx when opening device")
Fixes: ece3896d5b ("asahi: add broken bits of unstable Linux UAPI")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29750>
We need a heuristic for handling GS side effects in the least surprising way
possible. Upgrade our previous heuristic to a better one, moving more side
effects into the prepass from the rast shader. This is technically an optimization but mitigates VDM timeouts in absurd Vulkan CTS cases.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29742>
Different APIs have different robustness requirements for VBOs. Add a knob to
select the desired robustness so we can implement rba2 in honeykrisp.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29742>
Rebasing around this patch has been a significant burden for development.
Staging patches to asahi/mesa helps somewhat but 1. it's still really
frustrating to have this much divergence with upstream, and 2. ideally we
wouldn't have to do that.
The kernel upstreaming is stalled for various reasons. This patch adds
compile-only code to speak the unstable Linux UAPI for the SOLE purpose of
reducing my rebase pain... NOT to actually work.
It is NOT for users OR distro maintainers. asahi will refuse to probe on
upstream Mesa to protect against regressions. The uapi is NOT STABLE and
upstream Mesa CANNOT be used with it. Attempting to bypass this WILL give you a
broken system.
This patch employs several layers of deterrents against system-breaking
enablement. With a lot of warning text at the relevant sites. Hopefully that is
good enough to prevent people from breaking systems. And if people brazenly
ignore all of the above ... they get to pick up the pieces.
You have been warned.
---
There is significant prior art for Mesa including downstream kernel uapi
supports in-tree:
* powervr (downstream android driver)
* turnip (downstream kgsl android driver)
* asahi ... ironically (prop macOS kernel driver)
* maybe vc4?
Linux is only special because of distros shipping tagged Mesa releases. The
several layers of guards here guarantee that no tagged Mesa release would
possibly probe even on an asahi downstream kernel. A distro would need a
significant scary patch to make it probe. If/when it breaks, that's on them
and they pick up the pieces.
I make a stability guarantee ONLY for Fedora Asahi Remix -- where we push
packages for both a downstream kernel and Mesa in tandem, while we patiently
wait for upstreaming -- and that is *it*. It will be a nice future when this all
works upstream, but unfortunately we're not there yet.
Acked by Dave [1] and Sima [2]
[1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29620#note_2444189
[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29620#note_2445155
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Co-developed-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Asahi Lina <lina@asahilina.net>
Co-developed-by: Sergio Lopez <slp@sinrega.org>
Signed-off-by: Sergio Lopez <slp@sinrega.org>
Co-developed-by: i509VCB <git@i509.me>
Signed-off-by: i509VCB <git@i509.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29620>