mesa/src/asahi/lib
Alyssa Rosenzweig d26ae4f455 asahi,libagx: tessellate on device
Add OpenCL kernels implementing the tessellation algorithm on device. This is an
OpenCL C port of the D3D11 reference tessellator, originally written by
Microsoft in C++. There are significant differences compared to the CPU based
reference implementation:

* significant simplifications and clean up. The reference code did a lot of
  things in weird ways that would be inefficient on the GPU. I did a *lot* of
  work here to get good AGX assembly generated for the tessellation kernels ...
  the first attempts were quite bad! Notably, everything is carefully written to
  ensure that all private memory access is optimized out in NIR; the resulting
  kernels do not use scratch and do not spill on G13.

* prefix sum variants. To implement geom+tess efficiently, we need to first
  calculate the count of indices generated by the tessellator, then prefix sum
  that, then tessellate using the prefix sum results writing into 1 large index
  buffer for a single indirect draw. This isn't too bad, we already have most of
  the logic and the guts of the prefix sum kernel is shared with geometry
  shaders.

* VDM generation variant. To implement tess alone, it's fastest to generate a
  hardware Index List word for each patch, adding an appropriate 32-bit index
  bias to the dynamically allocated U16 index buffers. Then from the CPU, we
  have the illusion of a single draw to Stream Link with Return to. This
  requires packing hardware control words from the tessellator kernel.
  Fortunately, we have GenXML available so we just use agx_pack like we would in
  the driver.

Along the way, we pick up indirect tess support (this follows on naturally),
which gets rid of the other bit of tessellation-related cheating. Implementing
this requires reworking our internal agx_launch data structures, but that has
the nice side effect of speeding up GS invocations too (by fixing the workgroup
size).

Don't get me wrong. tessellator.cl is the single most unhinged file of my
career, featuring GenXML-based pack macros fed by dynamic memory allocation fed
by the inscrutable tessellation algorithm.

But it works *really* well.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
..
shaders asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
tests asahi: add missing tib alignment check 2024-01-10 08:44:38 -04:00
agx_bg_eot.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
agx_bg_eot.h asahi: rename meta -> bg/eot 2024-05-16 13:25:56 -04:00
agx_bo.c asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
agx_bo.h asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
agx_border.c asahi: Implement custom border colours 2023-02-04 10:37:02 -05:00
agx_device.c asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
agx_device.h asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
agx_device_virtio.c asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
agx_device_virtio.h asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
agx_formats.c asahi: add missing rgba4 format 2024-05-14 04:57:26 +00:00
agx_formats.h asahi: clean up format table renderability 2024-02-14 21:02:32 +00:00
agx_helpers.h asahi: fix vbo clamp with stride=0 2024-06-16 12:15:22 -04:00
agx_iokit.h asahi/lib: use #pragma once 2024-02-14 21:02:32 +00:00
agx_linker.c asahi: don't ralloc in agx_fast_link 2024-05-14 04:57:27 +00:00
agx_linker.h asahi: implement rba2 semantics for vbo 2024-06-16 12:15:22 -04:00
agx_nir_format_helpers.h asahi/lib: use #pragma once 2024-02-14 21:02:32 +00:00
agx_nir_lower_alpha.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
agx_nir_lower_gs.c asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
agx_nir_lower_gs.h asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
agx_nir_lower_ia.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
agx_nir_lower_msaa.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
agx_nir_lower_sample_intrinsics.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
agx_nir_lower_tess.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
agx_nir_lower_texture.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
agx_nir_lower_tilebuffer.c agx: handle discard with force early tests 2024-06-07 16:57:03 +00:00
agx_nir_lower_uvs.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
agx_nir_lower_vbo.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
agx_nir_lower_vbo.h asahi: implement rba2 semantics for vbo 2024-06-16 12:15:22 -04:00
agx_nir_passes.h asahi: add AGX_TEXTURE_FLAG_CLAMP_TO_0 flag 2024-06-07 16:57:03 +00:00
agx_nir_prolog_epilog.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
agx_ppp.h asahi: split frag shader words 2024-05-16 13:25:56 -04:00
agx_scratch.c asahi: precompile helper program 2024-02-14 21:02:32 +00:00
agx_scratch.h asahi/lib: use #pragma once 2024-02-14 21:02:32 +00:00
agx_tilebuffer.c asahi: pack tilebuffer usc word ahead-of-time 2024-05-14 04:57:26 +00:00
agx_tilebuffer.h asahi: add flag controlling sample mask without MSAA 2024-06-07 16:57:03 +00:00
agx_usc.h asahi: don't allocate for USC words 2024-05-16 13:25:56 -04:00
agx_uvs.h asahi: extend varying linking for tri fan weirdness 2024-05-14 04:57:27 +00:00
asahi_proto.h asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
decode.c asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
decode.h asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
dyld_interpose.h asahi: Clang-format the subtree 2022-12-27 22:46:29 +00:00
meson.build asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
pool.c asahi: Convert to SPDX headers 2023-03-28 05:14:00 +00:00
pool.h asahi: split out genxml/ directory 2024-02-14 21:02:32 +00:00
unstable_asahi_drm.h asahi: add broken bits of unstable Linux UAPI 2024-06-14 15:44:30 +00:00
wrap.c asahi/decode: Decode multiple macOS commands 2023-12-09 10:56:17 -04:00