mesa/src/asahi/lib/shaders
Alyssa Rosenzweig d26ae4f455 asahi,libagx: tessellate on device
Add OpenCL kernels implementing the tessellation algorithm on device. This is an
OpenCL C port of the D3D11 reference tessellator, originally written by
Microsoft in C++. There are significant differences compared to the CPU based
reference implementation:

* significant simplifications and clean up. The reference code did a lot of
  things in weird ways that would be inefficient on the GPU. I did a *lot* of
  work here to get good AGX assembly generated for the tessellation kernels ...
  the first attempts were quite bad! Notably, everything is carefully written to
  ensure that all private memory access is optimized out in NIR; the resulting
  kernels do not use scratch and do not spill on G13.

* prefix sum variants. To implement geom+tess efficiently, we need to first
  calculate the count of indices generated by the tessellator, then prefix sum
  that, then tessellate using the prefix sum results writing into 1 large index
  buffer for a single indirect draw. This isn't too bad, we already have most of
  the logic and the guts of the prefix sum kernel is shared with geometry
  shaders.

* VDM generation variant. To implement tess alone, it's fastest to generate a
  hardware Index List word for each patch, adding an appropriate 32-bit index
  bias to the dynamically allocated U16 index buffers. Then from the CPU, we
  have the illusion of a single draw to Stream Link with Return to. This
  requires packing hardware control words from the tessellator kernel.
  Fortunately, we have GenXML available so we just use agx_pack like we would in
  the driver.

Along the way, we pick up indirect tess support (this follows on naturally),
which gets rid of the other bit of tessellation-related cheating. Implementing
this requires reworking our internal agx_launch data structures, but that has
the nice side effect of speeding up GS invocations too (by fixing the workgroup
size).

Don't get me wrong. tessellator.cl is the single most unhinged file of my
career, featuring GenXML-based pack macros fed by dynamic memory allocation fed
by the inscrutable tessellation algorithm.

But it works *really* well.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30051>
2024-07-15 20:09:00 +00:00
..
geometry.cl asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
geometry.h asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
helper.cl asahi: Implement scratch allocation 2024-02-14 21:02:29 +00:00
helper.h asahi: scratch: Add feature to debug core IDs 2024-02-14 21:02:30 +00:00
libagx.h libagx: fix uint8_t definition 2024-06-16 10:10:33 -04:00
query.cl libagx: generalize query copies 2024-06-16 12:15:22 -04:00
query.h libagx: generalize query copies 2024-06-16 12:15:22 -04:00
tessellation.cl asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
tessellator.cl asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
tessellator.h asahi,libagx: tessellate on device 2024-07-15 20:09:00 +00:00
texture.cl asahi: simplify image atomic lowering 2024-05-14 04:57:26 +00:00