mesa/src/freedreno/computerator
Eric Anholt 1f44053301 freedreno+turnip: Upload large shader constants as a UBO.
Right now if the shader indirects on some large constant array, we see NIR
load_consts (usually from the const file) of its contents into general
registers, then indirection on the GPRs.  This often results in register
allocation failures, as it's easy to go beyond the ~256 dwords of
registers per invocation.

By moving the large constants to a UBO, we can load an arbitrary number of
them.  They also can be theoretically moved to the constant reg file (~2k
dwords), though you're unlikely to hit this path without an indirect load
on your large constant, and we don't yet let UBO indirect loads get moved
to constant regs.

This possibly won't work out right if we have 16-bit load_constants, but
without other MRs in flight we won't see 16-bit temps to be lowered to
this.

This allows 2 kerbal-space-program shaders to compile that previously
would fail, and fixes the new dEQP-VK and -GLES2 tests I wrote that
dynamically index a 40-element temporary array of float/vec2/vec3/vec4
with constant element initializers.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2789
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5810>
2020-11-16 13:55:41 -08:00
..
examples freedreno/computer: add script to test widening/narrowing 2020-04-28 20:06:49 +00:00
a6xx.c freedreno/ir3: Simpify the immediates from an array of vec4 to array of dwords. 2020-08-05 23:06:55 +00:00
ir3_asm.c freedreno+turnip: Upload large shader constants as a UBO. 2020-11-16 13:55:41 -08:00
ir3_asm.h freedreno/ir3: Move ir3 assembler to backend compiler 2020-04-25 00:03:43 +00:00
main.c freedreno/computerator: Use a render node 2020-09-02 14:53:44 +00:00
main.h freedreno/registers: split header build into subdirs 2020-08-03 19:46:49 +00:00
meson.build freedreno/ir3: refactor out helper to compile shader from asm 2020-06-19 13:16:57 +00:00
README.rst freedreno/computerator: add computerator 2020-02-24 21:31:53 +00:00

Overview
========

Computerator is a tool to launch compute shaders, written in assembly.
The main purpose is to have an easy way to experiment with instructions
without dealing with the entire compiler stack (which makes controlling
the order of instructions, the registers chosen, etc, difficult).  The
choice of compute shaders is simply because there is far less state
setup required.

Headers
-------

The shader assembly can be prefixed with headers to control state setup:

* ``@localsize X, Y, Z`` - configures local workgroup size
* ``@buf SZ`` - configures an SSBO of the specified size (in dwords).
  The order of the ``@buf`` headers determines the index, ie the first
  ``@buf`` header is ``g[0]``, the second ``g[1]``, and so on
* ``@const(cN.c)`` configures a const vec4 starting at specified
  const register, ie ``@const(c1.x) 1.0, 2.0, 3.0, 4.0`` will populate
  ``c1.xyzw`` with ``vec4(1.0, 2.0, 3.0, 4.0)``
* ``@invocationid(rN.c)`` will populate a vec3 starting at the specified
  register with the local invocation-id
* ``@wgid(rN.c)`` will populate a vec3 starting at the specified register
  with the workgroup-id (must be a high-reg, ie. ``r48.x`` and above)
* ``@numwg(cN.c)`` will populate a vec3 starting at the specified const
  register

Example
-------

```
@localsize 32, 1, 1
@buf 32  ; g[0]
@const(c0.x)  0.0, 0.0, 0.0, 0.0
@const(c1.x)  1.0, 2.0, 3.0, 4.0
@wgid(r48.x)        ; r48.xyz
@invocationid(r0.x) ; r0.xyz
@numwg(c2.x)        ; c2.xyz
mov.u32u32 r0.y, r0.x
(rpt5)nop
stib.untyped.1d.u32.1 g[0] + r0.y, r0.x
end
nop
```

Usage
-----

```
cat myshader.asm | ./computerator --disasm --groups=4,4,4
```