mesa/src/freedreno/computerator
Danylo Piliaiev 5d377f435b freedreno/a6xx: Add EARLYPREAMBLE flag to all a6xx_sp_xs_ctrl_reg0
Each shader stage has its own "early preamble" flag.

Early preamble is likely an optimization to hide some of latency
when loading UBOs into consts in the preamble.

Early preamble has the following limitations:
- Only shared, a1, and consts regs could be used
  (accessing other regs would result in GPU fault);
- No cat5/cat6, only stc/ldc variants are working;
- Values writen to shared regs are not accessible by the rest
  of the shader;
- Instructions before shps are also considered to be a part of
  early preamble.

Note, for all shaders from d3d11 games blob produced preambles
compatible with early preamble mode.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15901>
2022-05-18 11:17:47 +00:00
..
examples freedreno/a6xx: Add EARLYPREAMBLE flag to all a6xx_sp_xs_ctrl_reg0 2022-05-18 11:17:47 +00:00
a4xx.c ir3: Refactor ir3_compiler_create() to take an options struct 2022-03-17 12:15:45 +00:00
a6xx.c freedreno/a6xx: Add EARLYPREAMBLE flag to all a6xx_sp_xs_ctrl_reg0 2022-05-18 11:17:47 +00:00
ir3_asm.c freedreno/computerator: Mark shader bo for dumping 2021-12-20 19:47:35 +00:00
ir3_asm.h freedreno/computerator: Re-indent 2021-04-17 15:38:56 +00:00
main.c freedreno/drm: Add fd_device_open() helper 2022-03-25 02:03:30 +00:00
main.h a4xx/computerator: add initial backend 2021-09-10 01:20:22 +00:00
meson.build a4xx/computerator: add initial backend 2021-09-10 01:20:22 +00:00
README.rst freedreno/computerator: pass iova of buffer to const register 2021-06-25 15:39:51 +00:00

Overview
========

Computerator is a tool to launch compute shaders, written in assembly.
The main purpose is to have an easy way to experiment with instructions
without dealing with the entire compiler stack (which makes controlling
the order of instructions, the registers chosen, etc, difficult).  The
choice of compute shaders is simply because there is far less state
setup required.

Headers
-------

The shader assembly can be prefixed with headers to control state setup:

* ``@localsize X, Y, Z`` - configures local workgroup size
* ``@buf SZ (cN.c)`` - configures an SSBO of the specified size (in dwords).
  The order of the ``@buf`` headers determines the index, ie the first
  ``@buf`` header is ``g[0]``, the second ``g[1]``, and so on.
  The iova of the buffer is written as a vec2 to ``cN.c``
* ``@const(cN.c)`` configures a const vec4 starting at specified
  const register, ie ``@const(c1.x) 1.0, 2.0, 3.0, 4.0`` will populate
  ``c1.xyzw`` with ``vec4(1.0, 2.0, 3.0, 4.0)``
* ``@invocationid(rN.c)`` will populate a vec3 starting at the specified
  register with the local invocation-id
* ``@wgid(rN.c)`` will populate a vec3 starting at the specified register
  with the workgroup-id (must be a high-reg, ie. ``r48.x`` and above)
* ``@numwg(cN.c)`` will populate a vec3 starting at the specified const
  register

Example
-------

```
@localsize 32, 1, 1
@buf 32  ; g[0]
@const(c0.x)  0.0, 0.0, 0.0, 0.0
@const(c1.x)  1.0, 2.0, 3.0, 4.0
@wgid(r48.x)        ; r48.xyz
@invocationid(r0.x) ; r0.xyz
@numwg(c2.x)        ; c2.xyz
mov.u32u32 r0.y, r0.x
(rpt5)nop
stib.untyped.1d.u32.1 g[0] + r0.y, r0.x
end
nop
```

Usage
-----

```
cat myshader.asm | ./computerator --disasm --groups=4,4,4
```