Commit graph

84 commits

Author SHA1 Message Date
Danylo Piliaiev
7e10a175c7 freedreno/computerator: Fix remaining issues with A7XX
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217>
2023-09-05 16:19:29 +00:00
Danylo Piliaiev
a70e04b0c0 freedreno: Add a list of raw magic regs
The set of magic regs is different between generations and even
sub-gens. Adding a new one and/or emitting one on specific generation
takes much more code than necessary. Doing this in a single place is
much nicer.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23217>
2023-09-05 16:19:29 +00:00
Emma Anholt
f6ea7c3a99 freedreno/devices: Move fibers_per_sp to the common info struct.
We'll need it for pvt mem on other GPUs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24358>
2023-08-08 18:51:58 +00:00
Emma Anholt
e3274e9e1b freedreno/ir3: Move pvtmem per-fiber size alignment to the compiler.
Instead of having tu and each fd backend do it.  This will help me make
some shared code on freedreno for pre-6xx pvtmem support.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24358>
2023-08-08 18:51:58 +00:00
Yonggang Luo
3b731d92d9 freedreno: decouple compiler and vulkan driver from gallium
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23438>
2023-08-03 07:29:36 +00:00
Danylo Piliaiev
271ba74766 freedreno/regs: Properly document a7xx CP_EVENT_WRITE, CP_WAIT_TIMESTAMP
Event write is changes so much in a7xx that it makes sense to
create a new event CP_EVENT_WRITE7.

All credits to Connor Abbott for finding out what different flags
in these commands are doing.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23881>
2023-07-12 13:33:28 +00:00
Danylo Piliaiev
1dc044764d freedreno/regs: Clarify polling on a7xx for CP_WAIT_REG_MEM/CP_COND_WRITE5
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23881>
2023-07-12 13:33:28 +00:00
Danylo Piliaiev
9f43bc73da freedreno/computerator: Add support for a7xx
Not everything works correctly, e.g. stib seems flakey while stg
seems alright.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22148>
2023-03-30 23:40:48 +00:00
Danylo Piliaiev
f32eb48095 freedreno/computerator: Templatize a6xx backend
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22148>
2023-03-30 23:40:48 +00:00
Danylo Piliaiev
48ad485d1c freedreno/computerator: Convert to C++
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22148>
2023-03-30 23:40:48 +00:00
Danylo Piliaiev
6826a0ab14 freedreno/computerator: C++ proofing
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22148>
2023-03-30 23:40:48 +00:00
Rob Clark
48b5164356 freedreno/drm: Return fence from submit flush
This moves away from embedding the submit fence inside the pipe fence,
which lets us start refcnt'ing the fence.  This will enable several
cleanups and improvements:

1. Get rid of fd_bo_fence, and just have fd_bo hold pending fd_fence
   refs instead, which will be needed for cpu_prep implementation of
   sub-allocated buffers.
2. For merged submits, we can just return a new reference to an
   existing fence.

Note that this temporarily defeats submit-merging, which will be
fixed (and improved) in a following commit.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20263>
2022-12-17 19:14:12 +00:00
Rob Clark
c1a621813b freedreno/drm: Combine fd_fence and fd_submit_fence
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20263>
2022-12-17 19:14:12 +00:00
Connor Abbott
3ca90405e8 freedreno/a6xx: Document buffer-specific tex const fields
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20105>
2022-12-14 16:19:47 +00:00
Marek Olšák
c9ca8abe4f Change all debug_assert calls to assert
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17403>
2022-07-10 00:50:35 +00:00
Danylo Piliaiev
5d377f435b freedreno/a6xx: Add EARLYPREAMBLE flag to all a6xx_sp_xs_ctrl_reg0
Each shader stage has its own "early preamble" flag.

Early preamble is likely an optimization to hide some of latency
when loading UBOs into consts in the preamble.

Early preamble has the following limitations:
- Only shared, a1, and consts regs could be used
  (accessing other regs would result in GPU fault);
- No cat5/cat6, only stc/ldc variants are working;
- Values writen to shared regs are not accessible by the rest
  of the shader;
- Instructions before shps are also considered to be a part of
  early preamble.

Note, for all shaders from d3d11 games blob produced preambles
compatible with early preamble mode.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15901>
2022-05-18 11:17:47 +00:00
Rob Clark
9ea36968d3 freedreno/drm: Add fd_device_open() helper
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
2022-03-25 02:03:30 +00:00
Connor Abbott
221a912b8c ir3: Refactor ir3_compiler_create() to take an options struct
This will let us add more options without creating too much churn.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13148>
2022-03-17 12:15:45 +00:00
Connor Abbott
00be8c4619 freedreno: Replace A6XX_IBO with A6XX_TEX_CONST
Since these were reverse-engineered, it's become clear that IBO
descriptors are just a subset of texture descriptors, and bindless reads
of readonly images actually use isam on the IBO descriptor, further
confirming that the two are always compatible, even if not all of the
texture fields exist for IBOs. It's pointless to have a separate type
for IBOs, and just leads to things getting out-of-sync unnecessarily
which has already happened. Just remove it and use TEX_CONST insted.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15114>
2022-02-28 23:33:22 +00:00
Rob Clark
9766a5721d freedreno/computerator: Mark shader bo for dumping
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14231>
2021-12-20 19:47:35 +00:00
Danylo Piliaiev
e63ffc2f04 freedreno,tu: Limit the amount of instructions preloaded into icache
Inferring from blob's cmdstream the size of shader instruction
cache for:
- a630 is 64
- a650 is 128
- a660 is 128

On a650 and a660 gpu could hang if we exceed the limit. Though
it is not reproducible with computerator or a single amber
test. Also while blob limits the size to 128 - Turnip still
hangs with it but does not hang with the limit of 127.

On a630 there seem to be no hang when limit is exceeded.

Fixes the hang of compute shader in Alien Isolation on a650/a660.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14044>
2021-12-07 13:48:35 +00:00
Ilia Mirkin
a95a9f0cc6 freedreno/a4xx: include guesses from a3xx for some of the constid's
The ones that are untested are left as comments. The ones that rename
values were tested manually.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13806>
2021-11-16 05:08:26 +00:00
Danylo Piliaiev
3afdc3ab2c freedreno/computerator: Support A660 gpu
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13640>
2021-11-03 16:32:19 +00:00
Rob Clark
5948ff4826 freedreno/computerator: Fix mergedregs
This was getting set *after* ir3_shader_assemble, which was too late.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13426>
2021-10-19 16:04:42 +00:00
Rob Clark
2a0a9b189a freedreno/computerator/a4xx: Fix enum mismatch warning
Fixes: fb5deb2b4a ("a4xx/computerator: add initial backend")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12923>
2021-09-18 20:24:49 +00:00
Ilia Mirkin
fb5deb2b4a a4xx/computerator: add initial backend
This backend provides very basic a4xx support. It's enough to run
kernels with explicit stg/etc ops, but not with stgb/ldgb type access.
There is no perfcounter support hooked up yet either.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12784>
2021-09-10 01:20:22 +00:00
Connor Abbott
1963a61faa freedreno/computerator: Add support for pvtmem
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11876>
2021-09-01 19:26:41 +00:00
Rob Clark
7806843866 freedreno/all: Introduce fd_dev_id
Move away from using gpu_id as the primary means to identify which
adreno we are running on, as future GPUs (starting with 7c3) stop
providing a gpu_id as a new naming scheme is introduced.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>
2021-08-06 18:51:50 +00:00
Rob Clark
4b2afd11cc freedreno/computerator: Add script to probe FLUT values
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8705>
2021-07-13 14:40:30 +00:00
Connor Abbott
56dc84b95c freedreno/computerator: Fix local_size typo
Fixes: cbc68c79a5 ("freedreno: Add local_size to ir3_shader_variant")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11622>
2021-06-28 16:06:23 +00:00
Danylo Piliaiev
fdc0f489e0 ir3: add ldg.a,stg.a which allow complex in-place offset calculation
The full form for ldg.a/stg.a offset is:
 g[reg_address + reg_offset << (imm_shift + 2) + imm_offset << 2]

where imm_shift is in [0, 3] and imm_offset is in [0, 3]

a6xx blob was found to produce a bit simplier offset calculations
for TES/TCS shaders in GTA V:

 [c002000a_03c14215] ldg.a.f32 r2.z, g[r1.y+((r2.z+1)<<2)], 3;
 [c0020004_01c14609] ldg.a.f32 r1.x, g[r1.y+((r1.x+3)<<2)], 1;

Our new syntax:
 stg.a.u32 g[r2.x+(r1.x+1)<<2], r5.x, 1
 stg.a.u32 g[r2.x+r1.x<<4+3<<2], r5.x, 1
 ldg.a.f32 r1.w, g[r1.y+(r1.w+1)<<2], 3
 ldg.a.f32 r1.w, g[r1.y+r1.w<<5+2<<2], 3

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11431>
2021-06-25 15:39:51 +00:00
Danylo Piliaiev
ba1c989348 freedreno/computerator: pass iova of buffer to const register
The syntax is:
  @buf 32 (c2.x)
The "(c2.x)" is optional.

This makes possible to test stg, ldg, and global atomics.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11431>
2021-06-25 15:39:51 +00:00
Caio Marcelo de Oliveira Filho
c8a7bd0dc8 nir: Rename WORK_GROUP (and similar) to WORKGROUP
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
2021-06-07 22:34:42 +00:00
Rob Clark
b447db41fc freedreno/tools: Fix async flush vs fdperf/computerator
They need to wait on the ready fence to ensure the submit has been
flushed to the kernel.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10626>
2021-05-05 20:32:31 +00:00
Rob Clark
aafcd8aacb freedreno: Re-work fd_submit fence interface
Move everything into a struct assocated with the pipe_fence_handle, so
that the drm layer can fill in the seqn/fd fences directly.

This will give us a comvenient place to insert a util_queue_fence in the
next commit.

While we're at it, extract the uint32_t fence (previously called
'timestamp' in place, a kgsl legacy) into a struct that encapsulates
both the kernel fence and the userspace fence.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark
8ab227c373 freedreno/drm: Cleanup bo cpu_prep flags
Also add some STATIC_ASSERT()

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Rob Clark
7f0abd9048 freedreno/drm: Cleanup bo allocation flags
Most of them were actually unused.  The memory type (KMEM vs SMI) only
applied to very old a2xx era devices that had a small/fast stacked
memory (SMI) vs normal memory (KMEM).  And the cache flags are ignored
(ie. everything is writecombine), but we can add new cache flags later
when they actually do something.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10444>
2021-04-28 15:36:42 +00:00
Danylo Piliaiev
9402d5a6b5 ir3: make possible to specify branchstack up to 64
On a6xx/a5xx there is such dependency between branchstack bitfield
and the amount of nested ifs, which could be seen with blob:

IFs   BRANCHSTACK
0	0
1	1
2	2
3	2
4	3
5	3
6	4
...
59	30
60	31
61	31
62	32
63	32
64	32

Remove open-coded branchstack for a5xx compute along the way.

Fixes tests:
 dEQP-VK.spirv_assembly.instruction.compute.float16.opvectorshuffle.344
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_vert
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.444_geom
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.244_tessc
 dEQP-VK.spirv_assembly.instruction.graphics.float16.opvectorshuffle.344_frag

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9859>
2021-04-21 11:57:07 +00:00
Rob Clark
3894bc9664 freedreno/computerator: Re-indent
clang-format -fallback-style=none --style=file -i src/freedreno/computerator/*.[ch]

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10293>
2021-04-17 15:38:56 +00:00
Connor Abbott
c68ea960a7 ir3, tu: Add compiler flag for robust UBO behavior
This needs to be part of the compiler because it's the only piece that
we always have access to in all the places ir3_optimize_loop() is
called, and it's only enabled for the whole Vulkan device. Right now
it's just used for constraining vectorization, but the next commit adds
another use.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
2021-04-15 16:05:11 +02:00
Danylo Piliaiev
64aaa4afc3 turnip: enable infinities for f16 math and document the register
When float16 is enabled this will allow to pass a number of
float16 tests.

When A6XX_SP_FLOAT_CNTL_F16_NO_INF is set - all operations which
generate +-infinity generate +-MAX_HALF_FLOAT.

Fixes some tests from:
 dEQP-VK.spirv_assembly.instruction.*.float16.*
 dEQP-VK.spirv_assembly.instruction.*.float_controls.fp16.*

E.g.:
 dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.sinh_vert
 dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_4.length
 dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log_denorm_flush_to_zero_nostorage
 dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log2_denorm_flush_to_zero_nostorage
 dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.inv_sqrt_denorm_flush_to_zero_nostorage

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
2021-04-01 17:51:07 +00:00
Connor Abbott
d8a2abe348 freedreno/computerator: Add script for finding reg file size
This helps with finding the various parameters introduced in the last
commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Connor Abbott
d274649799 freedreno/computerator: Use threadsize calculated by ir3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Connor Abbott
cbc68c79a5 freedreno: Add local_size to ir3_shader_variant
We want to use the local_size when available to calculate the threadsize
in ir3, and we need it to work with e.g. computerator where we don't
have a nir shader. Add a local_size field and use that in computerator
instead of of a separate structure that's inaccessable to core ir3.

Also set a dummy local_size in the tests to avoid a divide-by-zero.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9498>
2021-03-22 18:03:16 +00:00
Connor Abbott
ee1f140fd9 freedreno/a6xx: Cleanup SP_XS_CTRL_REG0 definitions
The registers were actually different per-stage even though we used the
same type, which resulted in a bunch of incorrectly programmed fields
and confusion. Move the stage-specific values to the registers
themselves, which makes things much less confusing and makes it possible
to set "mergedregs" correctly.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>
2021-03-11 20:58:39 +00:00
Connor Abbott
1d8bf2d0bf freedreno/computerator: Fix thrsz type
And use it for the other thread size field, too

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>
2021-03-11 20:58:39 +00:00
Connor Abbott
7b7532b806 freedreno/computerator: Add branching example
Mainly to be able to test label resolution without having to replace a
shader.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
2021-03-10 16:23:04 +00:00
Connor Abbott
534658f79b freedreno/computerator: Fix example assembly
Use the new bindless cat6 syntax for a6xx.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>
2021-03-10 16:23:04 +00:00
Jonathan Marek
ec54166a2b freedreno/a6xx: set SP_PERFCTR_ENABLE in computerator
Set this register to have properly working SP perfcntrs in computerator.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8423>
2021-02-19 04:04:03 +00:00
Jonathan Marek
b94c652afe freedreno/a6xx: always use reg64 for address registers (no LO/HI)
Reduce noise in a6xx.xml by removing LO/HI versions of address registers.

Also fix type="address" registers in register packing (use bit size instead
of checking for "waddress" to use qword)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8423>
2021-02-19 04:04:02 +00:00