Commit graph

23 commits

Author SHA1 Message Date
Iago Toral Quiroga
5d578c27ce v3d: add initial compiler plumbing for geometry shaders
Most of the relevant work happens in the v3d_nir_lower_io. Since
geometry shaders can write any number of output vertices, this pass
injects a few variables into the shader code to keep track of things
like the number of vertices emitted or the offsets into the VPM
of the current vertex output, etc. This is also where we handle
EmitVertex() and EmitPrimitive() intrinsics.

The geometry shader VPM output layout has a specific structure
with a 32-bit general header, then another 32-bit header slot for
each output vertex, and finally the actual vertex data.

When vertex shaders are paired with geometry shaders we also need
to consider the following:
  - Only geometry shaders emit fixed function outputs.
  - The coordinate shader used for the vertex stage during binning must
    not drop varyings other than those used by transform feedback, since
    these may be read by the binning GS.

v2:
 - Use MAX3 instead of a chain of MAX2 (Alejandro).
 - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro)
 - Update comment in IO owering so it includes the GS stage (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
52cbef0039 v3d: enable debug options for geometry shader dumps
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
e7e501efce v3d: rename vertex shader key (num)_fs_inputs fields
Until now this made sense because we always paired vertex shaders
with fragment shaders, but as soon as we implement geometry and
tessellation shaders that will no longer be the case, so rename
this to (num_)used_outputs.

v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-31 08:46:35 +00:00
Andreas Bergmeier
f92290a8d9 broadcom: Move v3d_get_device_info to common
In common we can use implementation for Vulkan.
2019-07-17 20:02:34 +00:00
Eric Anholt
8a2d91e124 v3d: Detect the correct number of QPUs and use it to fix the spill size.
We were missing a * 4 even if the particular hardware matched our
assumption.
2019-04-12 15:59:31 -07:00
Eric Anholt
89b7df552b v3d: Add and use a define for the number of channels in a QPU invocation.
A shader invocation always executes 16 channels together, so we often end
up multiplying things by this magic 16 number.  Give it a name.
2019-04-12 15:58:28 -07:00
Eric Anholt
62360e92ec v3d: Bump the maximum texture size to 4k for V3D 4.x.
4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in
the CTS at 8k and 16k.  4k at least lets us get one 4k display working.

Cc: mesa-stable@lists.freedesktop.org
2019-04-04 17:30:35 -07:00
Emil Velikov
385843ac3c vc4: Declare the last cpu pointer as being modified in NEON asm.
Earlier commit addressed 7 of the 8 instances available.

v2: Rebase patch back to master (by anholt)

Cc: Carsten Haitzler (Rasterman) <raster@rasterman.com>
Cc: Eric Anholt <eric@anholt.net>
Fixes: 300d3ae8b1 ("vc4: Declare the cpu pointers as being modified in NEON asm.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-29 16:00:25 -08:00
Carsten Haitzler (Rasterman)
300d3ae8b1 vc4: Declare the cpu pointers as being modified in NEON asm.
Otherwise, the compiler is free to reuse the register containing the input
for another call and assume that the value hasn't been modified.  Fixes
crashes on texture upload/download with current gcc.

We now have to have a temporary for the cpu2 value, since outputs must be
lvalues.

(commit message by anholt)

Fixes: 4d30024238 ("vc4: Use NEON to speed up utile loads on Pi2.")
2019-01-28 16:45:45 -08:00
Carsten Haitzler (Rasterman)
522f688471 vc4: Use named parameters for the NEON inline asm.
This makes the asm code more intelligible and clarifies the functional
change in the next commit.

(commit message and commit squashing by anholt)
2019-01-28 16:40:46 -08:00
Eric Anholt
060575bea8 v3d: Drop maximum number of texture units down to 16.
This is the GLES 3.2 minmax, and also what the closed source driver does.
Avoids hitting OOMs in the CTS's
dEQP-GLES3.functional.texture.units.all_units.only_cube.1.
2019-01-27 08:30:03 -08:00
Eric Anholt
3e743d8cd8 v3d: Avoid duplicating limits defines between gallium and v3d core.
We don't want to pull the compiler into every include in the gallium
driver, so just make a new little header to store the limits.
2019-01-27 08:30:03 -08:00
Eric Anholt
87b251a940 v3d: Add a "precompile" debug flag for shader-db.
I've been using my apitrace-based shader-db so far, but it's slow
(apitrace decompression), intrusive (apitrace windows spamming the
screen), and doesn't have much coverage.  The original shader-db provides
a lot more coverage and compiles faster, at the expense of not having the
actual runtime variant key.  As v3d has a lot less runtime variation than
vc4 did, this tradeoff makes more sense.
2018-12-29 13:52:09 -08:00
Eric Anholt
7c56b7a6ea v3d: Add a fallthrough path for utile load/store of 32 byte lines.
Now that V3D has 8 byte per pixel formats exposed, we've got stride==32
utiles to load and store.  Just handle them through the non-NEON paths for
now.
2018-12-19 10:27:26 -08:00
Eric Anholt
f6a0f4f41e vc4: Move the utile load/store functions to a header for reuse by v3d.
These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.
2018-12-19 10:27:26 -08:00
Eric Anholt
1561e4984e v3d: Emit the VCM_CACHE_SIZE packet.
This is needed to ensure that we don't get blocked waiting for VPM space
with bin/render overlapping.

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-06 13:03:23 -07:00
Eric Anholt
103f21b13d v3d: Add a separate flag for CLIF ABI output versus human-readable CLs.
A few of the upcoming changes would make the V3D_DEBUG=cl output less
readable, so let's make proper CLIF file production be under a separate
V3D_DEBUG=clif flag.
2018-07-30 14:29:01 -07:00
Eric Anholt
07b243674f v3d: Add missing always_flush debug flag.
The #define existed and was checked in the driver.
2018-06-19 09:42:20 -07:00
jenny.q.cao
ff7521c9ba android: change include "cutils/log.h" to "log/log.h" on Android API >=26
There is a compile warning from Android 8 (API version 26) from "include cutils/log.h"
warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings,
Change to include "log/log.h" on Android 8 or later major version to avoid this warning

Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-14 08:08:31 +03:00
Eric Anholt
96d3e8f134 broadcom/vc5: Add XML for V3D 4.2. 2018-01-27 18:57:58 +11:00
Eric Anholt
fb4face86a broadcom/vc5: Introduce v3dx_macros.h and v3dx_pack.h headers.
This will be used by vc5 for prefixing functions and including the pack
header in v3d-version-dependent code, following the model of anv.
2018-01-12 21:51:40 -08:00
Eric Anholt
59257c35eb broadcom: Introduce a v3d_debug.h header for vc5 and broadcom Vulkan.
Unlike vc4, where the compiler and gallium driver live together, for vc5
the compiler will live up in the shared broadcom directory, and need
access to the debug flags.  Define a set of debug flags and helpers there,
so it can be shared between compiler, vc5, and vulkan.
2017-10-10 11:42:04 -07:00
Eric Anholt
427bbbb99c broadcom: Introduce a header for talking about chip revisions.
This will be used by the VC5 driver and various shared VC4/VC5 tooling,
like the XML decoder.
2017-07-13 11:28:28 -07:00