Commit graph

42 commits

Author SHA1 Message Date
Jose Maria Casanova Crespo
90f966e05f v3dv/v3d: Fix copyright holder to Raspberry Pi Ltd
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15057>
2022-02-18 11:50:07 +01:00
Iago Toral Quiroga
b9f9474577 v3dv: implement double-buffer mode
Double buffer mode splits the tile buffer size in half so we can
start processing the next tile while the current one is being
stored to memory. This mode is available only if MSAA is not enabled
and can, in theory, improve performance by reducing tile store
overhead, however, it comes at the cost of reducing the tile size,
which also causes some overhead of its own.

Testing shows that this helps some cases (i.e the Vulkan Quake
ports) but hurts others (i.e. Unreal Engine 4), so for the time
being we don't enable this by default but we allow to enable it
selectively by using V3D_DEBUG.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14551>
2022-01-14 10:57:26 +00:00
Alejandro Piñeiro
cbf0d83eac v3d,v3dv: move TFU register definition to a common header
We are using the same definitions for both OpenGL and Vulkan, so let's
move it to common.

As we are here we are also adding versioning on the TFU register
definition. Those are basically register bit places, so really likely
to change between versions.

Adding 33 as it is the first version they got defined.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13832>
2021-11-17 11:04:31 +01:00
Iago Toral Quiroga
f384c763fc v3d,v3dv: move tile size calculation to a common helper
We had this code replicated in 3 places across both drivers.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13790>
2021-11-15 11:40:39 +00:00
Alejandro Piñeiro
0e5601e6f3 broadcom/common: remove unused debug helper
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13071>
2021-09-28 22:30:29 +00:00
Juan A. Suarez Romero
9c158fcc70 broadcom: add cl_nobin debug option
Dumps the command list, excluding the binary resources.

v2 (Juan):
 - Make this option independent from `cl`

v3 (Iago):
 - Rename option name
 - Fix style issues
 - Do not print BO ranges

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12803>
2021-09-13 08:51:54 +00:00
Juan A. Suarez Romero
d220d8cb51 broadcom/compiler: add V3D_DEBUG_NO_LOOP_UNROLL debug option
Disables loop unrolling.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12803>
2021-09-13 08:51:54 +00:00
Iago Toral Quiroga
c964e5f099 v3d,v3dv: add options to force 32-bit or 16-bit TMU precision
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12303>
2021-08-11 05:57:10 +00:00
Juan A. Suarez Romero
82d84a35a1 v3d: handle debug options with debug_named_value
Switch from using debug_control structure to debug_named_value
structure.

The main nice feature is that it provides a "help" option, so using
"V3D_DEBUG=help" will print all the debug options with a brief
description.

Useful to avoid going to https://docs.mesa3d.org/envvars.html everytime
we need to know the available options.

v1:
 - Modify a couple of debug option documentation (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12036>
2021-07-23 15:25:48 +00:00
Iago Toral Quiroga
bf175fbb6a broadcom/util: don't use compute supergroup packing with subgroups
When using subgroups there are additional restrictions to consider,
so for now we keep it simple and disable supergroup packing in that
scenario.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11620>
2021-06-29 08:43:06 +02:00
Alejandro Piñeiro
d198e26a1e broadcom/common: move v3d_tiling to common
We initially just copied on v3dv, just in case we needed to modify
it. One year later the code is exactly the same, so let's move it to
common.

This fix an additional issue, as we were not using NEON when building
v3d_tiling.c for v3dv.

v2:
   * Add "#include util/u_box.h" at v3d_tiling.h, so we can't avoid
     the need to include it on other places. (Juan and Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11121>
2021-06-04 13:00:40 +02:00
Iago Toral Quiroga
f7ce44b6e5 v3dv: define V3D_MAX_BUFFER_RANGE
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10970>
2021-05-26 07:18:19 +00:00
Iago Toral Quiroga
3ce249e65e broadcom/common: move CSD supergroup sizing to a common helper
We want to use this in GL too.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>
2021-05-04 15:53:23 +00:00
Iago Toral Quiroga
e6f8202749 v3dv: serialize pipeline compilation when debugging shaders
It is possible to compile pipelines in multiple threads, but when we
are dumping debug information for shaders, we want all the outputs
serialized so we can make sense of it.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8913>
2021-02-08 13:00:16 +00:00
Iago Toral Quiroga
44dcc4c24d v3d/common: use spaces instead of TABs
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8913>
2021-02-08 13:00:16 +00:00
Alejandro Piñeiro
ff02458aa8 v3d/limits: add line width and point size limits
They will be the same for the OpenGL and Vulkan driver, so let's put
it on the commit limits header.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Alejandro Piñeiro
60145629a2 v3dv: initial CreateGraphicsPipeline/DestroyPipeline implementation
The basic to get the spirv built to nir, including calling some common
nir passes. Pending deep review if all those are needed or if we miss
some, but for that it would be better to be able to run existing
tests.

Enough to get assembly generated for simple tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:26 +00:00
Alejandro Piñeiro
8de380d26a broadcom/compiler: add V3D_DEBUG_RA option
To ask to debug a registr allocation failure
(V3D_DEBUG_REGISTER_ALLOCATION seemed too long to me).

When a fallback register allocation algorithm was added, if the
register allocation fails, it only dumpg the current vir with the
register pressure info with the failed fallback. But if we want do
debug the problem, we would be interested on both.

Additionally, it was strange that we got the full vir dump with the
failure even if no debug option was set.

Additionally we add shaderdb like stats for those failures, to make
easier to compare one and the other.

v2: keep a small warning message in case both register allocation
    algorithms fails (Neil)

Reviewed-by: Neil Roberts <nroberts@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6999>
2020-10-07 20:21:17 +00:00
Alejandro Piñeiro
bcb8dd7432 broadcom/common: increase V3D_MAX_TEXTURE_SAMPLERS, add specific OpenGL limit
This is needed due Vulkan because by spec (31.1. Limit Requirements)
the minimum value for the following limits are the following ones:
  maxPerStageDescriptorSampledImages 16
  maxPerStageDescriptorStorageImages  4
  maxPerStageDescriptorInputAttachments 4

And we are using v3d textures for all of them, so current limit would
not be enough for some cases.

Note that as the current comment explains there is not exactly a HW
limit for it, so we could bump to 32 for example, but let's just be
conservative and ask the minimum required.

It is worth to note that we needed to maintain the same value for the
OpenGL case, as it gets a register allocation failure on some GL
cases. We tried to fix that with small changes on the nir scheduler,
but we found that it would require some non-trivial effort to get it
done (that eventually we would need to).

Fixes tests like:
dEQP-VK.binding_model.descriptorset_random.sets16.constant.ubolimitlow.sbolimitlow.imglimitlow.noiub.uab.comp.noia.0

v2: keep the previous limit for Opengl (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6999>
2020-10-07 20:21:17 +00:00
Iago Toral Quiroga
5d578c27ce v3d: add initial compiler plumbing for geometry shaders
Most of the relevant work happens in the v3d_nir_lower_io. Since
geometry shaders can write any number of output vertices, this pass
injects a few variables into the shader code to keep track of things
like the number of vertices emitted or the offsets into the VPM
of the current vertex output, etc. This is also where we handle
EmitVertex() and EmitPrimitive() intrinsics.

The geometry shader VPM output layout has a specific structure
with a 32-bit general header, then another 32-bit header slot for
each output vertex, and finally the actual vertex data.

When vertex shaders are paired with geometry shaders we also need
to consider the following:
  - Only geometry shaders emit fixed function outputs.
  - The coordinate shader used for the vertex stage during binning must
    not drop varyings other than those used by transform feedback, since
    these may be read by the binning GS.

v2:
 - Use MAX3 instead of a chain of MAX2 (Alejandro).
 - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro)
 - Update comment in IO owering so it includes the GS stage (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
52cbef0039 v3d: enable debug options for geometry shader dumps
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
e7e501efce v3d: rename vertex shader key (num)_fs_inputs fields
Until now this made sense because we always paired vertex shaders
with fragment shaders, but as soon as we implement geometry and
tessellation shaders that will no longer be the case, so rename
this to (num_)used_outputs.

v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-10-31 08:46:35 +00:00
Andreas Bergmeier
f92290a8d9 broadcom: Move v3d_get_device_info to common
In common we can use implementation for Vulkan.
2019-07-17 20:02:34 +00:00
Eric Anholt
8a2d91e124 v3d: Detect the correct number of QPUs and use it to fix the spill size.
We were missing a * 4 even if the particular hardware matched our
assumption.
2019-04-12 15:59:31 -07:00
Eric Anholt
89b7df552b v3d: Add and use a define for the number of channels in a QPU invocation.
A shader invocation always executes 16 channels together, so we often end
up multiplying things by this magic 16 number.  Give it a name.
2019-04-12 15:58:28 -07:00
Eric Anholt
62360e92ec v3d: Bump the maximum texture size to 4k for V3D 4.x.
4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in
the CTS at 8k and 16k.  4k at least lets us get one 4k display working.

Cc: mesa-stable@lists.freedesktop.org
2019-04-04 17:30:35 -07:00
Emil Velikov
385843ac3c vc4: Declare the last cpu pointer as being modified in NEON asm.
Earlier commit addressed 7 of the 8 instances available.

v2: Rebase patch back to master (by anholt)

Cc: Carsten Haitzler (Rasterman) <raster@rasterman.com>
Cc: Eric Anholt <eric@anholt.net>
Fixes: 300d3ae8b1 ("vc4: Declare the cpu pointers as being modified in NEON asm.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-01-29 16:00:25 -08:00
Carsten Haitzler (Rasterman)
300d3ae8b1 vc4: Declare the cpu pointers as being modified in NEON asm.
Otherwise, the compiler is free to reuse the register containing the input
for another call and assume that the value hasn't been modified.  Fixes
crashes on texture upload/download with current gcc.

We now have to have a temporary for the cpu2 value, since outputs must be
lvalues.

(commit message by anholt)

Fixes: 4d30024238 ("vc4: Use NEON to speed up utile loads on Pi2.")
2019-01-28 16:45:45 -08:00
Carsten Haitzler (Rasterman)
522f688471 vc4: Use named parameters for the NEON inline asm.
This makes the asm code more intelligible and clarifies the functional
change in the next commit.

(commit message and commit squashing by anholt)
2019-01-28 16:40:46 -08:00
Eric Anholt
060575bea8 v3d: Drop maximum number of texture units down to 16.
This is the GLES 3.2 minmax, and also what the closed source driver does.
Avoids hitting OOMs in the CTS's
dEQP-GLES3.functional.texture.units.all_units.only_cube.1.
2019-01-27 08:30:03 -08:00
Eric Anholt
3e743d8cd8 v3d: Avoid duplicating limits defines between gallium and v3d core.
We don't want to pull the compiler into every include in the gallium
driver, so just make a new little header to store the limits.
2019-01-27 08:30:03 -08:00
Eric Anholt
87b251a940 v3d: Add a "precompile" debug flag for shader-db.
I've been using my apitrace-based shader-db so far, but it's slow
(apitrace decompression), intrusive (apitrace windows spamming the
screen), and doesn't have much coverage.  The original shader-db provides
a lot more coverage and compiles faster, at the expense of not having the
actual runtime variant key.  As v3d has a lot less runtime variation than
vc4 did, this tradeoff makes more sense.
2018-12-29 13:52:09 -08:00
Eric Anholt
7c56b7a6ea v3d: Add a fallthrough path for utile load/store of 32 byte lines.
Now that V3D has 8 byte per pixel formats exposed, we've got stride==32
utiles to load and store.  Just handle them through the non-NEON paths for
now.
2018-12-19 10:27:26 -08:00
Eric Anholt
f6a0f4f41e vc4: Move the utile load/store functions to a header for reuse by v3d.
These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.
2018-12-19 10:27:26 -08:00
Eric Anholt
1561e4984e v3d: Emit the VCM_CACHE_SIZE packet.
This is needed to ensure that we don't get blocked waiting for VPM space
with bin/render overlapping.

Cc: "18.2" <mesa-stable@lists.freedesktop.org>
2018-08-06 13:03:23 -07:00
Eric Anholt
103f21b13d v3d: Add a separate flag for CLIF ABI output versus human-readable CLs.
A few of the upcoming changes would make the V3D_DEBUG=cl output less
readable, so let's make proper CLIF file production be under a separate
V3D_DEBUG=clif flag.
2018-07-30 14:29:01 -07:00
Eric Anholt
07b243674f v3d: Add missing always_flush debug flag.
The #define existed and was checked in the driver.
2018-06-19 09:42:20 -07:00
jenny.q.cao
ff7521c9ba android: change include "cutils/log.h" to "log/log.h" on Android API >=26
There is a compile warning from Android 8 (API version 26) from "include cutils/log.h"
warning: "Deprecated: don't include cutils/log.h, use either android/log.h or log/log.h"-W#warnings,
Change to include "log/log.h" on Android 8 or later major version to avoid this warning

Signed-off-by: jenny.q.cao <jenny.q.cao@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-05-14 08:08:31 +03:00
Eric Anholt
96d3e8f134 broadcom/vc5: Add XML for V3D 4.2. 2018-01-27 18:57:58 +11:00
Eric Anholt
fb4face86a broadcom/vc5: Introduce v3dx_macros.h and v3dx_pack.h headers.
This will be used by vc5 for prefixing functions and including the pack
header in v3d-version-dependent code, following the model of anv.
2018-01-12 21:51:40 -08:00
Eric Anholt
59257c35eb broadcom: Introduce a v3d_debug.h header for vc5 and broadcom Vulkan.
Unlike vc4, where the compiler and gallium driver live together, for vc5
the compiler will live up in the shared broadcom directory, and need
access to the debug flags.  Define a set of debug flags and helpers there,
so it can be shared between compiler, vc5, and vulkan.
2017-10-10 11:42:04 -07:00
Eric Anholt
427bbbb99c broadcom: Introduce a header for talking about chip revisions.
This will be used by the VC5 driver and various shared VC4/VC5 tooling,
like the XML decoder.
2017-07-13 11:28:28 -07:00