Commit graph

544 commits

Author SHA1 Message Date
Icecream95
314ba5e174 nir: Add a face_sysval argument to nir_lower_two_sided_color
This is needed for handling drivers that use an input for loading the
face, for example Panfrost with Midgard GPUs.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Tested-by: Urja Rannikko <urjaman@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5915>
2020-07-17 14:50:26 +00:00
Neil Roberts
97f8ec321b v3d/compiler: Lower geometry output store base into offset src
When generating the VPM write instruction for geometry shader outputs,
emit_store_output_gs ends up adding the base and offset arguments
together with an ADD instruction. The addition was done at the VIR level
after scheduling so it always ends up right next to the corresponding
stvpm instruction. Most of the time the offset is constant but nothing
does any constant folding at the VIR level.

This patch makes it instead fold the addition into the offset at the NIR
level in v3d_nir_lower_io so that the NIR-level constant folding can get
rid of the addition most of the time.

v2: Use nir_iadd_imm to simplify the code. (Eric Anholt)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5825>
2020-07-16 08:48:06 +02:00
Neil Roberts
deefebc55b v3d/compiler: Fix sorting the gs and fs inputs
ntq_setup_fs_inputs and ntq_setup_gs_inputs sort the inputs according to
the driver location. This input array is then used to calculate the VPM
offset for the outputs in the previous stage. However, it wasn’t taking
into account variables that are packed into a single varying slot. In
that case they would have the same driver_location and are
distinguished by location_frac.

This patch makes it additionally sort by location_frac when the driver
locations are equal. This can happen when the compiler packs varyings
that are sized less than vec4. Without this fix, when the VPM is used to
transmit data free-form between the stages (such as VS->GS) then it
would end up writing to inconsistent locations.

Fixes dEQP tests such as:
dEQP-GLES31.functional.primitive_bounding_box.lines.global_state.
vertex_geometry_fragment.default_framebuffer_bbox_equal

Fixes: 5d578c27ce ("v3d: add initial compiler plumbing for geometry shaders")

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5787>
2020-07-08 07:39:47 +00:00
Neil Roberts
b921d17aa6 broadcom/qpu: set VC5_QPU_RADDR_A out of the switch at _pack_branch
Detected after mesa added Wimplicit-fallthrough project wide.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5769>
2020-07-07 21:40:16 +02:00
Neil Roberts
ee4d51f8b2 v3d: Add a lowering pass for line smoothing
When line smoothing is enabled, the driver now increases the width of
the line so that it can add some semi-transparent pixels to either side
of the line. A lowering pass is added which modifies the alpha component
of every write to fragment output 0 so that if the fragment is outside
the width of the line then the alpha is reduced. It additionally
discards fragments that are completely invisible. It might seem bad to
use discard on a tiled renderer but the assumption is that any bad
effects from using discard will also happen anyway because of enabling
alpha blending.

v2: Disable the line smoothing pass entirely when the framebuffer
    contains an integer colour output or one with no alpha channel.
    Calculate the coverage once upfront and store in a global variable
    instead of calculating each time an output write is modified. Also
    do the conditional discard once upfront.
v3: Don’t check whether the output buffer has an alpha channel. Only
    look at output 0. Use aa_line_width intrinsic instead of calculating
    the real line width in the shader. Clamp the coverage as part of the
    global variable, not per output write.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
2020-07-06 21:59:16 +00:00
Neil Roberts
207da33a86 v3d: Handle the line width intrinsics
Adds new QUNIFORMs to store the line widths.

v2: Also handle the aa_line_width intrinsic

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
2020-07-06 21:59:16 +00:00
Neil Roberts
2c4616368b v3d: Implement the line coord intrinsic
The line coord intrinsic is loaded from the implicit varying stored in
the same slot as the point coord when drawing lines.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5624>
2020-07-06 21:59:16 +00:00
Alejandro Piñeiro
f8946bd705 v3d/tex: handle correctly coordinates for cube/cubearrays images
When fetching for cube maps, we need to interpret them as 2d texture
arrays, being the third coordinate the index for the face.

Fixes Vulkan CTS tests like the following using v3dv:

dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_image.fragment.single_descriptor.cube_base_mip
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_image.compute.multiple_descriptor_sets.multiple_contiguous_descriptors.cube_array_base_mip

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5675>
2020-07-03 08:14:57 +00:00
Iago Toral Quiroga
8456ff75b3 v3d/compiler: fix image size for 1D arrays
Reviewed by: Alejandro Piñeiro <apinheiro@igalia.com>

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5692>
2020-07-01 10:01:46 +00:00
Eric Anholt
f55a308c75 v3d: Enable PIPE_CAP_TGSI_TEXCOORD.
Dave wants to drop the !TEXCOORD path from NIR, and it's easy enough to
do.  Untested.

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2952>
2020-06-29 09:07:21 -07:00
Iago Toral Quiroga
653dff949e v3d/compiler: fix spill offset
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Fixes: 97566efe5c ("v3d: Rematerialize MOVs of uniforms instead of spilling them.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5664>
2020-06-29 14:21:25 +02:00
Alejandro Piñeiro
583d7d3d8d v3d: moving v3d simulator to src/broadcom
So it could be used by both the OpenGL and the Vulkan driver.

In addition to the move, some small changes were needed to be made on
the API. For example, the simulator was receiving v3d_screen on
initialization, and that code setted v3d_screen->sim_file. Now it
returns the new sim_file created.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5666>
2020-06-27 00:06:58 +00:00
Iago Toral Quiroga
4845f184d7 v3d/compiler: don't rewrite unused temporaries to point to NOP register
This was assuming that unused temporaries are written but never read,
since the NOP register can only be used as a destination register,
but we can end up here also for temporaries that are read once but
never written.

This was found with a graphicsfuzz test that has a switch with
cases that have unreachable discards. In that test, NIR genrates
code like this:

decl_reg vec3 32 r19
...
r20 = mov r19.z
r21 = mov r19.y
r22 = mov r19.x

Where r19.xyz would generate 3 temporary registers that are read but
never written, so we would rewrite them to point to the NOP register
as QPU instruction sources, which is not allowed and would hit an
assert that expect magic reads to be from [r0,r5] only.

Fixes:
dEQP-VK.graphicsfuzz.unreachable-switch-case-with-discards

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5645>
2020-06-26 08:57:32 +00:00
Neil Roberts
3b1c511b09 v3d: Use stvpmd for non-uniform offsets in GS
The offset for the VPM write for storing outputs from the geometry
shader isn’t necessarily uniform across all the lanes. This can happen
if some of the lanes don’t emit some of the vertices. In that case the
offset for the subsequent vertices will be different in each lane. In
that case we need to use the stvpmd instruction instead of stvpmv
because it will scatter the values out.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3150

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5621>
2020-06-26 09:36:15 +02:00
Neil Roberts
dab8a9169c v3d: Add missing macro for stvpmd instruction
stvpmd is like stvpmv but it scatters the output. It can be used with
non-dynamically uniform offsets.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5621>
2020-06-26 09:36:15 +02:00
Neil Roberts
053df9bd8f v3d: Let scheduler know GS doesn’t have shared I/O memory
Unlike the vertex shaders, the memory for inputs and outputs is stored
in separate segments so the scheduler doesn’t need to serialise them.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>
2020-06-22 08:23:06 +02:00
Neil Roberts
ed29b576cb nir/scheduler: Add an option to specify what stages share memory for I/O
The scheduler has code to handle hardware that shares the same memory
for inputs and outputs. Seeing as the specific stages that need this is
probably hardware-dependent, this patch makes it a configurable option
instead of hard-coding it to everything but fragment shaders.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>
2020-06-22 08:23:06 +02:00
Neil Roberts
0a18c935e1 v3d: Remove unused member of v3d_compile
It looks like gs_input_sizes was added when GS shaders were implemented
but it was never used anywhere.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5561>
2020-06-22 08:23:06 +02:00
Rob Clark
c148dbe07e v3d: don't use intr->num_components for non-vectorized intrinsics
Squashed-in-fix-from: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
2020-06-16 02:48:18 +00:00
Timothy Arceri
04dbf709ed nir: add callback to nir_remove_dead_variables()
This allows us to do API specific checks before removing variable
without filling nir_remove_dead_variables() with API specific code.

In the following patches we will use this to support the removal
of dead uniforms in GLSL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
2020-06-03 02:22:23 +00:00
Dylan Baker
a8e2d79e02 meson: use gnu_symbol_visibility argument
This uses a meson builtin to handle -fvisibility=hidden. This is nice
because we don't need to track which languages are used, if C++ is
suddenly added meson just does the right thing.

Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
2020-06-01 18:59:18 +00:00
Alejandro Piñeiro
f7fcbe9830 v3d/tex: use TMUSLOD register if possible
TMUSLOD register is the same that TMUS but having the same effect that
setting disable_autolod on the TMU configuration parameter 2.

So using that register is potentially more efficient, as in several
cases we would be able to skip writing P2.

One case where we can't use it is for texture cube maps, as we need to
use TMUSCM.

v2: don't put a comment in the middle of the conditions (Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>
2020-05-11 23:52:46 +00:00
Alejandro Piñeiro
c3af695bb0 v3d/tex: set up default values for Configuration Parameter 1 if possible
Texture access has three configuration parameters, P0 (texture), P1
(sampler) and P2(lookup). P1 and P2 are optional, but if P2 is needed
(like for example to set the offset for texelFetchOffset), then you
need to set P1.

But until now when setting up P1 we were asking the driver to fill up
the address with the shader state. But in that case we can just fill
that address with the default value NULL.

So let's avoid asking the driver to fill that default values, and do
it directly on the compiler. This is a good-to-have on OpenGL, and
likely would be needed on Vulkan.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>
2020-05-11 23:52:46 +00:00
Alejandro Piñeiro
50c2c76ea3 v3d/tex: only look up the 2nd texture gather offset for 1d non-arrays
Commit 1bc71e8b65 already did that for
the 3rd offset, but it also needs to do it for the 2nd (to handle 1d
array).

Fixes assertion failures with Vulkan CTS tests using 1darray
targets. Seems that there isn't too many 1darray tests on OpenGL CTS,
and OpenGL-ES don't support 1d arrays, but the same problem could
arise eventually on OpenGL.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4962>
2020-05-11 23:52:46 +00:00
Eric Anholt
e9add0c501 drm-shim: Let the driver choose to overwrite the first render node.
When I was writing drm-shim, I was focused on the v3d kmsro case -- use my
intel device as the kmsro display device and add on a simulator-based v3d
device that we could render with.  But for the noop backends we use for
shader-db, it's a lot more useful to just overwrite the first render node
in the system so that you don't have to pass a -d <how many render nodes I
already have in my system> argument.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4664>
2020-04-23 17:54:54 +00:00
Alejandro Piñeiro
ad460c5dd6 v3d: support for textureQueryLOD
Fixes all the ARB_texture_query_lod piglit tests, and needed to get
the Vulkan CTS textureQueryLOD passing with the ongoing Vulkan driver.

Note that LOD Query bit flag became only available on V42 of the hw,
but the v3d40_tex is using V41 as reference. In order to avoid setting
up the infrastructure to support both v41 and v42, we manually set the
bit if the device version is the correct one.

We also fix how the ARB_texture_query_lod (so EXT_texture_query_lod)
is exposed. Before this commit it was always exposed (wrongly as it
was not really supported). Now it is exposed for devinfo.ver >= 42.

v2: move _need_sampler helper to nir.h (Eric Anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
2020-04-22 23:43:23 +02:00
Alejandro Piñeiro
41bfd0812b v3d/packet: fixing TMU_Config_Parameter_2 definition
v41 interchanged the size and start values for the Padding, and it
seems that v42 inherited it when adding the LOD Query bit.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
2020-04-22 23:39:41 +02:00
Alejandro Piñeiro
9967c26ae6 v3d/tex: Configuration Parameter 1 can be only skipped if P2 can be skipped too
Configuration Parameter packets 1 and 2 are pointed as optional, but
it is not clearly stated if you can skip only P1 when P2 is needed.

In the practice, it seems that the situation P0 - non-P1 - P2 can
causes problems, and at least on the simulator, it seems that sampler
info are attempted to be accessed. So let's just be conservative, and
only skip P1 configuration if we can skip P2 configuration too.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
2020-04-22 23:39:34 +02:00
Alejandro Piñeiro
d0b644d9f9 v3d/tex: don't configure tmu config 1 if not needed
TMU configuration parameter 1 configures the sampler for the texture
operation. But there are some texture operations that doesn't need a
sampler. Skipping the configuration could provide a small perf
improvement on OpenGL. On the incoming Vulkan driver, would allow us
to avoid to set up an unneeded sampler.

Note that we still need to add the sampler configuration parameter if
the output is a 32bit, as it is on the sampler where we configure that
info.

Also, note that for images this is done comparing against a unpacked
p1 default. But in order to do that it is needed to go through the
code that fills up the unpacked p1. We can skip that too.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
2020-04-22 23:38:18 +02:00
Lionel Landwerlin
c3e305616c drm-shim: return device platform as specified
v2: Embed the libdrm dependency inside the drm-shim dependency

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Eric Anholt <eric@anholt.net> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4429>
2020-04-03 21:14:18 +00:00
Eric Engestrom
79af30768d meson: inline inc_common
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
2020-03-28 21:36:54 +01:00
Rob Clark
36aed70b59 util/ra: spiff out select_reg_callback
Add a parameter so the callback can know which node it is selecting a
register for.  And remove the graph parameter, as it is unused by
existing users, and somewhat unnecessary (ie. the callback data could
be used instead).

And add a comment so $future_me remembers how this works.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
2020-03-10 16:01:39 +00:00
Eric Anholt
12cf484d02 v3d: Ask the state tracker to lower image accesses off of derefs.
This saves a bunch of hassle in handling derefs in the backend, and would
be needed for reasonable handling of dynamic indexing of image arrays.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3728>
2020-02-24 18:25:02 +00:00
Jose Maria Casanova Crespo
68bb26af63 broadcom: Fix implicit declaration of ffs for Android build
Include util/bitscan.h to ensure ffs is available when there is no
glibc like in Android.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1983
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2554>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2554>
2020-02-06 18:31:13 +01:00
Eric Anholt
8d07d66180 glsl,nir: Switch the enum representing shader image formats to PIPE_FORMAT.
This means you can directly use format utils on it without having to have
your own GL enum to number-of-components switch statement (or whatever) in
your vulkan backend.

Thanks to imirkin for fixing up the nouveau driver (and a couple of core
details).

This fixes the computed qualifiers for EXT_shader_image_load_store's
non-integer sizeNxM qualifiers, which we don't have tests for.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> (v3d)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3355>
2020-02-05 10:31:14 -08:00
Anthony Pesch
f77369086c util/hash_table: update users to use new optimal integer hash functions
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
2020-01-23 17:06:57 +00:00
Jason Ekstrand
d3737002ee nir/lower_atomics_to_ssbo: Also lower barriers
This is more correct for a pass which is supposed to completely lower
away atomic counters.  It also lets us stop supporting atomic counter
barriers in most of the drivers.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
e40b11bbcb nir: Rename nir_intrinsic_barrier to control_barrier
This is a more explicit name now that we don't want it to be doing any
memory barrier stuff for us.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Jason Ekstrand
60097cc840 nir: Add a new memory_barrier_tcs_patch intrinsic
Right now, it's implemented as a no-op for everyone.  For most drivers,
it's a switch case in the NIR -> whatever which just breaks.  For ir3,
they already have code to delete tessellation barriers so we just add a
case to also delete memory_barrier_tcs_patch.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
2020-01-13 17:23:47 +00:00
Iago Toral Quiroga
6c7a2b69f8 v3d: handle writes to gl_Layer from geometry shaders
When geometry shaders write a value to gl_Layer that doesn't correspond to
an existing layer in the target framebuffer the rendering behavior is
undefined according to the spec, however, there are CTS tests that trigger
this scenario on purpose, probably to ensure that nothing terrible happens.

For V3D, this situation is problematic because the binner uses the layer
index to select the offset to write into the tile state data, and we only
allocate tile state for MAX2(num_layers, 1), so we want to make sure we
don't produce values that would lead to out of bounds writes. The simulator
has an assert to catch this, although we haven't observed issues in actual
hardware it is probably best to play safe.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a6b318ef52 v3d: predicate geometry shader outputs inside non-uniform control flow
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
a07d70c54b v3d: we always have at least one output segment
If we program an output size of 0 the simulator asserts. This was
not a problem until now because our VS would always have to
emit fixed function outputs, however, now that it can be paired
with a GS we can end up with a VS shader that no longer emits
any outputs.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
76fc8c8bb1 v3d: compute appropriate VPM memory configuration for geometry shader workloads
Geometry shaders can output many vertices and thus have higher VPM memory
pressure as a result. It is possible that too wide geometry shader dispatches
exceed the maximum available VPM output allocated, in which case we need
to reduce the dispatch width until we can fit the VPM memory requirements.
Supported dispatch widths for geometry shaders are 16, 8, 4, 1.

There is a limit in the number of VPM output sectors that can be used by a
geometry shader that we can meet by lowering the dispatch width at compile
time, however, at draw time we need to revisit this number and, together with
other elements that can contribute to total VPM memory requirements, decide
on a configuration that can fit the program into the available VPM memory.
Ideally, we also want to aim for not using more than half of the available
memory so we that we can run a pair of bin and render programs in parallel.

v2: fixed language in comment and typo in commit log. (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
76f4c83815 v3d: add 1-way SIMD packing definition
According to the documentation, the 1-way dispatch width is only supported
with geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
4f5fbd6490 v3d: implement geometry shader instancing
v2:
 - Remove unused field uses_iid from v3d_gs_prog_data (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
0934bd4460 v3d: fix packet descriptions for geometry and tessellation shaders
Every code address starts at bit 3 (addresses must be 64-bit aligned),
with the first 3 bits used to specify threading and NaN propagation
parameters for the shader program.

We generally skip "reserved" bits, however, doing this when the
reserved field is the last in a struct and it is large enough can
make us compute incorrect (smaller) struct sizes which can
lead to corrupt CLs. In particular, the "Tess/Geom Common Params"
struct has a reserved field at the end that is 8-bit, so if we
don't include this we compute a packet size that is 1 byte smaller
than it shold, making the next packet we emit start 1 byte
earlier and therefore leading to incorrect CL data from that point
forward.

The name of one of the fields was not correct.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
5d578c27ce v3d: add initial compiler plumbing for geometry shaders
Most of the relevant work happens in the v3d_nir_lower_io. Since
geometry shaders can write any number of output vertices, this pass
injects a few variables into the shader code to keep track of things
like the number of vertices emitted or the offsets into the VPM
of the current vertex output, etc. This is also where we handle
EmitVertex() and EmitPrimitive() intrinsics.

The geometry shader VPM output layout has a specific structure
with a 32-bit general header, then another 32-bit header slot for
each output vertex, and finally the actual vertex data.

When vertex shaders are paired with geometry shaders we also need
to consider the following:
  - Only geometry shaders emit fixed function outputs.
  - The coordinate shader used for the vertex stage during binning must
    not drop varyings other than those used by transform feedback, since
    these may be read by the binning GS.

v2:
 - Use MAX3 instead of a chain of MAX2 (Alejandro).
 - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro)
 - Update comment in IO owering so it includes the GS stage (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
f63750accf v3d: remove unused variable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
52cbef0039 v3d: enable debug options for geometry shader dumps
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
d6b0786a38 v3d: add debug assert
While lowering vpm outputs we look for the NIR variables matching
particular store output instructions and we expect to find a match,
so assert on that.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00