Commit graph

118535 commits

Author SHA1 Message Date
Iago Toral Quiroga
5d578c27ce v3d: add initial compiler plumbing for geometry shaders
Most of the relevant work happens in the v3d_nir_lower_io. Since
geometry shaders can write any number of output vertices, this pass
injects a few variables into the shader code to keep track of things
like the number of vertices emitted or the offsets into the VPM
of the current vertex output, etc. This is also where we handle
EmitVertex() and EmitPrimitive() intrinsics.

The geometry shader VPM output layout has a specific structure
with a 32-bit general header, then another 32-bit header slot for
each output vertex, and finally the actual vertex data.

When vertex shaders are paired with geometry shaders we also need
to consider the following:
  - Only geometry shaders emit fixed function outputs.
  - The coordinate shader used for the vertex stage during binning must
    not drop varyings other than those used by transform feedback, since
    these may be read by the binning GS.

v2:
 - Use MAX3 instead of a chain of MAX2 (Alejandro).
 - Make all loop variables unsigned in ntq_setup_gs_inputs (Alejandro)
 - Update comment in IO owering so it includes the GS stage (Alejandro)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
f63750accf v3d: remove unused variable
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
52cbef0039 v3d: enable debug options for geometry shader dumps
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
d6b0786a38 v3d: add debug assert
While lowering vpm outputs we look for the NIR variables matching
particular store output instructions and we expect to find a match,
so assert on that.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Iago Toral Quiroga
6e68f74395 v3d: add missing plumbing for VPM load instructions
We will need to use LDVPMG_IN specifically to read VPM inputs
in geometry shaders.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2019-12-16 08:42:37 +01:00
Eric Anholt
f58ef5d481 turnip: Lower usub_borrow.
Fixes dEQP-VK.glsl.builtin.function.integer.usubborrow.uvec2_mediump_fragment.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2986>
2019-12-16 04:52:09 +00:00
Caio Marcelo de Oliveira Filho
c06ba83589 intel/fs: Lower 64-bit MOVs after lower_load_payload()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>
2019-12-14 21:12:21 +00:00
Bas Nieuwenhuizen
b53856aca3 amd/common: Always use addrlib for HTILE tc-compat.
Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.

One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.

On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
2019-12-14 20:39:29 +00:00
Bas Nieuwenhuizen
e197fb1c2f amd/common: Fix tcCompatible degradation on Stoney.
addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.

This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
2019-12-14 20:39:29 +00:00
Denis Pauk
6bf14e9c47 docs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe, swr
Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
CC: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Matt Turner <mattst88@gmail.com>
2019-12-14 20:02:10 +00:00
Denis Pauk
3acc15f4f0 gallium/swr: Enable support bptc format.
Reuse Code from:
f69bc797e1 gallium/auxiliary: Add helper support for bptc format compress/decompress

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
2019-12-14 20:02:10 +00:00
Rob Clark
1bf3837395 freedreno/a6xx: fix OUT_REG() vs growable cmdstream
BEGIN_RING() could decide we can't fit the next packet in the current
cmdstream segment, and grow a new segment.  So we need to grab ring->cur
*after* BEGIN_RING(), otherwise we are writing cmdstream past the end of
the previous segment.

Fixes: bdd98b892f ("freedreno: New struct packing macros")
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-14 09:12:39 -08:00
Erico Nunes
ce52b49348 lima: split draw calls on 64k vertices
The Mali400 only supports draws with up to 64k vertices per command.
To handle this, break the draw_vbo call into multiple commands.
Indexed drawing is left to a separate code path.
This implementation was ported from vc4_draw_vbo.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
6d46d0e82b vc4: move the draw splitting routine to shared code
This can also be useful for other hardware which has similar limitations
on vertex count per single draw.
The Mali400 has a similar limitation and can reuse this.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
2d7be5f01f lima: refactor indexed draw indices upload
As of this commit this is just a refactor in preparation to enable
support for more than 64k vertices.
To support splitting the draw_vbo call, indices shouldn't be re-uploaded
every time.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
270c282a43 lima: allocate separate bo to store varyings
The current strategy using the suballocator with fixed size doesn't
scale and causes some programs with large number of vertices (like some
glmark2 scenes) to crash.
Change it to dynamically allocate a separate bo to accomodate for
arbitrary number of vertices.
This also fixes the buffer read/write flags for gp.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Erico Nunes
8bf2b5db78 gallium/util: add alignment parameter to util_upload_index_buffer
At least on Mali Utgard, index buffers need to be aligned on 0x40.
To avoid duplicating this, add an alignment parameter.
Keep the previous default for the other existing users.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
2019-12-14 07:44:43 +01:00
Kenneth Graunke
9fb45c5bbd drirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version
This gets it running on i965 with Mesa master.  (The game won't start
without GL 3.3 compatibility, but uses 1.20 with GL_EXT_gpu_shader4
for shaders.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>
2019-12-13 17:58:42 -08:00
Timothy Arceri
7564c5fc6d st/glsl_to_nir: fix SSO validation regression
Fixes: b77907edb554 ("st/glsl_to_nir: use nir based program resource list builder")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2216

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-13 23:09:57 +00:00
Alyssa Rosenzweig
46f0b9ecc5 ci: Remove T760/T860 from CI temporarily
I feel really bad about this but this one test is flaking. I don't want
to do a mass revert (and bisection is extremely difficult with
nondeterministic/Heisenbugs), but it's Friday night and master needs to
pass. This commit should be reverted asap (once the flake is solved)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 22:52:39 +00:00
Rafael Antognolli
59de5d9b6a iris: Implement WA for push constants.
v2: Apply WA to gen11+ instead of gen12+ (Jordan).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-12-13 14:15:04 -08:00
Andreas Baierl
8adeeaa7f2 lima/parser: Add texture descriptor parser
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
2019-12-13 22:02:03 +00:00
Andreas Baierl
5456916309 lima/parser: Add RSW parsing
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
2019-12-13 22:02:03 +00:00
Andreas Baierl
31ed081ca3 lima/parser: Some fixes and cleanups
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
2019-12-13 22:02:03 +00:00
Rafael Antognolli
6a3b8811ea vulkan/overlay: Update docs.
Add mention to overlay control socket.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
56ccea58ae vulkan/overlay: Add basic overlay control script.
This can be used to start/stop statistics capturing from the command
line.

v3:
 - Install script (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
a94fa1da93 vulkan/overlay: Add a command to start capturing data to a file.
By default, if an output_file is specified, the overlay layer will start
capturing data immediately. After this commit, when a control socket is
used, the capture starts disabled by default, and is only enabled when a
command ":capture=1;" is received.

when the capture is enabled, we might have already accumulated some
stats. To avoid capturing such noise, we discard and reset the fps and
stats, updating the display and capturing only data from that point on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
606dff1b73 vulkan/overlay: Add support for a control socket.
Add support for socket from which the overlay layer can receive
commands. This control socket can be useful to allow setting options
once the application is already running. For instance, triggering the
capture of fps data at a certain point.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
e87d7fea8a vulkan/overlay: Add a control socket.
v2: Use a socket instead of named pipe.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Rafael Antognolli
ef5266ebd5 util/os_socket: Add socket related functions.
v3:
 - Add os_socket.c/h into Makefile.sources (Lionel)
 - Add empty non-linux implementation to public functions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-12-13 20:53:44 +00:00
Eric Engestrom
c327245257 anv: drop unused #include
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-13 20:42:40 +00:00
Eric Engestrom
1a837e803b util/simple_mtx: don't set the canary when it can't be checked
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-12-13 20:20:21 +00:00
Eric Engestrom
d600b19640 intel/compiler: replace 0 pointer with NULL
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-13 20:16:20 +00:00
Eric Engestrom
8074f68b3b intel/compiler: add ASSERTED annotation to avoid "unused variable" warning
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-12-13 20:16:20 +00:00
Kenneth Graunke
91efae4f80 iris: Alphabetize source files after iris_perf.c was added 2019-12-13 11:03:13 -08:00
Rob Clark
3b8feefd9c freedreno/ir3: add iterator macros
So many open coded list iterators were getting annoying.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-13 09:25:40 -08:00
Rob Clark
ad92aa36ac freedreno/ir3: add scheduler traces
Add some infrastructure to trace scheduler decisions.  The next patch
will add some more traces, just splitting this out to reduce clutter.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-13 09:25:40 -08:00
Rob Clark
dd34ccb2c5 freedreno/ir3: add last-baryf shaderdb stat
Sometimes sched changes that are a win in terms of instruction count
and/or register pressure, are worse in real life, due to keeping varying
storage locked for too long.  Add a shader-db stat to give this more
visibility.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-12-13 09:25:40 -08:00
Alejandro Piñeiro
2865d79a33 nir/opt_peephole_select: remove unused variables
To avoid "unused variable" warnings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-12-13 17:14:58 +01:00
Alyssa Rosenzweig
7c972eba40 panfrost: Report GPU name in es2_info
We can prettify the ID.

Closes #2093

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
09a2c74cfd panfrost: Add panfrost_model_name helper
This gives us a string representation of a GPU ID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
a215289176 panfrost: Move property queries to _encoder
We'll want these in non-Gallium devices.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
102789886c panfrost: Move nir_undef_to_zero to Midgard compiler
Nothing Gallium about it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
ddbbb2db48 pandecode: Add cast
Fixes minor coverity warning about the format specifier.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
4f7fddbd71 panfrost: Pass size to panfrost_batch_get_scratchpad
We'll compute the size with the new scratchpad helpers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
bc887e8281 panfrost: Calculate maximum stack_size per batch
We'll need this so we can allocate a stack for the batch large enough
for all the jobs within it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
a337bf319c pan/midgard: Handle misc. cppcheck warnings
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
f204791cd6 pan/midgard: Remove unused ld/st packing hepers
Identified by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
709d8c29cd panfrost: Handle minor cppcheck issues
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00
Alyssa Rosenzweig
b0e915b4e6 panfrost: Emit SFBD/MFBD after a batch, instead of before
The size of the scratchpad (as well as some tiler details) depend on the
contents of the batch, so we need to wait to defer filling out the FBD
until after all draws are queued.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-12-13 10:26:35 -05:00