Commit graph

38510 commits

Author SHA1 Message Date
Alyssa Rosenzweig
5e2c3d40bd panfrost/midgard: Implement UBO reads
UBOs and uniforms now use a common code path with an explicit `index`
argument passed, enabling UBO reads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig
f28e9e868b panfrost: Handle disabled/empty UBOs
Prevents an assert(0) later in this (not so edge) case. We still have to
have a dummy there.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig
bd2fc60a8a panfrost: Identify "uniform buffer count" bits
We've known about this for a while, but it was never formally in the
machine header files / decoder, so let's add them in.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig
856e03902b panfrost: Upload UBOs
Now that all the counting is sorted, it's a matter of passing along a
GPU address and going.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig
4c6d751274 panfrost: Allow for dynamic UBO count
We already uploaded UBOs, but only a fixed number (1) for uniforms;
let's upload as many as we compute we need.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig
5d60be4e24 panfrost: Report UBO count
We look at the highest set bit in the UBO enable mask to work out the
maximum indexable UBO, i.e. the UBO count as we need to report to the
hardware.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig
ca2caf01df panfrost: Constant buffer refactor
We refactor panfrost_constant_buffer to mirror v3d's constant buffer
handling, to enable UBOs as well as a single set of uniforms.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:57:40 -07:00
Alyssa Rosenzweig
f35f373850 panfrost: Replace varyings for point sprites
This doesn't handle Y-flipping, but it's good enough to render the stars
in Neverball.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:56:22 -07:00
Alyssa Rosenzweig
be03060066 panfrost: Track point sprites in fragment shader key
In preparation for lowering point sprites, track them like we track
alpha testing state.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-24 12:56:16 -07:00
Daniel Schürmann
0daeb1d127 amd/common: lower bitfield_extract to ubfe/ibfe.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
48a75e7af0 amd/common: lower bitfield_insert to bfm & bitfield_select
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
165b7f3a44 nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.
That is: the five least significant bits provide the values of
'bits' and 'offset' which is the case for all hardware currently
supported by NIR and using the bfm/bfe instructions.
This patch also changes the lowering of bitfield_insert/extract
using shifts to not use bfm and removes the flag 'lower_bfm'.

Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Andreas Baierl
fa6ea16a8d lima/ppir: Add fsat op
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 16:41:33 +02:00
Andreas Baierl
f1d89bbc2f lima/ppir: Add fneg op
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 16:41:33 +02:00
Andreas Baierl
512397058d lima/ppir: Add fabs op
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 16:41:33 +02:00
Andreas Baierl
0cb9ce12fd lima/ppir: lower ffma in ppir
Since we cannot handle ffma in ppir, lower it on nir level already.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 11:57:57 +00:00
Timur Kristóf
3b6d787e40 iris: move sysvals to their own constant buffer
This commit moves the sysvals to a separate, new constant buffer
at the end (before the shader constants). It also allows us to
remove the special handling we had for cbuf0, and enables all
constant buffers to support user-specified resources and user
buffers.

v2: (by Kenneth Graunke)
- Rebase on the previous patch to fix system value uploading.
- Fix disk cache num_cbufs calculation
- Fix passthrough TCS to report num_cbufs = 1 so upload actually occurs
- Change upload_sysvals to assert that num_cbufs > 0 when
  num_system_values > 0.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-23 18:33:23 +02:00
Kenneth Graunke
ebc8c20b3e iris: Mark cbuf0 as not needing uploading every single time
I neglected to mark cbuf0_needs_upload = false after uploading it.
The obvious fix regressed user clip plane tests, because of a second
bug: we also forgot to mark that they may need re-uploading when
changing shader programs (which may have more or less system values).

Thanks to Timur Kristóf for catching the original issue.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
2019-06-23 18:32:11 +02:00
Kenneth Graunke
262787b9bc iris: Drop bo != NULL check from blorp 48b invalidate function.
There is always a BO.
2019-06-21 20:50:42 -05:00
Kenneth Graunke
5da37a826b Revert "iris: Don't check VF address high bits when there is no buffer."
This reverts commit db8f57a5cb.

This is bonkers.  There will always be a BO.
2019-06-21 20:50:42 -05:00
Eric Anholt
01d0bad9ef freedreno: Remove silly return from ir3_optimize_nir().
We only ever return the shader we were passed in (but internally
modified).

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-21 17:14:43 -07:00
Eric Anholt
23a7feda63 freedreno: Stop reporting max_const in shader-db.
We end up uploading constlen regardless, so max_const would only get
you slightly improved granularity in const usage in comparison.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-21 17:14:43 -07:00
Eric Anholt
ee2e1e85d4 freedreno: Include binning shaders in shader-db.
We want to see if we've improved our binning VS output, as well as the
render VS.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-21 17:14:43 -07:00
Alyssa Rosenzweig
a6bef350ed panfrost: Fix unused variable warning
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 13:06:49 -07:00
Boris Brezillon
5f81669d88 panfrost: Remove the panfrost_driver abstraction
The non-DRM backend is gone. Let's get rid of the panfrost_driver
abstraction and call the panfrost_drm_xxx() functions directly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 13:01:49 -07:00
Boris Brezillon
e8257f3de8 panfrost: Remove the perf counters interface
The DRM backend has a dummy implementation and the non-DRM backend is
gone, so let's remove this perf counter interface.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 13:01:12 -07:00
Tomeu Vizoso
0bcbccf887 panfrost: ci: Fix parsing of crashed tests
Without this fix, LAVA isn't parsing crashes as failed tests, because
the shell logging is interspersed within the fake deqp output.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:35 -07:00
Alyssa Rosenzweig
d38ac21297 panfrost: Conditionally submit fragment job
If there are no tiling jobs and no clears, there is no need to submit a
fragment job (relevant for transform feedback).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:35 -07:00
Alyssa Rosenzweig
cd5d618b5c panfrost: Implement rasterizer discard
D'aww, look how cute that is now that scoreboarding is setup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:31 -07:00
Alyssa Rosenzweig
26c5a145a7 panfrost: Track buffer initialization
We want to know if a given slice of a buffer is initialized at a
particular point in the execution of the program. This is accomplished
easily enough -- start out uninitialized and upon an operation writing
to the buffer, mark it initialized.

The motivation is to optimize away expensive operations (like wallpaper
blits) when reading from an uninitialized buffer; since it's
uninitialized, the results of these operations are undefined, and it's
legal to take the fast path ^_^

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:09 -07:00
Alyssa Rosenzweig
f0854745fd panfrost: Implement command stream scoreboarding
This is a rather complex change, adding a lot of code but ideally
cleaning up quite a bit as we go.

Within a batch (single frame), there are multiple distinct Mali job
types: SET_VALUE, VERTEX, TILER, FRAGMENT for the few that we emit right
now (eventually more for compute and geometry shaders). Each hardware
job has a mali_job_descriptor_header, which contains three fields of
interest: job index, a dependencies list, and a next job pointer.

The next job pointer in each job is used to form a linked list of
submitted jobs. Easy enough.

The job index and dependencies list, however, are used to form a
dependency graph (a DAG, where each hardware job is a node and each
dependency is a directed edge). Internally, this sets up a scoreboarding
data structure for the hardware to dispatch jobs in parallel, enabling
(for example) vertex shaders from different draws to execute in parallel
while there are strict dependencies between tiling the geometry of a
draw and running that vertex shader.

For a while, we got by with an incredible series of total hacks,
manually coding indices, lists, and dependencies. That worked for a
moment, but combinatorial kaboom kicked in and it became an
unmaintainable mess of spaghetti code.

We can do better. This commit explicitly handles the scoreboarding by
providing high-level manipulation for jobs. Rather than a command like
"set dependency #2 to index 17", we can express quite naturally "add a
dependency from job T on job V". Instead of some open-coded logic to
copy a draw pointer into a delicate context array, we now have an
elegant exposed API to simple "queue a job of type XYZ".

The design is influenced by both our current requirements (standard ES2
draws and u_blitter) as well as the need for more complex scheduling in
the future. For instance, blits can be optimized to use only a tiler
job, without a vertex job first (since the screen-space vertices are
known ahead-of-time) -- causing tiler-only jobs. Likewise, when using
transform feedback with rasterizer discard enabled, vertex jobs are
created (to run vertex shaders) with no corresponding tiler job. Both of
these cases break the original model and could not be expressed with the
open-coded logic. More generally, this will make it easier to add
support for compute shaders, geometry shaders, and fused jobs (an
optimization available on Bifrost).

Incidentally, this moves quite a bit of state from the driver context to
the batch, which helps with Rohan's refactor to eventually permit
pipelining across framebuffers (one important outstanding optimization
for FBO-heavy workloads).

v2: Add comment explaining the meaning of "primary batch" as suggested
by Tomeu (trivial - not reviewed).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
2019-06-21 09:35:02 -07:00
Jason Ekstrand
13f0c278c5 i965,iris: Move guardband calculations to a common location
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Mauro Rossi
60c581b57d android: virgl: fix libmesa_winsys_virgil_common build and dependencies
Fixes the following building errors and resolves Bug 110922
Fixes gallium_dri target missing symbols at linking.

external/mesa/src/gallium/winsys/virgl/drm/Android.mk:
error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64)
...
external/mesa/src/gallium/winsys/virgl/vtest/Android.mk:
error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64)
...
build/core/main.mk:728: error: exiting from previous errors.

In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c:34:
external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10:
fatal error: 'virgl_resource_cache.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.c:32:
external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10:
fatal error: 'virgl_resource_cache.h' file not found
#include "virgl_resource_cache.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: b18f09a ("virgl: Introduce virgl_resource_cache")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2019-06-21 15:53:29 +02:00
Mauro Rossi
cf389ba895 android: winsys/amdgpu,radv: fix generated amdgfxregs.h header dependecies
Fix android building errors in winsys/amdgpu and radv
due to 'amdgfxregs.h' not found.

Changelog:
amd/common - generated $(intermediated)/common path is added to exports
winsys/amdgpu - libmesa_amd_common static dependency is added
radv - correct generated $(intermediated)/common path is added to includes

Fixes: f480b8a ("amd/common: use generated register header")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-06-21 15:53:23 +02:00
Eric Engestrom
6a9dd62882 drisw: move build logic to build systems
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-21 11:35:39 +00:00
Tomeu Vizoso
1cbe2ad394 panfrost: ci: Exclude two more flip-flop from results
These three tests pass on RK3399, but fail on RK3288:

dEQP-GLES2.functional.shaders.matrix.div.const_lowp_mat2_mat2_vertex
dEQP-GLES2.functional.shaders.operator.unary_operator.pre_increment_effect.highp_ivec4_vertex
dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec3

They reliably pass when run individually, but reliably fail when run in
a full CI run.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-06-21 10:45:12 +02:00
Gert Wollny
ef4429d9c5 gallium/st: Add Gallium hud to swrast drivers
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-06-21 08:54:57 +02:00
Iago Toral Quiroga
4d8f82946b v3d: flush jobs writing to vertex buffers used in the current draw call
This can happen when any of our vertex buffers was written by a previous
transform feedback draw.

Fixes the following piglit tests:
spec/ext_transform_feedback/position-render-bufferbase
spec/ext_transform_feedback/position-render-bufferbase-discard
spec/ext_transform_feedback/position-render-bufferoffset
spec/ext_transform_feedback/position-render-bufferoffset-discard
spec/ext_transform_feedback/position-render-bufferrange
spec/ext_transform_feedback/position-render-bufferrange-discard

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-21 08:06:13 +02:00
Iago Toral Quiroga
eb44dcc219 v3d: flush jobs reading from transform feedback output buffers
If we are about to write to a transform feedback buffer, we should
make sure that we flush any prior work that intended to read from
any of these buffers.

Fixes piglit test:
spec/ext_transform_feedback/immediate-reuse

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-21 08:06:13 +02:00
Iago Toral Quiroga
42572f2f7d v3d: add a helper to check if transform feedback is enabled
v2: We should be safe assuming that bind_vs != NULL (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-21 08:06:13 +02:00
Dave Airlie
00a56acc23 llvmpipe: make remove_shader_variant static.
this isn't used outside this file.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-06-21 10:27:57 +10:00
Tomeu Vizoso
2743e34f20 panfrost: ci: Update expectations
These tests have been fixed recently.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-06-20 20:57:41 +02:00
Alyssa Rosenzweig
195e297a92 panfrost/midgard: Broadcast swizzle
Fixes regression in shaders using ball/etc by explicitly passing through
the number of channels in the NIR op and broadcasting the last
components of the channel appropriately, as the Midgard ops are all vec4
implicitly but NIR can be vec2/3.

v2: Don't also regress every other swizzle in Equestria.

v3: Don't regress the swizzles at Canterlot High either.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-06-20 20:52:04 +02:00
Kenneth Graunke
31de802e7e iris: Use stream uploader for shader draw parameters.
Most vertex data lives in user VBOs in IRIS_MEMZONE_OTHER, which
typically have high bits set to 0xffff.  The shader draw parameters were
being uploaded in IRIS_MEMZONE_DYNAMIC, which have high bets set to 0x2.
This was causing a lot of ping-ponging of high bits, leading to
unnecessary VF cache flushing.

Cuts 7.2% of the flushes in the Civizilation VI demo on Kabylake GT2.
2019-06-20 13:32:16 -05:00
Kenneth Graunke
db8f57a5cb iris: Don't check VF address high bits when there is no buffer.
If there is no buffer, then it doesn't matter.  Leave the old stale
high bits in place (for next time) and don't bother invalidating.

Cuts 5.6% of the flushes in the Civilization VI demo on Kabylake GT2.
2019-06-20 13:32:16 -05:00
Kenneth Graunke
ecc500398f iris: Drop RT flushes from depth stencil clearing flushes.
These write depth and stencil, not color writes, so there's no need
to flush the render target.
2019-06-20 13:32:16 -05:00
Kenneth Graunke
1d63af0f2c iris: Don't bother with PIPE_CONTROLs for CPU writes and no history
If a buffer has no usage history, we don't have any read only cache
invalidates to do.  If we've written it with the CPU, we don't need
to flush the render cache.  The only bit remaining is the CS stall
from iris_flush_bits_for_history.  We can just skip the PIPE_CONTROL
in this case.

This is pretty common - an app creates a buffer, fills it with data,
and then binds it for some purpose.

Cuts 36% of the flushes in Manhattan 3.0 on Kabylake GT2.
2019-06-20 13:32:16 -05:00
Kenneth Graunke
dfff6e10b4 iris: Only do an RT flush for transfer maps if using copy_region.
If we wrote the data via the CPU, there's no point in doing a render
target flush.  If using BLORP, we do want a render target flush so the
data lands.
2019-06-20 13:32:15 -05:00
Kenneth Graunke
c4c17ab3ec iris: Use iris_flush_bits_for_history in iris_transfer_flush_region
Instead of using the combined iris_flush_and_dirty_for_history, use
iris_flush_bits_for_history directly - we were already using the split
out iris_dirty_for_history.  There's no need to dirty twice, and we can
avoid the looping altogether for non-buffers.
2019-06-20 13:32:15 -05:00
Kenneth Graunke
6890340c31 iris: Avoid double flushing in iris_transfer_flush_region when copying.
My intention was to have iris_copy_region not do flushing, and leave
that up to the callers.  iris_resource_copy_region needs to do this,
but iris_transfer_flush_region was already doing it.  The net result
was that we were doing it twice for transfers.

So, move the flushing from iris_copy_region to iris_resource_copy_region
so that it only happens in the callers as I intended.
2019-06-20 13:32:15 -05:00