Commit graph

117072 commits

Author SHA1 Message Date
Daniel Schürmann
48a75e7af0 amd/common: lower bitfield_insert to bfm & bitfield_select
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
a8b0b6e52b nir: introduce lowering of bitfield_insert to bfm and a new opcode bitfield_select.
bitfield_select is defined as:
bitfield_select(mask, base, insert) = (mask & base) | (~mask & insert)
matching the behavior of AMD's BFI instruction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
1403c3a7bf nir/algebraic: Use unsigned comparison when lowering bitfield insert/extract
This lets us use the optimization pattern
(('ult', 31, ('iand', b, 31)), False) to remove the
bcsel instruction for code originating in D3D shaders.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
4eeb49ea71 nir/algebraic: Remove unnecessary iand of [iu]bfe and bfm sources
The [iu]bfe and bfm instructions are defined to only use the five
least significant bits.
This optimizes a common pattern from D3D -> SPIR-V translation.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
165b7f3a44 nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.
That is: the five least significant bits provide the values of
'bits' and 'offset' which is the case for all hardware currently
supported by NIR and using the bfm/bfe instructions.
This patch also changes the lowering of bitfield_insert/extract
using shifts to not use bfm and removes the flag 'lower_bfm'.

Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann
a74f256c58 nir/algebraic: add optimization pattern for ('ult', a, ('and', b, a)) and friends.
These optimizations are based on the fact that
'and(a,b) <= umin(a,b)'.
For AMD, this series moves the optimization from LLVM to NIR,
so currently no vkpipeline-db changes here.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-06-24 18:42:20 +02:00
Andreas Baierl
fa6ea16a8d lima/ppir: Add fsat op
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 16:41:33 +02:00
Andreas Baierl
f1d89bbc2f lima/ppir: Add fneg op
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 16:41:33 +02:00
Andreas Baierl
512397058d lima/ppir: Add fabs op
Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 16:41:33 +02:00
Eric Engestrom
2d2e824fae util: support "y" and "n" in env_var_as_boolean()
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
2019-06-24 12:49:13 +00:00
Andreas Baierl
0cb9ce12fd lima/ppir: lower ffma in ppir
Since we cannot handle ffma in ppir, lower it on nir level already.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-06-24 11:57:57 +00:00
Samuel Pitoiset
946193ae00 radv: add support for VK_AMD_buffer_marker
This simple extension might be useful for debugging purposes.
GAPID has support for it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-24 10:50:54 +02:00
Tapani Pälli
ff77b0415b meson: error out if platforms contains empty string
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110939
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-06-24 08:40:18 +03:00
Nataraj Deshpande
d94fca5420 anv: Add HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED in vk_format
When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform
gralloc module will select a format based on the usage flags provided by
the camera device and the other endpoint of the stream.

The patch fixes crash in vulkan when the test is run with camera stream
set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED.

Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering
on chromebook with camera HAL3.

v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take
    AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan)

Fixes: f1654fa7e3 "anv/android: support creating images from external format"
Signed-off-by: Nataraj Deshpande <nataraj.deshpande@intel.com>
Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-24 08:28:18 +03:00
Timur Kristóf
3b6d787e40 iris: move sysvals to their own constant buffer
This commit moves the sysvals to a separate, new constant buffer
at the end (before the shader constants). It also allows us to
remove the special handling we had for cbuf0, and enables all
constant buffers to support user-specified resources and user
buffers.

v2: (by Kenneth Graunke)
- Rebase on the previous patch to fix system value uploading.
- Fix disk cache num_cbufs calculation
- Fix passthrough TCS to report num_cbufs = 1 so upload actually occurs
- Change upload_sysvals to assert that num_cbufs > 0 when
  num_system_values > 0.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-06-23 18:33:23 +02:00
Kenneth Graunke
ebc8c20b3e iris: Mark cbuf0 as not needing uploading every single time
I neglected to mark cbuf0_needs_upload = false after uploading it.
The obvious fix regressed user clip plane tests, because of a second
bug: we also forgot to mark that they may need re-uploading when
changing shader programs (which may have more or less system values).

Thanks to Timur Kristóf for catching the original issue.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
2019-06-23 18:32:11 +02:00
Eric Engestrom
188dbb1679 Revert "egl: drop empty eglfallbacks.c" and "egl: move fallback calls to eglapi.c"
This reverts commits cc4b68a801 and
b27fb3eaca.

These caused a bunch of EGLSync tests to crash when they were previously
failing.

I have a hunch the tests are doing something wrong, like using
extensions without checking for they support, but until the issue is
investigated I'm just reverting these commits.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-06-22 21:59:06 +01:00
Eric Engestrom
cc4b68a801 egl: drop empty eglfallbacks.c
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-22 15:17:42 +00:00
Eric Engestrom
b27fb3eaca egl: move fallback calls to eglapi.c
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-22 15:17:42 +00:00
Eric Engestrom
262b767023 egl: drop _eglReturnFalse() fallbacks
v2: drop them altogether, they should never get called in the
    first place (Emil)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-06-22 15:17:42 +00:00
Eric Engestrom
82487ede62 egl: remove unnecessary eglGetProcAddress() fallback
No need to add a function that returns `false` only to be cast into
a pointer, we can just use the existing `return NULL` :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2019-06-22 15:17:42 +00:00
Eric Engestrom
30ecd86947 egl: remove NULL assignments after calloc()
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-06-22 15:17:42 +00:00
Eric Engestrom
64c7c05b71 egl: move bad_param check further up
This way other functions added in these entrypoints don't need to check
anything.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-06-22 15:17:42 +00:00
Kenneth Graunke
262787b9bc iris: Drop bo != NULL check from blorp 48b invalidate function.
There is always a BO.
2019-06-21 20:50:42 -05:00
Kenneth Graunke
5da37a826b Revert "iris: Don't check VF address high bits when there is no buffer."
This reverts commit db8f57a5cb.

This is bonkers.  There will always be a BO.
2019-06-21 20:50:42 -05:00
Eric Anholt
4449572c47 freedreno: Only upload UBO pointers for UBOs that haven't been lowered.
total constlen in shared programs: 2485933 -> 2462236 (-0.95%)

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-21 17:14:43 -07:00
Eric Anholt
01d0bad9ef freedreno: Remove silly return from ir3_optimize_nir().
We only ever return the shader we were passed in (but internally
modified).

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-21 17:14:43 -07:00
Eric Anholt
56842d33d5 freedreno: Fix up end range of unaligned UBO loads.
We need the constants uploaded to cover the NIR offset plus the size,
not the aligned-down start of our upload range plus the size.  Fixes
mistaken UBO analysis with mat3 loads.

Fixes: 893425a607 ("freedreno/ir3: Push UBOs to constant file")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-21 17:14:43 -07:00
Eric Anholt
5e7c96b95d freedreno: Fix UBO load range detection on booleans.
NIR 1-bit bool dests will have a bit size of 1, and thus a calculated
"bytes" of 0.  load_ubo is always loading from dwords in the source.

Fixes: 893425a607 ("freedreno/ir3: Push UBOs to constant file")
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-21 17:14:43 -07:00
Eric Anholt
23a7feda63 freedreno: Stop reporting max_const in shader-db.
We end up uploading constlen regardless, so max_const would only get
you slightly improved granularity in const usage in comparison.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-21 17:14:43 -07:00
Eric Anholt
ee2e1e85d4 freedreno: Include binning shaders in shader-db.
We want to see if we've improved our binning VS output, as well as the
render VS.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-21 17:14:43 -07:00
Marek Olšák
8ab9f3a857 include: update GL headers from the registry
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2019-06-21 19:00:52 -04:00
Alyssa Rosenzweig
a6bef350ed panfrost: Fix unused variable warning
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 13:06:49 -07:00
Boris Brezillon
5f81669d88 panfrost: Remove the panfrost_driver abstraction
The non-DRM backend is gone. Let's get rid of the panfrost_driver
abstraction and call the panfrost_drm_xxx() functions directly.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 13:01:49 -07:00
Boris Brezillon
e8257f3de8 panfrost: Remove the perf counters interface
The DRM backend has a dummy implementation and the non-DRM backend is
gone, so let's remove this perf counter interface.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 13:01:12 -07:00
Tomeu Vizoso
0bcbccf887 panfrost: ci: Fix parsing of crashed tests
Without this fix, LAVA isn't parsing crashes as failed tests, because
the shell logging is interspersed within the fake deqp output.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:35 -07:00
Alyssa Rosenzweig
d38ac21297 panfrost: Conditionally submit fragment job
If there are no tiling jobs and no clears, there is no need to submit a
fragment job (relevant for transform feedback).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:35 -07:00
Alyssa Rosenzweig
cd5d618b5c panfrost: Implement rasterizer discard
D'aww, look how cute that is now that scoreboarding is setup.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:31 -07:00
Alyssa Rosenzweig
26c5a145a7 panfrost: Track buffer initialization
We want to know if a given slice of a buffer is initialized at a
particular point in the execution of the program. This is accomplished
easily enough -- start out uninitialized and upon an operation writing
to the buffer, mark it initialized.

The motivation is to optimize away expensive operations (like wallpaper
blits) when reading from an uninitialized buffer; since it's
uninitialized, the results of these operations are undefined, and it's
legal to take the fast path ^_^

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-21 09:35:09 -07:00
Alyssa Rosenzweig
f0854745fd panfrost: Implement command stream scoreboarding
This is a rather complex change, adding a lot of code but ideally
cleaning up quite a bit as we go.

Within a batch (single frame), there are multiple distinct Mali job
types: SET_VALUE, VERTEX, TILER, FRAGMENT for the few that we emit right
now (eventually more for compute and geometry shaders). Each hardware
job has a mali_job_descriptor_header, which contains three fields of
interest: job index, a dependencies list, and a next job pointer.

The next job pointer in each job is used to form a linked list of
submitted jobs. Easy enough.

The job index and dependencies list, however, are used to form a
dependency graph (a DAG, where each hardware job is a node and each
dependency is a directed edge). Internally, this sets up a scoreboarding
data structure for the hardware to dispatch jobs in parallel, enabling
(for example) vertex shaders from different draws to execute in parallel
while there are strict dependencies between tiling the geometry of a
draw and running that vertex shader.

For a while, we got by with an incredible series of total hacks,
manually coding indices, lists, and dependencies. That worked for a
moment, but combinatorial kaboom kicked in and it became an
unmaintainable mess of spaghetti code.

We can do better. This commit explicitly handles the scoreboarding by
providing high-level manipulation for jobs. Rather than a command like
"set dependency #2 to index 17", we can express quite naturally "add a
dependency from job T on job V". Instead of some open-coded logic to
copy a draw pointer into a delicate context array, we now have an
elegant exposed API to simple "queue a job of type XYZ".

The design is influenced by both our current requirements (standard ES2
draws and u_blitter) as well as the need for more complex scheduling in
the future. For instance, blits can be optimized to use only a tiler
job, without a vertex job first (since the screen-space vertices are
known ahead-of-time) -- causing tiler-only jobs. Likewise, when using
transform feedback with rasterizer discard enabled, vertex jobs are
created (to run vertex shaders) with no corresponding tiler job. Both of
these cases break the original model and could not be expressed with the
open-coded logic. More generally, this will make it easier to add
support for compute shaders, geometry shaders, and fused jobs (an
optimization available on Bifrost).

Incidentally, this moves quite a bit of state from the driver context to
the batch, which helps with Rohan's refactor to eventually permit
pipelining across framebuffers (one important outstanding optimization
for FBO-heavy workloads).

v2: Add comment explaining the meaning of "primary batch" as suggested
by Tomeu (trivial - not reviewed).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
2019-06-21 09:35:02 -07:00
Anuj Phogat
e334a595e4 intel/icl: Add new ICL PCI-IDs
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 08:38:08 -07:00
Jason Ekstrand
1a9e5b9094 anv: Implement "pop-free" clipping
This is the preferred clipping mode since it doesn't mean your points
disappear the moment part of the point crosses over the edge of the
viewport and that lines have weird endpoints at viewport edges.  We've
just never bothered to hook it up until now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Jason Ekstrand
4a757d6c31 anv: Enable the guardband clip test
In workloads where there is a lot of geometry drawn that crosses over
the edge of the viewport, this should substantially improve clipper
performance.  Not really sure why it's taken 3 years to turn it on but
we never got around to it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Jason Ekstrand
13f0c278c5 i965,iris: Move guardband calculations to a common location
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-06-21 14:18:59 +00:00
Mauro Rossi
60c581b57d android: virgl: fix libmesa_winsys_virgil_common build and dependencies
Fixes the following building errors and resolves Bug 110922
Fixes gallium_dri target missing symbols at linking.

external/mesa/src/gallium/winsys/virgl/drm/Android.mk:
error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64)
...
external/mesa/src/gallium/winsys/virgl/vtest/Android.mk:
error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64)
...
build/core/main.mk:728: error: exiting from previous errors.

In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c:34:
external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10:
fatal error: 'virgl_resource_cache.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.c:32:
external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10:
fatal error: 'virgl_resource_cache.h' file not found
#include "virgl_resource_cache.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: b18f09a ("virgl: Introduce virgl_resource_cache")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
2019-06-21 15:53:29 +02:00
Mauro Rossi
cf389ba895 android: winsys/amdgpu,radv: fix generated amdgfxregs.h header dependecies
Fix android building errors in winsys/amdgpu and radv
due to 'amdgfxregs.h' not found.

Changelog:
amd/common - generated $(intermediated)/common path is added to exports
winsys/amdgpu - libmesa_amd_common static dependency is added
radv - correct generated $(intermediated)/common path is added to includes

Fixes: f480b8a ("amd/common: use generated register header")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2019-06-21 15:53:23 +02:00
Samuel Pitoiset
9bf47fefe0 radv: add support for VK_KHR_depth_stencil_resolve
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-21 14:50:38 +02:00
Samuel Pitoiset
e67fc11c26 radv: pass sample locations for transitions before depth/stencil resolves
HTILE decompressions need the user sample locations if specified
in the current subpass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-21 14:50:35 +02:00
Samuel Pitoiset
396da5c029 radv: clear the depth/stencil resolve attachment if necessary
The driver might need to clear one aspect of the depth/stencil
resolve attachment before performing the resolve itself.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-21 14:50:33 +02:00
Samuel Pitoiset
c7872237bf radv: decompress HTILE if the resolve src image is compressed
It's required to decompress HTILE before resolving with the
compute path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-21 14:50:27 +02:00