Commit graph

117057 commits

Author SHA1 Message Date
Nanley Chery
0a2a9a4a5b iris: Allocate main and aux surfaces together
On Gen12, the CCS buffer address doesn't have to be referenced in state
packets. In the case of a stencil buffer with CCS, the kernel won't know
the location of the CCS unless an extra call is made to pin its address.
To avoid this extra call, make the CCS part of the main surface.

v2. Update comment above bo_size. (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Nanley Chery
ff5bc81b51 iris: Determine aux offsets within configure_aux
If a resource has a modifier, the main and aux surfaces will share a BO.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Nanley Chery
f0ed86c6c6 iris: Bail resource creation upon aux creation error
The functions used during aux buffer configuration and creation only
return false for exceptional errors. Don't proceed with surface creation
in those cases.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Nanley Chery
8b62e3d978 iris: Drop iris_resource::aux::extra_aux::bo
The primary and secondary aux buffers are always allocated in the same
BO.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 14:46:15 -07:00
Duncan Hopkins
bb8e6994cc zink: pass line width from rast_state to gfx_pipeline_state.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-10-29 20:38:26 +00:00
Jason Ekstrand
52aa7f3e05 anv: Reduce the minimum number of relocations
The original value of 256 was under the assumption that you're a batch
buffer which is likely going to have a large number of relocations.
However, pipeline objects on Gen7 will have at most 6 relocations (one
per shader stage and one for the workaround BO) so this is a lot of
per-pipeline wasted space.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-29 20:27:52 +00:00
Jason Ekstrand
a3153162a9 anv: Delay allocation of relocation lists
The old relocation list code always allocated 256 relocations and a hash
set up-front without knowing whether or not we really need them.  In
particular, in the softpin case, this is two fairly large allocations
that we don't need to be making.  Also, for pipeline objects on haswell
where we don't have softpin, we don't need relocations unless scratch is
used so this is extra data per-pipeline.  Instead, we should do it
on-demand.  This shaves 3.5% off of a cpu-limited example running with
the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-29 20:27:52 +00:00
Plamena Manolova
4fe2317601 anv: Implement new way for setting streamout buffers.
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 19:21:20 +00:00
Plamena Manolova
0f610e17bc iris: Implement new way for setting streamout buffers.
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 19:20:25 +00:00
Plamena Manolova
665b81e29a genxml: Add 3DSTATE_SO_BUFFER_INDEX_* instructions
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 19:19:58 +00:00
Rob Clark
ff6e148a3d freedreno/a6xx: add a618 support
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-29 09:19:34 -07:00
Rob Clark
afd224fac3 freedreno/a6xx: cleanup magic registers
Extract out values for the handful of unknown registers which have
different values across different a6xx models, to simplify adding
support for new a6xx's.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-29 09:19:31 -07:00
Rob Clark
1fdc259bfc freedreno/a6xx: remove some left over dead code
These registers don't exist, just remnants of initial port from a5xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-10-29 09:19:27 -07:00
Plamena Manolova
f9ad73cdfd anv: Set depthBounds to true in anv_GetPhysicalDeviceFeatures.
Add depth bounds testing to the list of supported
physical device features.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 16:05:33 +00:00
Plamena Manolova
e6c8750278 genxml: Change 3DSTATE_DEPTH_BOUNDS bias.
The bias for the 3DSTATE_DEPTH_BOUNDS instruction
should be 2 not 1.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 16:05:33 +00:00
Michel Dänzer
2a38fc1027 gitlab-ci: Only run the pipeline if any files affecting it have changed
E.g. documentation-only changes cannot affect the outcome of the
pipeline, so don't waste resources on running it.

The thing we need to be careful about here is that the container stage
jobs must always run if any later stage jobs using the corresponding
docker images run. We're currently using the same .ci-run-policy
template for all jobs, so this is trivially true.

v2:
* Add bin/ and common.py (Eric Engestrom)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> # v1
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-29 15:09:56 +00:00
Krzysztof Raszkowski
163d5fde06 gallium/swr: Enable GL_ARB_gpu_shader5: multiple streams
Added support for geometry shader multiple streams (part of
GL_ARB_gpu_shader5 extension).

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
2019-10-29 14:50:02 +00:00
Alyssa Rosenzweig
44971b84b7 panfrost: Remove unused definitions in mali-job.h
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Alyssa Rosenzweig
fa14cdf6e4 panfrost: Cleanup _shader_upper -> shader
I don't believe this is actually a tagged pointer; warn if it is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-10-29 13:02:53 +00:00
Eric Engestrom
b4f508ab59 meson: define _GNU_SOURCE on FreeBSD
_mesa_strtod() needs this to use strtod_l(), which behaves correctly
wrt `,` vs `.` decimal separator.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2008
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-29 12:12:58 +00:00
Lionel Landwerlin
1a2246a5e0 intel/perf: update ICL configurations
A few equations/programming changes for ICL.

v2: Fix a couple of issues in naming and floating/integer operations (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-29 13:00:26 +02:00
Alexandros Frantzis
1257d06ba7 gitlab-ci: Update required libdrm version
Commit 9edcce2a32 bumped the required libdrm-amdgpu version to
2.4.100. Update the version we use in our CI scripts to avoid CI
build failures.

Also bump the debian image name for this change to take effect.
Note that amdgpu is only built with the debian-buster image,
so only this image requires an update.

Fixes: 9edcce2a ("ac: get tcc_harvested from the kernel")
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-10-29 09:50:09 +00:00
Eric Engestrom
690d359b6f travis: fix scons build after deprecation warning
Fixes: 54053bc8d0 ("scons: Print a deprecation warning about using scons on not windows")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
2019-10-29 09:25:40 +00:00
Caio Marcelo de Oliveira Filho
e2155158e9 anv: Fix output of INTEL_DEBUG=bat for chained batches
The anv_batch_bo contents are linked one to another, and when printing
we have to start with the first of those.  Since in `u_vector` new
elements are added to the head, to get the first element we need the
vector's tail.

Fixes: 32ffd90002 ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 19:34:54 -07:00
Marek Olšák
f9fe86e02a winsys/amdgpu: use the new GPU reset query 2019-10-28 21:38:01 -04:00
Marek Olšák
9edcce2a32 ac: get tcc_harvested from the kernel
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-28 21:38:01 -04:00
Marek Olšák
4d1e43badb radeonsi: initialize shader compilers in threads on demand
It takes a noticable amount of time with piglit.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-28 21:36:18 -04:00
Marek Olšák
1380db9fa8 radeonsi: don't print diagnostic LLVM remarks and notes
We don't use them.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-28 21:36:18 -04:00
Timur Kristóf
c52ebbcea4 aco: Introduce vgpr_limit to keep track of available VGPRs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Timur Kristóf
d59f702e26 aco: Implement subgroup shuffle in GFX10 wave64 mode.
Previously subgroup shuffle was implemented using the bpermute
instruction, which only works accross half-waves, so by itself it's
not suitable for implementing subgroup shuffle when the shader is
running in wave64 mode.

This commit adds a trick using shared VGPRs that allows to implement
subgroup shuffle still relatively effectively in this mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Rhys Perry
c2eebfe3ea aco: Remove dead code in reduction lowering.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Rhys Perry
3865448012 aco: Fix reductions on GFX10.
Fixes p_reduce (all cluster sizes), p_inclusive_scan and p_exclusive_scan
with all reduction operations.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Eric Engestrom
cd04b63c00 loader: default to iris for all future PCI IDs
The existing "fallback" code didn't actually do anything, so this
removes it, and instead we just always fallback to `iris` for future
PCI IDs.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 23:21:39 +00:00
Eric Engestrom
ea8116908c anv: add a couple printflike() annotations
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-28 23:17:16 +00:00
Erik Faye-Lund
21b7f79a76 st/mesa: lower global vars to local after lowering clip
When this code was merged, this wasn't necessary because the
state-tracker would do it later anyway. But this recently got changed,
without changing the code that depended on this.

Arguably, this was a mistake in the lowering pass to begin with. Either
way, let's fix it by not assuming that the lowering code gets called
later when it's not needed.

This fixed user-defined clip-planes in Zink.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: eaffdad108 ("st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-10-28 21:17:40 +00:00
Sagar Ghuge
3ac688b0c2 iris: Create resource with aux_usage MCS_CCS
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:02 -07:00
Sagar Ghuge
366fcbf2d8 intel/isl: Support lossless compression with multisamples
GEN12 adds the ability to losslessly compress each sample plane in a
multisampled buffer that uses MCS compression.

v2: Remove unnecessary assertion (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
758a6a3a00 iris: Get correct resource aux usage for copy
Add case for MCS_CCS so that we get the correct aux usage while copy
operation.

v2: Fix commit subject (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
e80bca6895 intel/blorp: Use isl_aux_usage_has_mcs instead of comparing
Depending on MCS_CSS or MCS we can emit blorp blit shaders.

As we support MCS_CSS and MCS, it makes sense to use
isl_aux_usage_has_mcs function.

v2: Fix commit message (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
d156632374 iris: Define MCS_CCS state transitions and usages
v2: 1) Fix assertion check (Nanley Chery)
    2) Correct commit subject (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
2cd849cf17 iris: Initialize CCS to fast clear while using with MCS
v2: Explain Bsepc quotes properly (Nanley Chery)

v3: 1) Fix comment format (Nanley Chery)
    2) Fix typo in comment (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Sagar Ghuge
2f0fbe06e6 intel/isl: Don't reconfigure aux surfaces for MCS
If aux for MCS is already configured, don't configure again.

v2: Fix missing period in commit message (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2019-10-28 14:02:01 -07:00
Erik Faye-Lund
810fc75dab zink: emulate optional depth-formats
The Vulkan spec says that an implementation has to support one of
VK_FORMAT_X8_D24_UNORM_PACK32 and VK_FORMAT_D32_SFLOAT, as well of
one of VK_FORMAT_D24_UNORM_S8_UINT and VK_FORMAT_D32_SFLOAT_S8_UINT.

So let's keep track which one is supported of earch pair, and emulate
one on top of the other one.

This won't give the exact result for comparisons, or when mapping and
unmapping the resources. But it's better than flat out failing to create
the resource, and we can fix the map/unmap issue later if needed.

Tested-by: Duncan Hopkins <duncan@thefoundry.co.uk>
2019-10-28 17:57:49 +00:00
Erik Faye-Lund
e6ea350fb0 zink: error if VK_KHR_maintenance1 isn't supported
While we're at it, remove the VK_-prefix from the extension bool; all
extensions have this so it's kinda superfluous.
2019-10-28 17:57:49 +00:00
Nanley Chery
d298740a1c iris: Disallow incomplete resource creation
If a modifier specifies an aux, it must be created.

Fixes: 75a3947af4 ("iris/resource: Fall back to no aux if creation fails")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
f2fc5dece9 iris: Don't leak the resource for unsupported modifier
Make sure the res struct is free'd before returning.

Fixes: 2dce0e94a3 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
7a619b5c75 iris: Enable HIZ_CCS sampling
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
8e7644e48f intel/blorp: Satisfy clear color rules for HIZ_CCS
Store the converted depth value into two dwords. Avoids regressing the
piglit test "fbo-depth-array depth-clear", when HIZ_CCS sampling is
enabled in a later commit.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
0aa308f420 intel: Fix and use HIZ_CCS write through mode
Write through to the CCS if the surface is used as a texture and can be
sampled by the HW with CCS.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
fee4dbcb4d iris: Start using blorp_can_hiz_clear_depth()
Check that the alignment requirements for HIZ_CCS are satisfied by using
this function.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00