Commit graph

108540 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
0aaf47f7cd anv: Initialize depth_bounds_test_enable when not explicitly set
This was causing uninitialized value to end up propagated to the
3DSTATE_DEPTH_BOUNDS packet, leading to asserts on packet
building due to the value being greater than 1.

Fixes: 939ddccb7a ("anv: Add support for depth bounds testing.")
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2019-11-13 10:13:27 -08:00
Alyssa Rosenzweig
771d23584a pan/midgard: Remove util/ra support
It's now unused, in favour of LCRA.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
e343f2ceb9 pan/midgard: Integrate LCRA
Pretty routine, we do have a hack to force swizzle alignment for !32-bit
for until we implement !32-bit the right way.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
66ad64d73d pan/midgard: Implement linearly-constrained register allocation
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Alyssa Rosenzweig
fd81916ee5 pan/midgard: Add blend shader selection bits for MRT
This is less complicated than previously thought. Note we have no way of
specifying the work register count for blend shaders; it must be
strictly less than the work register count of the corresponding fragment
shader (which is fine since we force the fragment shader to report a
count of 16 with a blend shader as a major hack until we get register
pressure down for blend shaders).

TODO: pandecode the flags.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-13 15:27:56 +00:00
Christian Gmeiner
e101af8671 drm-shim: fix EOF case
Close input end of the pipe after data was written. Without this
fix I have seen a hang in sysfs_uevent_get(.., "OF_FULLNAME")
when key was not found.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-13 12:39:14 +00:00
Tapani Pälli
b12911c88e util/android: fix android build errors
Fixes: 9020f519 ("util/u_endian: Add error checks")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2078
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-13 12:31:31 +02:00
Erik Faye-Lund
4c1cef68cf zink: move drawing separate source
This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree.
2019-11-13 09:14:05 +01:00
Erik Faye-Lund
589e8651e6 zink: move blitting to separate source
This code is kinda stand-alone, and it makes it a bit easier to find the
right source in the source-tree
2019-11-13 09:14:05 +01:00
Erik Faye-Lund
1605a0c8f2 zink: move filter-helper to separate helper-header
This will help code-reuse a bit in the next commit.
2019-11-13 09:12:36 +01:00
Erik Faye-Lund
36f3902213 zink: move format-checking to separate source
This code is more or less stand-alone, and this keeps the formats array
a bit more encapsulated.
2019-11-13 09:12:36 +01:00
Rob Clark
0f33c255d3 freedreno/ir3: remove unused parameter
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-12 13:57:52 -08:00
Rob Clark
df7a88dca3 freedreno/ir3: legalize cleanups
We can clear the "needs" flags once we emit a flag.  And also, don't
open-code the opcode name.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
b22617fb57 freedreno/ir3: fix gpu hang with pre-fs-tex-fetch
For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x,
even if it is not used in the shader (ie. only varying use is for tex
coords).  But if, for example, gl_FragCoord is used, it could get
assigned on top of bary_ij, resulting in a GPU hang.

The solution to this is two-fold: (1) the inputs/outputs rework has the
benefit of making RA realize bary_ij is a vec2, even if there are no
split/collect instructions (due to no varying fetches in the shader
itself).  And (2) extend the live ranges of meta:input instructions to
the first non-input, to prevent RA from assigning the same register to
multiple inputs.

Backport note: because of (1) above, a better solution for 19.3 would be
to revert f30c256ec0.

Fixes: f30c256ec0 ("freedreno/ir3: enable pre-fs texture fetch for a6xx")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
4bb697d938 freedreno/ir3: only tex instructions have wrmask
At the ir3 level, we would assume that we could use wrmask to mask
off other components of an instruction returning a vecN when they are
not used.  Which would let RA use components not written for other live
values.  But this is only true for tex instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
bdf6b7018c freedreno/ir3: re-work shader inputs/outputs
Allow inputs/outputs to be vecN (ie. whatever their actual size is), and
use split to get scalar components of inputs, and collect to gather up
scalar components of outputs.

The main motivation is to simplify RA, by only having to consider split/
collect to figure out where values need to land in consecutive scalar
registers, rather than having to also deal with left/right neighbors.

Because of varying packing, and the resulting fractional location
(location_frac), to implement load_input/store_output, it is still
convenient to have a table of scalar inputs/outputs.  We move this to
the compile ctx (since it is only needed for nir->ir3).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:57:52 -08:00
Rob Clark
2aae13f642 freedreno/ir3: simplify creating sysval inputs
In almost all places, the add_sysval_input() is paired directly with a
create_input().  (The one exception is frag shader ij bary coord, and
this exception will go away in a later patch.)

So go ahead and clean this up before reworking input/output handling.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
68d2ec5f7e freedreno/ir3: remove first-vertex sysval
This is a driver-param (loaded from uniform), not a sysval (populated by
hw into a register).  So it has no value to having a sysval slot.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
7b2166785a freedreno/ir3: helper to print ir if debug enabled
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
7a5f073da3 freedreno/ir3: show input/output wrmask's in disasm
Currently it is always 0x1 (scalar), but that will change in a later
patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
c00a67171c freedreno/ir3: add input/output iterators
We can at least get rid of the if-not-NULL check in a bunch of places.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
b2417801e5 freedreno/ir3: remove impossible condition
We keep kill's alive w/ keeps these days, rather than a fake output.
This condition was left over from prior to that change.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
611258d578 freedreno/ir3: rename fanin/fanout to collect/split
If I'm going to refactor a bit to use these meta instructions to also
handle input/output, then might as well cleanup the names first.
Nouveau also uses collect/split for names of these meta instructions,
and I like those names better.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
4af86bd0b9 freedreno/ir3: remove half-precision output
This doesn't really work, we can't necessarily just change the outputs
to half-precision like this in anything but simple cases.

Keep the shader key entry around though, eventually with proper mediump
support we could use this with a nir pass to use lower precision frag
shader outputs when the render target format has <= 16b/component.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
089b105396 freedreno/ir3: fix valgrind complaint with STLW
The instruction has 3 src regs, so `instr->regs[0..3]` are valid, but
`instr->regs[4]` is not.

```
Test case 'dEQP-GLES31.functional.shaders.linkage.es31.tessellation.varying.rules.output_superfluous_declaration'..
==29239== Invalid read of size 8
==29239==    at 0x5BE9CDC: emit_cat6 (ir3.c:841)
==29239==    by 0x5BEA1BF: ir3_assemble (ir3.c:921)
==29239==    by 0x5BDF0A7: ir3_shader_assemble (ir3_shader.c:133)
==29239==    by 0x5BDF193: assemble_variant (ir3_shader.c:162)
==29239==    by 0x5BDF407: create_variant (ir3_shader.c:215)
==29239==    by 0x5BDF4DB: shader_variant (ir3_shader.c:241)
==29239==    by 0x5BDF553: ir3_shader_get_variant (ir3_shader.c:257)
==29239==    by 0x5BA85F7: ir3_shader_variant (ir3_gallium.c:80)
==29239==    by 0x5BA7703: ir3_cache_lookup (ir3_cache.c:96)
==29239==    by 0x5B8B8B3: fd6_emit_get_prog (fd6_emit.h:119)
==29239==    by 0x5B8C137: fd6_draw_vbo (fd6_draw.c:186)
==29239==    by 0x5BB1FBB: fd_draw_vbo (freedreno_draw.c:290)
==29239==  Address 0xb97f2d0 is 0 bytes after a block of size 240 alloc'd
==29239==    at 0x4848D54: malloc (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so)
==29239==    by 0x61BD35B: ralloc_size (ralloc.c:119)
==29239==    by 0x61BD41B: rzalloc_size (ralloc.c:151)
==29239==    by 0x5BE599B: ir3_alloc (ir3.c:45)
==29239==    by 0x5BEA583: instr_create (ir3.c:984)
==29239==    by 0x5BEA5DF: ir3_instr_create2 (ir3.c:1000)
==29239==    by 0x5BEE317: ir3_STLW (ir3.h:1431)
==29239==    by 0x5BF12D3: emit_intrinsic_store_shared_ir3 (ir3_compiler_nir.c:903)
==29239==    by 0x5BF418B: emit_intrinsic (ir3_compiler_nir.c:1802)
==29239==    by 0x5BF5D07: emit_instr (ir3_compiler_nir.c:2339)
==29239==    by 0x5BF603F: emit_block (ir3_compiler_nir.c:2426)
==29239==    by 0x5BF624B: emit_cf_list (ir3_compiler_nir.c:2474)
==29239==
```

Probably this only triggers in non-optimized builds?

Fixes: 1f3b52ce50 ("freedreno/a6xx: Add register offset for STG/LDG")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rafael Antognolli
a4da6008b6 iris: Use mocs from isl_dev.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rafael Antognolli
d4f628235e anv: Use mocs settings from isl_dev.
v2: Remove device->default_mocs and external_mocs (Jason).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rafael Antognolli
2b01636ddb intel/isl: Add MOCS settings to isl_device.
Centralize mocs settings into isl.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rob Clark
d509a46225 freedreno: fix eglDupNativeFenceFD error
We can end up with scenarios where last_fence is associated with a batch
that is flushed through some other path before needs_out_fence_fd gets
set.  Resulting in returning a fence that has no backing fd.

The simplest thing is to just skip the optimization to try and avoid
no-op batches when a fence-fd is requested.  This should normally be
just once a frame anyways.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-12 11:38:16 -08:00
Brian Paul
bd49dedae0 nir: fix a couple signed/unsigned comparison warnings in nir_builder.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-12 11:44:02 -07:00
Brian Paul
a69e105361 s/APIENTRY/GLAPIENTRY/ in teximage.c
The later is the right symbol for entrypoint functions.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-12 11:44:01 -07:00
Lepton Wu
5c2d307a10 android: mesa: Revert "android: mesa: revert "Enable asm unconditionally""
Commit 45206d7673 fixed PIC issue of x86 asm stub.
We can enable asm for Android x86 now. This should sightly improve performance.

Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-11-12 18:09:43 +00:00
Rhys Perry
6914b0236f aco: combine read_invocation and shuffle implementations
They do mostly the same thing now.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Rhys Perry
2c98d79d11 aco: don't propagate vgprs into v_readlane/v_writelane
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
5a1bacb6f9 aco: fix read_invocation with VGPR lane index
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
c877f4d320 nir/divergence: improve DA of shuffle
If the data is uniform, then it's really a uniform copy. If the index is
uniform, then it's really a read_invocation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Rhys Perry
f97d933426 aco: fix shuffle with uniform operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
3204e83768 aco: use DPP instead of exec modification when lowering GFX10 shuffles
Seems we can use DPP's row_mask field to get an effect similar to
modifying exec.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Daniel Schürmann
746b9380bd aco: rematerialize s_movk instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Daniel Schürmann
b6f5085dfe aco: preserve kill flag on moved operands during RA
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Daniel Schürmann
a2a6880743 aco: fix invalid access on Pseudo_instructions
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Erik Faye-Lund
5b09a7e2e4 zink: remove no-longer-needed hack
It seems whatever was causing this is no longer an issue. So let's get
rid of the hack here.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-12 13:30:35 +00:00
Erik Faye-Lund
e1c87bbb4b zink: implement buffer-to-buffer copies 2019-11-12 12:40:49 +00:00
Erik Faye-Lund
9352991880 zink: always allow transfer to/from buffers 2019-11-12 12:40:49 +00:00
Danylo Piliaiev
d4c8182018 intel/blorp: Fix usage of uninitialized memory in key hashing
The automatically generated padding in structs contains
undefined values, force pack the structs to eliminate the
padding. Otherwise structs with the same values may generate
different hashes.

Valgrind output:

Conditional jump or move depends on uninitialised value(s)
 util_fast_urem32 (fast_urem_by_const.h:71)
 hash_table_search (hash_table.c:262)
 _mesa_hash_table_search (hash_table.c:296)
 anv_pipeline_cache_search_locked (anv_pipeline_cache.c:318)
 anv_pipeline_cache_search (anv_pipeline_cache.c:335)
 lookup_blorp_shader (anv_blorp.c:38)
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1112)
 blorp_mcs_partial_resolve (blorp_clear.c:1205)
 anv_image_mcs_op (anv_blorp.c:1742)
 anv_cmd_predicated_mcs_resolve (genX_cmd_buffer.c:774)
 transition_color_buffer (genX_cmd_buffer.c:1159)
 cmd_buffer_end_subpass (genX_cmd_buffer.c:4840)

Uninitialised value was created by a stack allocation
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1103)

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 13:59:29 +02:00
Danylo Piliaiev
3349b4b056 i965/program_cache: Lift restriction on shader key size
This will allow usage of packed structs which may have size
not divisible by 4.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 13:59:24 +02:00
Jason Ekstrand
0c7e0c5599 spirv: Fix the MSVC build
Fixes: 9cc4c2c916 "spirv: Add a vtn_decorate_pointer helper"
Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-12 08:34:55 +00:00
Erik Faye-Lund
9b8964d064 nir: patch up deref-vars when lowering clip-planes
Otherwise, we fail validation and potentially generate invalid code.
Let's fix up the mode of the accesses to the variable.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-12 09:13:22 +01:00
Samuel Pitoiset
bef7b2f805 ac: handle pointer types to LDS in ac_get_elem_bits()
This fixes crashes with some
dEQP-VK.spirv_assembly.instruction.spirv1p4.* tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-12 08:32:15 +01:00
Jonathan Marek
01cae57c80 freedreno: add Adreno 640 ID
A640 seems to work without any other changes (glmark and vkcube).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-11 20:46:01 -05:00