Commit graph

108521 commits

Author SHA1 Message Date
Rob Clark
7a5f073da3 freedreno/ir3: show input/output wrmask's in disasm
Currently it is always 0x1 (scalar), but that will change in a later
patch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
c00a67171c freedreno/ir3: add input/output iterators
We can at least get rid of the if-not-NULL check in a bunch of places.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
b2417801e5 freedreno/ir3: remove impossible condition
We keep kill's alive w/ keeps these days, rather than a fake output.
This condition was left over from prior to that change.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
611258d578 freedreno/ir3: rename fanin/fanout to collect/split
If I'm going to refactor a bit to use these meta instructions to also
handle input/output, then might as well cleanup the names first.
Nouveau also uses collect/split for names of these meta instructions,
and I like those names better.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
4af86bd0b9 freedreno/ir3: remove half-precision output
This doesn't really work, we can't necessarily just change the outputs
to half-precision like this in anything but simple cases.

Keep the shader key entry around though, eventually with proper mediump
support we could use this with a nir pass to use lower precision frag
shader outputs when the render target format has <= 16b/component.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rob Clark
089b105396 freedreno/ir3: fix valgrind complaint with STLW
The instruction has 3 src regs, so `instr->regs[0..3]` are valid, but
`instr->regs[4]` is not.

```
Test case 'dEQP-GLES31.functional.shaders.linkage.es31.tessellation.varying.rules.output_superfluous_declaration'..
==29239== Invalid read of size 8
==29239==    at 0x5BE9CDC: emit_cat6 (ir3.c:841)
==29239==    by 0x5BEA1BF: ir3_assemble (ir3.c:921)
==29239==    by 0x5BDF0A7: ir3_shader_assemble (ir3_shader.c:133)
==29239==    by 0x5BDF193: assemble_variant (ir3_shader.c:162)
==29239==    by 0x5BDF407: create_variant (ir3_shader.c:215)
==29239==    by 0x5BDF4DB: shader_variant (ir3_shader.c:241)
==29239==    by 0x5BDF553: ir3_shader_get_variant (ir3_shader.c:257)
==29239==    by 0x5BA85F7: ir3_shader_variant (ir3_gallium.c:80)
==29239==    by 0x5BA7703: ir3_cache_lookup (ir3_cache.c:96)
==29239==    by 0x5B8B8B3: fd6_emit_get_prog (fd6_emit.h:119)
==29239==    by 0x5B8C137: fd6_draw_vbo (fd6_draw.c:186)
==29239==    by 0x5BB1FBB: fd_draw_vbo (freedreno_draw.c:290)
==29239==  Address 0xb97f2d0 is 0 bytes after a block of size 240 alloc'd
==29239==    at 0x4848D54: malloc (in /usr/lib/aarch64-linux-gnu/valgrind/vgpreload_memcheck-arm64-linux.so)
==29239==    by 0x61BD35B: ralloc_size (ralloc.c:119)
==29239==    by 0x61BD41B: rzalloc_size (ralloc.c:151)
==29239==    by 0x5BE599B: ir3_alloc (ir3.c:45)
==29239==    by 0x5BEA583: instr_create (ir3.c:984)
==29239==    by 0x5BEA5DF: ir3_instr_create2 (ir3.c:1000)
==29239==    by 0x5BEE317: ir3_STLW (ir3.h:1431)
==29239==    by 0x5BF12D3: emit_intrinsic_store_shared_ir3 (ir3_compiler_nir.c:903)
==29239==    by 0x5BF418B: emit_intrinsic (ir3_compiler_nir.c:1802)
==29239==    by 0x5BF5D07: emit_instr (ir3_compiler_nir.c:2339)
==29239==    by 0x5BF603F: emit_block (ir3_compiler_nir.c:2426)
==29239==    by 0x5BF624B: emit_cf_list (ir3_compiler_nir.c:2474)
==29239==
```

Probably this only triggers in non-optimized builds?

Fixes: 1f3b52ce50 ("freedreno/a6xx: Add register offset for STG/LDG")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-11-12 13:55:03 -08:00
Rafael Antognolli
a4da6008b6 iris: Use mocs from isl_dev.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rafael Antognolli
d4f628235e anv: Use mocs settings from isl_dev.
v2: Remove device->default_mocs and external_mocs (Jason).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rafael Antognolli
2b01636ddb intel/isl: Add MOCS settings to isl_device.
Centralize mocs settings into isl.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 20:41:52 +00:00
Rob Clark
d509a46225 freedreno: fix eglDupNativeFenceFD error
We can end up with scenarios where last_fence is associated with a batch
that is flushed through some other path before needs_out_fence_fd gets
set.  Resulting in returning a fence that has no backing fd.

The simplest thing is to just skip the optimization to try and avoid
no-op batches when a fence-fd is requested.  This should normally be
just once a frame anyways.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-12 11:38:16 -08:00
Brian Paul
bd49dedae0 nir: fix a couple signed/unsigned comparison warnings in nir_builder.h
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-12 11:44:02 -07:00
Brian Paul
a69e105361 s/APIENTRY/GLAPIENTRY/ in teximage.c
The later is the right symbol for entrypoint functions.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-12 11:44:01 -07:00
Lepton Wu
5c2d307a10 android: mesa: Revert "android: mesa: revert "Enable asm unconditionally""
Commit 45206d7673 fixed PIC issue of x86 asm stub.
We can enable asm for Android x86 now. This should sightly improve performance.

Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>
2019-11-12 18:09:43 +00:00
Rhys Perry
6914b0236f aco: combine read_invocation and shuffle implementations
They do mostly the same thing now.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Rhys Perry
2c98d79d11 aco: don't propagate vgprs into v_readlane/v_writelane
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
5a1bacb6f9 aco: fix read_invocation with VGPR lane index
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
c877f4d320 nir/divergence: improve DA of shuffle
If the data is uniform, then it's really a uniform copy. If the index is
uniform, then it's really a read_invocation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Rhys Perry
f97d933426 aco: fix shuffle with uniform operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
2019-11-12 17:21:38 +00:00
Rhys Perry
3204e83768 aco: use DPP instead of exec modification when lowering GFX10 shuffles
Seems we can use DPP's row_mask field to get an effect similar to
modifying exec.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-11-12 17:21:38 +00:00
Daniel Schürmann
746b9380bd aco: rematerialize s_movk instructions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Daniel Schürmann
b6f5085dfe aco: preserve kill flag on moved operands during RA
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Daniel Schürmann
a2a6880743 aco: fix invalid access on Pseudo_instructions
Fixes: 93c8ebfa78 aco: Initial commit of independent AMD compiler

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-11-12 15:59:48 +00:00
Erik Faye-Lund
5b09a7e2e4 zink: remove no-longer-needed hack
It seems whatever was causing this is no longer an issue. So let's get
rid of the hack here.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-11-12 13:30:35 +00:00
Erik Faye-Lund
e1c87bbb4b zink: implement buffer-to-buffer copies 2019-11-12 12:40:49 +00:00
Erik Faye-Lund
9352991880 zink: always allow transfer to/from buffers 2019-11-12 12:40:49 +00:00
Danylo Piliaiev
d4c8182018 intel/blorp: Fix usage of uninitialized memory in key hashing
The automatically generated padding in structs contains
undefined values, force pack the structs to eliminate the
padding. Otherwise structs with the same values may generate
different hashes.

Valgrind output:

Conditional jump or move depends on uninitialised value(s)
 util_fast_urem32 (fast_urem_by_const.h:71)
 hash_table_search (hash_table.c:262)
 _mesa_hash_table_search (hash_table.c:296)
 anv_pipeline_cache_search_locked (anv_pipeline_cache.c:318)
 anv_pipeline_cache_search (anv_pipeline_cache.c:335)
 lookup_blorp_shader (anv_blorp.c:38)
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1112)
 blorp_mcs_partial_resolve (blorp_clear.c:1205)
 anv_image_mcs_op (anv_blorp.c:1742)
 anv_cmd_predicated_mcs_resolve (genX_cmd_buffer.c:774)
 transition_color_buffer (genX_cmd_buffer.c:1159)
 cmd_buffer_end_subpass (genX_cmd_buffer.c:4840)

Uninitialised value was created by a stack allocation
 blorp_params_get_mcs_partial_resolve_kernel (blorp_clear.c:1103)

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 13:59:29 +02:00
Danylo Piliaiev
3349b4b056 i965/program_cache: Lift restriction on shader key size
This will allow usage of packed structs which may have size
not divisible by 4.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-12 13:59:24 +02:00
Jason Ekstrand
0c7e0c5599 spirv: Fix the MSVC build
Fixes: 9cc4c2c916 "spirv: Add a vtn_decorate_pointer helper"
Tested-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-12 08:34:55 +00:00
Erik Faye-Lund
9b8964d064 nir: patch up deref-vars when lowering clip-planes
Otherwise, we fail validation and potentially generate invalid code.
Let's fix up the mode of the accesses to the variable.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-12 09:13:22 +01:00
Samuel Pitoiset
bef7b2f805 ac: handle pointer types to LDS in ac_get_elem_bits()
This fixes crashes with some
dEQP-VK.spirv_assembly.instruction.spirv1p4.* tests.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-11-12 08:32:15 +01:00
Jonathan Marek
01cae57c80 freedreno: add Adreno 640 ID
A640 seems to work without any other changes (glmark and vkcube).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-11-11 20:46:01 -05:00
Luis Mendes
0cb5c96a83 radv: fix radv secure compile feature breaks compilation on armhf EABI and aarch64
__NR_select is not defined the same way across architectures, sometimes is
not even defined, like in armhf EABI and aarch64.

Signed-off-by: Luis Mendes <luis.p.mendes@gmail.com>

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2042
2019-11-12 11:47:20 +11:00
Marek Olšák
3a23af9f44 st/mesa: remove unused TGSI-only debug printing functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:12 -05:00
Marek Olšák
d29a332862 st/mesa: add ST_DEBUG=nir to print NIR shaders
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:10 -05:00
Marek Olšák
265abc54f8 st/mesa: print TCS/TES/GS/CS TGSI in the right place & keep disk cache enabled
The old place only printed on a disk cache miss, which is why the disk
cache was disabled.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:08 -05:00
Marek Olšák
98e27e5e28 st/mesa: remove \n being only printed in debug builds after printed TGSI
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:07 -05:00
Marek Olšák
c3351bb44b st/mesa: rename DEBUG_TGSI -> DEBUG_PRINT_IR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:45:04 -05:00
Marek Olšák
e00791c552 st/mesa: fix Sanctuary and Tropics by disabling ARB_gpu_shader5 for them
They use the "sample" keyword as a variable name.

Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-11-11 19:23:37 -05:00
Lionel Landwerlin
34f32a6d66 anv: implement VK_KHR_timeline_semaphore
v2: Fix inverted condition in vkGetPhysicalDeviceExternalSemaphoreProperties()

v3: Add anv_timeline_* helpers (Jason)

v4: Avoid variable shadowing (Jason)
    Split timeline wait/signal device operations (Jason/Lionel)

v5: s/point/signal_value/ (Jason)
    Drop piece of drm-syncobj timeline code (Jason)

v6: Add missing sync_fd semaphore signaling (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Jason Ekstrand
5a4f15ef2c anv: Plumb timeline semaphore signal/wait values through from the API
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
edc6606d4e anv/wsi: signal the semaphore in the acquireNextImage
We seem to have forgotten about the semaphore in the
acquireNextImageInfo.

v2: Signal semaphore/fence regardless of presentation status (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Jason Ekstrand
b10b455c1d anv: Lock around fetching sync file FDs from semaphores
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
246261f0ad anv: prepare the driver for delayed submissions
Timeline semaphore introduce support for wait before signal behavior,
which means that it is now allowed to call vkQueueSubmit() with wait
semaphores not yet submitted for execution. Our kernel driver requires
all of the wait primitives to be created before calling the execbuf
ioctl. As a result, we must delay submissions in the userspace driver.
This change store the necessary information to be able to delay a
VkSubmitInfo submission to the kernel driver.

v2: Fold count++ into array access (Jason)
    Move queue list to another patch (Jason)

v3: Document cleanup of temporary semaphores (Jason)

v4: Track semaphores of SYNC_FD type that needs updating after delayed
    submission

v5: Don't forget to update sync_fd in signaled semaphores after
    submission (Jason)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
3e22363537 anv: refcount semaphores
Delayed submissions required by timeline semaphores mean we need to be
able to update the sync fd backed semaphores in a delayed fashion.
This could mean a race between the application destroying the
semaphore and the submission code trying to update it with the new
sync fd.

This change prepares semaphores to be refcounted, we'll most likely
only take a reference for cases where we signal a sync fd semaphore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
3da798c9f1 anv: prepare driver to report submission error through queues
When we will submit to i915 from a submission thread, we won't be able
to directly report the error to the user (in particular through the
debug report callbacks). So prepare 2 paths to report errors device ->
notifying the user immediately, queue -> notifying the user the next
time an entry point is called.

In this change we still report directly for both paths, this will
change in the next commit.

v2: Split NULL batch parameter handling in
    anv_queue_submit_simple_batch() in a different commit

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
89de271bc2 anv: allow NULL batch parameter to anv_queue_submit_simple_batch
We can reuse device->trivial_batch_bo

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
f606c12731 anv: move queue init/finish to anv_queue.c
Prepare the queue initialization to take on more responsabilities and
possibly fail.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
206ab49ba1 anv: expose timeout helpers outside of anv_queue.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
2f4dcc8a1c anv: detach batch emission allocation from device
In the future we'll have 2 different allocations depending on whether
we're using threaded submission or not.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-11-11 21:46:51 +00:00
Lionel Landwerlin
935f8f0e56 anv: remove list items on batch fini
This doesn't seem to fix anything because those destroy() calls happen
right before the command buffer object & its list of batch_bo is also
destroyed. Still looks a bit cleaner.

v2: Found a second occurence

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
Fixes: 26ba0ad54d ("vk: Re-name command buffer implementation files")
Cc: <mesa-stable@lists.freedesktop.org>
2019-11-11 21:46:51 +00:00