Commit graph

117072 commits

Author SHA1 Message Date
Marek Olšák
a8a0e5c03c radeonsi: don't ignore PIPE_FLUSH_ASYNC
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2019-04-26 15:44:39 -04:00
Eric Anholt
fb0611df3d v3d: Fix detection of TMU write sequences in register spilling.
We can't use the QPU functions to detect this until register allocation is
done and we've moved inst->dst into inst->qpu.

Fixes bad TMU sequences from register spilling in
KHR-GLES31.core.compute_shader.shared-max.
2019-04-26 12:42:30 -07:00
Eric Anholt
18894a5e5a v3d: Fix detection of the last ldtmu before a new TMU op.
We were looking at the start instruction, instead of scanning through the
list of following instructions to find any more ldtmus.
2019-04-26 12:42:30 -07:00
Eric Anholt
575caab895 v3d: Re-add support for memory_barrier_shared.
Looks like I lost it in a rebase conflict resolution.  We'd hit the
unknown intrinsic assertion in
KHR-GLES31.core.compute_shader.shared-struct.

Fixes: 6b1c659825 ("v3d: Add Compute Shader compilation support.")
2019-04-26 12:42:30 -07:00
Eric Anholt
971a13d805 Revert "v3d: Disable PIPE_CAP_BLIT_BASED_TEXTURE_TRANSFER."
This reverts commit ccce940947, leaving a
note as to why we had to (corruption in chromium, breaking some GLES3.1
tests).
2019-04-26 12:42:30 -07:00
Eric Anholt
49071b2e3f v3d: Don't try to update the shadow texture for separate stencil.
There are two cases where v3d's sampler view's resource doesn't match the
base's: shadow textures for sampling from raster, and pointing at the
separate depth texture for z32f_s8x24.  We only want to update shadow for
the first case.

Fixes
dEQP-GLES31.functional.stencil_texturing.render.depth32f_stencil8_draw
when run after the previous testcase.
2019-04-26 12:42:30 -07:00
Eric Anholt
4358904c06 v3d: Add a note about i/o indirection for future performance work. 2019-04-26 12:42:30 -07:00
Eric Anholt
c74d0e7f62 vc4: Use _mesa_hash_table_remove_key() where appropriate. 2019-04-26 12:42:30 -07:00
Eric Anholt
d8486c2ad7 v3d: Use _mesa_hash_table_remove_key() where appropriate. 2019-04-26 12:42:30 -07:00
Eric Anholt
24587ae8ae v3d: Assert that we do request the normal texturing return data.
An unused tex should be DCEed, but if it wasn't we'd run into trouble with
not doing a TMUWT.
2019-04-26 12:42:30 -07:00
Eric Anholt
42210a4351 v3d: Apply the GFXH-930 workaround to the case where the VS loads attrs.
We were emitting a dummy load for when the VS doesn't load any attributes,
but we also need to emit a dummy load for when the render VS loads
attributes but the binner VS doesn't.  Fixes simulator assertion failures
and GPU hangs on KHR-GLES31.core.texture_gather.\*
2019-04-26 12:42:30 -07:00
Eric Anholt
448fc3ea42 v3d: Fill in the ignored segment size fields to appease new simulator.
We are assured that the input segment size field is ignored for
!separate_segs mode, and now the simulator wants an in-range value set
regardless of whether it's functionally ignored or not.
2019-04-26 12:40:31 -07:00
Tapani Pälli
af06963d24 glsl: use empty brace initializer
fixes following warning with clang:
   warning: suggest braces around initialization of subobject

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-04-26 12:24:41 -07:00
coypu
976004d0e7 gbm: don't return void
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2019-04-26 12:04:26 -07:00
Tapani Pälli
7a7f182dac nir: use braces around subobject in initializer
Used same syntax as elsewhere with Mesa sources, verified result
against MSVC with godbolt.org.

fixes following warning with clang:
   warning: suggest braces around initialization of subobject

v2: empty braces -> braces around subobject (Caio, Kristian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-04-26 12:01:22 -07:00
Kristian H. Kristensen
a7c70bb2a1 freedreno/drm: Quiet pointer to u64 conversion warning 2019-04-26 11:58:44 -07:00
Alok Hota
8bfb34fd0a swr/rast: enforce use of tile offsets
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-04-26 13:00:45 -05:00
Alok Hota
0e49963212 swr/rast: AVX512 support compiled in by default
- Emulation of AVX512 built into SIMDLIB
  - Remove associated macros
- Remove knobs controlling AVX512 and let emulation handle it
- Refactor variable names for SIMD16

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-04-26 13:00:38 -05:00
Alok Hota
0bf1df2bb6 swr/rast: Remove deprecated 4x2 backend code
- Use 8x2 tiling by default
  - Remove associated macros
- Use SIMDLIB emulation for SIMD16 on SIMD8 hardware
- Remove code rot in Load/StoreTile

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2019-04-26 13:00:24 -05:00
Tomasz Figa
e8bf4efceb llvmpipe: Always return some fence in flush (v2)
If there is no last fence, due to no rendering happening yet, just
create a new signaled fence and return it, to match the expectations of
the EGL sync fence API.

Fixes random "Could not create sync fence 0x3003" assertion failures from
Skia on Android, coming from the following code:

https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427

Reproducible especially with thread count >= 4.

One could make the driver always keep the reference to the last fence,
but:

 - the driver seems to explicitly destroy the fence whenever a rendering
   pass completes and changing that would require a significant functional
   change to the code. (Specifically, in lp_scene_end_rasterization().)

 - it still wouldn't solve the problem of an EGL sync fence being created
   and waited on without any rendering happening at all, which is
   also likely to happen with Android code pointed to in the commit.

Therefore, the simple approach of always creating a fence is taken,
similarly to other drivers, such as radeonsi.

Tested with piglit llvmpipe suite with no regressions and following
tests fixed:

egl_khr_fence_sync
 conformance
  eglclientwaitsynckhr_flag_sync_flush
  eglclientwaitsynckhr_nonzero_timeout
  eglclientwaitsynckhr_zero_timeout
  eglcreatesynckhr_default_attributes
  eglgetsyncattribkhr_invalid_attrib
  eglgetsyncattribkhr_sync_status

v2:
 - remove the useless lp_fence_reference() dance (Nicolai),
 - explain why creating the dummy fence is the right approach.

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-04-26 11:26:33 +01:00
Emil Velikov
591955d82d llvmpipe: correctly handle waiting in llvmpipe_fence_finish
Currently if the timeout differs from 0, we'll end up with infinite
wait... even if the user is perfectly clear they don't want that.

Use the new lp_fence_timedwait() helper guarding both waits in an
!lp_fence_signalled block like the rest of llvmpipe.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-04-26 11:26:33 +01:00
Emil Velikov
5b284fe6bc llvmpipe: add lp_fence_timedwait() helper
The function is analogous to lp_fence_wait() while taking at timeout
(ns) parameter, as needed for EGL fence/sync.

v2:
 - use absolute UTC time, as per spec (Gustaw)
 - bail out on cnd_timedwait() failure (Gustaw)

v3:
 - check count/rank under mutex (Gustaw)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
2019-04-26 11:26:33 +01:00
Emil Velikov
bd0c4e360d vulkan/wsi: don't use DUMB_CLOSE for normal GEM handles
Currently we get normal GEM handles from PrimeFDToHandle, yet we close
then with DUMB_CLOSE. Use GEM_CLOSE instead.

Fixes: da997ebec9 ("vulkan: Add KHR_display extension using DRM [v10]")
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Keith Packard <keithp@keithp.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-04-26 11:26:33 +01:00
Emil Velikov
c962a78f18 vulkan/wsi: check if the display_fd given is master
As effectively required by the extension, we need to ensure we're master

Currently drivers employ vendor specific solutions, which check if the
device behind the fd is capable*, yet none of them do the master check.

*In the radv case, if acceleration is available.

Instead of duplicating the check in each driver, keep it where it's
needed and used.

Note this copies libdrm's drmIsMaster() to avoid depending on bleeding
edge version of the library.

v2: set the fd to -1 if not master (Bas)

Fixes: da997ebec9 ("vulkan: Add KHR_display extension using DRM [v10]")
Cc: Andres Rodriguez <andresx7@gmail.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Keith Packard <keithp@keithp.com>
Reported-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-04-26 11:26:33 +01:00
Emil Velikov
1a9367c134 turnip: drop dead close(master_fd)
The fd is -1, thus the block of if (fd != -1) close(fd) is dead code.

Cc: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-04-26 11:26:33 +01:00
Jason Ekstrand
00d4e78ea9 nir/algebraic: Optimize integer cast-of-cast
These have been popping up more and more with the OpenCL work and other
bits causing extra conversions to/from 64-bit.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-04-26 04:26:08 -05:00
Jason Ekstrand
934f178341 anv/descriptor_set: Don't fully destroy sets in pool destroy/reset
In 105002bd2d, we fixed a memory leak bug where we weren't properly
destroying descriptor when destroying/resetting a descriptor pool.
However, the only real leak that happened was that we we take a
reference to the descriptor set layout in the descriptor set and we
weren't dropping our reference.  Everything else in the descriptor set
is tied to the pool itself and doesn't need to be freed on a per-set
basis.  This commit changes the destroy/reset functions to only bother
walking the list of sets to unref the layouts and otherwise we just
assume that the whole-pool destroy/reset takes care of the rest.

Now that we're doing more non-trivial things with descriptor sets such
as allocating things with util_vma_heap, per-set destruction is starting
to show up on perf traces.  This takes reset back to where it's supposed
to be as a cheap whole-pool operation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-26 05:40:28 +00:00
Jason Ekstrand
baf4802e3e anv: Better handle 32-byte alignment of descriptor set buffers
In c520f4dec9, we chose to align the sizes of descriptor set buffers to
32 bytes.  We have to align the descriptor set buffer to 32B so that
it's valid for using with push constants.  We align the size as well so
we don't leave lots of holes with util_vma_heap_alloc.  Unfortunately,
we were only aligning it for alloc and not for free so we were still
creating piles of holes when we delete descriptor sets.  This causes
terrible perf for the allocator once we've deleted piles of descriptor
sets.

This commit reworks the code so that we align the descriptor set buffer
size to 32B for both alloc and free.  The result is that it takes the
new crucible vkResetDescriptorPool from 104.567719 to 2.898354 seconds.

Fixes: c520f4dec9 "anv: Add a concept of a descriptor buffer"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110497
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-26 05:40:28 +00:00
Dave Airlie
d946cbe9f5 nir: fix bit_size in lower indirect derefs.
This fixes a case where we are expecting 64-bit but generate
32-bit consts and validate gets angry.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-04-26 12:59:43 +10:00
Kenneth Graunke
529ace7887 iris: Silence unused function warning 2019-04-25 17:33:56 -07:00
Marek Olšák
c5f65bfe6c glsl: fix shader_storage_blocks_write_access for SSBO block arrays (v2)
This fixes KHR-GL45.compute_shader.resources-max on radeonsi.

Fixes: 4e1e8f684b "glsl: remember which SSBOs are not read-only and pass it to gallium"

v2: use is_interface_array, protect again assertion failures in u_bit_consecutive

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-04-25 18:57:38 -04:00
Rob Clark
a6ab27dcab docs/features: update GL too
Forgot to update corresponding entries for desktop GL.. kinda wish we
didn't have to update both GLES and GL tables.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 15:48:19 -07:00
Rob Clark
7a57cfbed6 freedreno/a6xx: sample-shading support
Enables:

  OES_sample_shading
  OES_sample_variables
  OES_shader_multisample_interpolation

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
ee2e3a07bb freedreno/ir3: sample-shading support
The compiler support for:

  OES_sample_shading
  OES_sample_variables
  OES_shader_multisample_interpolation

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
85949c52b4 freedreno: wire up core sample-shading support
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
c8e825aaac freedreno/ir3: fix load_interpolated_input slot
The so->inputs[] table is in units of vec4

Fixes: 7ff6705b8d freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
49f922d96c freedreno/a6xx: add VALIDREG/CONDREG helper macros
There are a few places that we check if a shader stage input reg is
used/valid (ie. not r63.x).. and there are about to be a bunch more.
So add some helper macros for less open-coding.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
f4b4d6cf23 freedreno/ir3: rename frag_vcoord -> ij_pixel
Since this is what the value actually is.  Cleanup the name before
adding more different i,j related values for sample-shading.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
5be415fc2b freedreno/ir3: remove bogus assert
tex instruction can actually return 16b values.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
2f0b9d2249 freedreno/ir3: lower load_barycentric_at_offset
Calculates i,j at specified offset within a pixel.  A new load_size_ir3
intrinsic is used in conjunction with fddx/fddy to translate the offset
into primitive space and adjust the i,j from load_barycentric_pixel
accordingly.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
c4f423aa36 freedreno/ir3: lower load_barycentric_at_sample
This lowers load_barycentric_at_sample to load_sample_pos_from_id plus
load_barycentric_at_offset.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
4e3ce224a7 freedreno: update generated headers
Pull in updates for sample shading.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
6d6ec2d4d2 freedreno/ir3: cleanup instruction builder macros
De-duplicate the "normal" and "flags" versions of the macros, and while
at it go ahead and add "flags" versions for all the remaining macros,
since we'll at least need INSTR1F in a following commit.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
77b3b96a3b freedreno/ir3: more emit-cat5 fixes
Couple more opcodes which don't take a sampler id as first arg.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
9032f0690c freedreno/ir3: fix rgetpos decoding
It takes an argument.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
4d08c1b595 compiler: rename SYSTEM_VALUE_VARYING_COORD
And add corresponding enums for different sorts of varying
interpolation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
96d2e4ab8a freedreno: add robustness support
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:31 -07:00
Rob Clark
6503918689 freedreno/drm: update for robustness
Update UABI header and add FD_PP_PGTABLE and FD_NR_FAULTS params.

Robustness can be supported by a kernel which provides the new ABI if it
also indicates that per-process pagetables are in use.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-04-25 14:13:07 -07:00
Alyssa Rosenzweig
77d091d0c5 panfrost/midgard: Add new bitwise ops
These fused NOT-ops could maybe help somehow...?

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-25 20:37:46 +00:00
Alyssa Rosenzweig
bcabcfe3ad panfrost/midgard: Identify inand
This was previously thought to be inot, but it's actually a bit more
general than that! :)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-04-25 20:37:45 +00:00