Commit graph

111705 commits

Author SHA1 Message Date
Ian Romanick
39f4dc23a5 intel/fs: Mark source 0 of bcsel as needing Boolean resolve
The other sources of the bcsel behave like the sources of an and or
other logical operation.  However, source zero behaves differently.
It is evaluated as a Boolean, so it needs to be resolved.

No shader-db changes, but the tests mentioned in the bug get a couple
instructions added back.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110857
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-06-11 12:12:07 -07:00
Rob Clark
f9f89df8bc freedreno/a5xx: enable a540
Tested-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-06-11 12:03:10 -07:00
Rob Clark
832010f6ac freedreno/a6xx: enable UBWC by default
Flip the FD_MESA_DEBUG flag to a disable rather than enable, drop the
obsolete comment (and bonus, drop unused softpin debug flag)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
81cc555e9a freedreno/a6xx: disallow UBWC for z24s8
This is slightly annoying because it *mostly* works.. but we have some
issues to sort out about how to blit z24s8/x24s8/z24x8 with UBWC before
we can enable UBWC by default.  For now it is a step forward to at least
enable it for non-z/s while we figure out how to blit z24s8+UBWC.

(The basic issue is that pretending z24s8 is an equivalently sized rgba
format for the purpose of blitting falls apart when UBWC is in the
picture.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
4f1319a17d freedreno/a6xx: use correct UBWC reg builders
No functional change, the registers have the same layout as MRT flags
pitch reg.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
d42ce659ed freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
490baa6974 freedreno/a6xx: disable UBWC for some formats
An older blob claims to support UBWC w/ r32ui an r32i, but not r32f.
Results from deqp indicate that it doesn't work with r32ui and r32i.

This *could* also just mean that use as "IBO" (image) is more limited
than as texture, although blob also doesn't seem to bother to try to use
UBWC with images at all, so hard to know for sure.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
8ddffa75c0 freedreno/a6xx: handle non-UWC-compatible image views
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
dac3bc9862 freedreno/a6xx: handle non-UBWC-compatible texture views
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
fe5c7b2b75 freedreno: add helper to uncompress UBWC resource
We'll need this for a few edge cases, like image/sampler view that uses
a format that UBWC does not support with a resource originally created
in a format that UBWC does support.

NOTE we *could* in some cases do an in-place uncompress.  But that has
a couple potential sharp edges:

 1) the uncompressed buffer could have different layout, ie. a5xx
    with meta and pixel data of layers/levels interleaved.

 2) if it comes mid-batch, it would force flush, or somehow fixing
    up cmdstream for draws already emitted.  But with the resource
    shadowing approach we can rely on batch re-ordering to avoid
    splitting things.. older draws see the older compressed version,
    newer draws see the new uncompressed version of the rsc.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
846b8a76bd freedreno: handle images in rebind_resource()
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
c6ae354299 freedreno: allow null discard box in shadow path
When uncompressing a UBWC buffer, we don't want to discard anything.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
12201d7a8b freedreno: swap UBWC state in shadow path
It doesn't come up yet, as so far we only hit this path with linear
buffers.  But it will when we start re-using the shadow path for
uncompressing UBWC buffers.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
3c9a31eb50 freedreno: add modifier param to fd_try_shadow_resource()
To uncompress UBWC, I want to re-use the shadow path, but we'll need a
way to request that the new buffer is not compressed.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Rob Clark
3b05a120a3 freedreno: correct modifier for UBWC buffers
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-06-11 10:55:27 -07:00
Chia-I Wu
15323c14fd virgl: consider newly created resources idle
A newly created resource can be regarded as idle.  We don't care if
the RESOURCE_CREATE command has been retired, unless it is used for
fencing.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-06-11 10:03:54 -07:00
Chia-I Wu
9e4452cfd9 virgl: make resource_wait/resource_is_busy cheaper
The round trip to the kernel is expensive.  Add a local cache to
avoid it when possible.

There is a race condition when two contexts access the same resource
at the same time (e.g., ctx1 submits a cmdbuf that accesses a
resource while ctx2 maps the resource).  But that is probably an app
bug in the first place.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-06-11 10:03:54 -07:00
Chia-I Wu
ddc90be907 virgl: add virgl_drm_{alloc,free,clear}_res_list
Helpers to work with resource list.  virgl_drm_release_all_res is
removed.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-06-11 10:03:54 -07:00
Chia-I Wu
71465fe569 virgl: do not cache external resources
We should not reuse a resource for other purposes when it can still
be accessed by another process or device.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-06-11 10:03:54 -07:00
Alyssa Rosenzweig
7d43999e63 panfrost: Enable AFBC on depth/stencil
This seems to be a performance win, but more rigorous testing is
necessary to figure out the exact circumstances when this is good/bad.
Incidentally, this fixes non-aligned ZS.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:46:43 -07:00
Alyssa Rosenzweig
15f62b8e7c panfrost: Linear depth/stencil should be aligned
We might render to it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:46:43 -07:00
Alyssa Rosenzweig
d7ad29ce25 panfrost/midgard: Decode LOD/bias registers
For constant LODs/biases, we can use an immediate embedded in the
texture (already decoded); for non-constant, we have to use a register
squeezed into the usual immediate field, which is decoded here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
b4a3296e77 panfrost/midgard: Decode texture offset register swizzle
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
4e9e42cc56 panfrost/midgard/disasm: include textureGather()
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
6c18ae33bc panfrost/midgard: Support negative immediate offsets
It's not at all clear why this work for texelFetch but not texture.
Maybe the top bits are dual-purpose on other texturing ops...?

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
4d8157f12d panfrost/midgard: Fix redunant mask redundancy
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
3dee556c4e panfrost/midgard/disasm: Print LOD for texelFetch
Its encoding differs slightly from the LOD used in normal texture calls.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
cda9f32909 panfrost/midgard: Identify the in_reg_full field
This is clear for texelFetch, hence the confusion with Bifrost's filter
field, but it's much more general in reality.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
445a7b523f panfrost/midgard/disasm: Correctly dump bias/LOD
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
873a3ed342 panfrost/midgard/disasm: Cleanup texture op code
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
289405392d panfrost/midgard/disasm: Add missing space
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
f4ee8d055c panfrost/midgard/disasm: LOD immediate/register select
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:19 -07:00
Alyssa Rosenzweig
59fa7c95c8 panfrost/midgard/disasm: Use texture op name bare
This allows us to show a call to textureLod in a reasonable way.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig
109460f03a panfrost/midgard/disasm: Varying perspective divides
With an extra flag, we're able to do a perspective division "for free"
while loading a varying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig
fc472007e7 panfrost/midgard: Add perspective division opcodes
...on the load/store unit, not the ALUs. Looks goofy but hey.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig
b0396d6dda panfrost/midgard: Print texture offsets
This patch identifies the two modes of offsets in a texture instruction
(immediate and register, disambiguated by the bit-once-known-as
"has_offset") and implements disassembly for both.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:18 -07:00
Alyssa Rosenzweig
ed1c48e91d panfrost/midgard: Expand texture to 4-channel swizzle
This eliminates some unknowns, clarifies 3D textures, and will maybe
help with array/shadow textures?

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-06-11 08:44:18 -07:00
Juan A. Suarez Romero
b586ed51f3 docs: update calendar, add news item and link release notes for 19.1.0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2019-06-11 17:38:22 +02:00
Juan A. Suarez Romero
cc7fc7e319 docs: Add SHA256 sums for 19.1.0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 2a5b4e2b9f)
2019-06-11 15:26:42 +00:00
Juan A. Suarez Romero
7e8e49475c docs: Add release notes for 19.1.0
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 1517811f4f)
2019-06-11 15:26:38 +00:00
Samuel Iglesias Gonsálvez
32e1d85cb6 radv: assert on inline uniform blocks in radv_CmdPushDescriptorSetKHR()
According to the Vulkan spec, inline uniform blocks are not allowed
to be updated through vkCmdPushDescriptorSetKHR().

These are the spec quotes from "13.2.1. Descriptor Set Layout"
that are relevant for this case:

"VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies
 that descriptor sets must not be allocated using this layout, and
 descriptors are instead pushed by vkCmdPushDescriptorSetKHR."

"If flags contains
 VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all
 elements of pBindings must not have a descriptorType of
 VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT".

There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid
this case but it is implied in the creation of the descriptor set
layout as aforementioned.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-11 16:32:27 +02:00
Samuel Iglesias Gonsálvez
d0c52ff610 anv: ignore inline uniform blocks in anv_CmdPushDescriptorSetKHR()
According to the Vulkan spec, inline uniform blocks are not allowed
to be updated through vkCmdPushDescriptorSetKHR().

These are the spec quotes from "13.2.1. Descriptor Set Layout"
that are relevant for this case:

"VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies
that descriptor sets must not be allocated using this layout, and
descriptors are instead pushed by vkCmdPushDescriptorSetKHR."

"If flags contains
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all
elements of pBindings must not have a descriptorType of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT".

There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid
this case but it is implied in the creation of the descriptor set
layout as aforementioned.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-06-11 16:25:53 +02:00
Eric Engestrom
773ff93bc4 egl: compare the whole list of attributes
`memcmp()` compares a given number of bytes, but `EGLAttrib` is larger than a byte.

Fixes: 8e991ce539 "egl: handle the full attrib list in display::options"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-06-11 12:18:09 +00:00
Eduardo Lima Mitev
3fb7b1fd35 freedreno/a5xx: Fix indirect draw max_indices calculation
The number of elements to draw should not be affected by the offset.

A similar fix was submitted for a6xx at 79180a05.

Fixes these dEQP tests on a5xx:

dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-06-11 08:28:45 +02:00
Samuel Pitoiset
40699f74b8 radv: remove extra assignment in radv_decompress_resolve_subpass_src()
baseArrayLayer is defined twice, trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-11 08:17:22 +02:00
Samuel Pitoiset
c39a1611ab radv: add radv_get_resolve_pipeline() helper in the graphics path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-11 08:06:42 +02:00
Samuel Pitoiset
b06d1f029d radv: do not decompress all image layers before resolving inside a subpass
When decompressing resolve source images, we should rely on the
framebuffer layer count instead of resolving all images layers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-11 08:06:39 +02:00
Samuel Pitoiset
4efbd963ec radv: initialize the aspect mask when decompressing resolve source images
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-11 08:06:35 +02:00
Samuel Pitoiset
c31a07fa85 radv: perform proper layout transitions before resolving
Use an explicit pipeline barrier for doing layout transitions
instead of duplicating some code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-11 08:06:32 +02:00
Samuel Pitoiset
92fa6264cb radv: do not resolve all image layers with compute inside a subpass
When resolving inside a subpass, we should rely on the framebuffer
layer count instead of resolving all images layers. This should
improve performance of layered resolves a bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-11 08:06:28 +02:00