Commit graph

738 commits

Author SHA1 Message Date
Jason Ekstrand
bd56ce8ce5 anv: Implement VK_KHR_shader_atomic_int64
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand
79fb0d27f3 anv: Implement SSBOs bindings with GPU addresses in the descriptor BO
This commit adds a new way for ANV to do SSBO bindings by just passing a
GPU address in through the descriptor buffer and using the A64 messages
to access the GPU address directly.  This means that our variable
pointers are now "real" pointers instead of a vec2(BTI, offset) pair.
This carries a few of advantages:

 1. It lets us support a virtually unbounded number of SSBO bindings.

 2. It lets us implement VK_KHR_shader_atomic_int64 which we couldn't
    implement before because those atomic messages are only available
    in the bindless A64 form.

 3. It's way better than messing around with bindless handles for SSBOs
    which is the only other option for VK_EXT_descriptor_indexing.

 4. It's more future looking, maybe?  At the least, this is what NVIDIA
    does (they don't have binding based SSBOs at all).  This doesn't a
    priori mean it's better, it just means it's probably not terrible.

The big disadvantage, of course, is that we have to start doing our own
bounds checking for robustBufferAccess again have to push in dynamic
offsets.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand
e7a1e8f735 anv: Add a has_a64_buffer_access to anv_physical_device
This is more descriptive and a bit nicer than checking for gen >= 8 &&
use_softpin everywhere.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand
146deec9ef anv/pipeline: Add skeleton support for spilling to bindless
If the number of surfaces or samplers exceeds what we can put in a
table, we will want to spill out to bindless.  There is no bindless
support yet but this gets us the basic framework that will be used by
later commits.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand
a5a0dc08f1 anv: Add a #define for the max binding table size
This also fixes a bug where we mis-calculate maximum binding table sizes
and may return true in vkGetDescriptorSetLayoutSupport even for sets too
large to fit in a binding table.

Fixes: ddc4069122 "anv: Implement VK_KHR_maintenance3"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand
3b755b52e8 anv: Put image params in the descriptor set buffer on gen8 and earlier
This is really where they belong; not push constants.  The one downside
here is that we can't push them anymore for compute shaders.  However,
that's a general problem and we should figure out how to push descriptor
sets for compute shaders.  This lets us bump MAX_IMAGES to 64 on BDW and
earlier platforms because we no longer have to worry about push constant
overhead limits.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Jason Ekstrand
83b943cc2f anv: Make all VkDeviceMemory BOs resident permanently
We spend a lot of time in the driver adding things to hash sets to track
residency.  The reality is that a properly built Vulkan app uses large
memory objects and sub-allocates from them.  In a typical frame, most of
if not all of those allocations are going to be resident for the entire
frame so we're really not saving ourselves much by tracking fine-grained
residency.  Just throwing everything in the validation list does make it
a little bit more expensive inside the kernel to walk the list and
ensure that all our VA is in order.  However, without relocations, the
overhead of that is pretty small.

If we ever do run into a memory pressure situation where the fine-
grained residency could even potentially help, we would likely be
swapping one page out to make room for another within the draw call and
performance is totally lost at that point.  We're better off swapping
out other apps and just letting ours run a whole frame.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-19 19:56:42 +00:00
Lionel Landwerlin
dfd79079da anv: fix uninitialized pthread cond clock domain
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 843775bab7 ("anv: Rework fences")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-18 23:23:03 +01:00
Jason Ekstrand
db4a70e678 anv: Drop some unneeded ANV_FROM_HANDLE for physical devices
Ever since 48ed2a7bb0, we've had one at the top of the function.

Reviewed-by: Caio Marcelo de Oliveira Filho caio.oliveira@intel.com
2019-04-18 20:12:57 +00:00
Jason Ekstrand
981209d175 anv: Re-sort the GetPhysicalDeviceFeatures2 switch statement
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-04-18 20:12:57 +00:00
Iago Toral Quiroga
c2b8fb9a81 anv/device: expose VK_KHR_shader_float16_int8 in gen8+
v2 (Jason):
 - Merge shaderFloat16 and shaderInt8 enablement into a single patch.
 - Merge extension enable.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2019-04-18 13:23:03 +02:00
Tapani Pälli
624789e370 compiler/glsl: handle case where we have multiple users for types
Both Vulkan and OpenGL might be using glsl_types simultaneously or we
can also have multiple concurrent Vulkan instances using glsl_types.
Patch adds a one time init to track number of users and will release
types only when last user calls _glsl_type_singleton_decref().

This change fixes glsl_type memory leaks we have with anv driver.

v2: reuse hash_mutex, cleanup, apply fix also to radv driver and
    rename helper functions (Jason)

v3: move init, destroy to happen on GL context init and destroy

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-16 12:58:00 +03:00
Jason Ekstrand
90108deb27 anv: Update to use the new features struct names
These were updated in version 1.1.106 of vulkan.h to make more sense
with the extension names.  We may as well keep with the times.

Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-15 13:25:43 +00:00
Lionel Landwerlin
9e7b0988d6 anv: leave the top 4Gb of the high heap VMA unused
In 628c9ca908 I forgot to apply the same -4Gb of the high address
of the high heap VMA. This was previously computed in the
HIGH_HEAP_MAX_ADDRESS.

Many thanks to James for pointing this out.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Xiong, James <james.xiong@intel.com>
Fixes: 628c9ca908 ("anv: store heap address bounds when initializing physical device")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-13 12:08:23 +00:00
Lionel Landwerlin
628c9ca908 anv: store heap address bounds when initializing physical device
We can then reuse those bounds to initialize the VMA heaps at logical
device creation.

This fixes an issue on EHL which has only 36bits of VMA. We were
incorrectly using the fixed 48bits upper bound to initialize the
logical device heap, resulting in addresses beyong the device's
limits.

v2: Don't confuse heap size (limited by system memory) and VMA size
   (limited by number of addressing bits the platform has)

v3: Fix low heap vma_size :( (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
2019-04-11 22:56:43 +01:00
Juan A. Suarez Romero
ec7a33af58 anv: advertise 8 subtexel/mipmap precision bits
So far ANV was advertising 4 bits for both subTexelPrecisionBits and
mipmapPrecisionBits. But these values were not actually verified.

But it seems the right value is actually 8 bits for both cases.

Unfortunately Intel PRM does not clarify how many bits the hardware use.
For the mipmap case, there is the following reference in PRM Volume 6
(3D Media GPGPU), specifically in LOD Computation Pseudocode:

```
Bias: S4.8
MinLod: U4.8
MaxLod: U4.8
Base: U4.1
MIPCnt: U4
SurfMinLod: U4.8
ResMinLod: U4.8
``

We have other clues, though:

- On one side, dEQP-VK.texture.explicit_lod.* tests fail when using 4
bits, but work when using 8 bits. These tests try to mimic the expected
behaviour as much real as possible, and they use the reported
subTexelPrecisionBits and mipmapPrecisionBits reported to get this.

- On the other side, the equivalent driver for Windows is reporting 8
bits for both elements. Not sure if they got to verify it from the PRM
or from a diffent source.

CC: Jason Ekstrand <jason@jlekstrand.net>
CC: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-04-09 15:28:42 +00:00
Caio Marcelo de Oliveira Filho
45a4129392 anv: Implement VK_NV_compute_shader_derivatives
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-08 19:29:33 -07:00
Jason Ekstrand
ce47999cee Revert "anv/radv: release memory allocated by glsl types during spirv_to_nir"
This reverts commit 4e1bbb000c.  It turns
out that some DXVK apps due to some implementation detail of DXVK or
other create and destroy instances in an interleaved way.  Freeing the
glsl_type memory without being a bit more careful causes use-after-free
issues.  Looks like we need to try again.
2019-03-27 11:24:58 -05:00
Gurchetan Singh
620df57dbb anv: fix build on Nougat
AHardwareBuffer is only available on O and above.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-03-21 15:36:39 -07:00
Tapani Pälli
4e1bbb000c anv/radv: release memory allocated by glsl types during spirv_to_nir
Fixes leaks for each glsl_type generated:

   ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18
   ==32470==    at 0x483880B: malloc (vg_replace_malloc.c:309)
   ==32470==    by 0x4C43F4A: ralloc_size (ralloc.c:119)
   ==32470==    by 0x4C44014: rzalloc_size (ralloc.c:151)
   ==32470==    by 0x4C44258: rzalloc_array_size (ralloc.c:215)
   ==32470==    by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114)
   ==32470==    by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146)
   ==32470==    by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501)
   ==32470==    by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269)
   ==32470==    by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018)
   ==32470==    by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365)
   ==32470==    by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490)
   ==32470==    by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173)

v2: move release call to vkDestroyInstance
v3: apply fix also to radv driver

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-21 08:30:22 +02:00
Jason Ekstrand
9a129510f5 anv: Bump maxComputeWorkgroupInvocations
We initially set this lower because we didn't have SIMD32 support yet
but we've supported SIMD32 for quite some time now.  We should bump it
up to the real limit.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-20 09:26:56 -05:00
Jason Ekstrand
887041c763 anv: Implement VK_EXT_host_query_reset
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-18 14:48:41 +00:00
Jason Ekstrand
13099d4490 anv: Stop using VK_TRUE/FALSE
We've been fairly inconsistent about this so we should really choose
whether we're going to use VK_TRUE/FALSE or the C boolean values.  The
Vulkan #defines are set to 1 and 0 respectively so it's the same value
as C gives you when you cast a boolean expression to an integer.  Since
there are several places where we set a VkBool32 to a C logical
expression, let's just embrace C booleans and stop using the VK defines.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-13 17:58:27 -05:00
Tapani Pälli
bef354321b anv: revert "anv: release memory allocated by glsl types during spirv_to_nir"
This reverts commit 47fc359822.

Reason is that patch did not take in to account situation where we might
have both OpenGL and Vulkan using glsl_types at the same time.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-12 14:12:36 +02:00
Tapani Pälli
47fc359822 anv: release memory allocated by glsl types during spirv_to_nir
Fixes leaks for each glsl_type generated:

  ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18
  ==32470==    at 0x483880B: malloc (vg_replace_malloc.c:309)
  ==32470==    by 0x4C43F4A: ralloc_size (ralloc.c:119)
  ==32470==    by 0x4C44014: rzalloc_size (ralloc.c:151)
  ==32470==    by 0x4C44258: rzalloc_array_size (ralloc.c:215)
  ==32470==    by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114)
  ==32470==    by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146)
  ==32470==    by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501)
  ==32470==    by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269)
  ==32470==    by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018)
  ==32470==    by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365)
  ==32470==    by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490)
  ==32470==    by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173)

v2: move release call to vkDestroyInstance

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-11 13:14:45 +02:00
Timothy Arceri
051b4064da anv: add support for dumping shader info via VK_EXT_debug_report
This information will be used by the vkpipeline-db tool.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-11 16:16:04 +11:00
Kenneth Graunke
4787bc944a isl: Add a swizzle parameter to isl_buffer_fill_state()
This is necessary for legacy texture buffer object formats, where we'll
need to use a swizzle to fake e.g. luminance.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-07 11:39:27 -08:00
Lionel Landwerlin
acb50d6b1f intel/decoders: handle decoding MI_BBS from ring
An MI_BATCH_BUFFER_START in the ring buffer acts as a second level
batchbuffer (aka jump back to ring buffer when running into a
MI_BATCH_BUFFER_END).

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-03-07 15:08:31 +00:00
Caio Marcelo de Oliveira Filho
69cc6272fb anv: Implement VK_EXT_external_memory_host
v2: Ignore the import if handleType == 0. (Jason)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-05 12:59:50 -08:00
Jason Ekstrand
43f40dc7cb anv: Implement VK_EXT_inline_uniform_block
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-05 10:06:50 -06:00
Tapani Pälli
3bb8768b9d anv: toggle on support for VK_EXT_ycbcr_image_arrays
We already propagate coord_components correctly and did not have
layer restrictions for ycbcr formats.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-03-05 10:39:17 +00:00
Lionel Landwerlin
32ffd90002 anv: add support for INTEL_DEBUG=bat
As requested by Ken ;)

v2: Also decode simple batches (Caio)
    Fix u_vector usage issues (Lionel)

v3: Make binding/instruction/state/surface available (Lionel)

v4: Going through device pools for simple batches (Lionel)
    Centralize search BO callbacks into anv_device.c (Lionel)

v5: Clear decoded batch buffer var after use (Caio)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-03-02 12:53:21 +00:00
Juan A. Suarez Romero
4f917e6a61 anv: advertise 8 subpixel precision bits
On one side, when emitting 3DSTATE_SF, VertexSubPixelPrecisionSelect is
used to select between 8 bit subpixel precision (value 0) or 4 bit
subpixel precision (value 1). As this value is not set, means it is
taking the value 0, so 8 bit are used.

On the other side, in the Vulkan CTS tests, if the reference rasterizer,
which uses 8 bit precision, as it is used to check what should be the
expected value for the tests, is changed to use 4 bit as ANV was
advertising so far, some of the tests will fail.

So it seems ANV is actually using 8 bits.

v2: explicitly set 3DSTATE_SF::VertexSubPixelPrecisionSelect (Jason)

v3: use _8Bit definition as value (Jason)

v4: (by Jason)
anv: Explicitly set 3DSTATE_CLIP::VertexSubPixelPrecisionSelect

This field was added on gen8 even though there's an identically defined
one in 3DSTATE_SF.

CC: Jason Ekstrand <jason@jlekstrand.net>
CC: Kenneth Graunke <kenneth@whitecape.org>
CC: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-22 17:53:55 +01:00
Lionel Landwerlin
f509213675 anv: implement VK_EXT_depth_clip_enable
A new extension allowing the user to explictly specify the clipping
behavior.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-20 09:57:58 +00:00
Eric Engestrom
f1374805a8 drm-uapi: use local files, not system libdrm
There was an issue recently caused by the system header being included
by mistake, so let's just get rid of this include path and always
explicitly #include "drm-uapi/FOO.h"

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
2019-02-14 11:20:00 +00:00
Caio Marcelo de Oliveira Filho
5299c9cbcc anv: skip bit6 swizzle detection in Gen8+
It is always false on Gen8+.  Also, move the variable definition near
its use.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-04 20:44:41 -08:00
Jason Ekstrand
48ed2a7bb0 anv: Implement VK_EXT_buffer_device_address
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-02-01 17:09:42 -06:00
Eric Engestrom
9af77fcf98 anv: drop always-successful VkResult
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-25 09:45:27 +00:00
Jason Ekstrand
ac0f8a6ea0 anv: Implement transform feedback queries
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:57 -06:00
Jason Ekstrand
2be89cbd82 anv: Implement vkCmdDrawIndirectByteCountEXT
Annoyingly, this requires that we implement integer division on the
command streamer.  Fortunately, we're only ever dividing by constants so
we can use the mulh+add+shift trick and it's not as bad as it sounds.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Jason Ekstrand
36ee2fd61c anv: Implement the basic form of VK_EXT_transform_feedback
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-22 10:42:56 -06:00
Danylo Piliaiev
1952fd8d2c anv: Implement VK_EXT_conditional_rendering for gen 7.5+
Conditional rendering affects next functions:
- vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect
- vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR
- vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase
- vkCmdClearAttachments

Value from conditional buffer is cached into designated register,
MI_PREDICATE is emitted every time conditional rendering is enabled
and command requires it.

v2: by Jason Ekstrand
  - Use vk_find_struct_const instead of manually looping
  - Move draw count loading to prepare function
  - Zero the top 32-bits of MI_ALU_REG15

v3: Apply pipeline flush before accessing conditional buffer
 (The issue was found by Samuel Iglesias)

v4: - Remove support of Haswell due to possible hardware bug
    - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to
       define registers in one place.

v5: thanks to Jason Ekstrand and Lionel Landwerlin
    - Workaround the fact that MI_PREDICATE_RESULT is not
      accessible on Haswell by manually calculating
      MI_PREDICATE_RESULT and re-emitting MI_PREDICATE
      when necessary.

v6: suggested by Lionel Landwerlin
    - Instead of calculating the result of predicate once - re-emit
      MI_PREDICATE to make it easier to investigate error states.

v7: suggested by Jason
    - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL
      if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set.

v8: suggested by Lionel
    - Precompute conditional predicate's result to
      support secondary command buffers.
    - Make prepare_for_draw_count_predicate more readable.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 18:31:44 +00:00
Rafael Antognolli
643248b66a anv: Remove state flush.
We have all the state buffers snooped, so we don't need to clflush
everything anymore.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-17 15:08:22 -08:00
Iago Toral Quiroga
f92c5bc8f3 anv/device: fix maximum number of images supported
We had defined MAX_IMAGES as 8, which we used to size the array for
image push constant data. The comment there stated that this was for
gen8, but anv_nir_apply_pipeline_layout runs for all gens and writes
that array, asserting that we don't exceed that number of images,
which imposes a limit of MAX_IMAGES on all gens.

Furthermore, despite this, we are exposing up to 64 images per shader
stage on all gens, gen8 included.

This patch lowers the number of images we expose in gen8 to 8 and
keeps 64 images for gen9+ while making sure that only pre-SKL gens
use push constant space to handle images.

v2:
 - <= instead of < in the assert (Eric, Lionel)
 - Change the way the assertion is written (Eric)

v3:
 - Revert the way the assertion is written to the form it had in v1,
   the version in v2 was not equivalent and was incorrect. (Lionel)

v4:
 - gen9+ doesn't need push constants for images at all (Jason)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (v3)
2019-01-17 07:59:00 +01:00
Jason Ekstrand
5e4f9ea363 anv: Implement VK_KHR_depth_stencil_resolve 2019-01-14 10:16:52 -06:00
Eric Engestrom
4f5a526789 anv: drop unneeded KHR suffix
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:47:56 +00:00
Jason Ekstrand
754eff07d2 anv: Sort properties and features switch statements
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-01-07 18:41:15 -06:00
Tapani Pälli
f1654fa7e3 anv/android: support creating images from external format
Since we don't know the exact format at creation time, some initialization
is done only when bound with memory in vkBindImageMemory.

v2: demand dedicated allocation in vkGetImageMemoryRequirements2 if
    image has external format

v3: refactor prepare_ahw_image, support vkBindImageMemory2,
    calculate stride correctly for rgb(x) surfaces, rename as
    'resolve_ahw_image'

v4: rebase to b43f955037 changes
v5: add some assertions to verify input correctness (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
c79a528d2b anv/android: support import/export of AHardwareBuffer objects
v2: add support for non-image buffers (AHARDWAREBUFFER_FORMAT_BLOB)
v3: properly handle usage bits when creating from image
v4: refactor, code cleanup (Jason)
v5: rebase to b43f955037 changes,
    initialize bo flags as ANV_BO_EXTERNAL (Lionel)
v6: add assert that anv_bo_cache_import succeeds, add comment
    about multi-bo support to clarify current implementation (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00
Tapani Pälli
5c65c60d6c anv: refactor, remove else block in AllocateMemory
This makes it cleaner to introduce more cases where we import memory
from different types of external memory buffers.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-19 09:38:41 +02:00