Commit graph

2341 commits

Author SHA1 Message Date
Plamena Manolova
4fe2317601 anv: Implement new way for setting streamout buffers.
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-29 19:21:20 +00:00
Plamena Manolova
f9ad73cdfd anv: Set depthBounds to true in anv_GetPhysicalDeviceFeatures.
Add depth bounds testing to the list of supported
physical device features.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-29 16:05:33 +00:00
Caio Marcelo de Oliveira Filho
e2155158e9 anv: Fix output of INTEL_DEBUG=bat for chained batches
The anv_batch_bo contents are linked one to another, and when printing
we have to start with the first of those.  Since in `u_vector` new
elements are added to the head, to get the first element we need the
vector's tail.

Fixes: 32ffd90002 ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 19:34:54 -07:00
Eric Engestrom
ea8116908c anv: add a couple printflike() annotations
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-28 23:17:16 +00:00
Nanley Chery
6451008e8b intel: Refactor blorp_can_hiz_clear_depth()
Prepare this function to be used in iris and to handle new Gen12 behavior.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:06 -07:00
Nanley Chery
c50f8b2fc9 intel: Support HIZ_CCS in isl_surf_get_ccs_surf
Add an extra aux parameter which will be filled out with CCS if the
first two isl_surf parameters fit the requirements for HiZ_CCS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
8af1853331 anv/private: Modify aux slice helpers for Gen12 CCS
The isl_surf structs for Gen12's CCS won't describe how many slices in
the main surface can be compressed. All slices will be compressable if
CCS is enabled, so lookup the main surface's logical dimension.

v2. Add a space before a `?`. (Jordan)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:05 -07:00
Nanley Chery
300d77c2fa anv/cmd_buffer: Don't assume CCS_E includes CCS_D
There's no longer a clear-only compression mode of CCS on Gen12+.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:05 -07:00
Nanley Chery
4f0b5f9732 anv/image: Disable CCS_D on Gen12+
Clear-only compression no longer exists on TGL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:04 -07:00
Nanley Chery
0eaf293b47 anv/formats: Disable I915_FORMAT_MOD_Y_TILED_CCS on TGL+
The format of the CCS has changed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 10:47:04 -07:00
Nanley Chery
d0fcc2dd50 anv: Properly allocate aux-tracking space for CCS_E
add_aux_state_tracking_buffer() actually checks the aux usage when
determining how many dwords to allocate for state tracking. Move the
function call to the point after the CCS_E aux usage is assigned.

Fixes: de3be61801 ("anv/cmd_buffer: Rework aux tracking")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:04 -07:00
Nanley Chery
698d723a6d anv/blorp: Use BLORP_BATCH_NO_UPDATE_CLEAR_COLOR
Avoid failing the `info->use_clear_address` assertion in ISL on Gen12+.

Fixes: 6c9f9a82d7 ("intel/genxml,isl: Add gen12 render surface state changes")
Reported-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-10-28 10:47:04 -07:00
Plamena Manolova
939ddccb7a anv: Add support for depth bounds testing.
In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction
to enable depth bounds testing.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-28 14:13:04 +00:00
Timothy Arceri
7f106a2b5d util: rename list_empty() to list_is_empty()
This makes it clear that it's a boolean test and not an action
(eg. "empty the list").

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Lionel Landwerlin
6af8a4acc4
anv: Add aux-map translation for gen12+
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-28 00:09:14 -07:00
Jordan Justen
7737f56544
anv/gen12: Write GFX_AUX_TABLE base address register
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 00:09:14 -07:00
Jordan Justen
d4a3299ba1
anv/gen12: Initialize aux map context
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 00:09:13 -07:00
Jordan Justen
062022f2e4
anv: Implement aux-map allocator interface
This interface allows the aux-map code in the intel/common library to
allocate and free buffers.

Reworks:
 * free gen_buffer in gen_aux_map_buffer_free. (Rafael)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-28 00:09:13 -07:00
Eric Engestrom
493903199c anv: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:09:14 +00:00
Caio Marcelo de Oliveira Filho
06aecb14c0 anv: Implement VK_KHR_vulkan_memory_model
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-24 11:39:56 -07:00
Eric Engestrom
47571a01ec anv: fix error message
`strerror()` takes an `errno`, not the negative value returned by the
`ioctl()`.
Instead of fixing this as `"%s", strerror(errno)`, let's just use the
`"%m"` shortcut for it.

Fixes: 2b5f30b1d9 ("anv: implement VK_INTEL_performance_query")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-24 13:57:40 +00:00
Lionel Landwerlin
2b5f30b1d9 anv: implement VK_INTEL_performance_query
v2: Introduce the appropriate pipe controls
    Properly deal with changes in metric sets (using execbuf parameter)
    Record marker at query end

v3: Fill out PerfCntr1&2

v4: Introduce vkUninitializePerformanceApiINTEL

v5: Use new execbuf extension mechanism

v6: Fix comments in genX_query.c (Rafael)
    Use PIPE_CONTROL workarounds (Rafael)
    Refactor on the last kernel series update (Lionel)

v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-23 05:41:15 +00:00
Lionel Landwerlin
0dfa643feb anv: fix unwind of vkCreateDevice fail
We're skipping the context destruction in some cases which is the
grand scheme of thing is not that important because closing device->fd
will destroy the associated context as well.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Fixes: b30e01aef5 ("anv: fix memory leak on device destroy")
2019-10-22 20:44:26 +00:00
Lionel Landwerlin
b30e01aef5 anv: fix memory leak on device destroy
v2: handle vma destruction if vkCreateDevice fails (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1959
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2019-10-20 08:02:22 +00:00
Lionel Landwerlin
3f8f52b241 anv: fix vkUpdateDescriptorSets with inline uniform blocks
With inline uniform blocks descriptor, the meaning of descriptorCount
is a number of bytes to copy into the descriptor. Don't try to use
that size as an index into the descriptor table.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 43f40dc7cb ("anv: Implement VK_EXT_inline_uniform_block")
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1195
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-19 13:16:40 +03:00
Caio Marcelo de Oliveira Filho
58286c7969 anv: Advertise VK_KHR_spirv_1_4
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-14 08:25:42 -07:00
Eric Engestrom
960038d550 anv: add exported symbols check
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-13 17:40:47 +01:00
Marek Olšák
dd4cc56ebd nir: add a strip parameter to nir_serialize
so that drivers don't have to call nir_strip manually.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-10-10 15:47:07 -04:00
Jason Ekstrand
c7e5d24d8f anv/pipeline: Capture serialized NIR
This allows the serialized NIR to be displayed in RenderDoc and similar
tools.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-09 22:28:01 +00:00
Caio Marcelo de Oliveira Filho
44978baece anv: Disable fast clears when running with INTEL_DEBUG=nofc
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2019-10-09 13:29:26 -07:00
Caio Marcelo de Oliveira Filho
9560c9b498 anv: Enable VK_EXT_shader_subgroup_{ballot,vote}
Anvil now supports and passes Vulkan CTS tests matching

    dEQP-VK.subgroups.*.ext_shader_subgroup_ballot.*
    dEQP-VK.subgroups.*.ext_shader_subgroup_vote.*

and crucible tests matching

    func.shader-ballot.*
    func.shader-subgroup-vote.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-08 16:34:00 -07:00
Tapani Pälli
e4a826b2c8 anv/android: fix images created with external format support
This fixes a case where user first creates image and then later binds it
with memory created from AHW buffer.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-08 07:19:05 +03:00
Caio Marcelo de Oliveira Filho
f7ca072ab2 anv: Implement VK_KHR_shader_clock
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-07 09:12:12 -07:00
Rafael Antognolli
cdc331c6f9 anv/block_pool: Align anv_block_pool state to 64 bits.
On 64 bits platforms, some atomic operations like __sync_fetch_and_add()
have constant time, but on 32 bits platforms they are implemented with a
loop and might take much longer.

Additionally, it seems like if their operands are not aligned to 64
bits, they also require extra memory accesses. From the Intel
Architecture's Developer Manual Vol. 1, 4.1.1:

 "A word or doubleword operand that crosses a 4-byte boundary or a
 quadword operand that crosses an 8-byte boundary is considered
 unaligned and requires two separate memory bus cycles for access."

Forcing the u64 field to be aligned to 64 bits seems to make the unit
tests that are stressing this finish much faster.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-03 12:40:33 -07:00
Lionel Landwerlin
da2d67fc3b anv: gem-stubs: return a valid fd got anv_gem_userptr()
Fixes invalid close(-1) in the unit tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-25 22:02:51 +03:00
Kenneth Graunke
b9e93db208 intel: Increase Gen11 compute shader scratch IDs to 64.
From the MEDIA_VFE_STATE docs:

   "Starting with this configuration, the Maximum Number of Threads must
    be set to (#EU * 8) for GPGPU dispatches.

    Although there are only 7 threads per EU in the configuration, the
    FFTID is calculated as if there are 8 threads per EU, which in turn
    requires a larger amount of Scratch Space to be allocated by the
    driver."

It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern.  The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.

Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.

Fixes: 5ac804bd9a ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-09-23 16:59:40 -07:00
Kenneth Graunke
50c0dd8621 Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM"
This reverts commit 729de1488f.

It turns out that, although the register is in the logical context,
it isn't whitelisted, so we can't actually write it from userspace
batch buffers.  The write just becomes a noop, which is why we saw
no performance changes.

I manually whitelisted it, and still observed no performance gains, but
it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments
on the iris driver.  So we might need to fix something before enabling
this.  To prevent it randomly getting turned on should the kernel ever
whitelist this register, we revert the patch for now.
2019-09-23 16:31:23 -07:00
Jason Ekstrand
7d861ab812 anv: Advertise VK_KHR_shader_subgroup_extended_types
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
2019-09-20 18:02:15 +00:00
Eric Engestrom
3c1a24de07 anv: implement ICD interface v4
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-20 08:31:58 +00:00
Eric Engestrom
19db95e78e anv: split instance dispatch table
This effectively breaks the instance dispatch table in 2 with entry
points using a physical device as first argument getting their own
dispatch table.

As a result we now have to check instance & physical device dispatch
table instead of just the instance dispatch table before.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-09-20 08:31:58 +00:00
Jason Ekstrand
0c4e89ad5b Move blob from compiler/ to util/
There's nothing whatsoever compiler-specific about it other than that's
currently where it's used.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-19 19:56:22 +00:00
Arcady Goldmints-Orlov
5ec5fecc26 anv: fix descriptor limits on gen8
Later generations support bindless for samplers, images, and buffers and
thus per-stage descriptors are not limited by the binding table size.
However, gen8 doesn't support bindless images and thus needs to report a
lower per-stage limit so that all combinations of descriptors that fit
within the advertised limits are reported as supported by
vkGetDescriptorSetLayoutSupport.

Fixes test dEQP-VK.api.maintenance3_check.descriptor_set
Fixes: 79fb0d27f3 ("anv: Implement SSBOs bindings with GPU addresses in the descriptor BO")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-19 09:10:40 -05:00
Samuel Iglesias Gonsálvez
f5dd6dfe01 anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls
This adds support for
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and
enables de Vulkan and SPIR-V extensions.

Also, notice that this includes the updates applied to the
VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension
VK_KHR_shader_float_controls v4 and Vulkan 1.1.116.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-17 23:39:19 +03:00
Lionel Landwerlin
0616b7ac90 vulkan: add vk_x11_strict_image_count option
This option strictly allocate the minImageCount given by the
application at swapchain creation.

This works around application that do not deal with the fact that the
implementation allocates more images than the minimum specified.

v2: Add values in default drirc (Bas)

v3: specify engine name/version (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111522
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-15 15:37:02 +03:00
Lionel Landwerlin
04dc6074cf driconfig: add a new engine name/version parameter
Vulkan applications can register with the following structure :

typedef struct VkApplicationInfo {
    VkStructureType    sType;
    const void*        pNext;
    const char*        pApplicationName;
    uint32_t           applicationVersion;
    const char*        pEngineName;
    uint32_t           engineVersion;
    uint32_t           apiVersion;
} VkApplicationInfo;

This enables the Vulkan implementations to apply workarounds based off
matching this description.

Here we add a new parameter for matching the driconfig options with
the following :

    <device driver="anv">
        <application engine_name_match="MyOwnEngine.*" engine_versions="10:12,40:42">
            <option name="blaaah" value="true" />
        </application>
    </device>

v2: switch engine name match to use regexps

v3: Verify that the regexec returns REG_NOMATCH for match failure (Eric)

v4: Add missing bit that went to the following commit (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-15 15:37:02 +03:00
Anuj Phogat
729de1488f intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM
Initial benchmarking didn't show any performance benefits. But it might eventually.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-11 11:29:37 -07:00
Jason Ekstrand
34541be7b0 intel/blorp: Use wide formats for nicely aligned stencil clears
In the case where the stencil clear is nicely aligned, we can clear
stencil much more efficiently by mapping it as a wide format (say
RGBA32_UINT) and blasting out the stencil clear value with a repclear.
On Unigine Heaven, this makes one stencil clear go from non-trivial to
unnoticeable when looking at per-draw timings.

In order for this change to work properly, ANV needs to do a bit more
flushing around depth and stencil clears.  i965 and iris already have
the cache tracking logic to handle this so no changes are required
there.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:35:09 +00:00
Eric Engestrom
037b5b567f anv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Eric Engestrom
4dcb1fff19 anv: add support for driconf
No option is supported yet, this is just the boilerplate.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-09-06 23:16:05 +01:00
Jordan Justen
9790cfcefa
anv,iris: L3ALLOC register replaces L3CNTLREG for gen12
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-06 13:11:25 -07:00