mesa/src/intel/vulkan
Jason Ekstrand dd4db84640 anv: Use on-the-fly surface states for dynamic buffer descriptors
We have a performance problem with dynamic buffer descriptors.  Because
we are currently implementing them by pushing an offset into the shader
and adding that offset onto the already existing offset for the UBO/SSBO
operation, all UBO/SSBO operations on dynamic descriptors are indirect.
The back-end compiler implements indirect pull constant loads using what
basically amounts to a texelFetch instruction.  For pull constant loads
with constant offsets, however, we use an oword block read message which
goes through the constant cache and reads a whole cache line at a time.
Because of these two things, direct pull constant loads are much faster
than indirect pull constant loads.  Because all loads from dynamically
bound buffers are indirect, the user takes a substantial performance
penalty when using this "performance" feature.

There are two potential solutions I have seen for this problem.  The
alternate solution is to continue pushing offsets into the shader but
wire things up in the back-end compiler so that we use the oword block
read messages anyway.  The only reason we can do this because we know a
priori that the dynamic offsets are uniform and 16-byte aligned.
Unfortunately, thanks to the 16-byte alignment requirement of the oword
messages, we can't do some general "if the indirect offset is uniform,
use an oword message" sort of thing.

This solution, however, is recommended for a few of reasons:

 1. Surface states are relatively cheap.  We've been using on-the-fly
    surface state setup for some time in GL and it works well.  Also,
    dynamic offsets with on-the-fly surface state should still be
    cheaper than allocating new descriptor sets every time you want to
    change a buffer offset which is really the only requirement of the
    dynamic offsets feature.

 2. This requires substantially less compiler plumbing.  Not only can we
    delete the entire apply_dynamic_offsets pass but we can also avoid
    having to add architecture for passing dynamic offsets to the back-
    end compiler in such a way that it can continue using oword messages.

 3. We get robust buffer access range-checking for free.  Because the
    offset and range are baked into the surface state, we no longer need
    to pass ranges around and do bounds-checking in the shader.

 4. Once we finally get UBO pushing implemented, it will be much easier
    to handle pushing chunks of dynamic descriptors if the compiler
    remains blissfully unaware of dynamic descriptors.

This commit improves performance of The Talos Principle on ULTRA
settings by around 50% and brings it nicely into line with OpenGL
performance.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-13 07:58:00 -07:00
..
tests anv: fold the tests' makefile 2016-05-01 08:38:04 +01:00
.gitignore anv: Suffix the intel_icd file with the host CPU 2016-10-21 09:30:20 -07:00
anv_allocator.c anv: Add missing error-checking to anv_block_pool_init (v2) 2016-11-28 21:11:25 +00:00
anv_batch_chain.c anv: Rename clflush_range and state_clflush 2017-02-21 12:26:35 -08:00
anv_blorp.c anv: Stall before fast-clear operations 2017-03-13 07:57:03 -07:00
anv_cmd_buffer.c anv: Use on-the-fly surface states for dynamic buffer descriptors 2017-03-13 07:58:00 -07:00
anv_descriptor_set.c anv: Use on-the-fly surface states for dynamic buffer descriptors 2017-03-13 07:58:00 -07:00
anv_device.c anv: Accurately advertise dynamic descriptor limits 2017-03-13 07:57:03 -07:00
anv_dump.c anv/cmd_buffer: Move Begin/End/Execute to genX_cmd_buffer.c 2016-10-17 17:41:35 -07:00
anv_entrypoints_gen.py anv: add VK_KHR_descriptor_update_template support 2017-03-02 10:34:06 +00:00
anv_formats.c anv: Use vk_foreach_struct for handling extension structs 2017-02-14 16:15:39 -08:00
anv_gem.c anv/device: Return the right error for failed maps 2016-11-09 18:17:48 -08:00
anv_gem_stubs.c anv: remove define _DEFAULT_SOURCE 2016-05-23 12:09:11 +01:00
anv_genX.h anv: Add support for the PMA fix on Broadwell 2017-02-14 14:18:55 -08:00
anv_image.c anv: Add a helper for working with VK_WHOLE_SIZE for buffers 2017-03-13 07:57:03 -07:00
anv_intel.c anv: Fix unintentional integer overflow in anv_CreateDmaBufImageINTEL 2016-11-22 15:15:45 +00:00
anv_nir.h anv: Add an input attachment lowering pass 2016-11-22 13:44:55 -08:00
anv_nir_apply_pipeline_layout.c anv: Add support for shaderStorageImageWriteWithoutFormat 2017-02-14 08:16:52 -08:00
anv_nir_lower_input_attachments.c anv/lower_input_attachments: honor sample index parameter to subpassLoad() 2017-01-26 08:11:21 +01:00
anv_nir_lower_push_constants.c spirv: compute push constant access offset & range 2017-01-04 21:14:17 +00:00
anv_pass.c anv/pass: Store subpass attachment reference list 2017-03-02 13:17:55 -08:00
anv_pipeline.c anv: Use on-the-fly surface states for dynamic buffer descriptors 2017-03-13 07:58:00 -07:00
anv_pipeline_cache.c anv: Store UUID in physical device. 2016-11-28 19:46:05 +00:00
anv_private.h anv: Use on-the-fly surface states for dynamic buffer descriptors 2017-03-13 07:58:00 -07:00
anv_util.c anv: Add a performance warning helper 2017-03-07 15:22:16 -08:00
anv_wsi.c vulkan/wsi/radv: add initial prime support (v1.1) 2017-02-27 05:42:16 +10:00
anv_wsi_wayland.c anv/wsi: Don't include wayland headers 2017-03-13 11:16:30 +00:00
anv_wsi_x11.c vulkan/wsi/radv: add initial prime support (v1.1) 2017-02-27 05:42:16 +10:00
dev_icd.json.in anv: Replace "abi_versions" with correct "api_version". 2016-10-25 12:55:39 -07:00
gen7_cmd_buffer.c anv: Get rid of the stub() macros 2017-03-07 15:22:16 -08:00
gen8_cmd_buffer.c anv: Take a device parameter in anv_state_flush 2017-02-21 12:26:35 -08:00
genX_blorp_exec.c anv: Take a device parameter in anv_state_flush 2017-02-21 12:26:35 -08:00
genX_cmd_buffer.c anv: Use on-the-fly surface states for dynamic buffer descriptors 2017-03-13 07:58:00 -07:00
genX_gpu_memcpy.c intel: Share URB configuration code between GL and Vulkan. 2016-11-19 11:40:01 -08:00
genX_pipeline.c anv: Store the user's VkAttachmentReference 2017-03-02 13:17:55 -08:00
genX_query.c anv: Put everything about queries in genX_query.c 2017-02-21 12:26:35 -08:00
genX_state.c anv: Emit 3DSTATE_HS/TE/DS packets. 2017-01-10 13:27:31 -08:00
intel_icd.json.in anv: Replace "abi_versions" with correct "api_version". 2016-10-25 12:55:39 -07:00
TODO anv: Enable MSAA compression 2017-02-23 12:10:42 -08:00
vk_format_info.h anv: Set up binding tables and surface states for input attachments 2016-11-22 13:44:55 -08:00