mesa/src/amd/vulkan
Marek Olšák e640d5a9c3 amd: vectorize SMEM loads aggressively, allow overfetching for ACO
If there is a 4-byte hole between 2 loads, they are vectorized. Example:
    load 4 + hole 4 + load 8 -> load 16
This helps GLSL uniform loads, which are often sparse. See the code for more
info.

RADV could get better code by vectorizing later.

radeonsi+ACO - TOTALS FROM AFFECTED SHADERS (45482/58355)
  Spilled SGPRs: 841 -> 747 (-11.18 %)
  Code Size: 67552396 -> 65291092 (-3.35 %) bytes
  Max Waves: 714439 -> 714520 (0.01 %)

This should have no effect on LLVM because ac_build_buffer_load scalarizes
SMEM, but it's improved for some reason:

radeonsi+LLVM - TOTALS FROM AFFECTED SHADERS (4673/58355)
  Spilled SGPRs: 1450 -> 1282 (-11.59 %)
  Spilled VGPRs: 106 -> 107 (0.94 %)
  Scratch size: 101 -> 102 (0.99 %) dwords per thread
  Code Size: 14994624 -> 14956316 (-0.26 %) bytes
  Max Waves: 66679 -> 66735 (0.08 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29399>
2025-01-09 22:01:54 +00:00
..
bvh vulkan/runtime,radv: Add shared BVH building framework 2024-12-01 20:08:35 +01:00
layers radv: fix capturing RT pipelines that return VK_OPERATION_DEFERRED_KHR for RGP 2024-12-17 17:12:27 +00:00
meta radv/meta: do not create redundant pipeline layout objects 2025-01-03 09:11:59 +00:00
nir ac/nir: lower more loads in ac_nir_lower_intrinsics_to_args instead of drivers 2025-01-02 17:36:55 +00:00
tests radv: Add radv_nir_lower_hit_attrib_derefs_tests 2023-11-02 15:48:36 +00:00
winsys radv/amdgpu: Set VCN version for ac_parse_ib 2024-12-27 08:17:16 +00:00
.clang-format radv/clang-format: Do not indent C++ modifiers 2023-11-02 15:48:36 +00:00
.editorconfig
meson.build radv: advertise Vulkan 1.4 on GFX8+ 2024-12-03 10:21:55 +00:00
radv_acceleration_structure.c radv/rt: Fix memleak in radv_init_header() 2025-01-07 09:49:56 +00:00
radv_aco_shader_info.h ac,radv,radeonsi: enable TCS input reads from VGPRs for all compatible loads 2024-12-18 11:07:59 +00:00
radv_android.c radv: add address binding report support for BOs imported with a fd 2024-12-03 08:13:13 +00:00
radv_android.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_buffer.c radv: report same buffer aligment for DGC preprocessed buffer 2024-12-16 14:53:56 +00:00
radv_buffer.h radv: add address binding report support for BOs imported with a ptr 2024-12-03 08:13:13 +00:00
radv_buffer_view.c radv: Remap 10 and 12 bit formats to 16 bit formats 2024-10-16 14:30:15 +00:00
radv_buffer_view.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_check_va.py radv: Add sparse mappings to radv_check_va.py. 2024-11-29 12:57:42 +00:00
radv_cmd_buffer.c radv: program DB_RENDER_OVERRIDE correctly on GFX12 2025-01-09 07:39:23 +00:00
radv_cmd_buffer.h radv: rename color output state to fragment output state 2024-12-23 08:09:26 +00:00
radv_constants.h radv: remove RADV_MAX_DRM_DEVICES 2024-10-07 11:42:37 +00:00
radv_cp_dma.c radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_cp_dma.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_cp_reg_shadowing.c radv: pad GFX preambles IBs with only one NOP 2024-08-21 14:55:04 +00:00
radv_cp_reg_shadowing.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_cs.c radv: don't assume that TC_ACTION_ENA invalidates L1 cache on gfx9 2024-06-11 06:15:12 +00:00
radv_cs.h radv: rename radeon perfctr uconfig helpers 2024-05-15 11:34:35 +00:00
radv_debug.c radv: dump the Mesa version with RADV_DEBUG=hang 2024-12-09 18:25:24 +00:00
radv_debug.h radv: remove remaining discard to demote options 2024-12-11 17:59:13 +00:00
radv_descriptor_set.c radv: use vk_descriptor_type_is_dynamic 2024-12-19 15:12:58 +00:00
radv_descriptor_set.h radv: use blake3 for hashing pipeline layouts 2024-07-10 07:35:19 +00:00
radv_device.c radv: use common calibrated timestamp support 2025-01-07 03:39:29 +00:00
radv_device.h radv/meta: convert the blit2d GFX pipelines to vk_meta 2024-12-31 10:32:50 +00:00
radv_device_memory.c radv: promote VK_KHR_map_memory2 to core 1.4 API 2024-12-03 10:21:55 +00:00
radv_device_memory.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_dgc.c radv: fix destroying DGC pipelines 2024-12-31 10:57:46 +00:00
radv_dgc.h radv/meta: convert DGC pipeline layout to vk_meta 2024-12-29 18:31:50 +00:00
radv_event.c radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_event.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_formats.c radv: fix disabling DCC for stores with drirc 2024-12-12 09:11:37 +00:00
radv_formats.h treewide: Stop putting enum in front of Vulkan enum types 2024-12-02 17:22:49 +00:00
radv_image.c radv: Fix sampling from image layers of video decode target 2025-01-03 01:28:07 +00:00
radv_image.h radv: optimize the pipe misaligned L2 cache invalidation on GFX11 2024-11-12 17:27:39 +00:00
radv_image_view.c radv: Fix sampling from image layers of video decode target 2025-01-03 01:28:07 +00:00
radv_image_view.h radv: Fix sampling from image layers of video decode target 2025-01-03 01:28:07 +00:00
radv_instance.c radv: add radv_lower_terminate_to_discard and enable for Indiana Jones 2024-12-12 19:54:39 +00:00
radv_instance.h radv: add radv_lower_terminate_to_discard and enable for Indiana Jones 2024-12-12 19:54:39 +00:00
radv_llvm_helper.cpp amd,radeonsi: reduce legacy::PassManager use to only run backend passes 2024-10-05 09:10:06 +00:00
radv_llvm_helper.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_nir_to_llvm.c amd: lower load_local_invocation_index in NIR 2025-01-02 17:36:55 +00:00
radv_nir_to_llvm.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_perfcounter.c radv: Fix shader mask for SQ_WGP SPM counters 2024-07-16 16:10:11 +00:00
radv_perfcounter.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_physical_device.c radv: disable VRS coarse shading with 8x MSAA on GFX12 2025-01-07 18:56:24 +00:00
radv_physical_device.h radv: use common calibrated timestamp support 2025-01-07 03:39:29 +00:00
radv_pipeline.c amd: vectorize SMEM loads aggressively, allow overfetching for ACO 2025-01-09 22:01:54 +00:00
radv_pipeline.h radv: promote VK_KHR_maintenance5 to core 1.4 API 2024-12-03 10:21:55 +00:00
radv_pipeline_binary.c radv: fix generating the global key for pipeline binaries 2024-10-09 21:15:48 +00:00
radv_pipeline_binary.h radv: add initial support for pipeline binaries 2024-09-10 08:19:52 +00:00
radv_pipeline_cache.c radv: fix printing with RADV_DEBUG=psocachestats 2024-11-25 07:36:49 +00:00
radv_pipeline_cache.h radv: add initial support for pipeline binaries 2024-09-10 08:19:52 +00:00
radv_pipeline_compute.c radv: promote VK_KHR_maintenance5 to core 1.4 API 2024-12-03 10:21:55 +00:00
radv_pipeline_compute.h radv: fix skipping on-disk shaders cache when not useful 2024-11-20 10:01:26 +00:00
radv_pipeline_graphics.c ac/nir: extract a load_subgroup_id lowered helper 2025-01-02 17:36:55 +00:00
radv_pipeline_graphics.h radv: pass extra graphics pipeline create info using pNext 2024-12-29 17:51:03 +00:00
radv_pipeline_rt.c amd: vectorize SMEM loads aggressively, allow overfetching for ACO 2025-01-09 22:01:54 +00:00
radv_pipeline_rt.h radv: add initial support for pipeline binaries 2024-09-10 08:19:52 +00:00
radv_printf.c radv: promote VK_KHR_maintenance5 to core 1.4 API 2024-12-03 10:21:55 +00:00
radv_printf.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_query.c radv/meta: reduce length of some cache keys 2025-01-03 09:11:59 +00:00
radv_query.h radv: rename GDS queries to emulated queries 2024-12-19 08:08:53 +00:00
radv_queue.c radv: program DB_RENDER_OVERRIDE correctly on GFX12 2025-01-09 07:39:23 +00:00
radv_queue.h radv: promote VK_KHR_global_priority to core 1.4 API 2024-12-03 10:21:54 +00:00
radv_radeon_winsys.h radv: Move ac_addrlib to the physical device 2024-10-28 20:06:38 +00:00
radv_rmv.c radv: Store range rather than bo_size in VkBuffer/VkImage. 2024-04-16 16:29:57 +02:00
radv_rmv.h radv/rmv: fix image binds logging for disjoint images 2024-04-10 11:23:40 +00:00
radv_rra.c radv: promote VK_KHR_maintenance5 to core 1.4 API 2024-12-03 10:21:55 +00:00
radv_rra.h radv/rra: Reduce the memory requirement of copy_after_build 2024-06-21 17:47:53 +00:00
radv_sampler.c ac,radv,radeonsi: introduce a helper to build a sampler descriptor 2024-05-17 13:43:12 +00:00
radv_sampler.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_sdma.c radv: fix stencil only copies of depth/stencil images with SDMA 2024-12-04 09:30:36 +00:00
radv_sdma.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_shader.c radv: run peephole_select in optimize_nir_algebraic 2025-01-08 09:56:39 +00:00
radv_shader.h radv: switch to the new TCS LDS/offchip size computation 2024-12-18 11:07:59 +00:00
radv_shader_args.c ac/nir: split local_invocation_ids to 3 separate VGPR inputs 2025-01-02 17:36:55 +00:00
radv_shader_args.h radv: rename shader_query_state to task_state 2024-09-24 06:00:00 +00:00
radv_shader_info.c radv: Rename layer_input to reads_layer in PS info. 2025-01-02 14:07:51 +00:00
radv_shader_info.h radv: Rename layer_input to reads_layer in PS info. 2025-01-02 14:07:51 +00:00
radv_shader_object.c radv: fix missing variants for the last VGT stage with shader object 2024-12-17 09:50:52 +00:00
radv_shader_object.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_spm.c radv: resize the SPM bo when it's too small 2024-10-29 18:33:17 +00:00
radv_spm.h radv: resize the SPM bo when it's too small 2024-10-29 18:33:17 +00:00
radv_sqtt.c radv: use common calibrated timestamp support 2025-01-07 03:39:29 +00:00
radv_sqtt.h radv: add new start/stop sqtt helpers for capturing with SQTT 2024-11-28 07:03:21 +00:00
radv_video.c radv/video: Remove dt_field_mode handling code 2025-01-03 01:28:07 +00:00
radv_video.h radv/video: Remove dt_field_mode handling code 2025-01-03 01:28:07 +00:00
radv_video_enc.c radv/video: Fix bitstreamStartOffset including dstBufferOffset 2024-12-03 22:19:43 +00:00
radv_wsi.c radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00
radv_wsi.h radv: use SPDX-License-Identifier 2024-04-08 07:17:31 +00:00