mesa/src/amd/vulkan
Konstantin Seurer c18a7d0e2b radv: Emit compressed primitive nodes on GFX12
The normal encode pass writes batches to a section in build scratch
memory. Those batches contain information about the internal node and
the primitive nodes. The encoder is split to avoid the register
pressure of the compressor and maximize occupancy.

The compressor works in two passes because one pass can not guarantee
that every primitive node (except) has at least two triangles. This
guarantee is used to advertise a smaller acceleration structure size to
the application.

During compression, every invocation processes at most two triangles.
Groups of 8 invocations are used to support the maximum triangle count
of 16 that the hardware supports.

The first step of compression is loading the triangle(s). Shared
vertices are deduplicated early to avoid doing it in the compression
loop. The compression loop tries to add triangles to a list of triangles
until the computed node size needed for storing the triangles reaches
the hardware node size. For this, each invocation first deduplicates
vertices with the triangles that have already been picked. It then
computes the node size of the picked triangles plus the candidate
triangles of the current invocation. The invocation that computed the
smallest size is added to the list.

Because it may not be possible to fit every triangle into the same node,
there can be multiple hardware nodes which are written in parallel for
optimal performance. If there are no nodes with only one triangle, all
nodes are written. If there is, compression of the batch is aborted and
the index of the batch is written to build scratch memory. The second
compression pass will repeat the steps above but only for those aborted
batches. The nodes with only one triangle can and are now merged.

It can not be determined during box node encode which triangles will be
compressed together so the encoder also has to fix up the parent box
node's child infos.

Reviewed-by: Natalie Vock <natalie.vock@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36965>
2025-10-21 19:32:55 +00:00
..
bvh radv: Emit compressed primitive nodes on GFX12 2025-10-21 19:32:55 +00:00
layers amd: add and use utility functions for LDS size encoding 2025-10-15 11:20:08 +00:00
meta radv: Emit compressed primitive nodes on GFX12 2025-10-21 19:32:55 +00:00
nir treewide: use nir_store_global alias of nir_build_store_global 2025-10-21 12:37:58 +02:00
tests build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
winsys amd: change radeon_info::lds_size_per_workgroup for GFX10+ to 64KB 2025-10-15 11:20:09 +00:00
.clang-format
.editorconfig
meson.build amd/common: merge radv_nir_opt_access_speculate() into ac_nir_flag_smem_for_loads() 2025-10-14 16:33:12 +00:00
radv_acceleration_structure.c radv: Emit compressed primitive nodes on GFX12 2025-10-21 19:32:55 +00:00
radv_aco_shader_info.h radv: calculate LDS allocation requirements independently from the compiler 2025-10-15 11:20:07 +00:00
radv_android.c radv: use AHARDWAREBUFFER_USAGE_CAMERA_MASK 2025-07-09 03:47:06 +00:00
radv_android.h
radv_buffer.c Revert "radv,driconf: Add radv_force_64k_sparse_alignment config" 2025-06-13 06:43:47 +00:00
radv_buffer.h radv: switch to device address from vk_buffer 2025-03-06 09:46:01 +00:00
radv_buffer_view.c radv: Remove offset parameter from radv_make_texel_buffer_descriptor. 2025-05-02 09:13:14 +00:00
radv_buffer_view.h radv: Remove offset parameter from radv_make_texel_buffer_descriptor. 2025-05-02 09:13:14 +00:00
radv_check_va.py radv: Add sparse mappings to radv_check_va.py. 2024-11-29 12:57:42 +00:00
radv_cmd_buffer.c amd,radv: move SDMA utility helpers to common code 2025-10-21 13:31:20 +02:00
radv_cmd_buffer.h radv: upload and emit dynamic descriptors separately from push constants 2025-10-14 15:34:43 +00:00
radv_constants.h radv: re-format using clang-format 2025-09-09 05:48:56 +00:00
radv_cp_dma.c radv: use ac_emit_cp_pfp_sync_me() more 2025-10-16 06:31:37 +00:00
radv_cp_dma.h radv: switch to radv_cmd_stream everywhere 2025-08-08 11:49:23 +00:00
radv_cp_reg_shadowing.c radv: Add amd_ip_type to radv_cmd_stream 2025-10-14 12:33:13 +00:00
radv_cp_reg_shadowing.h radv: switch to radv_cmd_stream everywhere 2025-08-08 11:49:23 +00:00
radv_cs.c amd,radv: move SDMA utility helpers to common code 2025-10-21 13:31:20 +02:00
radv_cs.h amd: move CP emit helpers to ac_cmdbuf_cp.c/h 2025-10-21 13:31:20 +02:00
radv_debug.c radv: replace radeon_cmdbuf by ac_cmdbuf completely 2025-10-08 18:00:15 +00:00
radv_debug.h radv: enable the global BO list by default 2025-10-14 08:12:36 +00:00
radv_debug_nir.c nir: remove manual nir_store_global 2025-10-21 12:37:58 +02:00
radv_debug_nir.h radv: Add RADV_DEBUG=validatevas for address validation in nir 2025-08-15 10:32:35 +00:00
radv_descriptor_pool.c radv: simplify error handling when creating descriptor pools 2025-10-21 06:43:29 +00:00
radv_descriptor_pool.h radv: move descriptor pool implementation to radv_descriptor_pool.c/h 2025-06-27 07:55:35 +00:00
radv_descriptor_set.c radv: simplify allocating pool entries for descriptor sets 2025-10-21 06:43:29 +00:00
radv_descriptor_set.h radv: split descriptor set and descriptor utils in separate files 2025-06-27 07:55:37 +00:00
radv_descriptor_update_template.c radv: reduce the combined image/sampler desc size on GFX11+ 2025-08-14 06:47:30 +00:00
radv_descriptor_update_template.h radv: move descriptor update implementation to radv_descriptor_update_template.c/h 2025-06-27 07:55:37 +00:00
radv_descriptors.c radv: reduce the combined image/sampler desc size on GFX11+ 2025-08-14 06:47:30 +00:00
radv_descriptors.h radv: reduce the combined image/sampler desc size on GFX11+ 2025-08-14 06:47:30 +00:00
radv_device.c radv: Add amd_ip_type to radv_cmd_stream 2025-10-14 12:33:13 +00:00
radv_device.h radv: Use extra context for video encode queue with multiple VCN instances 2025-09-01 10:56:31 +00:00
radv_device_memory.c radv: Allocate BOs as implicit sync even if the WSI is doing implicit sync. 2025-10-10 19:17:04 +00:00
radv_device_memory.h radv: add import and export handle_type in radv_alloc_memory 2025-03-03 08:26:51 +00:00
radv_dgc.c treewide: use nir_store_global alias of nir_build_store_global 2025-10-21 12:37:58 +02:00
radv_dgc.h radv: Remove unneeded forward declaration of qf from dgc header 2025-10-14 12:33:19 +00:00
radv_event.c
radv_event.h
radv_formats.c radv: always return optimalDeviceAccess=TRUE for block-compressed formats 2025-10-14 10:14:45 +00:00
radv_formats.h build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
radv_host_image_copy.c radv: implement vkTransitionImageLayout() 2025-07-15 09:12:16 +00:00
radv_image.c radv,ac: Split has_tc_compat_zrange_bug into Z and ZS, document it 2025-10-02 08:29:49 +00:00
radv_image.h radv,ac: Split has_tc_compat_zrange_bug into Z and ZS, document it 2025-10-02 08:29:49 +00:00
radv_image_view.c vulkan: Drop the driver_internal from vk_image_view_init/create() 2025-09-05 23:34:14 +00:00
radv_image_view.h radv: add a function to create an image view for HiZ surfaces 2025-08-12 13:48:09 +00:00
radv_instance.c radv: enable the global BO list by default 2025-10-14 08:12:36 +00:00
radv_instance.h radv: disable radv_disable_hiz_his_gfx12 for Mafia Definition Edition 2025-09-11 15:21:50 +00:00
radv_llvm_helper.cpp amd,radeonsi: reduce legacy::PassManager use to only run backend passes 2024-10-05 09:10:06 +00:00
radv_llvm_helper.h
radv_nir_to_llvm.c radv: use CU mode when LDS is used 2025-10-15 13:37:48 +01:00
radv_nir_to_llvm.h
radv_perfcounter.c amd: add a predicate parameter to ac_emit_cp_copy_data() 2025-10-21 13:31:20 +02:00
radv_perfcounter.h radv: Remove qf from radv_spm/sqtt/perfcounter where applicable 2025-10-14 12:33:20 +00:00
radv_physical_device.c radv: Disable compute queues when the regalloc bug is present 2025-10-15 18:08:49 +00:00
radv_physical_device.h radv: allow to select a different HiZ workaround on GFX12 2025-09-01 07:02:24 +00:00
radv_pipeline.c amd: keep ac_shader_config::lds_size unaligned 2025-10-15 11:20:09 +00:00
radv_pipeline.h radv: rename indirect_descriptor_sets to indirect_descriptors 2025-10-10 13:22:03 +00:00
radv_pipeline_binary.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
radv_pipeline_binary.h radv: add initial support for pipeline binaries 2024-09-10 08:19:52 +00:00
radv_pipeline_cache.c radv: Add RADV_DEBUG=validatevas for address validation in nir 2025-08-15 10:32:35 +00:00
radv_pipeline_cache.h radv: remove unused forwarded declarations of pipeline layout 2025-08-18 07:25:34 +00:00
radv_pipeline_compute.c radv: rename indirect_descriptor_sets to indirect_descriptors 2025-10-10 13:22:03 +00:00
radv_pipeline_compute.h radv: rename indirect_descriptor_sets to indirect_descriptors 2025-10-10 13:22:03 +00:00
radv_pipeline_graphics.c treewide: don't check before free 2025-10-15 23:01:33 +00:00
radv_pipeline_graphics.h radv: pre-compute vgt_outprim_type 2025-09-15 19:10:39 +00:00
radv_pipeline_layout.c radv: move pipeline layout implementation to radv_pipeline_layout.c/h 2025-06-25 07:52:12 +00:00
radv_pipeline_layout.h radv: move pipeline layout implementation to radv_pipeline_layout.c/h 2025-06-25 07:52:12 +00:00
radv_pipeline_rt.c amd: add and use utility functions for LDS size encoding 2025-10-15 11:20:08 +00:00
radv_pipeline_rt.h all: rename gl_shader_stage to mesa_shader_stage 2025-08-06 10:28:40 +08:00
radv_query.c amd,radv: move SDMA utility helpers to common code 2025-10-21 13:31:20 +02:00
radv_query.h radv: re-run clang-format 2025-07-16 09:10:33 +02:00
radv_queue.c amd: add a predicate parameter to ac_emit_cp_copy_data() 2025-10-21 13:31:20 +02:00
radv_queue.h radv: replace radeon_cmdbuf by ac_cmdbuf completely 2025-10-08 18:00:15 +00:00
radv_radeon_winsys.h radv: replace radeon_cmdbuf by ac_cmdbuf completely 2025-10-08 18:00:15 +00:00
radv_rmv.c radv: remove unnecessary radv_graphics_pipeline::is_ngg 2025-09-02 06:18:05 +00:00
radv_rmv.h
radv_rra.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
radv_rra.h radv/rra: Increase rra_validation_context::location 2025-07-22 14:40:33 +00:00
radv_rra_gfx10_3.c radv/rra: Only write used BLAS 2025-07-22 14:40:33 +00:00
radv_rra_gfx12.c radv/rra/gfx12: Properly validate geometry indices 2025-10-21 19:32:54 +00:00
radv_sampler.c radv: Actually fail custom border color sampler creation. 2025-10-10 14:25:54 +00:00
radv_sampler.h radv: fix capture/replay with sampler border color 2025-09-12 06:51:51 +00:00
radv_sdma.c amd,radv: move SDMA utility helpers to common code 2025-10-21 13:31:20 +02:00
radv_sdma.h amd,radv: move SDMA utility helpers to common code 2025-10-21 13:31:20 +02:00
radv_shader.c radv: allow WGP mode with task/mesh 2025-10-15 13:37:48 +01:00
radv_shader.h amd: keep ac_shader_config::lds_size unaligned 2025-10-15 11:20:09 +00:00
radv_shader_args.c radv: allow to inline all push constants even with dynamic descriptors 2025-10-14 15:34:43 +00:00
radv_shader_args.h radv: declare a new user SGPR for dynamic descriptors 2025-10-14 15:34:43 +00:00
radv_shader_info.c amd: add and use utility functions for LDS size encoding 2025-10-15 11:20:08 +00:00
radv_shader_info.h radv: calculate LDS allocation requirements independently from the compiler 2025-10-15 11:20:07 +00:00
radv_shader_object.c treewide: don't check before free 2025-10-15 23:01:33 +00:00
radv_shader_object.h all: rename gl_shader_stage to mesa_shader_stage 2025-08-06 10:28:40 +08:00
radv_spm.c radv: Remove qf from radv_spm/sqtt/perfcounter where applicable 2025-10-14 12:33:20 +00:00
radv_spm.h radv: Remove qf from radv_spm/sqtt/perfcounter where applicable 2025-10-14 12:33:20 +00:00
radv_sqtt.c radv: Fix crash in sqtt due to uninitalized value 2025-10-17 06:10:46 +00:00
radv_sqtt.h radv: switch to radv_cmd_stream everywhere 2025-08-08 11:49:23 +00:00
radv_video.c radv/video: Fill maxCodedExtent caps first 2025-10-17 17:58:08 +00:00
radv_video.h radv/video: Fix waiting on encode feedback query 2025-10-06 10:32:54 +00:00
radv_video_enc.c radv/video: Fix waiting on encode feedback query 2025-10-06 10:32:54 +00:00
radv_wsi.c vulkan/wsi: Make get_blit_queue return a struct vk_queue * 2025-08-22 23:05:03 +00:00
radv_wsi.h