mesa/src/amd/common
Marek Olšák 3943ed8199 ac/lower_ngg: improve streamout code generation for gfx12/ACO to match LLVM
ACO is still not perfect:
* It generates s_wait_loadcnt 0x0-0x3 when the only required wait instruction
  is s_wait_loadcnt 0x5.
* It generates a lot of unnecessary jumps and blocks for uniform loop breaks.
  Only scc1 jumps are necessary to break the loop. This is 10x better than
  LLVM, but even ACO might consider using nir_intrinsic_ordered_add_loop_gfx12_amd
  for the best performance.

How to print the streamout asm on any GPU:
    PIGLIT_PLATFORM=gbm AMD_FORCE_FAMILY=gfx12_16pipe AMD_DEBUG=vs,mono,asm,useaco ../piglit/bin/shader-io-rate vs_out_xfb

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32570>
2024-12-16 07:35:07 +00:00
..
.clang-format amd: import libdrm_amdgpu ioctl wrappers 2024-11-25 21:03:41 -05:00
ac_binary.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_binary.h ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00
ac_cmdbuf.c amd: do not emit PA_SU_PRIM_FILTER_CNTL in the common GFX preamble 2024-10-25 07:41:22 +00:00
ac_cmdbuf.h ac,radeonsi,radv: add common GFX preambles 2024-08-27 14:14:57 +00:00
ac_debug.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_debug.h ac: Add VCN IB parser 2024-09-23 19:25:08 +00:00
ac_descriptors.c amd: Rename GFX1103_R1/R2 to PHOENIX/2 2024-11-20 02:14:40 +00:00
ac_descriptors.h ac,radv,radeonsi: add a function to build texture descriptors 2024-06-06 10:15:10 +00:00
ac_drm_fourcc.h ac/surf: add more modifiers to gfx12 supported list 2024-12-16 07:35:06 +00:00
ac_fake_hw_db.h amd: include amdgpu_drm.h from mesa instead of system for ac_fake_hw_db.h 2024-12-03 12:02:06 +00:00
ac_formats.c ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_formats.h ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_gather_context_rolls.c ac: Improve context roll readability 2024-03-19 16:08:14 +00:00
ac_gpu_info.c amd: add GFX v11.5.3 support 2024-12-11 19:14:34 +00:00
ac_gpu_info.h ac/gpuinfo: add use_userq and AMD_USERQ variable 2024-12-03 12:02:06 +00:00
ac_hw_stage.h amd: Move ac_hw_stage to its own file 2023-07-03 21:12:45 +00:00
ac_ib_parser.c ac/parse_ib: Replace the parameter list with ac_ib_parser 2024-03-19 16:08:13 +00:00
ac_linux_drm.c amd: add new AMDGPU_INFO subquery for userqueue metadata 2024-12-03 12:02:06 +00:00
ac_linux_drm.h amd: add new AMDGPU_INFO subquery for userqueue metadata 2024-12-03 12:02:06 +00:00
ac_msgpack.c ac/msgpack: make fixstrs a const char 2023-08-22 11:33:10 +00:00
ac_msgpack.h ac/msgpack: make fixstrs a const char 2023-08-22 11:33:10 +00:00
ac_nir.c ac/nir: have ac_nir_lower_mem_access_bit_sizes preserve >128 bit SMEM 2024-12-09 16:56:29 +00:00
ac_nir.h ac/nir/ngg: Add ability to store primitive ID as per-primitive. 2024-12-12 18:11:45 +00:00
ac_nir_cull.c ac/nir/cull: Slightly refactor control flow for small primitive culling. 2024-11-22 01:01:35 +01:00
ac_nir_helpers.h ac/nir: Shorten the name of ac_nir_calc_io_offset_mapped. 2024-08-08 16:55:02 +00:00
ac_nir_lower_esgs_io_to_mem.c ac/nir: Shorten the name of ac_nir_calc_io_offset_mapped. 2024-08-08 16:55:02 +00:00
ac_nir_lower_global_access.c ac/nir: add ACCESS_CAN_REORDER to lowered load_global_constant 2024-11-13 12:59:26 +00:00
ac_nir_lower_image_opcodes_cdna.c ac/nir/cdna: don't use image_descriptor intrinsics if the src is a descriptor 2024-06-25 10:09:08 +00:00
ac_nir_lower_ngg.c ac/lower_ngg: improve streamout code generation for gfx12/ACO to match LLVM 2024-12-16 07:35:07 +00:00
ac_nir_lower_ps.c ac/nir: export alpha to MRTZ.a and one to MRT0.a for alpha-to-one on GFX11 2024-12-11 10:50:31 +00:00
ac_nir_lower_resinfo.c ac/nir: set .image_dim and .image_array for all opcodes 2024-09-27 19:21:55 +00:00
ac_nir_lower_subdword_loads.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
ac_nir_lower_taskmesh_io_to_mem.c nir: add ACCESS_CP_GE_COHERENT_AMD 2024-04-30 17:17:25 +00:00
ac_nir_lower_tess_io_to_mem.c ac/nir: get pass_tessfactors_by_reg from nir_gather_tcs_info 2024-11-16 21:58:29 -05:00
ac_nir_lower_tex.c nir: change signature of nir_src_is_divergent() 2024-10-24 10:06:17 +00:00
ac_nir_meta.h ac/nir/meta: tune clear/copy_buffer performance for gfx6-10.3 2024-09-17 15:19:32 -04:00
ac_nir_meta_cs_blit.c ac/nir: set .image_dim and .image_array for all opcodes 2024-09-27 19:21:55 +00:00
ac_nir_meta_cs_clear_copy_buffer.c ac/nir/meta: tune clear/copy_buffer performance for gfx6-10.3 2024-09-17 15:19:32 -04:00
ac_nir_meta_ps_resolve.c ac/nir: import the MSAA resolving pixel shader from radeonsi 2024-06-08 05:48:11 +00:00
ac_nir_opt_outputs.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
ac_nir_opt_shared_append.c amd/nir: add ac_nir_opt_shared_append 2024-09-19 16:21:47 +00:00
ac_parse_ib.c ac/parse_ib: print VA for the SDMA CONSTANT_FILL/WRITE packets 2024-12-03 15:29:40 +00:00
ac_perfcounter.c ac/perfcounter: fix buffer overflow 2024-11-08 13:31:02 +00:00
ac_perfcounter.h ac/perfcounter: compute the number of global instances of TCP,SQ,GL1C and GL2C 2023-09-14 14:17:19 +00:00
ac_pm4.c amd: use a valid size for ac_pm4_state allocation 2024-07-22 10:09:34 +00:00
ac_pm4.h ac,radeonsi import PM4 state from RadeonSI 2024-06-06 20:26:47 +00:00
ac_rgp.c ac/rgp: assume GFX11_5 use the same SQTT/RGP versions as GFX11 2024-07-17 16:25:19 +00:00
ac_rgp.h ac/rgp: update dumping queue event records to the capture 2023-11-13 08:53:09 +00:00
ac_rgp_elf_object_pack.c amd: Use align64 instead of ALIGN for 64 bit value parameter 2024-01-03 22:02:17 +00:00
ac_rtld.c ac/llvm: implement WA in nir to llvm 2024-06-20 13:14:33 +00:00
ac_rtld.h ac/llvm: implement WA in nir to llvm 2024-06-20 13:14:33 +00:00
ac_shader_args.c nir: change "user_data_amd" sysval from 4 to 8 components 2024-04-13 16:45:08 +00:00
ac_shader_args.h radv/rt: Track ray_launch_size reads 2024-05-28 12:23:45 +00:00
ac_shader_debug_info.h amd: Add ac_shader_debug_info 2024-11-11 08:39:13 +00:00
ac_shader_util.c radv: fix alpha-to-coverage with alpha-to-one without MRTZ 2024-12-12 10:07:25 +00:00
ac_shader_util.h ac/nir/ngg: Implement optional primitive compaction. 2024-11-25 01:56:20 +01:00
ac_shadowed_regs.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_shadowed_regs.h amd: add a new helper that prints all non-shadowed regs 2023-06-17 23:42:21 +00:00
ac_spm.c ac/spm: do not abort when the SPM BO is too small 2024-10-29 18:33:17 +00:00
ac_spm.h ac/spm: do not abort when the SPM BO is too small 2024-10-29 18:33:17 +00:00
ac_sqtt.c ac/sqtt: make VA helpers static 2024-06-14 10:32:17 +02:00
ac_sqtt.h ac/sqtt: make VA helpers static 2024-06-14 10:32:17 +02:00
ac_surface.c ac/surf: add more modifiers to gfx12 supported list 2024-12-16 07:35:06 +00:00
ac_surface.h ac/surface: Add RADEON_SURF_VIDEO_REFERENCE 2024-12-02 13:48:22 +00:00
ac_surface_meta_address_test.c ac: add 'polaris12' gpu to ac_fake_hw_db 2024-11-08 13:31:02 +00:00
ac_surface_modifier_test.c ac/surface/tests: support all block sizes 2024-12-16 07:35:06 +00:00
ac_uvd_dec.h ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00
ac_vcn.h ac/vcn: allow sq signature package to be skipped 2024-11-29 10:01:49 +10:00
ac_vcn_av1_default.h ac/radeonsi: add av1 defaults header file from radeonsi 2023-06-16 05:53:44 +00:00
ac_vcn_dec.c ac/radv/radeon: move film grain init to common code. 2024-06-19 20:51:53 +00:00
ac_vcn_dec.h radv/video: support event for pre-VCN4 decode queues 2024-11-29 10:03:48 +10:00
ac_vcn_enc.c ac: Add ac_vcn_init_enc_cmds 2024-09-20 06:58:29 +00:00
ac_vcn_enc.h radeonsi/vcn: Enable VCN4 AV1 encode WA 2024-11-01 14:05:04 +00:00
ac_vcn_enc_av1_default_cdf.h ac,radeonsi: move vcn enc av1 default cdf file to common 2023-09-14 07:51:24 +00:00
amd_family.c amd: add GFX v11.5.3 support 2024-12-11 19:14:34 +00:00
amd_family.h amd: add GFX v11.5.3 support 2024-12-11 19:14:34 +00:00
amd_kernel_code_t.h amd/common: add AMD_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32 property 2023-08-31 20:30:03 +00:00
gfx10_format_table.h amd/common: only pass gfx_level to ac_get_gfx10_format_table() 2024-05-22 08:31:39 +00:00
gfx10_format_table.py ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00
meson.build amd: import libdrm_amdgpu ioctl wrappers 2024-11-25 21:03:41 -05:00
sid.h winsys/amdgpu: send hdp flush packet for userq 2024-12-03 12:02:06 +00:00
sid_tables.py ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00