mesa/src/amd/common
Georg Lehmann fd77cc7c32 ac/nir/lower_ps: move exports after packing alu
If ACO's wqm section ends just before the first export, this mixing alu and
exports means the alu in question can't be reordered as much by the ILP
scheduler.

Foz-DB Navi31:
Totals from 8959 (11.31% of 79188) affected shaders:
Instrs: 5977212 -> 5978494 (+0.02%); split: -0.02%, +0.04%
CodeSize: 32982732 -> 32987876 (+0.02%); split: -0.01%, +0.03%
Latency: 35218073 -> 35216277 (-0.01%); split: -0.02%, +0.02%
InvThroughput: 5149751 -> 5149696 (-0.00%); split: -0.00%, +0.00%
SClause: 220552 -> 220551 (-0.00%); split: -0.01%, +0.01%
PreVGPRs: 313203 -> 313069 (-0.04%); split: -0.06%, +0.01%

Foz-DB Navi21:
Totals from 8895 (11.21% of 79377) affected shaders:
MaxWaves: 219280 -> 219272 (-0.00%); split: +0.00%, -0.01%
Instrs: 5393330 -> 5393366 (+0.00%); split: -0.00%, +0.00%
CodeSize: 29921900 -> 29922024 (+0.00%); split: -0.00%, +0.00%
VGPRs: 406664 -> 406688 (+0.01%); split: -0.00%, +0.01%
Latency: 35653975 -> 35652220 (-0.00%); split: -0.02%, +0.02%
InvThroughput: 7992134 -> 7992032 (-0.00%); split: -0.00%, +0.00%
SClause: 223784 -> 223786 (+0.00%)
Copies: 370984 -> 370983 (-0.00%)
PreVGPRs: 314323 -> 314330 (+0.00%); split: -0.01%, +0.01%
VALU: 3800023 -> 3800022 (-0.00%)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33417>
2025-02-08 17:31:18 +00:00
..
nir ac/nir/lower_ps: move exports after packing alu 2025-02-08 17:31:18 +00:00
virtio ac/virtio: add virtio-only AMDGPU_GEM_CREATE flag 2025-01-16 12:24:33 +00:00
.clang-format amd: import libdrm_amdgpu ioctl wrappers 2024-11-25 21:03:41 -05:00
ac_binary.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_binary.h ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00
ac_cmdbuf.c radv: fix fetching draw vertex data from counter buffers with transform feedback 2025-02-07 07:59:39 +00:00
ac_cmdbuf.h ac,radeonsi,radv: add common GFX preambles 2024-08-27 14:14:57 +00:00
ac_debug.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_debug.h ac: Add VCN IB parser 2024-09-23 19:25:08 +00:00
ac_descriptors.c ac/descriptors: allow to configure DCC for buffer descriptors 2025-01-30 08:18:22 +00:00
ac_descriptors.h ac/descriptors: allow to configure DCC for buffer descriptors 2025-01-30 08:18:22 +00:00
ac_drm_fourcc.h ac/surf: add more modifiers to gfx12 supported list 2024-12-16 07:35:06 +00:00
ac_fake_hw_db.h ac/fake_hw_db: deobfuscate GPU name strings 2025-01-29 07:20:02 +00:00
ac_formats.c ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_formats.h ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_gather_context_rolls.c ac: Improve context roll readability 2024-03-19 16:08:14 +00:00
ac_gpu_info.c ac/nir/ngg: Add and use a has_ngg_passthru_no_msg field to ac_gpu_info. 2025-01-30 15:26:45 +00:00
ac_gpu_info.h ac/nir/ngg: Add and use a has_ngg_passthru_no_msg field to ac_gpu_info. 2025-01-30 15:26:45 +00:00
ac_hw_stage.h amd: Move ac_hw_stage to its own file 2023-07-03 21:12:45 +00:00
ac_ib_parser.c ac/parse_ib: Replace the parameter list with ac_ib_parser 2024-03-19 16:08:13 +00:00
ac_linux_drm.c amd: add ac_drm_device_get_cookie 2025-01-22 14:55:56 +00:00
ac_linux_drm.h amd: add ac_drm_device_get_cookie 2025-01-22 14:55:56 +00:00
ac_msgpack.c ac: remove unused code 2024-12-26 10:12:43 +00:00
ac_msgpack.h ac: remove unused code 2024-12-26 10:12:43 +00:00
ac_parse_ib.c ac/parse_ib: Parse VCN IB_COMMON_OP_WRITEMEMORY 2024-12-27 08:17:16 +00:00
ac_perfcounter.c ac/perfcounter: fix buffer overflow 2024-11-08 13:31:02 +00:00
ac_perfcounter.h ac/perfcounter: compute the number of global instances of TCP,SQ,GL1C and GL2C 2023-09-14 14:17:19 +00:00
ac_pm4.c amd: use a valid size for ac_pm4_state allocation 2024-07-22 10:09:34 +00:00
ac_pm4.h ac,radeonsi import PM4 state from RadeonSI 2024-06-06 20:26:47 +00:00
ac_rgp.c ac/rgp: assume GFX11_5 use the same SQTT/RGP versions as GFX11 2024-07-17 16:25:19 +00:00
ac_rgp.h ac/rgp: update dumping queue event records to the capture 2023-11-13 08:53:09 +00:00
ac_rgp_elf_object_pack.c amd: Use align64 instead of ALIGN for 64 bit value parameter 2024-01-03 22:02:17 +00:00
ac_rtld.c ac/llvm: implement WA in nir to llvm 2024-06-20 13:14:33 +00:00
ac_rtld.h ac/llvm: implement WA in nir to llvm 2024-06-20 13:14:33 +00:00
ac_shader_args.c nir: change "user_data_amd" sysval from 4 to 8 components 2024-04-13 16:45:08 +00:00
ac_shader_args.h ac/nir: split local_invocation_ids to 3 separate VGPR inputs 2025-01-02 17:36:55 +00:00
ac_shader_debug_info.h amd: Add ac_shader_debug_info 2024-11-11 08:39:13 +00:00
ac_shader_util.c radeonsi: fix PS prolog not counting used fragcoord VGPRs correctly 2025-01-29 07:19:40 +00:00
ac_shader_util.h radeonsi: fix PS prolog not counting used fragcoord VGPRs correctly 2025-01-29 07:19:40 +00:00
ac_shadowed_regs.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_shadowed_regs.h amd: add a new helper that prints all non-shadowed regs 2023-06-17 23:42:21 +00:00
ac_spm.c ac/spm: do not abort when the SPM BO is too small 2024-10-29 18:33:17 +00:00
ac_spm.h ac/spm: do not abort when the SPM BO is too small 2024-10-29 18:33:17 +00:00
ac_sqtt.c ac/sqtt: update programming SQTT on GFX12 2025-01-20 23:50:10 +00:00
ac_sqtt.h ac/sqtt: update programming SQTT on GFX12 2025-01-20 23:50:10 +00:00
ac_surface.c ac/surface: always allow LINEAR modifier for color formats 2025-02-06 01:48:25 +00:00
ac_surface.h ac,radv,radeonsi: add new GFX12_DCC_WRITE_COMPRESS_DISABLE tiling flag 2025-02-03 21:12:07 +00:00
ac_surface_meta_address_test.c ac: add 'polaris12' gpu to ac_fake_hw_db 2024-11-08 13:31:02 +00:00
ac_surface_modifier_test.c amd: update addrlib 2024-12-26 21:02:21 +00:00
ac_uvd_dec.h radeonsi/uvd: Set decode target swizzle mode on GFX9 2025-01-17 08:53:05 +00:00
ac_vcn.h ac/vcn: allow sq signature package to be skipped 2024-11-29 10:01:49 +10:00
ac_vcn_av1_default.h ac/radeonsi: add av1 defaults header file from radeonsi 2023-06-16 05:53:44 +00:00
ac_vcn_dec.c ac/vcn_dec: Fix AV1 film grain on VCN5 2025-02-07 13:13:45 +00:00
ac_vcn_dec.h ac/vcn_dec: Fix AV1 film grain on VCN5 2025-02-07 13:13:45 +00:00
ac_vcn_enc.c ac: Add ac_vcn_init_enc_cmds 2024-09-20 06:58:29 +00:00
ac_vcn_enc.h radeonsi/vcn: Enable VCN4 AV1 encode WA 2024-11-01 14:05:04 +00:00
ac_vcn_enc_av1_default_cdf.h ac,radeonsi: move vcn enc av1 default cdf file to common 2023-09-14 07:51:24 +00:00
amd_family.c amd: drop support for LLVM 15, 16, 17 2025-02-01 04:22:30 +00:00
amd_family.h radeonsi/vcn: Add vcn_5_0_1 support 2025-01-16 23:41:28 +00:00
amd_kernel_code_t.h amd/common: add AMD_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32 property 2023-08-31 20:30:03 +00:00
gfx10_format_table.h amd/common: only pass gfx_level to ac_get_gfx10_format_table() 2024-05-22 08:31:39 +00:00
gfx10_format_table.py ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00
meson.build ac/nir/ngg: Move GS lowering to separate file. 2025-01-30 15:26:46 +00:00
sid.h ac,radeonsi: add SDMA DCC tiling for GFX12+ 2025-01-30 08:18:22 +00:00
sid_tables.py ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00