mesa/src/amd/common
Rhys Perry 6dbf44ad9c ac/nir: allow less than one register of overfetch
This is to allow vectorization of 8/16-bit loads, which can later be
cheaply lowered to a 32-bit load.

fossil-db (gfx1201):
Totals from 178 (0.22% of 79377) affected shaders:
MaxWaves: 4138 -> 4102 (-0.87%)
Instrs: 619714 -> 617917 (-0.29%); split: -0.32%, +0.03%
CodeSize: 3364396 -> 3352724 (-0.35%); split: -0.38%, +0.03%
VGPRs: 12896 -> 12980 (+0.65%); split: -0.19%, +0.84%
SpillSGPRs: 546 -> 545 (-0.18%)
Latency: 7589585 -> 7406076 (-2.42%); split: -2.45%, +0.04%
InvThroughput: 1926356 -> 1879866 (-2.41%); split: -2.42%, +0.00%
VClause: 12301 -> 11750 (-4.48%)
SClause: 13614 -> 13583 (-0.23%); split: -0.45%, +0.22%
Copies: 82207 -> 82265 (+0.07%); split: -0.10%, +0.17%
Branches: 19284 -> 19266 (-0.09%)
PreSGPRs: 9525 -> 9457 (-0.71%)
PreVGPRs: 12366 -> 12421 (+0.44%)
VALU: 347928 -> 348020 (+0.03%); split: -0.01%, +0.04%
SALU: 82620 -> 82519 (-0.12%); split: -0.19%, +0.07%
VMEM: 22248 -> 21430 (-3.68%)
SMEM: 17951 -> 17843 (-0.60%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34162>
2025-05-08 13:30:50 +00:00
..
nir ac/nir: allow less than one register of overfetch 2025-05-08 13:30:50 +00:00
virtio ac/virtio: add virtio-only AMDGPU_GEM_CREATE flag 2025-01-16 12:24:33 +00:00
.clang-format amd: import libdrm_amdgpu ioctl wrappers 2024-11-25 21:03:41 -05:00
ac_binary.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_binary.h ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00
ac_cmdbuf.c amd: stop using CLEAR_STATE on gfx11 2025-05-02 18:40:11 +00:00
ac_cmdbuf.h ac/cmdbuf: rework CB/DB cache controls for better perf 2025-03-06 21:10:49 +00:00
ac_debug.c amd: add initial common code for gfx12 2024-05-11 22:14:05 -04:00
ac_debug.h ac: Add VCN IB parser 2024-09-23 19:25:08 +00:00
ac_descriptors.c ac/descriptors: allow to configure DCC for buffer descriptors 2025-01-30 08:18:22 +00:00
ac_descriptors.h ac/descriptors: allow to configure DCC for buffer descriptors 2025-01-30 08:18:22 +00:00
ac_drm_fourcc.h ac/surf: add more modifiers to gfx12 supported list 2024-12-16 07:35:06 +00:00
ac_fake_hw_db.h ac: define physical VGPRs for fake hw overrides 2025-04-07 19:44:22 +00:00
ac_formats.c ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_formats.h ac,radeonsi: add ac_is_reduction_mode_supported() 2024-07-10 07:57:42 +00:00
ac_gather_context_rolls.c ac: Improve context roll readability 2024-03-19 16:08:14 +00:00
ac_gpu_info.c radv: Return VK_ERROR_INCOMPATIBLE_DRIVER for unsupported devices 2025-05-07 08:26:33 +02:00
ac_gpu_info.h radv: Return VK_ERROR_INCOMPATIBLE_DRIVER for unsupported devices 2025-05-07 08:26:33 +02:00
ac_hw_stage.h amd: Move ac_hw_stage to its own file 2023-07-03 21:12:45 +00:00
ac_ib_parser.c ac/parse_ib: Replace the parameter list with ac_ib_parser 2024-03-19 16:08:13 +00:00
ac_linux_drm.c winsys/amdgpu: Add support for queue priority in Mesa 2025-05-08 04:29:29 +00:00
ac_linux_drm.h winsys/amdgpu: Add support for queue priority in Mesa 2025-05-08 04:29:29 +00:00
ac_msgpack.c ac: remove unused code 2024-12-26 10:12:43 +00:00
ac_msgpack.h ac: remove unused code 2024-12-26 10:12:43 +00:00
ac_parse_ib.c ac/parse_ib: Parse VCN DYNAMIC_REFLIST_BUFFER 2025-03-29 08:50:49 +00:00
ac_perfcounter.c ac/perfcounter: add support for GFX12 2025-04-16 06:35:33 +00:00
ac_perfcounter.h ac/perfcounter: compute the number of global instances of TCP,SQ,GL1C and GL2C 2023-09-14 14:17:19 +00:00
ac_pm4.c amd: use a valid size for ac_pm4_state allocation 2024-07-22 10:09:34 +00:00
ac_pm4.h ac,radeonsi import PM4 state from RadeonSI 2024-06-06 20:26:47 +00:00
ac_rgp.c ac/rgp: bump instrumentation API version to 1.5 2025-03-14 08:20:57 +00:00
ac_rgp.h ac/rgp: update dumping queue event records to the capture 2023-11-13 08:53:09 +00:00
ac_rgp_elf_object_pack.c amd: Use align64 instead of ALIGN for 64 bit value parameter 2024-01-03 22:02:17 +00:00
ac_rtld.c ac/llvm: implement WA in nir to llvm 2024-06-20 13:14:33 +00:00
ac_rtld.h ac/llvm: implement WA in nir to llvm 2024-06-20 13:14:33 +00:00
ac_shader_args.c ac: Don't include full nir.h anymore. 2025-02-12 22:33:07 +01:00
ac_shader_args.h ac/nir: split local_invocation_ids to 3 separate VGPR inputs 2025-01-02 17:36:55 +00:00
ac_shader_debug_info.h amd: Add ac_shader_debug_info 2024-11-11 08:39:13 +00:00
ac_shader_util.c ac: adjust maximum HS workgroup size 2025-05-08 02:54:13 +00:00
ac_shader_util.h ac,radeonsi: rework computing scratch wavesize and tmpring register 2025-04-17 10:35:40 +00:00
ac_shadowed_regs.c ac: remove gfx11_emulate_clear_state 2025-05-02 18:40:11 +00:00
ac_shadowed_regs.h ac,radv,radeonsi: use PM4 for shadowed registers 2025-03-28 20:50:22 +00:00
ac_spm.c radv: print more error messages during SPM initialization 2025-04-16 06:35:33 +00:00
ac_spm.h ac/spm: do not abort when the SPM BO is too small 2024-10-29 18:33:17 +00:00
ac_sqtt.c ac/sqtt: fix registers programming for GFX12 2025-03-14 08:20:57 +00:00
ac_sqtt.h ac/sqtt: fix registers programming for GFX12 2025-03-14 08:20:57 +00:00
ac_surface.c ac/surface: select 3D tile mode without overallocating too much for gfx6-8 2025-04-16 06:08:48 +00:00
ac_surface.h ac/nir: Move surface related NIR functions to separate file. 2025-02-12 22:33:07 +01:00
ac_surface_meta_address_test.c ac: add 'polaris12' gpu to ac_fake_hw_db 2024-11-08 13:31:02 +00:00
ac_surface_modifier_test.c amd: update addrlib 2024-12-26 21:02:21 +00:00
ac_uvd_dec.h radeonsi/uvd: Set decode target swizzle mode on GFX9 2025-01-17 08:53:05 +00:00
ac_vcn.h ac/vcn: allow sq signature package to be skipped 2024-11-29 10:01:49 +10:00
ac_vcn_av1_default.h ac/radeonsi: add av1 defaults header file from radeonsi 2023-06-16 05:53:44 +00:00
ac_vcn_dec.c ac/vcn_dec: Fix AV1 film grain on VCN5 2025-02-07 13:13:45 +00:00
ac_vcn_dec.h radeonsi/vcn: Add UDT support for VCN5 2025-02-26 13:07:10 +00:00
ac_vcn_enc.c ac: Add ac_vcn_init_enc_cmds 2024-09-20 06:58:29 +00:00
ac_vcn_enc.h radeonsi/vcn: Enable VCN4 AV1 encode WA 2024-11-01 14:05:04 +00:00
ac_vcn_enc_av1_default_cdf.h ac,radeonsi: move vcn enc av1 default cdf file to common 2023-09-14 07:51:24 +00:00
amd_family.c radv: add experimental support for AMD BC-250 board 2025-03-04 08:07:31 +00:00
amd_family.h ac: Add rt_version 2025-04-17 20:20:40 +00:00
gfx10_format_table.h amd/common: only pass gfx_level to ac_get_gfx10_format_table() 2024-05-22 08:31:39 +00:00
gfx10_format_table.py ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00
meson.build ac/nir: Move surface related NIR functions to separate file. 2025-02-12 22:33:07 +01:00
sid.h ac,radeonsi: define all SDMA DCC fields & use them, enable compressed writes 2025-03-06 21:10:54 +00:00
sid_tables.py ac,radeonsi,winsyses: switch to SPDX-License-Identifier: MIT 2023-05-24 21:48:19 +00:00