mesa/src/amd/common/nir
Rhys Perry 8829fc3bd6 amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering
Summary of changes:
- handle unaligned 16-bit scalar loads when supported_dword=true
- increases the size of 8/16/32/64-bit buffer loads which are not dword
  aligned, which can create less SMEM loads.
- handles when "bytes" is less than "bit_size / 8"

fossil-db (gfx1201):
Totals from 26 (0.03% of 79839) affected shaders:
Instrs: 12676 -> 12710 (+0.27%); split: -0.30%, +0.57%
CodeSize: 67272 -> 67384 (+0.17%); split: -0.24%, +0.40%
Latency: 44399 -> 44375 (-0.05%); split: -0.09%, +0.04%
SClause: 352 -> 344 (-2.27%)
SALU: 3972 -> 3992 (+0.50%)
SMEM: 554 -> 528 (-4.69%)

fossil-db (navi21):
Totals from 6 (0.01% of 79825) affected shaders:
Instrs: 2192 -> 2186 (-0.27%)
CodeSize: 12188 -> 12140 (-0.39%)
Latency: 10037 -> 10033 (-0.04%); split: -0.12%, +0.08%
SMEM: 124 -> 118 (-4.84%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: fbf0399517 ("amd/lower_mem_access_bit_sizes: lower all SMEM instructions to supported sizes")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>
2025-10-21 22:10:34 +00:00
..
ac_nir.c treewide: use nir_load_global alias of nir_build_load_global 2025-10-21 12:37:58 +02:00
ac_nir.h amd: keep ac_shader_config::lds_size unaligned 2025-10-15 11:20:09 +00:00
ac_nir_create_gs_copy_shader.c ac/nir: set subgroup size for gs copy shader 2025-09-14 13:21:21 +00:00
ac_nir_cull.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
ac_nir_helpers.h ac/nir: remove pack_clip_cull_distances option 2025-07-12 10:28:21 +00:00
ac_nir_lower_esgs_io_to_mem.c ac/nir: mark all input loads as reorderable and speculatable (for LICM) 2025-07-24 06:31:16 +00:00
ac_nir_lower_global_access.c ac/nir_lower_global_access: don't assume pack_64_2x32 is the same as u2u64 2025-10-08 08:53:58 +00:00
ac_nir_lower_image_opcodes_cdna.c treewide: simplify nir_def_rewrite_uses_after 2025-08-01 15:34:24 +00:00
ac_nir_lower_intrinsics_to_args.c treewide: use nir_def_as_* 2025-08-01 15:34:24 +00:00
ac_nir_lower_legacy_gs.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
ac_nir_lower_legacy_vs.c ac/nir: remove pack_clip_cull_distances option 2025-07-12 10:28:21 +00:00
ac_nir_lower_mem_access_bit_sizes.c amd/lower_mem_access_bit_sizes: improve subdword/unaligned SMEM lowering 2025-10-21 22:10:34 +00:00
ac_nir_lower_ngg.c all: rename gl_shader_stage to mesa_shader_stage 2025-08-06 10:28:40 +08:00
ac_nir_lower_ngg_gs.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
ac_nir_lower_ngg_mesh.c ac/nir/ngg_mesh: Lower num_subgroups to constant 2025-10-20 14:05:40 +00:00
ac_nir_lower_ps_early.c treewide: use nir_def_as_* 2025-08-01 15:34:24 +00:00
ac_nir_lower_ps_late.c ac/nir/lower_ps: remove barrier for end_invocation_interlock 2025-08-04 09:30:06 +00:00
ac_nir_lower_resinfo.c nir: Use nir_def_as_* in more places 2025-08-24 14:03:09 +00:00
ac_nir_lower_sin_cos.c
ac_nir_lower_taskmesh_io_to_mem.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
ac_nir_lower_tess_io_to_mem.c amd: keep ac_shader_config::lds_size unaligned 2025-10-15 11:20:09 +00:00
ac_nir_lower_tex.c ac/nir: fix progress reporting in ac_nir_lower_tex 2025-09-24 08:20:27 +00:00
ac_nir_meta.h ac/nir/meta: allow compute blits with R5G6B5 & R5G5B5A1 formats on GFX9+ 2025-08-07 18:12:52 +00:00
ac_nir_meta_cs_blit.c ac/nir/meta: allow compute blits with R5G6B5 & R5G5B5A1 formats on GFX9+ 2025-08-07 18:12:52 +00:00
ac_nir_meta_cs_clear_copy_buffer.c build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
ac_nir_meta_ps_resolve.c treewide: use nir_def_as_* 2025-08-01 15:34:24 +00:00
ac_nir_opt_outputs.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
ac_nir_opt_pack_half.c
ac_nir_opt_shared_append.c
ac_nir_prerast_utils.c ac/nir: rename ac_nir_get_lds_gs_out_slot_offset -> ac_nir_get_gs_out_lds_offset 2025-07-12 10:28:21 +00:00
ac_nir_surface.c ac/nir: Move surface related NIR functions to separate file. 2025-02-12 22:33:07 +01:00
ac_nir_surface.h ac/nir: Move surface related NIR functions to separate file. 2025-02-12 22:33:07 +01:00