mesa/src/intel/compiler
Ian Romanick 0c089a5c32 brw: Eliminate duplicate fills
When the register allocator decides to spill a value, all reads of that
value are filled. This can result in cases where the same value is
filled many times in a single block. In those cases, the result of an
earlier fill may still be available when a later fill occurs.

This optimization replaces the later fill with a move from the result of
the earlier fill.

v2: Use FIXED_GRF for register overlap tests. Since this is after
register allocation, the VGRF values will not tell the whole truth.

v3: Use brw_transform_inst. Suggested by Caio. Add
brw_scratch_inst::offset instead of storing it as a source. Suggested by
Lionel.

v4: In intervening spill to the same location also invalidates the
value. 🤦

v5: Don't eliminate a fill if its destination partially overlaps the
preceeding fill destination. Fixes failures in cooperative matrix CTS.

shader-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
total instructions in shared programs: 17249903 -> 17249653 (<.01%)
instructions in affected programs: 35550 -> 35300 (-0.70%)
helped: 20 / HURT: 0

total cycles in shared programs: 893092398 -> 893101836 (<.01%)
cycles in affected programs: 2501720 -> 2511158 (0.38%)
helped: 6 / HURT: 14

total fills in shared programs: 1901 -> 1776 (-6.58%)
fills in affected programs: 1757 -> 1632 (-7.11%)
helped: 20 / HURT: 0

fossil-db:

Lunar Lake, Meteor Lake, and DG2 had similar results. (Lunar Lake shown)
Totals:
Instrs: 929949528 -> 926770338 (-0.34%)
Cycle count: 105126671329 -> 104851299099 (-0.26%); split: -0.28%, +0.02%
Fill count: 6520785 -> 5021518 (-22.99%)

Totals from 54281 (2.69% of 2018922) affected shaders:
Instrs: 239616289 -> 236437099 (-1.33%)
Cycle count: 22051883404 -> 21776511174 (-1.25%); split: -1.33%, +0.08%
Fill count: 6406295 -> 4907028 (-23.40%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37827>
2025-11-26 17:20:13 +00:00
..
brw brw: Eliminate duplicate fills 2025-11-26 17:20:13 +00:00
elk nir: rename nir_lower_indirect_derefs -> nir_lower_indirect_derefs_to_if_else_trees 2025-11-20 05:42:11 +00:00
brw_device_sha1_gen_c.py meson: remove '--outdir' argument in script 2025-10-08 20:51:20 +00:00
brw_list.h intel: fork exec_node/list -> brw_exec_node/list as a private Intel utility 2025-07-31 20:23:02 +00:00
intel_gfx_ver_enum.h build: avoid redefining unreachable() which is standard in C23 2025-07-31 17:49:42 +00:00
intel_nir.c intel/compiler: Use nir_split_conversions() 2025-04-07 17:45:21 -05:00
intel_nir.h brw: add support for separate tessellation shader compilation 2025-09-05 07:46:17 +00:00
intel_nir_blockify_uniform_loads.c intel/nir_blockify_uniform_loads: use helpers 2025-10-09 09:50:20 -04:00
intel_nir_clamp_image_1d_2d_array_sizes.c treewide: simplify nir_def_rewrite_uses_after 2025-08-01 15:34:24 +00:00
intel_nir_clamp_per_vertex_loads.c brw: add support for separate tessellation shader compilation 2025-09-05 07:46:17 +00:00
intel_nir_lower_non_uniform_barycentric_at_sample.c treewide: add & use parent instr helpers 2025-11-12 21:22:13 +00:00
intel_nir_lower_non_uniform_resource_intel.c treewide: add & use parent instr helpers 2025-11-12 21:22:13 +00:00
intel_nir_lower_printf.c intel: Move intel_shader_reloc to common code and drop elk_shader_reloc 2025-10-09 07:01:46 +00:00
intel_nir_lower_shading_rate_output.c treewide: simplify nir_def_rewrite_uses_after 2025-08-01 15:34:24 +00:00
intel_nir_lower_sparse.c treewide: simplify nir_def_rewrite_uses_after 2025-08-01 15:34:24 +00:00
intel_nir_opt_peephole_ffma.c treewide: add & use parent instr helpers 2025-11-12 21:22:13 +00:00
intel_nir_opt_peephole_imul32x16.c treewide: add & use parent instr helpers 2025-11-12 21:22:13 +00:00
intel_nir_tcs_workarounds.c nir: make nir_block::predecessors & dom_frontier sets non-malloc'd 2025-08-21 06:13:48 +00:00
intel_prim.h intel: Re-unify brw_prim.h and elk_prim.h 2025-10-09 07:01:46 +00:00
intel_shader_enums.h intel: Move intel_shader_reloc to common code and drop elk_shader_reloc 2025-10-09 07:01:46 +00:00
meson.build brw: Move into a new src/intel/compiler/brw subdirectory 2025-10-09 07:01:47 +00:00