mesa/src/intel/compiler
Kenneth Graunke 8bca7e520c intel/brw: Only force g0's liveness to be the whole program if spilling
We don't actually need to extend g0's live range to the EOT message
generally - most messages that end a shader are headerless.  The main
implicit use of g0 is for constructing scratch headers.  With the last
two patches, we now consider scratch access that may exist in the IR
and already extend the liveness appropriately.

There is one remaining problem: spilling.  The register allocator will
create new scratch messages when spilling a register, which need to
create scratch headers, which need g0.  So, every new spill or fill
might extend the live range of g0, which would create new interference,
altering the graph.  This can be problematic.

However, when compiling SIMD16 or SIMD32 fragment shaders, we don't
allow spilling anyway.  So, why not use allow g0?  Also, when trying
various scheduling modes, we first try allocation without spilling.
If it works, great, if not, we try a (hopefully) less aggressive
schedule, and only allow spilling on the lowest-pressure schedule.

So, even for regular SIMD8 shaders, we can potentially gain the use
of g0 on the first few tries at scheduling+allocation.

Once we try to allocate with spilling, we go back to reserving g0
for the entire program, so that we can construct scratch headers at
any point.  We could possibly do better here, but this is simple and
reliable with some benefit.

Thanks to Ian Romanick for suggesting I try this approach.

fossil-db on Alchemist shows some more spill/fill improvements:

   Totals:
   Instrs: 149062395 -> 149053010 (-0.01%); split: -0.01%, +0.00%
   Cycles: 12609496913 -> 12611652181 (+0.02%); split: -0.45%, +0.47%
   Spill count: 52891 -> 52471 (-0.79%)
   Fill count: 101599 -> 100818 (-0.77%)
   Scratch Memory Size: 3292160 -> 3197952 (-2.86%)

   Totals from 416541 (66.59% of 625484) affected shaders:
   Instrs: 124058587 -> 124049202 (-0.01%); split: -0.01%, +0.01%
   Cycles: 3567164271 -> 3569319539 (+0.06%); split: -1.61%, +1.67%
   Spill count: 420 -> 0 (-inf%)
   Fill count: 781 -> 0 (-inf%)
   Scratch Memory Size: 94208 -> 0 (-inf%)

Witcher 3 shows a 33% reduction in scratch memory size, for example.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30319>
2024-08-01 16:37:34 -07:00
..
elk intel/elk: Fix undefined left shift of negative value in elk_texture_offset 2024-07-26 17:18:08 -07:00
tests intel/brw: Remove assembler tests for Gfx8- 2024-02-24 02:10:56 +00:00
brw_asm.c intel/brw: Split off assembler logic into library 2024-07-12 19:34:23 +00:00
brw_asm.h intel/brw: Split off assembler logic into library 2024-07-12 19:34:23 +00:00
brw_asm_internal.h intel/brw: Split off assembler logic into library 2024-07-12 19:34:23 +00:00
brw_asm_tool.c intel/brw: Split off assembler logic into library 2024-07-12 19:34:23 +00:00
brw_cfg.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_cfg.h intel/brw: Add a idom_tree::dominates(a, b) helper. 2024-06-08 02:18:56 -07:00
brw_compile_bs.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compile_cs.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compile_fs.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compile_gs.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compile_mesh.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compile_tcs.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compile_tes.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compile_vs.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_compiler.c intel: Let compiler set indirect_ubos_use_sampler 2024-07-31 19:26:20 +00:00
brw_compiler.h intel/brw: Fix undefined shift by 64 of uint64_t in brw_compute_first_urb_slot_required 2024-07-26 17:17:15 -07:00
brw_dead_control_flow.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_debug_recompile.c intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_def_analysis.cpp intel/brw: Track the number of uses of each def in def_analysis 2024-06-18 09:02:25 +00:00
brw_device_sha1_gen_c.py intel/compiler: drop unused ray-tracing fields from cache hash 2024-03-22 00:01:28 +00:00
brw_disasm.c intel/disasm: Fix cache load/store disassembly for URB messages 2024-05-09 19:45:18 +00:00
brw_disasm.h intel/compiler: Merge intel_disasm.[ch] into corresponding brw files 2024-02-15 09:26:46 +00:00
brw_disasm_info.cpp intel/brw: Use fs_inst in disasm_annotate() 2024-02-29 21:14:13 -08:00
brw_disasm_info.h intel/brw: Use fs_inst in disasm_annotate() 2024-02-29 21:14:13 -08:00
brw_disasm_tool.c intel/brw: Remove Gfx8- code from disassembler 2024-02-28 05:45:38 +00:00
brw_eu.c intel/brw: Delete SAD2 and SADA2 opcodes 2024-06-10 16:47:50 -07:00
brw_eu.h intel/nir: add reloc delta to load_reloc_const_intel intrinsic 2024-05-15 13:13:38 +00:00
brw_eu_compact.c intel/brw: Fix undefined left shift of negative value in update_uip_jip 2024-07-26 17:17:53 -07:00
brw_eu_defines.h intel/brw: Make gl_SubgroupInvocation lane index loading SSA 2024-06-18 09:02:25 +00:00
brw_eu_emit.c intel/fs/gfx20+: Fix surface state address on extended descriptors for NIR scratch intrinsics. 2024-06-21 01:49:43 +00:00
brw_eu_validate.c intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs.cpp intel/brw: Only force g0's liveness to be the whole program if spilling 2024-08-01 16:37:34 -07:00
brw_fs.h intel/brw: Only force g0's liveness to be the whole program if spilling 2024-08-01 16:37:34 -07:00
brw_fs_bank_conflicts.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_builder.h intel/brw: Move VARYING_PULL_CONSTANT_LOAD from fs_visitor to fs_builder 2024-07-25 15:37:13 +00:00
brw_fs_cmod_propagation.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_combine_constants.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_fs_copy_propagation.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_cse.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_dead_code_eliminate.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_generator.cpp intel/brw: Record that SHADER_OPCODE_SCRATCH_HEADER uses g0 2024-08-01 16:37:31 -07:00
brw_fs_live_variables.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_live_variables.h intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_fs_lower.cpp intel/brw: Move VARYING_PULL_CONSTANT_LOAD from fs_visitor to fs_builder 2024-07-25 15:37:13 +00:00
brw_fs_lower_dpas.cpp intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_fs_lower_integer_multiplication.cpp intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_fs_lower_pack.cpp intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_fs_lower_regioning.cpp intel/brw: Disallow scalar byte to float conversions on DG2+ 2024-07-18 18:51:35 +00:00
brw_fs_lower_simd_width.cpp intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_fs_nir.cpp intel/brw: Handle 16-bit sampler return payloads 2024-07-31 21:26:46 +00:00
brw_fs_opt.cpp intel/brw: Replace some fs_reg constructors with functions 2024-07-03 02:53:18 +00:00
brw_fs_opt_algebraic.cpp intel/brw: Rename fs_reg_* helpers to brw_reg_* 2024-07-03 02:53:19 +00:00
brw_fs_opt_virtual_grfs.cpp intel/brw: allocate large table in the heap instead of the stack 2024-07-03 12:10:28 +00:00
brw_fs_reg_allocate.cpp intel/brw: Only force g0's liveness to be the whole program if spilling 2024-08-01 16:37:34 -07:00
brw_fs_register_coalesce.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_saturate_propagation.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_scoreboard.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_sel_peephole.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_fs_thread_payload.cpp intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_fs_validate.cpp intel/brw: Move out of fs_visitor and rename print instructions 2024-07-25 15:37:13 +00:00
brw_fs_visitor.cpp intel/brw: Move interp_reg and per_primitive_reg out of fs_visitor 2024-07-25 15:37:13 +00:00
brw_fs_workaround.cpp intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_gram.y intel/brw: Split off assembler logic into library 2024-07-12 19:34:23 +00:00
brw_inst.h intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_ir.h intel/brw: Fold backend_reg into fs_reg 2024-03-01 17:52:09 +00:00
brw_ir_allocator.h intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium 2023-08-03 22:00:15 +00:00
brw_ir_analysis.h
brw_ir_fs.h intel/brw: Move brw_reg helpers into brw_reg.h 2024-07-03 02:53:19 +00:00
brw_ir_performance.cpp intel/brw: Replace uses of fs_reg with brw_reg 2024-07-03 02:53:19 +00:00
brw_ir_performance.h intel/brw: Fold backend_shader into fs_visitor 2024-02-29 19:28:05 +00:00
brw_isa_info.h
brw_kernel.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
brw_kernel.h intel-clc: Use correct set of nir_options when building for Gfx8 2024-02-24 00:24:32 +00:00
brw_lex.l intel/brw: Split off assembler logic into library 2024-07-12 19:34:23 +00:00
brw_lower_logical_sends.cpp intel/brw: Record that SHADER_OPCODE_SCRATCH_HEADER uses g0 2024-08-01 16:37:31 -07:00
brw_nir.c intel/brw: Handle 16-bit sampler return payloads 2024-07-31 21:26:46 +00:00
brw_nir.h intel/nir: add printf lowering 2024-05-15 13:13:38 +00:00
brw_nir_analyze_ubo_ranges.c intel/brw/xe2: Update brw_nir_analyze_ubo_ranges to account for 512b physical registers 2024-04-01 00:00:03 +00:00
brw_nir_lower_alpha_to_coverage.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
brw_nir_lower_cooperative_matrix.c intel/brw/xe2+: Allow vec16 for cooperative matrix 2024-06-25 14:17:47 -07:00
brw_nir_lower_cs_intrinsics.c intel/brw: Reorganize lowering of LocalID/Index to handle Mesh/Task 2024-06-28 16:30:38 +00:00
brw_nir_lower_fsign.py intel/brw: Use range analysis to optimize fsign 2024-05-14 01:28:21 +00:00
brw_nir_lower_intersection_shader.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
brw_nir_lower_ray_queries.c intel/nir: only consider ray query variables in lowering 2024-02-24 12:56:30 +00:00
brw_nir_lower_rt_intrinsics.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
brw_nir_lower_shader_calls.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
brw_nir_lower_storage_image.c intel/brw: Remove Gfx8- code from lower storage image pass 2024-02-28 05:45:38 +00:00
brw_nir_rt.c treewide: use nir_def_replace sometimes 2024-06-21 15:36:56 +00:00
brw_nir_rt.h anv: support VK_PIPELINE_CREATE_RAY_TRACING_SKIP_* 2022-10-20 00:03:55 +00:00
brw_nir_rt_builder.h nir: Drop "SSA" from NIR language 2023-08-12 16:44:41 -04:00
brw_nir_trig_workarounds.py
brw_packed_float.c
brw_predicated_break.cpp intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_prim.h
brw_print.cpp intel/brw: Move out of fs_visitor and rename print instructions 2024-07-25 15:37:13 +00:00
brw_private.h intel/brw: fix subgroup size of geometry stages for lnl+ 2024-05-14 23:13:37 +00:00
brw_reg.h intel/brw: Fix undefined left shift of large UW value in brw_imm_uw 2024-07-26 17:17:56 -07:00
brw_reg_type.c intel/brw: Rename brw_reg_type_to_hw_type to brw_type_encode 2024-04-25 11:41:48 +00:00
brw_reg_type.h intel/brw: Make a helper for finding the largest of two types 2024-04-29 07:51:45 +00:00
brw_rt.h intel: Use ALIGN_POT instead of ALIGN inside macro define 2024-01-03 12:46:10 +00:00
brw_schedule_instructions.cpp intel/brw: Only force g0's liveness to be the whole program if spilling 2024-08-01 16:37:34 -07:00
brw_shader.cpp intel/brw: Move remaining compile stages to their own files 2024-07-25 15:37:13 +00:00
brw_simd_selection.cpp intel/brw: fix subgroup size of geometry stages for lnl+ 2024-05-14 23:13:37 +00:00
brw_vue_map.c intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
intel_clc.c intel/clc: Free disk_cache 2024-07-24 20:46:28 +00:00
intel_gfx_ver_enum.h intel/compiler: Rename brw_gfx_ver_enum.h to intel_gfx_ver_enum.h 2024-02-16 22:35:05 +00:00
intel_nir.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir.h intel/nir: add printf lowering 2024-05-15 13:13:38 +00:00
intel_nir_blockify_uniform_loads.c brw: blockify load_global_const_block_intel 2024-06-21 08:29:44 +00:00
intel_nir_clamp_image_1d_2d_array_sizes.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_clamp_per_vertex_loads.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_conversions.c intel/nir: Don't needlessly split u2f16 for nir_type_uint32 2024-07-11 02:37:05 -07:00
intel_nir_lower_non_uniform_barycentric_at_sample.c intel/compiler: Ensure load_barycentric_at_sample and load_interpolated_input remain together 2024-04-04 23:42:27 +00:00
intel_nir_lower_non_uniform_resource_intel.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_printf.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_shading_rate_output.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_sparse.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_texture.c intel/compiler: Pack texture LOD and offset to a single 32-bit value 2024-02-27 00:22:46 +00:00
intel_nir_opt_peephole_ffma.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_opt_peephole_imul32x16.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_tcs_workarounds.c intel/nir: Set src_type on TCS quads workaround store_output 2024-05-02 13:58:21 -07:00
intel_shader_enums.h intel/compiler: Use "intel" prefix for walk_order enum 2024-02-21 00:38:35 +00:00
meson.build intel/brw: Move printing functions to its own file 2024-07-25 15:37:13 +00:00
test_eu_compact.cpp intel/brw: Stop using long BRW_REGISTER_TYPE enum names 2024-04-25 11:41:48 +00:00
test_eu_validate.cpp intel/brw: Rename brw_reg_type_to_hw_type to brw_type_encode 2024-04-25 11:41:48 +00:00
test_fs_cmod_propagation.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
test_fs_combine_constants.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
test_fs_copy_propagation.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
test_fs_cse.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
test_fs_saturate_propagation.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
test_fs_scoreboard.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
test_predicated_break.cpp intel/brw: Move calculate_cfg out of fs_visitor 2024-07-25 15:37:13 +00:00
test_simd_selection.cpp intel: Remove brw_ prefix from process debug function 2024-02-16 22:35:05 +00:00
test_vf_float_conversions.cpp