mesa/src/intel/compiler
Ian Romanick 2594fcadd4 brw: Split virtual GRFs again at the end of optimizations
Logical sends and load_payload can have large VGRFs that cannot be
split. Once all of the lowering passes and optimization passes that
might eliminate any of those instructions have completed, try to split
larger VGRFs one last time.

Register allocation can only handle VGRFs up to a certain size, so this
is the last opportunity to prevent later failures due to VGRFs that are
too large.

Closes: #13239

shader-db:

Lunar Lake, Meteor Lake, DG2, and Tiger Lake had similar results. (Lunar Lake shown)
total instructions in shared programs: 17114494 -> 17114496 (<.01%)
instructions in affected programs: 2790 -> 2792 (0.07%)
helped: 2 / HURT: 4

total cycles in shared programs: 886617364 -> 886315282 (-0.03%)
cycles in affected programs: 4067540 -> 3765458 (-7.43%)
helped: 48 / HURT: 9

Ice Lake and Skylake had similar restuls. (Ice Lake shown)
total instructions in shared programs: 20799801 -> 20799691 (<.01%)
instructions in affected programs: 1210 -> 1100 (-9.09%)
helped: 1 / HURT: 0

total cycles in shared programs: 865495386 -> 865498990 (<.01%)
cycles in affected programs: 60132 -> 63736 (5.99%)
helped: 2 / HURT: 1

total spills in shared programs: 3987 -> 3981 (-0.15%)
spills in affected programs: 24 -> 18 (-25.00%)
helped: 1 / HURT: 0

total fills in shared programs: 3535 -> 3519 (-0.45%)
fills in affected programs: 36 -> 20 (-44.44%)
helped: 1 / HURT: 0

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 208647246 -> 208646499 (-0.00%); split: -0.00%, +0.00%
Cycle count: 31257819536 -> 31263957016 (+0.02%); split: -0.02%, +0.04%
Max live registers: 66160877 -> 66155728 (-0.01%)

Totals from 34703 (4.91% of 707053) affected shaders:
Instrs: 13766639 -> 13765892 (-0.01%); split: -0.02%, +0.01%
Cycle count: 3693572086 -> 3699709566 (+0.17%); split: -0.15%, +0.32%
Max live registers: 4843852 -> 4838703 (-0.11%)

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36202>
2025-07-18 19:04:01 +00:00
..
elk intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
tests intel/compiler tests: fix path-to-string conversion 2025-06-23 08:26:29 +00:00
brw_analysis.cpp brw: Make brw_range use half-open ranges 2025-04-09 19:06:49 +00:00
brw_analysis.h brw: Track the largest VGRF size in liveness analysis 2025-04-11 20:34:51 +00:00
brw_analysis_def.cpp brw: Add basic infrastructure for load_reg pseudo op 2025-04-04 06:45:02 +00:00
brw_analysis_liveness.cpp brw: Track the largest VGRF size in liveness analysis 2025-04-11 20:34:51 +00:00
brw_analysis_performance.cpp intel/compiler: Add support for MSAA typed load/store messages 2025-03-07 23:06:14 +00:00
brw_asm.c brw: Rework label tracking in assembler 2025-03-06 17:06:20 -08:00
brw_asm.h brw: Fix size in assembler when compacting 2025-03-03 20:43:56 +00:00
brw_asm_internal.h brw: Rework label tracking in assembler 2025-03-06 17:06:20 -08:00
brw_asm_tool.c intel/compiler tests: fix variable type for getopt_long() return value 2025-06-23 08:26:29 +00:00
brw_builder.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
brw_builder.h brw: Handle bfloat16 dest and src0 operands for DPAS 2025-07-02 20:06:59 +00:00
brw_cfg.cpp brw: Remove adjust_block_ips and brw_inst::remove() with defer 2025-03-29 00:25:51 +00:00
brw_cfg.h brw: Remove adjust_block_ips and brw_inst::remove() with defer 2025-03-29 00:25:51 +00:00
brw_compile_bs.cpp brw: Don't use simd_select for BS shaders 2025-07-02 19:48:04 +00:00
brw_compile_cs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_fs.cpp brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
brw_compile_gs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_mesh.cpp intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_compile_tcs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_tes.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_vs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compiler.c intel: Add INTEL_DEBUG=no-vrt 2025-07-13 21:11:02 +00:00
brw_compiler.h intel: Add INTEL_DEBUG=no-vrt 2025-07-13 21:11:02 +00:00
brw_debug_recompile.c intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_device_sha1_gen_c.py intel/compiler: drop unused ray-tracing fields from cache hash 2024-03-22 00:01:28 +00:00
brw_disasm.c brw/disasm: Fix Gfx11 3src-instructions dst register disassembly 2025-07-08 19:49:09 +00:00
brw_disasm.h intel/brw: support for dumping shader line numbers 2025-04-08 19:39:53 +00:00
brw_disasm_info.cpp intel/brw: support for dumping shader line numbers 2025-04-08 19:39:53 +00:00
brw_disasm_info.h intel/brw: Rename fs_inst to brw_inst 2025-01-31 00:57:21 +00:00
brw_disasm_tool.c intel/brw: Remove Gfx8- code from disassembler 2024-02-28 05:45:38 +00:00
brw_eu.c intel/compiler: fix SHA generation for shader replace 2025-05-27 22:57:19 +00:00
brw_eu.h brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_eu_compact.c brw: Add support for GOTO/JOIN in the assembler 2025-03-06 17:06:20 -08:00
brw_eu_defines.h brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets 2025-06-22 10:55:24 +00:00
brw_eu_emit.c brw: fix non constant BTI accesses with offsets 2025-07-02 01:04:06 +03:00
brw_eu_inst.h brw: Add BRW_TYPE_BF for bfloat16 2025-03-25 05:23:37 +00:00
brw_eu_validate.c brw: Only apply GRF 127 send workaround to Gfx9 2025-07-15 19:35:42 +00:00
brw_from_nir.cpp intel/compiler: Drop unused param from set_memory_address 2025-07-14 03:46:21 +00:00
brw_generator.cpp brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_generator.h brw: factor out base prog_data setting 2025-02-22 08:30:22 +00:00
brw_gram.y brw: Add EU assembler support for bfloat16 2025-03-25 05:23:37 +00:00
brw_inst.cpp brw: Handle bfloat16 dest and src0 operands for DPAS 2025-07-02 20:06:59 +00:00
brw_inst.h brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_isa_info.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
brw_kernel.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_kernel.h intel: rework CL pre-compile 2025-01-25 03:28:07 +00:00
brw_lex.l brw: Add EU assembler support for bfloat16 2025-03-25 05:23:37 +00:00
brw_load_reg.cpp brw: Add passes to generate and lower load_reg 2025-04-04 06:45:02 +00:00
brw_lower.cpp intel/compiler: Centralize type stomping logic for Gen12.5 restrictions 2025-05-22 06:46:18 +00:00
brw_lower_dpas.cpp brw: Simplify brw_builder "insert before inst" constructor 2025-03-06 23:33:38 +00:00
brw_lower_integer_multiplication.cpp brw: Remove bblock_t parameters from various passes 2025-03-06 23:33:38 +00:00
brw_lower_logical_sends.cpp brw: fix non constant BTI accesses with offsets 2025-07-02 01:04:06 +03:00
brw_lower_pack.cpp brw: Simplify brw_builder "insert before inst" constructor 2025-03-06 23:33:38 +00:00
brw_lower_regioning.cpp brw: Consider bfloat16 in lower regioning pass 2025-04-29 16:29:37 +00:00
brw_lower_scoreboard.cpp brw: Fix comparison with unordered_mode when making baked dependency 2025-07-14 20:28:54 +00:00
brw_lower_simd_width.cpp brw/nir: add intrinsics to read attribute payload register indirectly 2025-05-08 06:48:35 +00:00
brw_lower_subgroup_ops.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
brw_nir.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir.h brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
brw_nir_analyze_ubo_ranges.c intel/compiler: take reg_unit size into account with ubo ranges 2025-01-07 21:38:06 +00:00
brw_nir_lower_alpha_to_coverage.c nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries 2025-06-26 18:20:54 +00:00
brw_nir_lower_cooperative_matrix.c brw: Use convert_cmat_intel intrinsic 2025-06-27 01:26:22 +00:00
brw_nir_lower_cs_intrinsics.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
brw_nir_lower_fs_barycentrics.c brw: add lowering passes for FS barycentric inputs 2025-05-20 20:57:59 +00:00
brw_nir_lower_fs_msaa.c brw/anv: add provoking vertex to fs_msaa_flags 2025-05-20 20:57:58 +00:00
brw_nir_lower_fsign.py intel/brw: Use range analysis to optimize fsign 2024-05-14 01:28:21 +00:00
brw_nir_lower_immediate_offsets.c brw: fix non constant BTI accesses with offsets 2025-07-02 01:04:06 +03:00
brw_nir_lower_intersection_shader.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_lower_ray_queries.c intel/rt: Update BVH instance leaf load for Xe3+ 2025-04-21 20:10:45 +00:00
brw_nir_lower_rt_intrinsics.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_lower_rt_intrinsics_pre_trace.c brw: add pre ray trace intrinsic moves 2025-05-06 13:34:53 +00:00
brw_nir_lower_sample_index_in_coord.c intel/compiler: Lower sample index into coord for MSRT messages 2025-03-07 23:06:14 +00:00
brw_nir_lower_shader_calls.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_lower_storage_image.c brw: implement read without format lowering 2025-06-06 12:28:42 +00:00
brw_nir_lower_texel_address.c intel/compiler: Use correct enum type 2025-03-13 20:11:10 +00:00
brw_nir_lower_texture.c brw: move texture offset packing to NIR 2025-03-29 02:15:18 +00:00
brw_nir_opt_fsat.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
brw_nir_rt.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_rt.h intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_rt_builder.h intel/rt: Update BVH instance leaf load for Xe3+ 2025-04-21 20:10:45 +00:00
brw_nir_trig_workarounds.py
brw_nir_wa_18019110168.c treewide: use VARYING_BIT_* 2025-07-04 19:01:04 +00:00
brw_opt.cpp brw: Split virtual GRFs again at the end of optimizations 2025-07-18 19:04:01 +00:00
brw_opt_address_reg_load.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
brw_opt_algebraic.cpp brw/algebraic: Convert some NOT to MOV 2025-04-28 19:44:23 +00:00
brw_opt_bank_conflicts.cpp brw: Only apply GRF 127 send workaround to Gfx9 2025-07-15 19:35:42 +00:00
brw_opt_cmod_propagation.cpp brw/cmod: Allow integer CMP to ADD propagation only for Z and NZ 2025-04-28 19:44:23 +00:00
brw_opt_combine_constants.cpp brw: Use brw_inst::block in Combine Constants 2025-03-06 23:33:38 +00:00
brw_opt_copy_propagation.cpp brw: Consider bfloat16 in copy propagation 2025-04-29 16:29:37 +00:00
brw_opt_cse.cpp brw: make HALT instruction act as barrier in new CSE pass 2025-04-29 20:28:24 +00:00
brw_opt_dead_code_eliminate.cpp brw: Remove adjust_block_ips and brw_inst::remove() with defer 2025-03-29 00:25:51 +00:00
brw_opt_register_coalesce.cpp brw: don't generate invalid instructions 2025-06-04 06:08:26 +00:00
brw_opt_saturate_propagation.cpp brw: Clean up saturate propagation after non-defs version removal 2025-04-09 19:06:48 +00:00
brw_opt_txf_combiner.cpp brw: Simplify brw_builder "insert before inst" constructor 2025-03-06 23:33:38 +00:00
brw_opt_virtual_grfs.cpp brw: Don't assert about MAX_VGRF_SIZE in brw_opt_split_virtual_grfs() 2025-04-11 20:34:51 +00:00
brw_packed_float.c
brw_prim.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
brw_print.cpp brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_private.h intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_reg.cpp intel/compiler: Use unreachable instead of assert(!"...") 2025-03-13 20:11:10 +00:00
brw_reg.h brw: add new helper for immediate integer register with type 2025-06-22 10:55:24 +00:00
brw_reg_allocate.cpp brw/reg_allocate: Don't access out of bounds in non-debug builds 2025-07-18 19:04:01 +00:00
brw_reg_type.c brw: Add BRW_TYPE_BF for bfloat16 2025-03-25 05:23:37 +00:00
brw_reg_type.h brw: Add BRW_TYPE_BF for bfloat16 2025-03-25 05:23:37 +00:00
brw_rt.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
brw_schedule_instructions.cpp brw: Use live->max_vgrf_size in pre-RA scheduling 2025-04-11 20:34:51 +00:00
brw_shader.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_shader.h brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
brw_simd_selection.cpp intel/brw/xe3+: Optimize CS/TASK/MESH compile time optimistically assuming SIMD32. 2025-01-29 23:39:32 +00:00
brw_spirv.c nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar() 2025-07-08 15:33:59 +00:00
brw_thread_payload.cpp brw: Embed at_end() inside brw_builder(brw_shader *) constructor 2025-03-06 23:33:38 +00:00
brw_thread_payload.h intel/brw: Rename fs_visitor to brw_shader 2025-02-11 09:13:28 +00:00
brw_validate.cpp brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets 2025-06-22 10:55:24 +00:00
brw_vue_map.c intel/compiler: use ffsll instead of ffsl in brw_vue_map.c 2025-05-11 00:50:21 +02:00
brw_workaround.cpp brw: fix Wa_22013689345 emission 2025-04-10 16:44:28 +00:00
intel_gfx_ver_enum.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
intel_nir.c intel/compiler: Use nir_split_conversions() 2025-04-07 17:45:21 -05:00
intel_nir.h anv: lower input vertices for TCS unconditionally 2025-05-08 06:48:34 +00:00
intel_nir_blockify_uniform_loads.c intel: replace RANGE_BASE by BASE for uniform block loads 2025-06-22 10:55:23 +00:00
intel_nir_clamp_image_1d_2d_array_sizes.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_clamp_per_vertex_loads.c anv: lower input vertices for TCS unconditionally 2025-05-08 06:48:34 +00:00
intel_nir_lower_non_uniform_barycentric_at_sample.c intel: switch to nir_metadata_divergence 2025-02-13 10:08:43 +00:00
intel_nir_lower_non_uniform_resource_intel.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_printf.c nir: drop printf_base_identifier 2025-02-05 20:33:15 +00:00
intel_nir_lower_shading_rate_output.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_sparse.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_opt_peephole_ffma.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_opt_peephole_imul32x16.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_tcs_workarounds.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
intel_shader_enums.h brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
meson.build anv/brw: move Wa_18019110168 handling to backend 2025-06-28 05:55:32 +00:00
test_eu_compact.cpp intel/brw: Rename brw_inst_* helpers to brw_eu_inst_* 2024-12-30 17:16:15 +00:00
test_eu_validate.cpp brw: Allow DPAS with BF on Gfx125 2025-04-14 18:23:43 +00:00
test_helpers.cpp brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_helpers.h brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_insert_load_reg.cpp brw: Add passes to generate and lower load_reg 2025-04-04 06:45:02 +00:00
test_lower_scoreboard.cpp brw: Fix comparison with unordered_mode when making baked dependency 2025-07-14 20:28:54 +00:00
test_opt_algebraic.cpp brw/algebraic: Don't optimize float SEL.CMOD to MOV 2025-04-15 23:59:31 +00:00
test_opt_cmod_propagation.cpp brw/cmod: Don't propagate from CMP to possible Inf + (-Inf) 2025-04-28 19:44:23 +00:00
test_opt_combine_constants.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
test_opt_copy_propagation.cpp brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_opt_cse.cpp brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_opt_register_coalesce.cpp brw: don't generate invalid instructions 2025-06-04 06:08:26 +00:00
test_opt_saturate_propagation.cpp brw/sat: Eliminate non-defs saturate propagation 2025-04-04 06:45:02 +00:00
test_simd_selection.cpp intel: Switch uint64_t intel_debug to a bitset 2025-04-22 23:09:26 +00:00
test_vf_float_conversions.cpp