mesa/src/intel/compiler
Jason Ekstrand a4b36cd3dd intel/fs: Coalesce when the src live range is contained in the dst
Consider the following case:

    // g119-123 are written somewhere above
    mul.sat(16)   g67<1>F    g6.4<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g69<1>F    g6.5<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g71<1>F    g6.6<0,1,0>F   g125<8,8,1>F
    mov(16)       g119<1>F   g67<8,8,1>F
    mov(16)       g121<1>F   g69<8,8,1>F
    mov(16)       g123<1>F   g71<8,8,1>F

We should be able to coalesce it into

    mul.sat(16)   g119<1>F   g6.4<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g121<1>F   g6.5<0,1,0>F   g125<8,8,1>F
    mul.sat(16)   g123<1>F   g6.6<0,1,0>F   g125<8,8,1>F

What's stopping us is an overly conservative check for writes to the two
registers being coalesced.  The check walks over the intersection of
their live ranges and checks for no writes to either one.  However,
because the register which starts the live range (the mul.sat in this
case) is inside that intersection, we flag it as a write in the
intersection and don't coalesce.  However, this case is safe because the
destination register of the copy is never read after the source is
written.

Shader-db changes on ICL:

    total instructions in shared programs: 16043613 -> 16042610 (<.01%)
    instructions in affected programs: 43036 -> 42033 (-2.33%)
    helped: 226
    HURT: 0
    helped stats (abs) min: 1 max: 30 x̄: 4.44 x̃: 4
    helped stats (rel) min: 0.09% max: 26.67% x̄: 4.89% x̃: 3.43%
    95% mean confidence interval for instructions value: -4.86 -4.02
    95% mean confidence interval for instructions %-change: -5.57% -4.22%
    Instructions are helped.

    total cycles in shared programs: 334766372 -> 334710124 (-0.02%)
    cycles in affected programs: 617548 -> 561300 (-9.11%)
    helped: 214
    HURT: 2
    helped stats (abs) min: 15 max: 1512 x̄: 263.21 x̃: 212
    helped stats (rel) min: 0.30% max: 75.36% x̄: 25.30% x̃: 21.58%
    HURT stats (abs)   min: 40 max: 40 x̄: 40.00 x̃: 40
    HURT stats (rel)   min: 0.15% max: 0.15% x̄: 0.15% x̃: 0.15%
    95% mean confidence interval for cycles value: -277.91 -242.90
    95% mean confidence interval for cycles %-change: -27.58% -22.55%
    Cycles are helped.

No spill/fill changes or gained/lost

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4627>
2020-04-21 01:00:24 +00:00
..
brw_cfg.cpp intel/compiler: Pass backend_shader * to cfg_t() 2020-03-09 04:44:12 +00:00
brw_cfg.h intel/cfg: Add first/last_block helpers 2020-04-17 14:48:06 +00:00
brw_clip.h
brw_clip_line.c
brw_clip_point.c
brw_clip_tri.c i965: Don't emit MOVs with undefined registers for Gen4 point clipping. 2018-02-28 15:03:51 -08:00
brw_clip_unfilled.c
brw_clip_util.c
brw_compile_clip.c intel/common: move gen_debug to intel/dev 2019-04-10 13:15:33 -07:00
brw_compile_sf.c intel/common: move gen_debug to intel/dev 2019-04-10 13:15:33 -07:00
brw_compiler.c nir, intel: Move use_scoped_memory_barrier to nir_options 2020-02-24 19:12:11 +00:00
brw_compiler.h anv: Emit pushed UBO bounds checking code in the back-end compiler 2020-04-17 14:48:06 +00:00
brw_dead_control_flow.cpp intel/compiler: Pass detailed dependency classes to invalidate_analysis() 2020-03-06 10:20:39 -08:00
brw_dead_control_flow.h
brw_debug_recompile.c intel/compiler: Add a "base class" for program keys 2019-07-10 19:35:55 +00:00
brw_disasm.c intel/gen12: Take into account opcode when decoding SWSB 2020-02-18 09:17:51 -08:00
brw_disasm_info.c intel/common: move gen_debug to intel/dev 2019-04-10 13:15:33 -07:00
brw_disasm_info.h
brw_eu.cpp intel/disasm: SEND has two sources on Gen12+ 2020-01-31 17:23:39 +00:00
brw_eu.h intel/fs,vec4: Properly account SENDs in IVB memory fence 2020-04-20 09:29:09 -07:00
brw_eu_compact.c intel/compiler: Handle invalid compacted immediates 2020-01-22 00:19:21 +00:00
brw_eu_defines.h intel/gen12: Take into account opcode when decoding SWSB 2020-02-18 09:17:51 -08:00
brw_eu_emit.c intel/fs,vec4: Properly account SENDs in IVB memory fence 2020-04-20 09:29:09 -07:00
brw_eu_util.c
brw_eu_validate.c intel/eu/validate: Don't validate regions of sends 2020-01-31 17:23:39 +00:00
brw_fs.cpp anv: Emit pushed UBO bounds checking code in the back-end compiler 2020-04-17 14:48:06 +00:00
brw_fs.h intel/compiler: Add support for variable workgroup size 2020-04-09 19:23:12 -07:00
brw_fs_bank_conflicts.cpp intel/fs/gen6: Constrain barycentric source of LINTERP during bank conflict mitigation. 2020-01-17 13:22:29 -08:00
brw_fs_builder.h intel/compiler: Fixup operands in fs_builder::emit() that takes array 2020-04-17 08:21:47 -07:00
brw_fs_cmod_propagation.cpp intel/compiler: fix cmod propagation optimisations 2020-03-11 21:21:25 +00:00
brw_fs_combine_constants.cpp intel/compiler: Move idom tree calculation and related logic into analysis object 2020-03-06 10:21:03 -08:00
brw_fs_copy_propagation.cpp intel/compiler: Only GE and L modifiers are commutative for SEL 2020-04-17 08:21:43 -07:00
brw_fs_cse.cpp intel/compiler/fs: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:57 -08:00
brw_fs_dead_code_eliminate.cpp intel/compiler/fs: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:57 -08:00
brw_fs_generator.cpp intel/fs,vec4: Properly account SENDs in IVB memory fence 2020-04-20 09:29:09 -07:00
brw_fs_live_variables.cpp intel/compiler: Silence unused parameter warning in fs_live_variables::setup_one_read 2020-04-17 08:21:40 -07:00
brw_fs_live_variables.h intel/compiler: Silence unused parameter warning in fs_live_variables::setup_one_read 2020-04-17 08:21:40 -07:00
brw_fs_lower_pack.cpp intel/compiler: Pass detailed dependency classes to invalidate_analysis() 2020-03-06 10:20:39 -08:00
brw_fs_lower_regioning.cpp intel/compiler: Pass detailed dependency classes to invalidate_analysis() 2020-03-06 10:20:39 -08:00
brw_fs_nir.cpp intel/compiler: Add support for variable workgroup size 2020-04-09 19:23:12 -07:00
brw_fs_reg_allocate.cpp intel/compiler: Mark some methods and parameters const 2020-03-09 04:44:11 +00:00
brw_fs_register_coalesce.cpp intel/fs: Coalesce when the src live range is contained in the dst 2020-04-21 01:00:24 +00:00
brw_fs_saturate_propagation.cpp intel/compiler/fs: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:57 -08:00
brw_fs_scoreboard.cpp intel/compiler: Silence unused parameter warning in update_inst_scoreboard 2020-04-17 08:21:42 -07:00
brw_fs_sel_peephole.cpp intel/compiler: Pass detailed dependency classes to invalidate_analysis() 2020-03-06 10:20:39 -08:00
brw_fs_validate.cpp intel: disable FS IR validation in release mode. 2018-10-15 18:10:27 -07:00
brw_fs_visitor.cpp intel/fs: Allow multiple slots for position 2020-04-07 17:16:09 +00:00
brw_gen_enum.h intel/compiler: Extract GEN_* macros into separate file 2020-01-22 00:19:20 +00:00
brw_inst.h intel/compiler: Fix array bounds warning on GCC 10. 2020-01-22 08:35:18 +01:00
brw_interpolation_map.c intel/compiler: Silence unused parameter warning in brw_interpolation_map.c 2019-03-06 08:35:36 -08:00
brw_ir.h intel/compiler: Move base IR definitions into a separate header file 2020-03-06 10:20:11 -08:00
brw_ir_allocator.h intel/ir: Don't allow allocating zero registers 2018-12-11 21:26:23 -06:00
brw_ir_analysis.h intel/compiler: Define more detailed analysis dependency classes 2020-03-06 10:20:37 -08:00
brw_ir_fs.h intel/fs: Rework fs_inst::is_copy_payload() into multiple classification helpers. 2020-01-17 13:21:19 -08:00
brw_ir_vec4.h intel/compiler: Mark some methods and parameters const 2020-03-09 04:44:11 +00:00
brw_nir.c intel/nir: Enable load/store vectorization 2020-04-03 20:26:54 +00:00
brw_nir.h intel/compiler: detect if atomic load store operations are used 2020-03-16 10:34:21 +00:00
brw_nir_analyze_boolean_resolves.c intel/fs: Mark source 0 of bcsel as needing Boolean resolve 2019-06-11 12:12:07 -07:00
brw_nir_analyze_ubo_ranges.c intel/compiler: Do not qsort zero sized array 2020-02-19 12:07:24 +02:00
brw_nir_attribute_workarounds.c nir/builder: Remove the use_fmov parameter from nir_swizzle 2019-05-24 08:38:11 -05:00
brw_nir_clamp_image_1d_2d_array_sizes.c intel: Implement Gen12 workaround for array textures of size 1 2020-01-26 22:27:03 +02:00
brw_nir_lower_alpha_to_coverage.c nir: Add alpha_to_coverage lowering pass 2019-10-21 11:27:29 -07:00
brw_nir_lower_conversions.c intel/compiler: add a NIR pass to lower conversions 2019-04-18 11:05:18 +02:00
brw_nir_lower_cs_intrinsics.c intel/compiler: Add support for variable workgroup size 2020-04-09 19:23:12 -07:00
brw_nir_lower_image_load_store.c intel/compiler: detect if atomic load store operations are used 2020-03-16 10:34:21 +00:00
brw_nir_lower_mem_access_bit_sizes.c anv: Improve brw_nir_lower_mem_access_bit_sizes 2020-04-03 20:26:54 +00:00
brw_nir_opt_peephole_ffma.c util: rename list_empty() to list_is_empty() 2019-10-28 11:24:38 +00:00
brw_nir_tcs_workarounds.c util: use C99 declaration in the for-loop set_foreach() macro 2018-10-25 12:43:18 +01:00
brw_nir_trig_workarounds.py intel/nir: do not apply the fsin and fcos trig workarounds for consts 2019-09-17 23:39:18 +03:00
brw_packed_float.c intel/compiler: Cast to target type before shifting left 2019-10-24 16:19:23 +02:00
brw_predicated_break.cpp intel/compiler: Pass detailed dependency classes to invalidate_analysis() 2020-03-06 10:20:39 -08:00
brw_reg.h Move compiler.h and imports.h/c from src/mesa/main into src/util 2020-03-27 21:00:09 +00:00
brw_reg_type.c intel/compiler: Handle invalid inputs to brw_reg_type_to_*() 2020-01-22 00:19:21 +00:00
brw_reg_type.h intel/compiler: Add a INVALID_{,HW_}REG_TYPE macros 2020-01-22 00:19:20 +00:00
brw_schedule_instructions.cpp intel/compiler: Mark visitor parameters to scheduler const 2020-03-09 04:44:12 +00:00
brw_shader.cpp intel/compiler: CSEL can do saturate 2020-04-17 08:21:46 -07:00
brw_shader.h intel/compiler: Mark some methods and parameters const 2020-03-09 04:44:11 +00:00
brw_vec4.cpp intel/compiler: Pass shader_stats for each SIMD mode 2020-03-09 04:44:12 +00:00
brw_vec4.h intel/compiler: Mark some methods and parameters const 2020-03-09 04:44:11 +00:00
brw_vec4_builder.h intel/compiler: Lower flrp32 on Gen11+ 2018-02-28 11:15:47 -08:00
brw_vec4_cmod_propagation.cpp intel/compiler: Pass detailed dependency classes to invalidate_analysis() 2020-03-06 10:20:39 -08:00
brw_vec4_copy_propagation.cpp intel/compiler: Pass detailed dependency classes to invalidate_analysis() 2020-03-06 10:20:39 -08:00
brw_vec4_cse.cpp intel/compiler/vec4: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:59 -08:00
brw_vec4_dead_code_eliminate.cpp intel/compiler/vec4: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:59 -08:00
brw_vec4_generator.cpp intel/fs,vec4: Properly account SENDs in IVB memory fence 2020-04-20 09:29:09 -07:00
brw_vec4_gs_nir.cpp intel/vec4: Drop all of the 64-bit varying code 2019-07-31 18:14:09 -05:00
brw_vec4_gs_visitor.cpp intel/fs: Allow multiple slots for position 2020-04-07 17:16:09 +00:00
brw_vec4_gs_visitor.h
brw_vec4_live_variables.cpp intel/compiler: Drop invalidate_live_intervals() 2020-03-06 10:21:01 -08:00
brw_vec4_live_variables.h intel/compiler/vec4: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:59 -08:00
brw_vec4_nir.cpp intel/vec4: fix valgrind errors with vf_values array 2020-02-07 09:06:18 +00:00
brw_vec4_reg_allocate.cpp intel/compiler/vec4: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:59 -08:00
brw_vec4_surface_builder.cpp intel/compiler: Re-prefix non-logical surface opcodes with VEC4 2019-02-28 16:58:20 -06:00
brw_vec4_surface_builder.h intel/vec4: Drop dead code for handling typed surface messages 2019-02-28 16:58:20 -06:00
brw_vec4_tcs.cpp intel/fs: Allow multiple slots for position 2020-04-07 17:16:09 +00:00
brw_vec4_tcs.h intel/compiler: Silence unused parameter warnings in vec4_tcs_visitor 2020-04-17 08:21:37 -07:00
brw_vec4_tes.cpp intel/vec4: Drop all of the 64-bit varying code 2019-07-31 18:14:09 -05:00
brw_vec4_tes.h
brw_vec4_visitor.cpp intel/compiler/vec4: Switch liveness analysis to IR analysis framework 2020-03-06 10:20:59 -08:00
brw_vec4_vs.h i965: Use NIR to lower legacy userclipping. 2019-07-24 18:00:13 +00:00
brw_vec4_vs_visitor.cpp i965: Use NIR to lower legacy userclipping. 2019-07-24 18:00:13 +00:00
brw_vue_map.c intel/fs: Allow multiple slots for position 2020-04-07 17:16:09 +00:00
brw_wm_iz.cpp intel: Use a system value for gl_FragCoord 2019-07-29 23:30:26 +00:00
gen6_gs_visitor.cpp intel/compiler: Prevent warnings in the following patch 2019-01-09 16:42:41 -08:00
gen6_gs_visitor.h
meson.build intel: drop unused include directories 2020-03-28 21:36:54 +01:00
test_eu_compact.cpp intel/compiler: Test compaction on Gen <= 12 2020-01-22 00:19:21 +00:00
test_eu_validate.cpp util: Remove tmp argument from BITSET_FOREACH_SET macro 2020-01-23 01:52:43 +00:00
test_fs_cmod_propagation.cpp intel/compiler: fix cmod propagation optimisations 2020-03-11 21:21:25 +00:00
test_fs_copy_propagation.cpp intel/compiler: Pass backend_shader * to cfg_t() 2020-03-09 04:44:12 +00:00
test_fs_saturate_propagation.cpp intel/compiler: Pass backend_shader * to cfg_t() 2020-03-09 04:44:12 +00:00
test_fs_scoreboard.cpp intel/compiler: Pass backend_shader * to cfg_t() 2020-03-09 04:44:12 +00:00
test_vec4_cmod_propagation.cpp i965/vec4: Silence unused parameter warnings in vec4 compiler tests 2018-12-17 13:47:06 -08:00
test_vec4_copy_propagation.cpp i965/vec4: Silence unused parameter warnings in vec4 compiler tests 2018-12-17 13:47:06 -08:00
test_vec4_dead_code_eliminate.cpp i965/vec4/dce: Don't narrow the write mask if the flags are used 2018-12-17 13:47:06 -08:00
test_vec4_register_coalesce.cpp i965/vec4: Silence unused parameter warnings in vec4 compiler tests 2018-12-17 13:47:06 -08:00
test_vf_float_conversions.cpp