mesa/src/intel/compiler
Jason Ekstrand 69244fc72a intel/fs: Lower large local arrays to scratch
Shader-db results on Kaby Lake:

    total instructions in shared programs: 14929212 -> 14880028 (-0.33%)
    instructions in affected programs: 72428 -> 23244 (-67.91%)
    helped: 6
    HURT: 2
    helped stats (abs) min: 2165 max: 15981 x̄: 8590.00 x̃: 7624
    helped stats (rel) min: 56.06% max: 74.52% x̄: 67.55% x̃: 72.08%
    HURT stats (abs)   min: 1178 max: 1178 x̄: 1178.00 x̃: 1178
    HURT stats (rel)   min: 350.60% max: 361.35% x̄: 355.97% x̃: 355.97%
    95% mean confidence interval for instructions value: -11947.03 -348.97
    95% mean confidence interval for instructions %-change: -125.72% 202.37%
    Inconclusive result (%-change mean confidence interval includes 0).

    total cycles in shared programs: 368585300 -> 342557344 (-7.06%)
    cycles in affected programs: 28144921 -> 2116965 (-92.48%)
    helped: 6
    HURT: 2
    helped stats (abs) min: 1404978 max: 7766106 x̄: 4353922.00 x̃: 3890682
    helped stats (rel) min: 82.01% max: 95.57% x̄: 89.95% x̃: 92.28%
    HURT stats (abs)   min: 47778 max: 47798 x̄: 47788.00 x̃: 47788
    HURT stats (rel)   min: 278.20% max: 282.98% x̄: 280.59% x̃: 280.59%
    95% mean confidence interval for cycles value: -5900438.73 -606550.27
    95% mean confidence interval for cycles %-change: -140.79% 146.16%
    Inconclusive result (%-change mean confidence interval includes 0).

    total spills in shared programs: 9243 -> 8901 (-3.70%)
    spills in affected programs: 2718 -> 2376 (-12.58%)
    helped: 4
    HURT: 4

    total fills in shared programs: 21831 -> 10141 (-53.55%)
    fills in affected programs: 11804 -> 114 (-99.03%)
    helped: 6
    HURT: 2

    total sends in shared programs: 815912 -> 815912 (0.00%)
    sends in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    LOST:   1
    GAINED: 3

The helped shaders are all compute shaders in Aztec Ruins.  There is
also a compute shader in synmark2 OglCSDof that's helped but it doesn't
show up in above shader-db results because it went from SIMD8 to SIMD16.
That shader improves enough to yield an 15-20% performance boost to the
benchmark as a whole on my KBL laptop.  The hurt shaders are a couple
shaders in Kerbal Space Program and a couple in Aztec Ruins.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-11-11 17:17:02 +00:00
..
brw_cfg.cpp intel/ir: Represent physical edge of unconditional CONTINUE instruction. 2019-10-11 12:24:16 -07:00
brw_cfg.h intel/ir: Represent physical and logical subsets of the CFG. 2019-10-11 12:24:16 -07:00
brw_clip.h i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_line.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_point.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_tri.c i965: Don't emit MOVs with undefined registers for Gen4 point clipping. 2018-02-28 15:03:51 -08:00
brw_clip_unfilled.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_util.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_compile_clip.c intel/common: move gen_debug to intel/dev 2019-04-10 13:15:33 -07:00
brw_compile_sf.c intel/common: move gen_debug to intel/dev 2019-04-10 13:15:33 -07:00
brw_compiler.c intel/fs/gen12: Use TCS 8_PATCH mode. 2019-10-11 12:24:16 -07:00
brw_compiler.h intel/ir/gen12: Update assert in brw_stage_has_packed_dispatch(). 2019-10-11 12:24:16 -07:00
brw_dead_control_flow.cpp
brw_dead_control_flow.h intel/compiler: consistently use ifndef guards over pragma once 2017-03-22 16:55:22 +00:00
brw_debug_recompile.c intel/compiler: Add a "base class" for program keys 2019-07-10 19:35:55 +00:00
brw_disasm.c intel/compiler: Refactor disassembly of sources in 3src instruction 2019-10-21 20:32:43 -07:00
brw_disasm_info.c intel/common: move gen_debug to intel/dev 2019-04-10 13:15:33 -07:00
brw_disasm_info.h i965: Stop including brw_cfg.h in brw_disasm_info.h 2017-11-17 21:51:16 -08:00
brw_eu.cpp intel/eu/gen12: Add tracking of default SWSB state to the current brw_codegen instruction. 2019-10-11 12:24:16 -07:00
brw_eu.h intel/fs: Add DWord scattered read/write opcodes 2019-11-11 17:17:02 +00:00
brw_eu_compact.c intel/compiler: Add instruction compaction support on Gen12 2019-10-30 11:11:50 -07:00
brw_eu_defines.h intel/fs: Add DWord scattered read/write opcodes 2019-11-11 17:17:02 +00:00
brw_eu_emit.c intel/compiler: Set bits according to source file 2019-10-21 20:32:43 -07:00
brw_eu_util.c intel/compiler: whitespace cleanups 2017-03-13 11:16:35 +00:00
brw_eu_validate.c intel/compiler: Don't left-shift by >= the number of bits of the type 2019-10-24 16:16:49 +02:00
brw_fs.cpp intel/fs: Implement the new load/store_scratch intrinsics 2019-11-11 17:17:02 +00:00
brw_fs.h intel/fs: Implement the new load/store_scratch intrinsics 2019-11-11 17:17:02 +00:00
brw_fs_bank_conflicts.cpp i965/fs: unspills shoudn't use grf127 as dest since Gen8+ 2018-07-12 18:02:26 +02:00
brw_fs_builder.h intel/compiler: remove the operand restriction for src1 on GLK 2019-11-05 00:08:34 +00:00
brw_fs_cmod_propagation.cpp intel/fs: Allow cmod propagation across reads and writes of different flags 2019-06-05 17:03:45 -07:00
brw_fs_combine_constants.cpp intel/compiler: Don't move immediate in register 2019-10-21 20:32:43 -07:00
brw_fs_copy_propagation.cpp intel/fs/copy-prop: Don't walk all the ACPs for each instruction 2019-05-10 09:10:17 -05:00
brw_fs_cse.cpp Revert "intel/compiler: split is_partial_write() into two variants" 2019-04-25 09:19:10 +02:00
brw_fs_dead_code_eliminate.cpp intel/fs: Properly stride NULL replacement regs in DCE 2019-07-17 18:44:35 +00:00
brw_fs_generator.cpp intel/fs: Implement the new load/store_scratch intrinsics 2019-11-11 17:17:02 +00:00
brw_fs_live_variables.cpp intel/compiler: Fix C++ one definition rule violations 2019-10-28 12:02:40 +02:00
brw_fs_live_variables.h intel/compiler: Fix C++ one definition rule violations 2019-10-28 12:02:40 +02:00
brw_fs_lower_pack.cpp
brw_fs_lower_regioning.cpp intel/fs: Add an UNDEF instruction to avoid excess live ranges 2019-06-04 14:27:30 -05:00
brw_fs_nir.cpp intel/fs: Implement the new load/store_scratch intrinsics 2019-11-11 17:17:02 +00:00
brw_fs_reg_allocate.cpp intel/fs: Skip registers faster when setting spill costs 2019-06-04 14:37:56 +00:00
brw_fs_register_coalesce.cpp Revert "intel/compiler: split is_partial_write() into two variants" 2019-04-25 09:19:10 +02:00
brw_fs_saturate_propagation.cpp Revert "intel/compiler: split is_partial_write() into two variants" 2019-04-25 09:19:10 +02:00
brw_fs_scoreboard.cpp intel/fs/gen12: Demodernize software scoreboard lowering pass. 2019-10-11 12:24:16 -07:00
brw_fs_sel_peephole.cpp Revert "intel/compiler: split is_partial_write() into two variants" 2019-04-25 09:19:10 +02:00
brw_fs_validate.cpp intel: disable FS IR validation in release mode. 2018-10-15 18:10:27 -07:00
brw_fs_visitor.cpp intel/fs: Check for NULL key in fs_visitor constructor 2019-10-24 16:20:04 +02:00
brw_inst.h intel/compiler: Add instruction compaction support on Gen12 2019-10-30 11:11:50 -07:00
brw_interpolation_map.c intel/compiler: Silence unused parameter warning in brw_interpolation_map.c 2019-03-06 08:35:36 -08:00
brw_ir_allocator.h intel/ir: Don't allow allocating zero registers 2018-12-11 21:26:23 -06:00
brw_ir_fs.h intel/fs/gen12: Add scheduling information to the IR. 2019-10-11 12:24:16 -07:00
brw_ir_vec4.h intel: Don't propagate conditional modifiers if a UD source is negated 2018-10-10 13:13:12 -05:00
brw_nir.c intel/fs: Lower large local arrays to scratch 2019-11-11 17:17:02 +00:00
brw_nir.h intel/nir: Plumb devinfo through lower_mem_access_bit_sizes 2019-11-11 17:17:02 +00:00
brw_nir_analyze_boolean_resolves.c intel/fs: Mark source 0 of bcsel as needing Boolean resolve 2019-06-11 12:12:07 -07:00
brw_nir_analyze_ubo_ranges.c nir: Add explicit signs to image min/max intrinsics 2019-08-21 17:19:55 +00:00
brw_nir_attribute_workarounds.c nir/builder: Remove the use_fmov parameter from nir_swizzle 2019-05-24 08:38:11 -05:00
brw_nir_lower_alpha_to_coverage.c nir: Add alpha_to_coverage lowering pass 2019-10-21 11:27:29 -07:00
brw_nir_lower_conversions.c intel/compiler: add a NIR pass to lower conversions 2019-04-18 11:05:18 +02:00
brw_nir_lower_cs_intrinsics.c intel/fs: Don't loop when lowering CS intrinsics 2019-04-08 19:29:33 -07:00
brw_nir_lower_image_load_store.c nir: Add explicit signs to image min/max intrinsics 2019-08-21 17:19:55 +00:00
brw_nir_lower_mem_access_bit_sizes.c intel/fs: Implement the new load/store_scratch intrinsics 2019-11-11 17:17:02 +00:00
brw_nir_opt_peephole_ffma.c util: rename list_empty() to list_is_empty() 2019-10-28 11:24:38 +00:00
brw_nir_tcs_workarounds.c util: use C99 declaration in the for-loop set_foreach() macro 2018-10-25 12:43:18 +01:00
brw_nir_trig_workarounds.py intel/nir: do not apply the fsin and fcos trig workarounds for consts 2019-09-17 23:39:18 +03:00
brw_packed_float.c intel/compiler: Cast to target type before shifting left 2019-10-24 16:19:23 +02:00
brw_predicated_break.cpp intel/ir: Represent physical and logical subsets of the CFG. 2019-10-11 12:24:16 -07:00
brw_reg.h intel/compiler: Expand size of the 'nr' field 2019-01-09 16:42:41 -08:00
brw_reg_type.c intel/compiler: Remove unreachable() from brw_reg_type.c 2019-10-30 11:11:50 -07:00
brw_reg_type.h intel/compiler: add a brw_reg_type_is_integer helper 2019-04-18 11:05:18 +02:00
brw_schedule_instructions.cpp intel/fs: Add DWord scattered read/write opcodes 2019-11-11 17:17:02 +00:00
brw_shader.cpp intel/fs: Add DWord scattered read/write opcodes 2019-11-11 17:17:02 +00:00
brw_shader.h intel/nir: Take a nir_tex_instr and src index in brw_texture_offset 2019-04-14 22:25:56 +02:00
brw_vec4.cpp intel/fs: Drop the gl_program from fs_visitor 2019-08-25 01:02:52 -05:00
brw_vec4.h intel/compiler: Fill a compiler statistics struct 2019-08-12 22:56:07 +00:00
brw_vec4_builder.h intel/compiler: Lower flrp32 on Gen11+ 2018-02-28 11:15:47 -08:00
brw_vec4_cmod_propagation.cpp intel/compiler: use correct swizzle for replacement 2019-02-27 20:06:42 +00:00
brw_vec4_copy_propagation.cpp intel/compiler: Re-prefix non-logical surface opcodes with VEC4 2019-02-28 16:58:20 -06:00
brw_vec4_cse.cpp i965/vec4: Allow CSE on subset VF constant loads 2018-03-08 15:26:26 -08:00
brw_vec4_dead_code_eliminate.cpp i965/vec4/dce: Don't narrow the write mask if the flags are used 2018-12-17 13:47:06 -08:00
brw_vec4_generator.cpp intel/compiler: Report the number of non-spill/fill SEND messages on vec4 too 2019-10-30 21:27:03 -07:00
brw_vec4_gs_nir.cpp intel/vec4: Drop all of the 64-bit varying code 2019-07-31 18:14:09 -05:00
brw_vec4_gs_visitor.cpp intel/compiler: Fill a compiler statistics struct 2019-08-12 22:56:07 +00:00
brw_vec4_gs_visitor.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_live_variables.cpp intel/compiler: Fix C++ one definition rule violations 2019-10-28 12:02:40 +02:00
brw_vec4_live_variables.h intel/compiler: Fix C++ one definition rule violations 2019-10-28 12:02:40 +02:00
brw_vec4_nir.cpp intel/vec4: Set brw_stage_prog_data::has_ubo_pull 2019-10-30 16:05:57 +00:00
brw_vec4_reg_allocate.cpp intel/compiler: Prevent warnings in the following patch 2019-01-09 16:42:41 -08:00
brw_vec4_surface_builder.cpp intel/compiler: Re-prefix non-logical surface opcodes with VEC4 2019-02-28 16:58:20 -06:00
brw_vec4_surface_builder.h intel/vec4: Drop dead code for handling typed surface messages 2019-02-28 16:58:20 -06:00
brw_vec4_tcs.cpp intel/fs/gen12: Use TCS 8_PATCH mode. 2019-10-11 12:24:16 -07:00
brw_vec4_tcs.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_tes.cpp intel/vec4: Drop all of the 64-bit varying code 2019-07-31 18:14:09 -05:00
brw_vec4_tes.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_visitor.cpp intel/vec4: Delete vec4_visitor::emit_lrp 2019-07-08 11:30:11 -07:00
brw_vec4_vs.h i965: Use NIR to lower legacy userclipping. 2019-07-24 18:00:13 +00:00
brw_vec4_vs_visitor.cpp i965: Use NIR to lower legacy userclipping. 2019-07-24 18:00:13 +00:00
brw_vue_map.c intel/compiler: silence a warning of using different enum type 2019-06-25 10:09:22 +03:00
brw_wm_iz.cpp intel: Use a system value for gl_FragCoord 2019-07-29 23:30:26 +00:00
gen6_gs_visitor.cpp intel/compiler: Prevent warnings in the following patch 2019-01-09 16:42:41 -08:00
gen6_gs_visitor.h
meson.build nir: Add alpha_to_coverage lowering pass 2019-10-21 11:27:29 -07:00
test_eu_compact.cpp intel/eu: Encode and decode native instruction opcodes from/to IR opcodes. 2019-10-11 12:24:16 -07:00
test_eu_validate.cpp intel/eu/validate/gen12: Add TGL to eu_validate tests. 2019-10-30 14:08:51 -07:00
test_fs_cmod_propagation.cpp intel/fs: Drop the gl_program from fs_visitor 2019-08-25 01:02:52 -05:00
test_fs_copy_propagation.cpp intel/fs: Drop the gl_program from fs_visitor 2019-08-25 01:02:52 -05:00
test_fs_saturate_propagation.cpp intel/fs: Drop the gl_program from fs_visitor 2019-08-25 01:02:52 -05:00
test_fs_scoreboard.cpp intel/fs/gen12: Add tests for scoreboard pass 2019-10-17 10:02:35 -07:00
test_vec4_cmod_propagation.cpp i965/vec4: Silence unused parameter warnings in vec4 compiler tests 2018-12-17 13:47:06 -08:00
test_vec4_copy_propagation.cpp i965/vec4: Silence unused parameter warnings in vec4 compiler tests 2018-12-17 13:47:06 -08:00
test_vec4_dead_code_eliminate.cpp i965/vec4/dce: Don't narrow the write mask if the flags are used 2018-12-17 13:47:06 -08:00
test_vec4_register_coalesce.cpp i965/vec4: Silence unused parameter warnings in vec4 compiler tests 2018-12-17 13:47:06 -08:00
test_vf_float_conversions.cpp