mesa/src/intel/compiler
Ian Romanick fabe3ead57 i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1
Funny story... a single shader was hurt for instructions, spills, fills.
That same shader was also the most helped for cycles.  #GPUsAreWeird

No changes on any other Intel platform.

v2: Refactor selection of atomic opcode to a separate function.
Suggested by Jason.

Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14304116 -> 14304261 (<.01%)
instructions in affected programs: 12776 -> 12921 (1.13%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 2.32 x̃: 1
helped stats (rel) min: 0.05% max: 7.27% x̄: 0.92% x̃: 0.55%
HURT stats (abs)   min: 189 max: 189 x̄: 189.00 x̃: 189
HURT stats (rel)   min: 4.87% max: 4.87% x̄: 4.87% x̃: 4.87%
95% mean confidence interval for instructions value: -12.83 27.33
95% mean confidence interval for instructions %-change: -1.57% 0.31%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 527552861 -> 527531226 (<.01%)
cycles in affected programs: 1459195 -> 1437560 (-1.48%)
helped: 16
HURT: 2
helped stats (abs) min: 2 max: 21328 x̄: 1353.69 x̃: 6
helped stats (rel) min: 0.01% max: 5.29% x̄: 0.36% x̃: 0.03%
HURT stats (abs)   min: 12 max: 12 x̄: 12.00 x̃: 12
HURT stats (rel)   min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -3699.81 1295.92
95% mean confidence interval for cycles %-change: -0.94% 0.30%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8025 -> 8033 (0.10%)
spills in affected programs: 208 -> 216 (3.85%)
helped: 1
HURT: 1

total fills in shared programs: 10989 -> 11040 (0.46%)
fills in affected programs: 444 -> 495 (11.49%)
helped: 1
HURT: 1

Ivy Bridge
total instructions in shared programs: 11709181 -> 11709153 (<.01%)
instructions in affected programs: 3505 -> 3477 (-0.80%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 23 x̄: 9.33 x̃: 4
helped stats (rel) min: 0.11% max: 1.16% x̄: 0.63% x̃: 0.61%

total cycles in shared programs: 254741126 -> 254738801 (<.01%)
cycles in affected programs: 919067 -> 916742 (-0.25%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 2144 x̄: 775.00 x̃: 160
helped stats (rel) min: 0.03% max: 0.90% x̄: 0.32% x̃: 0.03%

total spills in shared programs: 4536 -> 4533 (-0.07%)
spills in affected programs: 40 -> 37 (-7.50%)
helped: 1
HURT: 0

total fills in shared programs: 4819 -> 4813 (-0.12%)
fills in affected programs: 94 -> 88 (-6.38%)
helped: 1
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-28 15:35:38 -07:00
..
.gitignore
brw_cfg.cpp intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution. 2017-12-07 18:27:05 -08:00
brw_cfg.h intel/compiler: consistently use ifndef guards over pragma once 2017-03-22 16:55:22 +00:00
brw_clip.h i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_line.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_point.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_tri.c i965: Don't emit MOVs with undefined registers for Gen4 point clipping. 2018-02-28 15:03:51 -08:00
brw_clip_unfilled.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_clip_util.c i965: Move clip program compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_compile_clip.c i965: Rewrite disassembly annotation code 2017-11-17 12:14:38 -08:00
brw_compile_sf.c i965: Move SF compilation to the compiler 2017-05-26 07:58:01 -07:00
brw_compiler.c intel/compiler: Add brw_get_compiler_config_value for disk cache 2018-08-01 23:49:16 -07:00
brw_compiler.h intel/compiler: Add brw_get_compiler_config_value for disk cache 2018-08-01 23:49:16 -07:00
brw_dead_control_flow.cpp
brw_dead_control_flow.h intel/compiler: consistently use ifndef guards over pragma once 2017-03-22 16:55:22 +00:00
brw_disasm.c intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_disasm_info.c intel/compiler: Silence unused parameter warnings 2018-08-22 20:31:32 -07:00
brw_disasm_info.h i965: Stop including brw_cfg.h in brw_disasm_info.h 2017-11-17 21:51:16 -08:00
brw_eu.c intel/eu: print bytes instead of 32 bit hex value 2018-08-27 11:07:39 -07:00
brw_eu.h intel/compiler: Silence unused parameter warnings in brw_eu.h 2018-08-28 15:35:38 -07:00
brw_eu_compact.c intel/compiler: Add instruction compaction support on Gen11 2018-02-28 11:15:47 -08:00
brw_eu_defines.h intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_eu_emit.c intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_eu_util.c intel/compiler: whitespace cleanups 2017-03-13 11:16:35 +00:00
brw_eu_validate.c intel/compiler: relax brw_eu_validate for byte raw movs 2018-07-10 00:14:49 +02:00
brw_fs.cpp intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_fs.h intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_fs_bank_conflicts.cpp i965/fs: unspills shoudn't use grf127 as dest since Gen8+ 2018-07-12 18:02:26 +02:00
brw_fs_builder.h intel/fs: Fix fs_builder::sample_mask_reg() for 32-wide FS dispatch. 2018-06-28 13:19:38 -07:00
brw_fs_cmod_propagation.cpp i965/fs: Propagate conditional modifiers from not instructions 2018-06-15 17:22:27 -07:00
brw_fs_combine_constants.cpp
brw_fs_copy_propagation.cpp intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_fs_cse.cpp intel/fs: Replace the CINTERP opcode with a simple MOV 2018-05-29 15:44:50 -07:00
brw_fs_dead_code_eliminate.cpp intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_fs_generator.cpp intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_fs_live_variables.cpp intel/fs: Restrict live intervals to the subset possibly reachable from any definition. 2017-12-07 18:27:04 -08:00
brw_fs_live_variables.h intel/fs: Restrict live intervals to the subset possibly reachable from any definition. 2017-12-07 18:27:04 -08:00
brw_fs_lower_conversions.cpp intel/compiler: fix lower conversions to account for predication 2018-07-27 14:48:29 +02:00
brw_fs_lower_pack.cpp
brw_fs_nir.cpp i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1 2018-08-28 15:35:38 -07:00
brw_fs_reg_allocate.cpp i965/fs: unspills shoudn't use grf127 as dest since Gen8+ 2018-07-12 18:02:26 +02:00
brw_fs_register_coalesce.cpp
brw_fs_saturate_propagation.cpp i965/fs: Handle negating immediates on MADs when propagating saturates 2017-11-21 10:13:07 -08:00
brw_fs_sel_peephole.cpp i965/fs: Do not move MOVs writing the flag outside of control flow 2017-07-20 16:56:49 -07:00
brw_fs_surface_builder.cpp intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_fs_surface_builder.h intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_fs_validate.cpp
brw_fs_visitor.cpp intel/fs: use uint type for per_slot_offset at GS 2018-07-09 15:28:48 +02:00
brw_inst.h intel/compiler: Expand untyped atomic message type field by a bit 2018-08-22 20:31:32 -07:00
brw_interpolation_map.c
brw_ir_allocator.h
brw_ir_fs.h intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot. 2018-05-29 15:44:50 -07:00
brw_ir_vec4.h intel/compiler: Add scheduler deps for instructions that implicitly read g0 2018-04-24 14:31:21 -04:00
brw_nir.c intel/nir: Enable nir_opt_find_array_copies 2018-08-23 21:47:51 -05:00
brw_nir.h intel/nir: Enable nir_opt_find_array_copies 2018-08-23 21:47:51 -05:00
brw_nir_analyze_boolean_resolves.c
brw_nir_analyze_ubo_ranges.c intel/compiler: Account for built-in uniforms in analyze_ubo_ranges 2018-07-23 15:28:17 -07:00
brw_nir_attribute_workarounds.c i965: Drop support for the legacy SNORM -> Float equation. 2018-01-02 16:51:42 -08:00
brw_nir_lower_cs_intrinsics.c i965/fs: Implement basic SPIR-V subgroup intrinsics 2018-03-07 12:13:47 -08:00
brw_nir_opt_peephole_ffma.c
brw_nir_tcs_workarounds.c nir: Get rid of nir_shader::stage 2017-10-20 12:49:17 -07:00
brw_nir_trig_workarounds.py python: Use the print function 2018-07-06 10:04:22 -07:00
brw_packed_float.c
brw_predicated_break.cpp
brw_reg.h intel/compiler: fix brw_imm_w for negative 16-bit integers 2018-05-03 11:40:25 +02:00
brw_reg_type.c intel/compiler: Check for unsupported register sizes. 2018-03-16 09:27:16 -07:00
brw_reg_type.h intel/compiler: Add Gen11+ native float type 2018-02-28 11:15:47 -08:00
brw_schedule_instructions.cpp intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_shader.cpp intel/compiler: Implement untyped atomic float min, max, and compare-swap dataport messages 2018-08-22 20:31:32 -07:00
brw_shader.h i965: Add negative_equals methods 2018-03-26 08:50:43 -07:00
brw_vec4.cpp intel/compiler: silence -Wclass-memaccess warnings 2018-07-18 08:29:51 -07:00
brw_vec4.h i965/vec4: Fix null destination register in 3-source instructions 2018-03-26 08:50:44 -07:00
brw_vec4_builder.h intel/compiler: Lower flrp32 on Gen11+ 2018-02-28 11:15:47 -08:00
brw_vec4_cmod_propagation.cpp i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible 2018-07-02 19:19:16 -07:00
brw_vec4_copy_propagation.cpp i965: Support copy propagating of untyped atomic surface indexes. 2017-09-26 15:35:14 -07:00
brw_vec4_cse.cpp i965/vec4: Allow CSE on subset VF constant loads 2018-03-08 15:26:26 -08:00
brw_vec4_dead_code_eliminate.cpp i965/vec4/dce: improve track of partial flag register writes 2017-04-14 14:56:09 -07:00
brw_vec4_generator.cpp intel/eu: Use descriptor constructors for dataport write messages. 2018-07-09 23:46:57 -07:00
brw_vec4_gs_nir.cpp i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_gs_visitor.cpp intel/compiler: Silence unused parameter warnings brw_nir.c 2018-07-02 16:17:19 -07:00
brw_vec4_gs_visitor.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_live_variables.cpp
brw_vec4_live_variables.h i965/vec4: consider subregister offset in live variables 2017-04-14 14:56:08 -07:00
brw_vec4_nir.cpp i965/vec4: Properly handle sign(-abs(x)) 2018-07-06 16:20:07 -07:00
brw_vec4_reg_allocate.cpp i965/vec4: Return float from spill_cost_for_type() 2017-08-21 14:45:44 -07:00
brw_vec4_surface_builder.cpp i965/vec4: Fix swizzles on atomic sources. 2017-09-26 15:35:11 -07:00
brw_vec4_surface_builder.h
brw_vec4_tcs.cpp intel/fs: Remove program key argument from generator. 2018-06-28 13:19:38 -07:00
brw_vec4_tcs.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_tes.cpp i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_tes.h i965/vec4: Delete the system value infastructure 2017-05-09 15:08:07 -07:00
brw_vec4_visitor.cpp compiler: int8/uint8 support 2018-03-14 10:08:42 -04:00
brw_vec4_vs.h i965: Drop support for the legacy SNORM -> Float equation. 2018-01-02 16:51:42 -08:00
brw_vec4_vs_visitor.cpp i965: Drop support for the legacy SNORM -> Float equation. 2018-01-02 16:51:42 -08:00
brw_vue_map.c
brw_wm_iz.cpp intel/fs: Extend thread payload layout to SIMD32 2018-06-28 13:19:38 -07:00
gen6_gs_visitor.cpp i965/gen6/gs: Handle case where a GS doesn't allocate VUE 2018-06-26 08:18:55 +02:00
gen6_gs_visitor.h
meson.build meson: Build with Python 3 2018-08-10 15:15:09 -07:00
test_eu_compact.cpp intel/ir: Fix invalid type aliasing with undefined behavior in test_eu_compact. 2018-02-27 11:42:39 -08:00
test_eu_validate.cpp intel/compiler: Readd ICL to test_eu_validate.cpp 2018-03-22 09:56:09 -07:00
test_fs_cmod_propagation.cpp i965/fs: Propagate conditional modifiers from compares to adds 2018-03-26 08:50:43 -07:00
test_fs_copy_propagation.cpp
test_fs_saturate_propagation.cpp i965/fs: Check ADD/MAD with immediates in satprop unit test 2017-11-21 10:13:07 -08:00
test_vec4_cmod_propagation.cpp i965/vec4: Don't cmod propagate from CMP to ADD if the writemask isn't compatible 2018-07-02 19:19:16 -07:00
test_vec4_copy_propagation.cpp
test_vec4_register_coalesce.cpp
test_vf_float_conversions.cpp