mesa/src/intel/compiler
Ian Romanick 535caaf3e0 nir: Optimize uniform iadd, fadd, and ixor reduction operations
This adds optimizations for iadd, fadd, and ixor with reduce,
inclusive scan, and exclusive scan.

NOTE: The fadd and ixor optimizations had no shader-db or fossil-db
changes on any Intel platform.

NOTE 2: This change "fixes" arb_compute_variable_group_size-local-size
and base-local-size.shader_test on DG2 and MTL. This is just changing
the code path taken to not use whatever path was not working properly
before.

This is a subset of the things optimized by ACO. See also
https://gitlab.freedesktop.org/mesa/mesa/-/issues/3731#note_682802. The
min, max, iand, and ior exclusive_scan optimizations are not
implemented.

Broadwell on shader-db is not happy. I have not investigated.

v2: Silence some warnings about discarding const.

v3: Rename mbcnt to count_active_invocations. Add a big comment
explaining the differences between the two paths. Suggested by Rhys.

shader-db:

All Gfx9 and newer platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 20300384 -> 20299545 (<.01%)
instructions in affected programs: 19167 -> 18328 (-4.38%)
helped: 35 / HURT: 0

total cycles in shared programs: 842809750 -> 842766381 (<.01%)
cycles in affected programs: 2160249 -> 2116880 (-2.01%)
helped: 33 / HURT: 2

total spills in shared programs: 4632 -> 4626 (-0.13%)
spills in affected programs: 206 -> 200 (-2.91%)
helped: 3 / HURT: 0

total fills in shared programs: 5594 -> 5581 (-0.23%)
fills in affected programs: 664 -> 651 (-1.96%)
helped: 3 / HURT: 1

fossil-db results:

All Intel platforms had similar results. (Ice Lake shown)
Totals:
Instrs: 165551893 -> 165513303 (-0.02%)
Cycles: 15132539132 -> 15125314947 (-0.05%); split: -0.05%, +0.00%
Spill count: 45258 -> 45204 (-0.12%)
Fill count: 74286 -> 74157 (-0.17%)
Scratch Memory Size: 2467840 -> 2451456 (-0.66%)

Totals from 712 (0.11% of 656120) affected shaders:
Instrs: 598931 -> 560341 (-6.44%)
Cycles: 184650167 -> 177425982 (-3.91%); split: -3.95%, +0.04%
Spill count: 983 -> 929 (-5.49%)
Fill count: 2274 -> 2145 (-5.67%)
Scratch Memory Size: 52224 -> 35840 (-31.37%)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27044>
2024-02-27 09:44:11 -08:00
..
elk intel/compiler: Remove has_render_target_reads from wm_prog_data 2024-02-24 02:34:59 +00:00
tests intel/brw: Remove assembler tests for Gfx8- 2024-02-24 02:10:56 +00:00
brw_asm.h intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm} 2024-02-15 09:26:46 +00:00
brw_asm_tool.c intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm} 2024-02-15 09:26:46 +00:00
brw_cfg.cpp intel/compiler: Verify that DO is alone in the block 2023-12-08 20:21:28 +00:00
brw_cfg.h intel/compiler: Allow dumping CFG to a specific FILE* 2023-11-30 20:58:05 +00:00
brw_clip.h intel/compiler: Rename brw_vue_map to intel_vue_map 2024-02-14 22:31:23 -08:00
brw_clip_line.c intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium 2023-08-03 22:00:15 +00:00
brw_clip_point.c intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium 2023-08-03 22:00:15 +00:00
brw_clip_tri.c intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium 2023-08-03 22:00:15 +00:00
brw_clip_unfilled.c intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium 2023-08-03 22:00:15 +00:00
brw_clip_util.c intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium 2023-08-03 22:00:15 +00:00
brw_compile_clip.c intel/compiler: Move disassemble functions to own header file 2024-02-15 09:26:46 +00:00
brw_compile_ff_gs.c intel/compiler: Move disassemble functions to own header file 2024-02-15 09:26:46 +00:00
brw_compile_sf.c intel/compiler: Move disassemble functions to own header file 2024-02-15 09:26:46 +00:00
brw_compiler.c intel/brw: Assert Gfx9+ 2024-02-24 02:10:56 +00:00
brw_compiler.h intel/compiler: Remove has_render_target_reads from wm_prog_data 2024-02-24 02:34:59 +00:00
brw_dead_control_flow.cpp intel/brw: Use references for a couple of backend_shader passes 2024-02-26 20:54:25 +00:00
brw_dead_control_flow.h intel/brw: Use references for a couple of backend_shader passes 2024-02-26 20:54:25 +00:00
brw_debug_recompile.c
brw_device_sha1_gen_c.py intel/compiler: generate a hash function to use with the shader cache 2024-02-15 16:58:15 -08:00
brw_disasm.c intel/compiler: Add texture gather offset LOD/Bias message support 2024-02-27 00:22:46 +00:00
brw_disasm.h intel/compiler: Merge intel_disasm.[ch] into corresponding brw files 2024-02-15 09:26:46 +00:00
brw_disasm_info.c intel/compiler: Move disassemble functions to own header file 2024-02-15 09:26:46 +00:00
brw_disasm_info.h
brw_disasm_tool.c intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm} 2024-02-15 09:26:46 +00:00
brw_eu.c intel/brw: Assert Gfx9+ 2024-02-24 02:10:56 +00:00
brw_eu.h intel/compiler: Include brw_disasm_info.h where its used 2024-02-15 09:26:46 +00:00
brw_eu_compact.c intel/compiler: Move disassemble functions to own header file 2024-02-15 09:26:46 +00:00
brw_eu_defines.h intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_eu_emit.c intel/eu/xe2+: Translate brw_reg fields in REG_SIZE units to physical 512b GRF units during codegen. 2024-01-20 19:55:31 +00:00
brw_eu_util.c
brw_eu_validate.c intel/compiler: Include brw_disasm_info.h where its used 2024-02-15 09:26:46 +00:00
brw_fs.cpp intel/compiler: Add texture gather offset LOD/Bias message support 2024-02-27 00:22:46 +00:00
brw_fs.h intel/brw: Move lower_integer_multiplication to its own file 2024-02-26 20:54:25 +00:00
brw_fs_bank_conflicts.cpp intel/brw: Pull bank_conflicts out of fs_visitor 2024-02-26 20:54:25 +00:00
brw_fs_builder.h intel/compiler: Initial bits for DPAS instruction 2023-12-29 20:24:16 -08:00
brw_fs_cmod_propagation.cpp intel/brw: Pull opt_cmod_propagation out of fs_visitor 2024-02-26 20:54:24 +00:00
brw_fs_combine_constants.cpp intel/brw: Pull opt_combine_constants out of fs_visitor 2024-02-26 20:54:24 +00:00
brw_fs_copy_propagation.cpp intel/compiler: Add texture gather offset LOD/Bias message support 2024-02-27 00:22:46 +00:00
brw_fs_cse.cpp intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_fs_dead_code_eliminate.cpp intel/brw: Pull dead_code_eliminate out of fs_visitor 2024-02-26 20:54:24 +00:00
brw_fs_generator.cpp intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_fs_live_variables.cpp intel/fs: Use linear allocator in fs_live_variables 2024-01-04 23:06:07 +00:00
brw_fs_live_variables.h
brw_fs_lower.cpp intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_fs_lower_dpas.cpp intel/fs: DPAS lowering 2023-12-29 20:27:15 -08:00
brw_fs_lower_integer_multiplication.cpp intel/brw: Move lower_integer_multiplication to its own file 2024-02-26 20:54:25 +00:00
brw_fs_lower_pack.cpp intel/brw: Pull lower_pack out of fs_visitor 2024-02-26 20:54:25 +00:00
brw_fs_lower_regioning.cpp intel/brw: Pull lower_regioning out of fs_visitor 2024-02-26 20:54:25 +00:00
brw_fs_lower_simd_width.cpp intel/compiler: Add texture gather offset LOD/Bias message support 2024-02-27 00:22:46 +00:00
brw_fs_nir.cpp intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_fs_opt.cpp intel/brw: Move optimize and small optimizations to brw_fs_opt.cpp 2024-02-26 20:54:25 +00:00
brw_fs_opt_algebraic.cpp intel/brw: Move fs algebraic to its own file 2024-02-26 20:54:25 +00:00
brw_fs_opt_virtual_grfs.cpp intel/brw: Move virtual GRF opts into their own file 2024-02-26 20:54:25 +00:00
brw_fs_reg_allocate.cpp intel/compiler: Make fs_builder include fs_visitor and not the other way 2023-12-12 19:36:14 +00:00
brw_fs_register_coalesce.cpp intel/brw: Pull register_coalesce out of fs_visitor 2024-02-26 20:54:25 +00:00
brw_fs_saturate_propagation.cpp intel/brw: Pull opt_saturate_propagation out of fs_visitor 2024-02-26 20:54:24 +00:00
brw_fs_scoreboard.cpp intel/brw: Pull lower_scoreboard out of fs_visitor 2024-02-26 20:54:25 +00:00
brw_fs_sel_peephole.cpp intel/brw: Pull peephole_sel out of fs_visitor 2024-02-26 20:54:25 +00:00
brw_fs_thread_payload.cpp intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* 2024-02-14 22:31:23 -08:00
brw_fs_validate.cpp intel/compiler: Add basic CFG validation 2023-11-30 20:58:05 +00:00
brw_fs_visitor.cpp intel/compiler: Rename brw_vue_map to intel_vue_map 2024-02-14 22:31:23 -08:00
brw_fs_workaround.cpp intel/brw: Move workarounds to a separate file 2024-02-26 20:54:25 +00:00
brw_gram.y intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm} 2024-02-15 09:26:46 +00:00
brw_inst.h intel/compiler/xe2: Fix for the removal of AccWrCtrl. 2024-01-12 20:18:03 +00:00
brw_interpolation_map.c intel/compiler: Rename brw_vue_map to intel_vue_map 2024-02-14 22:31:23 -08:00
brw_ir.h intel/compiler/xe2: Add extra flag registers. 2024-01-12 20:18:03 +00:00
brw_ir_allocator.h intel/compiler,intel/blorp,intel/vulkan: decouple vulkan driver and compiler from gallium 2023-08-03 22:00:15 +00:00
brw_ir_analysis.h
brw_ir_fs.h intel/compiler: Adjust sample_b parameter according to new layout 2024-02-27 00:22:46 +00:00
brw_ir_performance.cpp intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_ir_performance.h
brw_ir_vec4.h
brw_isa_info.h
brw_kernel.c intel-clc: Use correct set of nir_options when building for Gfx8 2024-02-24 00:24:32 +00:00
brw_kernel.h intel-clc: Use correct set of nir_options when building for Gfx8 2024-02-24 00:24:32 +00:00
brw_lex.l intel: Rename i965_{asm,disasm} tools to brw_{asm,disasm} 2024-02-15 09:26:46 +00:00
brw_lower_logical_sends.cpp intel/compiler: Add texture gather offset LOD/Bias message support 2024-02-27 00:22:46 +00:00
brw_mesh.cpp intel/compiler: Use glsl_type C helpers 2023-12-22 06:44:23 -08:00
brw_nir.c nir: Optimize uniform iadd, fadd, and ixor reduction operations 2024-02-27 09:44:11 -08:00
brw_nir.h anv: fixup push descriptor shader analysis 2024-02-19 11:10:29 +00:00
brw_nir_analyze_boolean_resolves.c intel: Collapse is_ssa checks 2023-08-03 22:40:29 +00:00
brw_nir_analyze_ubo_ranges.c intel/compiler: Remove unused parameter from brw_nir_analyze_ubo_ranges() 2023-11-08 18:10:31 +00:00
brw_nir_attribute_workarounds.c nir: Drop nir_dest 2023-08-14 21:22:53 +00:00
brw_nir_lower_alpha_to_coverage.c intel/compiler: Rename BRW_WM_MSAA_* enums to INTEL_MSAA_* 2024-02-14 22:31:23 -08:00
brw_nir_lower_cooperative_matrix.c intel/cmat: Generate better code for nir_intrinsic_cmat_insert 2023-12-29 20:28:54 -08:00
brw_nir_lower_cs_intrinsics.c intel/compiler: Use "intel" prefix for walk_order enum 2024-02-21 00:38:35 +00:00
brw_nir_lower_intersection_shader.c intel/nir/rt: fix reportIntersection() hitT handling 2023-11-17 07:06:30 +00:00
brw_nir_lower_ray_queries.c intel/nir: only consider ray query variables in lowering 2024-02-24 12:56:30 +00:00
brw_nir_lower_rt_intrinsics.c treewide: Use nir_before/after_impl for more elaborate cases 2023-08-30 19:30:58 +00:00
brw_nir_lower_shader_calls.c intel/rt: Don't directly generate umul_32x16 2024-02-02 00:02:05 +00:00
brw_nir_lower_storage_image.c intel/compiler: Rename brw_image_param to isl_image_param 2024-02-14 22:31:23 -08:00
brw_nir_rt.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
brw_nir_rt.h
brw_nir_rt_builder.h nir: Drop "SSA" from NIR language 2023-08-12 16:44:41 -04:00
brw_nir_trig_workarounds.py
brw_packed_float.c
brw_predicated_break.cpp intel/brw: Use references for a couple of backend_shader passes 2024-02-26 20:54:25 +00:00
brw_prim.h
brw_private.h intel/compiler: Don't allocate memory for SIMD select error handling 2023-09-22 16:23:02 +00:00
brw_reg.h intel/eu/xe2+: Translate brw_reg fields in REG_SIZE units to physical 512b GRF units during codegen. 2024-01-20 19:55:31 +00:00
brw_reg_type.c
brw_reg_type.h
brw_rt.h intel: Use ALIGN_POT instead of ALIGN inside macro define 2024-01-03 12:46:10 +00:00
brw_schedule_instructions.cpp intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_shader.cpp intel/fs: Add fast path for ballot(true) 2024-02-27 08:37:46 -08:00
brw_shader.h intel/brw: Use references for a couple of backend_shader passes 2024-02-26 20:54:25 +00:00
brw_simd_selection.cpp intel/fs/xe2+: Stop building SIMD8 compute-like shaders (CS/BS/TS/MS). 2023-12-22 10:37:00 -08:00
brw_vec4.cpp intel/brw: Use references for a couple of backend_shader passes 2024-02-26 20:54:25 +00:00
brw_vec4.h intel/vec4: Stop passing around nir_dest 2023-08-14 21:22:53 +00:00
brw_vec4_builder.h
brw_vec4_cmod_propagation.cpp intel/compiler: Use C helpers to access builtin types 2023-12-15 03:09:19 +00:00
brw_vec4_copy_propagation.cpp intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* 2024-02-14 22:31:23 -08:00
brw_vec4_cse.cpp
brw_vec4_dead_code_eliminate.cpp
brw_vec4_generator.cpp intel/compiler: Include brw_disasm_info.h where its used 2024-02-15 09:26:46 +00:00
brw_vec4_gs_nir.cpp intel/compiler: Use glsl_type C helpers 2023-12-22 06:44:23 -08:00
brw_vec4_gs_visitor.cpp intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* 2024-02-14 22:31:23 -08:00
brw_vec4_gs_visitor.h intel/compiler: rework input parameters 2023-07-20 09:08:08 +00:00
brw_vec4_live_variables.cpp
brw_vec4_live_variables.h
brw_vec4_nir.cpp intel/compiler: Use glsl_type C helpers 2023-12-22 06:44:23 -08:00
brw_vec4_reg_allocate.cpp intel/compiler: Make MAX_VGRF_SIZE macro depend on devinfo and update it for Xe2. 2023-09-20 17:19:36 -07:00
brw_vec4_surface_builder.cpp
brw_vec4_surface_builder.h
brw_vec4_tcs.cpp intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
brw_vec4_tcs.h intel/compiler/xe2+: Represent dispatch_grf_start_reg in native GRF units. 2023-09-20 17:19:36 -07:00
brw_vec4_tes.cpp intel/compiler: Rename BRW_TESS_* enums to INTEL_TESS_* 2024-02-14 22:31:23 -08:00
brw_vec4_tes.h intel/compiler: rework input parameters 2023-07-20 09:08:08 +00:00
brw_vec4_visitor.cpp intel/compiler: Rename brw_image_param to isl_image_param 2024-02-14 22:31:23 -08:00
brw_vec4_vs.h intel/compiler: rework input parameters 2023-07-20 09:08:08 +00:00
brw_vec4_vs_visitor.cpp intel/compiler: rework input parameters 2023-07-20 09:08:08 +00:00
brw_vue_map.c intel/compiler: Rename brw_vue_map to intel_vue_map 2024-02-14 22:31:23 -08:00
gfx6_gs_visitor.cpp intel/compiler: Use C helpers to access builtin types 2023-12-15 03:09:19 +00:00
gfx6_gs_visitor.h intel/compiler: rework input parameters 2023-07-20 09:08:08 +00:00
intel_clc.c intel-clc: Use correct set of nir_options when building for Gfx8 2024-02-24 00:24:32 +00:00
intel_gfx_ver_enum.h intel/compiler: Rename brw_gfx_ver_enum.h to intel_gfx_ver_enum.h 2024-02-16 22:35:05 +00:00
intel_nir.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir.h intel/compiler: Pack texture LOD and offset to a single 32-bit value 2024-02-27 00:22:46 +00:00
intel_nir_blockify_uniform_loads.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_clamp_image_1d_2d_array_sizes.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_clamp_per_vertex_loads.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_lower_conversions.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_lower_non_uniform_barycentric_at_sample.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_lower_non_uniform_resource_intel.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_lower_shading_rate_output.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_lower_sparse.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_lower_texture.c intel/compiler: Pack texture LOD and offset to a single 32-bit value 2024-02-27 00:22:46 +00:00
intel_nir_opt_peephole_ffma.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_opt_peephole_imul32x16.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_nir_tcs_workarounds.c intel/compiler: Rename the passes and files related to intel_nir.h 2024-02-16 22:35:05 +00:00
intel_shader_enums.h intel/compiler: Use "intel" prefix for walk_order enum 2024-02-21 00:38:35 +00:00
meson.build intel/brw: Move workarounds to a separate file 2024-02-26 20:54:25 +00:00
test_eu_compact.cpp intel/brw: Remove EU compaction tests for Gfx8- 2024-02-24 02:10:56 +00:00
test_eu_validate.cpp intel/brw: Remove EU validation tests for Gfx8- 2024-02-24 02:10:56 +00:00
test_fs_cmod_propagation.cpp intel/brw: Pull opt_cmod_propagation out of fs_visitor 2024-02-26 20:54:24 +00:00
test_fs_combine_constants.cpp intel/brw: Pull opt_combine_constants out of fs_visitor 2024-02-26 20:54:24 +00:00
test_fs_copy_propagation.cpp intel/brw: Pull opt_copy_propagation out of fs_visitor 2024-02-26 20:54:24 +00:00
test_fs_saturate_propagation.cpp intel/brw: Pull opt_saturate_propagation out of fs_visitor 2024-02-26 20:54:24 +00:00
test_fs_scoreboard.cpp intel/brw: Pull lower_scoreboard out of fs_visitor 2024-02-26 20:54:25 +00:00
test_predicated_break.cpp intel/brw: Use references for a couple of backend_shader passes 2024-02-26 20:54:25 +00:00
test_simd_selection.cpp intel: Remove brw_ prefix from process debug function 2024-02-16 22:35:05 +00:00
test_vec4_cmod_propagation.cpp intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* 2024-02-14 22:31:23 -08:00
test_vec4_copy_propagation.cpp intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* 2024-02-14 22:31:23 -08:00
test_vec4_dead_code_eliminate.cpp intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* 2024-02-14 22:31:23 -08:00
test_vec4_register_coalesce.cpp intel/compiler: Rename DISPATCH_MODE_* enums to INTEL_DISPATCH_MODE_* 2024-02-14 22:31:23 -08:00
test_vf_float_conversions.cpp