mesa/src/intel/compiler
Caio Oliveira f8db53ccae brw: Fix comparison with unordered_mode when making baked dependency
The unordered mode stored in dependencies might be a bitmask and not
only a single mode.  In practice, only the "stronger" mode will stick.
Make sure that the code testing for the mode uses "&" instead of "==",
to avoid prevent some valid combinations to happen, e.g.

```
   // ...
   add(16)         g104<1>F        g94<1,1,0>F     g34<1,1,0>F     { align1 1H @7 $7.dst compacted };
```

which without the fix ends up as

```
   // ...
   sync nop(1)                     null<0,1,0>UB                   { align1 WE_all 1N F@7 };
   add(16)         g104<1>F        g94<1,1,0>F     g34<1,1,0>F     { align1 1H $7.dst compacted };
```

Enables two tests for the scoreboard pass that illustrate this case.

For measuring the effect, re-enabled the sync.nop accounting on total of
instructions and got the following results.

```
   Totals:
   Instrs: 322041261 -> 321748285 (-0.09%)
   Cycle count: 22864587567 -> 22863073741 (-0.01%)
   Max dispatch width: 7989040 -> 7989024 (-0.00%); split: +0.00%, -0.00%

   Totals from 88212 (9.78% of 902056) affected shaders:
   Instrs: 102282050 -> 101989074 (-0.29%)
   Cycle count: 12787629859 -> 12786116033 (-0.01%)
   Max dispatch width: 525336 -> 525320 (-0.00%); split: +0.01%, -0.01%
```

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36096>
2025-07-14 20:28:54 +00:00
..
elk intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
tests intel/compiler tests: fix path-to-string conversion 2025-06-23 08:26:29 +00:00
brw_analysis.cpp brw: Make brw_range use half-open ranges 2025-04-09 19:06:49 +00:00
brw_analysis.h brw: Track the largest VGRF size in liveness analysis 2025-04-11 20:34:51 +00:00
brw_analysis_def.cpp brw: Add basic infrastructure for load_reg pseudo op 2025-04-04 06:45:02 +00:00
brw_analysis_liveness.cpp brw: Track the largest VGRF size in liveness analysis 2025-04-11 20:34:51 +00:00
brw_analysis_performance.cpp intel/compiler: Add support for MSAA typed load/store messages 2025-03-07 23:06:14 +00:00
brw_asm.c brw: Rework label tracking in assembler 2025-03-06 17:06:20 -08:00
brw_asm.h brw: Fix size in assembler when compacting 2025-03-03 20:43:56 +00:00
brw_asm_internal.h brw: Rework label tracking in assembler 2025-03-06 17:06:20 -08:00
brw_asm_tool.c intel/compiler tests: fix variable type for getopt_long() return value 2025-06-23 08:26:29 +00:00
brw_builder.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
brw_builder.h brw: Handle bfloat16 dest and src0 operands for DPAS 2025-07-02 20:06:59 +00:00
brw_cfg.cpp brw: Remove adjust_block_ips and brw_inst::remove() with defer 2025-03-29 00:25:51 +00:00
brw_cfg.h brw: Remove adjust_block_ips and brw_inst::remove() with defer 2025-03-29 00:25:51 +00:00
brw_compile_bs.cpp brw: Don't use simd_select for BS shaders 2025-07-02 19:48:04 +00:00
brw_compile_cs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_fs.cpp brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
brw_compile_gs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_mesh.cpp intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_compile_tcs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_tes.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compile_vs.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_compiler.c intel: Add INTEL_DEBUG=no-vrt 2025-07-13 21:11:02 +00:00
brw_compiler.h intel: Add INTEL_DEBUG=no-vrt 2025-07-13 21:11:02 +00:00
brw_debug_recompile.c intel/brw: Simplify @file annotations 2024-07-22 22:48:03 +00:00
brw_device_sha1_gen_c.py intel/compiler: drop unused ray-tracing fields from cache hash 2024-03-22 00:01:28 +00:00
brw_disasm.c brw/disasm: Fix Gfx11 3src-instructions dst register disassembly 2025-07-08 19:49:09 +00:00
brw_disasm.h intel/brw: support for dumping shader line numbers 2025-04-08 19:39:53 +00:00
brw_disasm_info.cpp intel/brw: support for dumping shader line numbers 2025-04-08 19:39:53 +00:00
brw_disasm_info.h intel/brw: Rename fs_inst to brw_inst 2025-01-31 00:57:21 +00:00
brw_disasm_tool.c intel/brw: Remove Gfx8- code from disassembler 2024-02-28 05:45:38 +00:00
brw_eu.c intel/compiler: fix SHA generation for shader replace 2025-05-27 22:57:19 +00:00
brw_eu.h brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_eu_compact.c brw: Add support for GOTO/JOIN in the assembler 2025-03-06 17:06:20 -08:00
brw_eu_defines.h brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets 2025-06-22 10:55:24 +00:00
brw_eu_emit.c brw: fix non constant BTI accesses with offsets 2025-07-02 01:04:06 +03:00
brw_eu_inst.h brw: Add BRW_TYPE_BF for bfloat16 2025-03-25 05:23:37 +00:00
brw_eu_validate.c brw: Update EU validation to allow packed BF mixed with packed F 2025-04-14 18:23:43 +00:00
brw_from_nir.cpp intel/compiler: Drop unused param from set_memory_address 2025-07-14 03:46:21 +00:00
brw_generator.cpp brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_generator.h brw: factor out base prog_data setting 2025-02-22 08:30:22 +00:00
brw_gram.y brw: Add EU assembler support for bfloat16 2025-03-25 05:23:37 +00:00
brw_inst.cpp brw: Handle bfloat16 dest and src0 operands for DPAS 2025-07-02 20:06:59 +00:00
brw_inst.h brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_isa_info.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
brw_kernel.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_kernel.h intel: rework CL pre-compile 2025-01-25 03:28:07 +00:00
brw_lex.l brw: Add EU assembler support for bfloat16 2025-03-25 05:23:37 +00:00
brw_load_reg.cpp brw: Add passes to generate and lower load_reg 2025-04-04 06:45:02 +00:00
brw_lower.cpp intel/compiler: Centralize type stomping logic for Gen12.5 restrictions 2025-05-22 06:46:18 +00:00
brw_lower_dpas.cpp brw: Simplify brw_builder "insert before inst" constructor 2025-03-06 23:33:38 +00:00
brw_lower_integer_multiplication.cpp brw: Remove bblock_t parameters from various passes 2025-03-06 23:33:38 +00:00
brw_lower_logical_sends.cpp brw: fix non constant BTI accesses with offsets 2025-07-02 01:04:06 +03:00
brw_lower_pack.cpp brw: Simplify brw_builder "insert before inst" constructor 2025-03-06 23:33:38 +00:00
brw_lower_regioning.cpp brw: Consider bfloat16 in lower regioning pass 2025-04-29 16:29:37 +00:00
brw_lower_scoreboard.cpp brw: Fix comparison with unordered_mode when making baked dependency 2025-07-14 20:28:54 +00:00
brw_lower_simd_width.cpp brw/nir: add intrinsics to read attribute payload register indirectly 2025-05-08 06:48:35 +00:00
brw_lower_subgroup_ops.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
brw_nir.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir.h brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
brw_nir_analyze_ubo_ranges.c intel/compiler: take reg_unit size into account with ubo ranges 2025-01-07 21:38:06 +00:00
brw_nir_lower_alpha_to_coverage.c nir: rename nir_lower_io_to_temporaries -> nir_lower_io_vars_to_temporaries 2025-06-26 18:20:54 +00:00
brw_nir_lower_cooperative_matrix.c brw: Use convert_cmat_intel intrinsic 2025-06-27 01:26:22 +00:00
brw_nir_lower_cs_intrinsics.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
brw_nir_lower_fs_barycentrics.c brw: add lowering passes for FS barycentric inputs 2025-05-20 20:57:59 +00:00
brw_nir_lower_fs_msaa.c brw/anv: add provoking vertex to fs_msaa_flags 2025-05-20 20:57:58 +00:00
brw_nir_lower_fsign.py intel/brw: Use range analysis to optimize fsign 2024-05-14 01:28:21 +00:00
brw_nir_lower_immediate_offsets.c brw: fix non constant BTI accesses with offsets 2025-07-02 01:04:06 +03:00
brw_nir_lower_intersection_shader.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_lower_ray_queries.c intel/rt: Update BVH instance leaf load for Xe3+ 2025-04-21 20:10:45 +00:00
brw_nir_lower_rt_intrinsics.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_lower_rt_intrinsics_pre_trace.c brw: add pre ray trace intrinsic moves 2025-05-06 13:34:53 +00:00
brw_nir_lower_sample_index_in_coord.c intel/compiler: Lower sample index into coord for MSRT messages 2025-03-07 23:06:14 +00:00
brw_nir_lower_shader_calls.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_lower_storage_image.c brw: implement read without format lowering 2025-06-06 12:28:42 +00:00
brw_nir_lower_texel_address.c intel/compiler: Use correct enum type 2025-03-13 20:11:10 +00:00
brw_nir_lower_texture.c brw: move texture offset packing to NIR 2025-03-29 02:15:18 +00:00
brw_nir_opt_fsat.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
brw_nir_rt.c intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_rt.h intel: Update all NIR_PASS_V to NIR_PASS 2025-07-14 19:25:52 +00:00
brw_nir_rt_builder.h intel/rt: Update BVH instance leaf load for Xe3+ 2025-04-21 20:10:45 +00:00
brw_nir_trig_workarounds.py
brw_nir_wa_18019110168.c treewide: use VARYING_BIT_* 2025-07-04 19:01:04 +00:00
brw_opt.cpp intel/compiler: Centralize type stomping logic for Gen12.5 restrictions 2025-05-22 06:46:18 +00:00
brw_opt_address_reg_load.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
brw_opt_algebraic.cpp brw/algebraic: Convert some NOT to MOV 2025-04-28 19:44:23 +00:00
brw_opt_bank_conflicts.cpp intel/brw: Rename fs_visitor to brw_shader 2025-02-11 09:13:28 +00:00
brw_opt_cmod_propagation.cpp brw/cmod: Allow integer CMP to ADD propagation only for Z and NZ 2025-04-28 19:44:23 +00:00
brw_opt_combine_constants.cpp brw: Use brw_inst::block in Combine Constants 2025-03-06 23:33:38 +00:00
brw_opt_copy_propagation.cpp brw: Consider bfloat16 in copy propagation 2025-04-29 16:29:37 +00:00
brw_opt_cse.cpp brw: make HALT instruction act as barrier in new CSE pass 2025-04-29 20:28:24 +00:00
brw_opt_dead_code_eliminate.cpp brw: Remove adjust_block_ips and brw_inst::remove() with defer 2025-03-29 00:25:51 +00:00
brw_opt_register_coalesce.cpp brw: don't generate invalid instructions 2025-06-04 06:08:26 +00:00
brw_opt_saturate_propagation.cpp brw: Clean up saturate propagation after non-defs version removal 2025-04-09 19:06:48 +00:00
brw_opt_txf_combiner.cpp brw: Simplify brw_builder "insert before inst" constructor 2025-03-06 23:33:38 +00:00
brw_opt_virtual_grfs.cpp brw: Don't assert about MAX_VGRF_SIZE in brw_opt_split_virtual_grfs() 2025-04-11 20:34:51 +00:00
brw_packed_float.c
brw_prim.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
brw_print.cpp brw: encode the offset into the message descriptor for Xe2 2025-06-22 10:55:24 +00:00
brw_private.h intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_reg.cpp intel/compiler: Use unreachable instead of assert(!"...") 2025-03-13 20:11:10 +00:00
brw_reg.h brw: add new helper for immediate integer register with type 2025-06-22 10:55:24 +00:00
brw_reg_allocate.cpp intel: Add INTEL_DEBUG=no-vrt 2025-07-13 21:11:02 +00:00
brw_reg_type.c brw: Add BRW_TYPE_BF for bfloat16 2025-03-25 05:23:37 +00:00
brw_reg_type.h brw: Add BRW_TYPE_BF for bfloat16 2025-03-25 05:23:37 +00:00
brw_rt.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
brw_schedule_instructions.cpp brw: Use live->max_vgrf_size in pre-RA scheduling 2025-04-11 20:34:51 +00:00
brw_shader.cpp intel/debug: shader dump filter 2025-05-23 19:57:02 +00:00
brw_shader.h brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
brw_simd_selection.cpp intel/brw/xe3+: Optimize CS/TASK/MESH compile time optimistically assuming SIMD32. 2025-01-29 23:39:32 +00:00
brw_spirv.c nir: add nir_vectorize_cb callback parameter to nir_lower_phis_to_scalar() 2025-07-08 15:33:59 +00:00
brw_thread_payload.cpp brw: Embed at_end() inside brw_builder(brw_shader *) constructor 2025-03-06 23:33:38 +00:00
brw_thread_payload.h intel/brw: Rename fs_visitor to brw_shader 2025-02-11 09:13:28 +00:00
brw_validate.cpp brw: introduce MEMORY_LOGICAL_ADDRESS_OFFSET to encode address offsets 2025-06-22 10:55:24 +00:00
brw_vue_map.c intel/compiler: use ffsll instead of ffsl in brw_vue_map.c 2025-05-11 00:50:21 +02:00
brw_workaround.cpp brw: fix Wa_22013689345 emission 2025-04-10 16:44:28 +00:00
intel_gfx_ver_enum.h intel/compiler: Use #pragma once instead of header guards 2024-12-11 19:47:44 +00:00
intel_nir.c intel/compiler: Use nir_split_conversions() 2025-04-07 17:45:21 -05:00
intel_nir.h anv: lower input vertices for TCS unconditionally 2025-05-08 06:48:34 +00:00
intel_nir_blockify_uniform_loads.c intel: replace RANGE_BASE by BASE for uniform block loads 2025-06-22 10:55:23 +00:00
intel_nir_clamp_image_1d_2d_array_sizes.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_clamp_per_vertex_loads.c anv: lower input vertices for TCS unconditionally 2025-05-08 06:48:34 +00:00
intel_nir_lower_non_uniform_barycentric_at_sample.c intel: switch to nir_metadata_divergence 2025-02-13 10:08:43 +00:00
intel_nir_lower_non_uniform_resource_intel.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_printf.c nir: drop printf_base_identifier 2025-02-05 20:33:15 +00:00
intel_nir_lower_shading_rate_output.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_lower_sparse.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_opt_peephole_ffma.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_opt_peephole_imul32x16.c treewide: use nir_metadata_control_flow 2024-06-17 16:28:14 -04:00
intel_nir_tcs_workarounds.c treewide: Switch to nir_progress 2025-02-26 15:19:53 +00:00
intel_shader_enums.h brw: handle wa_18019110168 with independent shader compilation 2025-06-28 05:55:35 +00:00
meson.build anv/brw: move Wa_18019110168 handling to backend 2025-06-28 05:55:32 +00:00
test_eu_compact.cpp intel/brw: Rename brw_inst_* helpers to brw_eu_inst_* 2024-12-30 17:16:15 +00:00
test_eu_validate.cpp brw: Allow DPAS with BF on Gfx125 2025-04-14 18:23:43 +00:00
test_helpers.cpp brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_helpers.h brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_insert_load_reg.cpp brw: Add passes to generate and lower load_reg 2025-04-04 06:45:02 +00:00
test_lower_scoreboard.cpp brw: Fix comparison with unordered_mode when making baked dependency 2025-07-14 20:28:54 +00:00
test_opt_algebraic.cpp brw/algebraic: Don't optimize float SEL.CMOD to MOV 2025-04-15 23:59:31 +00:00
test_opt_cmod_propagation.cpp brw/cmod: Don't propagate from CMP to possible Inf + (-Inf) 2025-04-28 19:44:23 +00:00
test_opt_combine_constants.cpp brw: Add brw_builder::uniform() 2025-04-04 23:07:21 +00:00
test_opt_copy_propagation.cpp brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_opt_cse.cpp brw: Simplify the test code for brw passes 2025-03-13 17:43:17 +00:00
test_opt_register_coalesce.cpp brw: don't generate invalid instructions 2025-06-04 06:08:26 +00:00
test_opt_saturate_propagation.cpp brw/sat: Eliminate non-defs saturate propagation 2025-04-04 06:45:02 +00:00
test_simd_selection.cpp intel: Switch uint64_t intel_debug to a bitset 2025-04-22 23:09:26 +00:00
test_vf_float_conversions.cpp