From 76b4255cd8075c3e41a28fe0575838fff0baafca Mon Sep 17 00:00:00 2001 From: Francisco Jerez Date: Fri, 10 Feb 2023 19:35:35 -0800 Subject: [PATCH] intel/fs: Fix register coalesce in presence of force_writemask_all copy source writes. This fixes the behavior of register coalesce in cases where the source of a copy is written elsewhere in the program by a force_writemask_all instruction, which could cause the overwrite to be executed for an inactive channel under non-uniform control flow, causing can_coalesce_vars() to give incorrect results. This has been reported in cases like: > while (true) { > x = imageSize(img); > if (non_uniform_condition()) { > y = x; > break; > } > } > use(y); Currently the register coalesce pass would coalesce x and y in the example above, which is invalid since in the example above imageSize() is implemented as a force_writemask_all SEND message, whose result is broadcast to all channels, so when a given channel executes 'y = x' and breaks out of the loop, another divergent channel can execute a subsequent iteration of the loop overwriting 'x' with a different value, hence coalescing y and x into the same register changes the behavior of the program. Note that this is a regression introduced by commit a4b36cd3dd30. In order to avoid the problem without reverting that patch, we prevent register coalesce if there is an overwrite of the source with force_writemask_all behavior inconsistent with the copy and this occurs anywhere in the intersection of the live ranges of source and destination, even if it occurs lexically before the copy, since it might be physically executed after the copy under divergent loop control flow. Fixes: a4b36cd3dd30 ("intel/fs: Coalesce when the src live range is contained in the dst") Reported-by: Lionel Landwerlin Reviewed-by: Lionel Landwerlin Acked-by: Matt Turner Part-of: --- src/intel/compiler/brw_fs_register_coalesce.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs_register_coalesce.cpp b/src/intel/compiler/brw_fs_register_coalesce.cpp index 51e9af4a72f..24aa25c94e6 100644 --- a/src/intel/compiler/brw_fs_register_coalesce.cpp +++ b/src/intel/compiler/brw_fs_register_coalesce.cpp @@ -177,7 +177,8 @@ can_coalesce_vars(const fs_live_variables &live, const cfg_t *cfg, /* See the big comment above */ if (regions_overlap(scan_inst->dst, scan_inst->size_written, inst->src[0], inst->size_read(0))) { - if (seen_copy || scan_block != block) + if (seen_copy || scan_block != block || + (scan_inst->force_writemask_all && !inst->force_writemask_all)) return false; seen_src_write = true; }