mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2026-05-06 02:58:05 +02:00
nak,nir: Use a simpler version of phis_to_regs_block in lower_cf
The original lower_phis_to_regs_block() is a little too clever. It
crawls up the predecessor tree until it finds a cross edge and places
the register writes as deep as it can. This breaks nak_nir_lower_cf().
Say you have a shader like...
con %0 = load_uniform()
con loop {
if div {
} else {
}
break;
}
con %1 = phi %0
The original lower_phis_to_regs_block() will turn it into
con %0 = load_uniform()
con %r = decl_reg();
con loop {
if div {
reg_store(%r, %0)
} else {
reg_store(%r, %0)
}
break;
}
con %1 = reg_load(%r)
We then convert it into unstructured control-flow and run regs_to_ssa()
to get our phis back, which lowers each of the registers we inserted to
a phi tree. When we try to recover divergence information on phis by
looking at their sources, this works fine if each source maps directly
to a reg_store() whic maps directly to a phi in the original IR.
However, because the reg_store() instructions are placed deeper, it may
introduce false divergence.
Switch to the simple version of nir_lower_phis_to_regs_block() which
places reg writes directly in phi predecessor blocks. We could probably
be more conservative and just avoid placing writes to uniform regs in
divergent control-flow but it's more robust to make the load/store_reg
intrinsics match the original phis directly.
This fixes some shaders in Horizon: Zero Dawn Remastered
Fixes: b013d54e4f ("nak/lower_cf: Flag phis as convergent when possible")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914>
This commit is contained in:
parent
26e32417b9
commit
c6e831ac44
1 changed files with 6 additions and 2 deletions
|
|
@ -445,10 +445,14 @@ lower_cf_func(nir_function *func)
|
|||
nir_metadata_require(old_impl, nir_metadata_dominance | nir_metadata_divergence);
|
||||
|
||||
/* First, we temporarily get rid of SSA. This will make all our block
|
||||
* motion way easier.
|
||||
* motion way easier. Ask the pass to place reg writes directly in the
|
||||
* immediate predecessors of the phis instead of trying to be clever.
|
||||
* This will ensure that we never get a write to a uniform register from
|
||||
* non-uniform control flow and makes our divergence reconstruction for
|
||||
* phis more reliable.
|
||||
*/
|
||||
nir_foreach_block(block, old_impl)
|
||||
nir_lower_phis_to_regs_block(block, false);
|
||||
nir_lower_phis_to_regs_block(block, true);
|
||||
|
||||
/* We create a whole new nir_function_impl and copy the contents over */
|
||||
func->impl = NULL;
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue