nak,nir: Use a simpler version of phis_to_regs_block in lower_cf
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run

The original lower_phis_to_regs_block() is a little too clever.  It
crawls up the predecessor tree until it finds a cross edge and places
the register writes as deep as it can.  This breaks nak_nir_lower_cf().
Say you have a shader like...

    con %0 = load_uniform()
    con loop {
        if div {
        } else {
        }
        break;
    }
    con %1 = phi %0

The original lower_phis_to_regs_block() will turn it into

    con %0 = load_uniform()
    con %r = decl_reg();
    con loop {
        if div {
           reg_store(%r, %0)
        } else {
           reg_store(%r, %0)
        }
        break;
    }
    con %1 = reg_load(%r)

We then convert it into unstructured control-flow and run regs_to_ssa()
to get our phis back, which lowers each of the registers we inserted to
a phi tree.  When we try to recover divergence information on phis by
looking at their sources, this works fine if each source maps directly
to a reg_store() whic maps directly to a phi in the original IR.
However, because the reg_store() instructions are placed deeper, it may
introduce false divergence.

Switch to the simple version of nir_lower_phis_to_regs_block() which
places reg writes directly in phi predecessor blocks.  We could probably
be more conservative and just avoid placing writes to uniform regs in
divergent control-flow but it's more robust to make the load/store_reg
intrinsics match the original phis directly.

This fixes some shaders in Horizon: Zero Dawn Remastered

Fixes: b013d54e4f ("nak/lower_cf: Flag phis as convergent when possible")
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36914>
This commit is contained in:
Faith Ekstrand 2025-08-21 13:38:27 -04:00 committed by Marge Bot
parent 26e32417b9
commit c6e831ac44

View file

@ -445,10 +445,14 @@ lower_cf_func(nir_function *func)
nir_metadata_require(old_impl, nir_metadata_dominance | nir_metadata_divergence);
/* First, we temporarily get rid of SSA. This will make all our block
* motion way easier.
* motion way easier. Ask the pass to place reg writes directly in the
* immediate predecessors of the phis instead of trying to be clever.
* This will ensure that we never get a write to a uniform register from
* non-uniform control flow and makes our divergence reconstruction for
* phis more reliable.
*/
nir_foreach_block(block, old_impl)
nir_lower_phis_to_regs_block(block, false);
nir_lower_phis_to_regs_block(block, true);
/* We create a whole new nir_function_impl and copy the contents over */
func->impl = NULL;