brw: Don't lower phis involved in DPAS instructions to scalar

On my Arc A380 (DG2), this more than doubles the performance of Jeff
Bolz's cooperative matrix benchmark. With llama.cpp modified to use
cooperative matrix on DG2, performance is improved by 37%.

Closes: #15311
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Matt Corallo <git@bluematt.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41172>
This commit is contained in:
Ian Romanick 2026-04-23 19:47:00 -07:00 committed by Marge Bot
parent 09b43966ba
commit e301817753

View file

@ -1701,6 +1701,25 @@ brw_nir_tag_speculative_access(nir_shader *nir)
static uint8_t
brw_nir_lower_phis_to_scalar_cb(const nir_instr *instr, const void *_)
{
nir_phi_instr *phi = nir_instr_as_phi(instr);
/* If a phi is used by DPAS or if a phi source is the result of a DPAS, do
* not scalarize.
*/
nir_foreach_phi_src(src, phi) {
const nir_intrinsic_instr *intrin = nir_src_as_intrinsic(src->src);
if (intrin != NULL && intrin->intrinsic == nir_intrinsic_dpas_intel)
return 0;
}
nir_foreach_use(use_src, &phi->def) {
const nir_intrinsic_instr *intrin = nir_src_as_intrinsic(*use_src);
if (intrin != NULL && intrin->intrinsic == nir_intrinsic_dpas_intel)
return 0;
}
return 1;
}