nak/sm70: allow first parameter of hfma2 to be non-reg
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run

Either Rb or Rc can be the non-register, so the copy if both
not reg should be sufficient.

Totals:
CodeSize: 14025216 -> 14022144 (-0.02%)
Static cycle count: 5313517 -> 5312651 (-0.02%)

Totals from 4 (0.30% of 1332) affected shaders:
CodeSize: 119168 -> 116096 (-2.58%)
Static cycle count: 33920 -> 33054 (-2.55%)

Only affects:
 q2rtx/q2rtx-rt-pipeline                42        -0.48%        -0.45%

This also helps with the coop matrix shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36104>
This commit is contained in:
Dave Airlie 2025-03-14 16:41:06 +10:00 committed by Marge Bot
parent 5bee7c0b12
commit e6038645fa

View file

@ -1086,7 +1086,6 @@ impl SM70Op for OpHFma2 {
let [src0, src1, src2] = &mut self.srcs;
swap_srcs_if_not_reg(src0, src1, gpr);
b.copy_alu_src_if_not_reg(src0, gpr, SrcType::F16v2);
b.copy_alu_src_if_not_reg(src1, gpr, SrcType::F16v2);
b.copy_alu_src_if_both_not_reg(src1, src2, gpr, SrcType::F16v2);
}