intel/brw: Use helper to create accumulator register

This ensure the region triple <V,W,H> is set correctly, in this case the
desired region is a sequential like <8,8,1>.  Without the helper the
sequence we get is <0,1,0> -- which the generator currently partially
adjusts when emitting code, but is not sufficient when doing validation
earlier.

The code generated code is slightly modified.  From crucible test
func.shader.subtractSaturate.uint in the fragment shader for SIMD8, the
diff looks like

```
 mov(8)          acc0<1>UD       g21<8,8,1>UD                    { align1 1Q $0.dst };
-add.sat(8)      g22<1>UD        -acc0<0,1,0>UD  g16<8,8,1>UD    { align1 1Q @1 $0.dst };
+add.sat(8)      g22<1>UD        -acc0<8,8,1>UD  g16<8,8,1>UD    { align1 1Q @1 $0.dst };
```

Note that without the patch generator adjusted the hstride for acc0 used
as destination (see brw_set_dest), but kept the src region as is.  For
the source, it is not clear to me why the <0,1,0> would work correctly
here since it is a scalar, but using <8,8,1> it is correct.

Fixes: 58907568ec ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops")
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28059>
This commit is contained in:
Caio Oliveira 2024-03-08 08:36:03 -08:00 committed by Marge Bot
parent 90cf8d14d6
commit db8022dc4d

View file

@ -207,7 +207,8 @@ brw_fs_lower_sub_sat(fs_visitor &s)
*/
if (inst->exec_size == 8 && inst->src[0].type != BRW_REGISTER_TYPE_Q &&
inst->src[0].type != BRW_REGISTER_TYPE_UQ) {
fs_reg acc(ARF, BRW_ARF_ACCUMULATOR, inst->src[1].type);
fs_reg acc = retype(brw_acc_reg(inst->exec_size),
inst->src[1].type);
ibld.MOV(acc, inst->src[1]);
fs_inst *add = ibld.ADD(inst->dst, acc, inst->src[0]);