From bac10ef4aaad2d61ea1cc268d2c53d40add25553 Mon Sep 17 00:00:00 2001 From: Ian Romanick Date: Wed, 4 Oct 2023 17:41:08 -0700 Subject: [PATCH] intel/fs: Add DP4A to get_lowered_simd_width While working on cooperative matrix support, I noticed some invalid DP4A instructions being generated. dp4a(32) g33<1>UD g21<8,8,1>UD g1.0<0,1,0>UD g9<1,1,1>UD This violates the constraint that the destination or a source can only access two consecutive GRFs. I'm a little surprised that validation didn't catch this. Perhaps because it's a 3 source instruction? Either way, it seems like a bigger project to fix that. Reviewed-by: Caio Oliveira Fixes: 0f809dbf404 ("intel/compiler: Basic support for DP4A instruction") Part-of: --- src/intel/compiler/brw_fs.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 1fdf6f05b5f..267bb6648a5 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -5154,6 +5154,7 @@ get_lowered_simd_width(const struct brw_compiler *compiler, const struct intel_device_info *devinfo = compiler->devinfo; switch (inst->opcode) { + case BRW_OPCODE_DP4A: case BRW_OPCODE_MOV: case BRW_OPCODE_SEL: case BRW_OPCODE_NOT: