From f2800deacbf1d8d55b18f8fe1ec01930328c220c Mon Sep 17 00:00:00 2001 From: Jordan Justen Date: Fri, 22 Mar 2024 00:02:48 -0700 Subject: [PATCH] intel/brw/validate: Simplify grf span validation check by not using a mask Previously this check would create a mask of the bytes used in the grf, and then shift the mask. This worked well when there was 32 bytes in the register because a 64-bit uint64_t could easily detect that bytes were used in the next regiter. (The next register was the high 32-bits of the `access_mask` variable.) With Xe2, the register size becomes 64 bytes, meaning this strategy doesn't work. Instead of a mask, we can just check to see if more than 1 grfs are used during each loop iteration. (Suggested by Ken.) This will make it easier to extend for Xe2 in a follow on commit. Verified this with dEQP-VK.subgroups.arithmetic.compute.subgroupexclusivemul_u64vec4_requiredsubgroupsize on Xe2, which otherwise would cause the program to fail to validate because it assumed a grf was 32 bytes. Backport-to: 24.2 Signed-off-by: Jordan Justen Reviewed-by: Kenneth Graunke Part-of: --- src/intel/compiler/brw_eu_validate.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/intel/compiler/brw_eu_validate.c b/src/intel/compiler/brw_eu_validate.c index 9fd3bb579ed..45e0698b025 100644 --- a/src/intel/compiler/brw_eu_validate.c +++ b/src/intel/compiler/brw_eu_validate.c @@ -1075,21 +1075,25 @@ general_restrictions_on_region_parameters(const struct brw_isa_info *isa, /* VertStride must be used to cross GRF register boundaries. This rule * implies that elements within a 'Width' cannot cross GRF boundaries. */ - const uint64_t mask = (1ULL << element_size) - 1; unsigned rowbase = subreg; for (int y = 0; y < exec_size / width; y++) { - uint64_t access_mask = 0; + bool spans_grfs = false; unsigned offset = rowbase; + unsigned first_grf = offset / REG_SIZE; for (int x = 0; x < width; x++) { - access_mask |= mask << (offset % 64); + const unsigned end_byte = offset + (element_size - 1); + const unsigned end_grf = end_byte / REG_SIZE; + spans_grfs = end_grf != first_grf; + if (spans_grfs) + break; offset += hstride * element_size; } rowbase += vstride * element_size; - if ((uint32_t)access_mask != 0 && (access_mask >> 32) != 0) { + if (spans_grfs) { ERROR("VertStride must be used to cross GRF register boundaries"); break; }