broadcom/compiler: Enable PER_QUAD TMU access only in uniform control flow

PER_QUAD TMU lookups will partially override the predication mask on TMU
writes. If some but not all lanes in a quad are predicated out, setting
PER_QUAD will force them all to be enabled. This can result in TMU
access to bogus addresses when in nonuniform control flow. Also, since
PER_QUAD is needed to make sure derivatives work with helper
invocations, and derivatives are undefined in nonuniform control flow,
there is no reason to leave it enabled in this case.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7726>
This commit is contained in:
Arcady Goldmints-Orlov 2020-12-15 13:55:23 -06:00 committed by Marge Bot
parent 79bde75131
commit 8f583df7b6

View file

@ -365,7 +365,7 @@ ntq_emit_tmu_general(struct v3d_compile *c, nir_intrinsic_instr *instr,
num_components = tmu_writes - 1;
}
uint32_t perquad = is_load
uint32_t perquad = is_load && !vir_in_nonuniform_control_flow(c)
? GENERAL_TMU_LOOKUP_PER_QUAD
: GENERAL_TMU_LOOKUP_PER_PIXEL;
uint32_t config = (0xffffff00 |