mirror of
https://gitlab.freedesktop.org/mesa/mesa.git
synced 2025-12-29 01:30:08 +01:00
intel: Fix SIMD16 unaligned payload GRF reads on Gen4-5.
When the SIMD16 Gen4-5 fragment shader payload contains source depth
(g2-3), destination stencil (g4), and destination depth (g5-6), the
single register of stencil makes the destination depth unaligned.
We were generating this instruction in the RT write payload setup:
mov(16) m14<1>F g5<8,8,1>F { align1 compr };
which is illegal, instructions with a source region spanning more than
one register need to be aligned to even registers. This is because the
hardware implicitly does (nr | 1) instead of (nr + 1) when splitting the
compressed instruction into two mov(8)'s.
I believe this would cause the hardware to load g5 twice, replicating
subspan 0-1's destination depth to subspan 2-3. This showed up as 2x2
artifact blocks in both TIS-100 and Reicast.
Normally, we rely on the register allocator to even-align our virtual
GRFs. But we don't control the payload, so we need to lower SIMD widths
to make it work. To fix this, we teach lower_simd_width about the
restriction, and then call it again after lower_load_payload (which is
what generates the offending MOV).
Fixes: 8aee87fe4c (i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107212
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=13728
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Diego Viola <diego.viola@gmail.com>
This commit is contained in:
parent
11b9f63a74
commit
08a5c395ab
1 changed files with 20 additions and 0 deletions
|
|
@ -5115,6 +5115,25 @@ get_fpu_lowered_simd_width(const struct gen_device_info *devinfo,
|
|||
}
|
||||
}
|
||||
|
||||
if (devinfo->gen < 6) {
|
||||
/* From the G45 PRM, Volume 4 Page 361:
|
||||
*
|
||||
* "Operand Alignment Rule: With the exceptions listed below, a
|
||||
* source/destination operand in general should be aligned to even
|
||||
* 256-bit physical register with a region size equal to two 256-bit
|
||||
* physical registers."
|
||||
*
|
||||
* Normally we enforce this by allocating virtual registers to the
|
||||
* even-aligned class. But we need to handle payload registers.
|
||||
*/
|
||||
for (unsigned i = 0; i < inst->sources; i++) {
|
||||
if (inst->src[i].file == FIXED_GRF && (inst->src[i].nr & 1) &&
|
||||
inst->size_read(i) > REG_SIZE) {
|
||||
max_width = MIN2(max_width, 8);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/* From the IVB PRMs:
|
||||
* "When an instruction is SIMD32, the low 16 bits of the execution mask
|
||||
* are applied for both halves of the SIMD32 instruction. If different
|
||||
|
|
@ -6321,6 +6340,7 @@ fs_visitor::optimize()
|
|||
if (OPT(lower_load_payload)) {
|
||||
split_virtual_grfs();
|
||||
OPT(register_coalesce);
|
||||
OPT(lower_simd_width);
|
||||
OPT(compute_to_mrf);
|
||||
OPT(dead_code_eliminate);
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue