intel/fs: Always stall between the fences on Gen11+

Be conservative in Gfx11+ and always stall in a fence.  Since there are
two different fences, and shader might want to synchronize between them.

This change also brings back the original code block for the stall
between the fence and comment from the commit
b390ff3517.

v2: (Caio)
 - Re-arrange code block.
 - Adjust comment.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6958

Fixes: f7262462 ("intel/fs: Rework fence handling in brw_fs_nir.cpp")
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Tested-by: Mark Janes <markjanes@swizzler.org>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20996>
(cherry picked from commit 0c083d29a5)
This commit is contained in:
Sagar Ghuge 2023-01-30 10:41:37 -08:00 committed by Dylan Baker
parent ccdb1221ea
commit d34ff0b916
2 changed files with 14 additions and 3 deletions

View file

@ -355,7 +355,7 @@
"description": "intel/fs: Always stall between the fences on Gen11+",
"nominated": true,
"nomination_type": 1,
"resolution": 0,
"resolution": 1,
"main_sha": null,
"because_sha": "f726246297e56ae0b3fac1af072f57dce16700ab"
},

View file

@ -4584,6 +4584,15 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
assert(fence_regs_count <= ARRAY_SIZE(fence_regs));
/* Be conservative in Gen11+ and always stall in a fence. Since
* there are two different fences, and shader might want to
* synchronize between them.
*
* TODO: Use scope and visibility information for the barriers from NIR
* to make a better decision on whether we need to stall.
*/
bool force_stall = devinfo->ver >= 11;
/* There are four cases where we want to insert a stall:
*
* 1. If we're a nir_intrinsic_end_invocation_interlock. This is
@ -4599,10 +4608,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
* scheduling barrier to keep the compiler from moving things
* around in an invalid way.
*
* 4. On platforms with LSC.
* 4. On Gen11+ and platforms with LSC, we have multiple fence types,
* without further information about the fence, we need to force a
* stall.
*/
if (instr->intrinsic == nir_intrinsic_end_invocation_interlock ||
fence_regs_count != 1 || devinfo->has_lsc) {
fence_regs_count != 1 || devinfo->has_lsc || force_stall) {
ubld.exec_all().group(1, 0).emit(
FS_OPCODE_SCHEDULING_FENCE, ubld.null_reg_ud(),
fence_regs, fence_regs_count);