From f2ef103863c81ddc4be25692e6cee809d8f68be2 Mon Sep 17 00:00:00 2001 From: Connor Abbott Date: Sat, 16 Jul 2022 22:58:59 +0200 Subject: [PATCH] tu: Treat CP_WAIT_FOR_ME as a cache invalidate The workaround for draws that need a CP_WAIT_FOR_ME didn't work if the barrier before the draw is in a separate command buffer from the draw. The barrier would add a pending CP_WAIT_FOR_ME, but it would get dropped on the floor at the end of the command buffer and the draw wouldn't have a pending CP_WAIT_FOR_ME so it wouldn't emit one. We don't know in the barrier if the destination is a draw with the workaround, so we have two options: - Emit any pending CP_WAIT_FOR_ME at the end of the command buffer (and before secondaries) in case there is a workaround draw later. This will emit an extra CP_WAIT_FOR_ME at the end of the command buffer in case there is an indirect command barrier. - Always assume at the beginning of the command buffer that there is a pending CP_WAIT_FOR_ME. This will emit an extra CP_WAIT_FOR_ME before the first workaround-requiring draw in the command buffer, in case there was a barrier earlier. The only draws requiring a workaround are currently vkCmdDraw*IndirectCount(), which we assume are rarer than indirect command barriers, so we implement the second option. This entails treating it as a cache invalidate. This fixes some upcoming dynamic rendering CTS tests that do vkCmdDrawIndirectCount() in a secondary but put the barrier for it in the primary. Fixes: 37939e9c546 ("turnip: Fix the lack of WFM before indirect draws") Part-of: (cherry picked from commit c5be4445004e4980a1897b904fc206b3d030c58f) --- .pick_status.json | 2 +- src/freedreno/vulkan/tu_private.h | 9 ++++++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/.pick_status.json b/.pick_status.json index 433c8311579..038f31da5ca 100644 --- a/.pick_status.json +++ b/.pick_status.json @@ -5638,7 +5638,7 @@ "description": "tu: Treat CP_WAIT_FOR_ME as a cache invalidate", "nominated": true, "nomination_type": 1, - "resolution": 0, + "resolution": 1, "main_sha": null, "because_sha": "37939e9c5462b871b0b9b00a43c5c9bec1e10e9d" }, diff --git a/src/freedreno/vulkan/tu_private.h b/src/freedreno/vulkan/tu_private.h index 28a9c5ea1c8..862d507c9dc 100644 --- a/src/freedreno/vulkan/tu_private.h +++ b/src/freedreno/vulkan/tu_private.h @@ -1068,7 +1068,14 @@ enum tu_cmd_flush_bits { TU_CMD_FLAG_ALL_INVALIDATE = TU_CMD_FLAG_CCU_INVALIDATE_DEPTH | TU_CMD_FLAG_CCU_INVALIDATE_COLOR | - TU_CMD_FLAG_CACHE_INVALIDATE, + TU_CMD_FLAG_CACHE_INVALIDATE | + /* Treat CP_WAIT_FOR_ME as a "cache" that needs to be invalidated when a + * a command that needs CP_WAIT_FOR_ME is executed. This means we may + * insert an extra WAIT_FOR_ME before an indirect command requiring it + * in case there was another command before the current command buffer + * that it needs to wait for. + */ + TU_CMD_FLAG_WAIT_FOR_ME, }; /* Changing the CCU from sysmem mode to gmem mode or vice-versa is pretty