anv: Fast clear depth/stencil surface in vkCmdClearAttachments

Instead of doing a slow depth clear, we can do depth fast clear in
vkClearAttachments.

Sascha Willems occlusionquery demo shows more than 2% perf boost with
this series.

On Felix's Tigerlake with the GPU at fixed frequency, this patch
improves performance of RoTR by +0.5%.

v2: (Nanley Chery)
- Clear stencil surface along with depth.
- Check for multilayer resources.
- Lookout for state.attachments.
- Fallback on slow clear for BDW and CHV if conditional rendering
  enabled.
- Keep flush in same function.

v3: (Nanley Chery)
- Return immediately after fast clearing.
- Remove unnecessary comment.

v4: (Nanley Chery)
- Add assertion for BLORP_BATCH_NO_EMIT_DEPTH_STENCIL.
- Remove unnecessary local variable.
- Add 3DSTATE_WM_HZ_OP comment.

v5: (Nanley Chery)
- Fix comments.
- Don't take fast depth clear path if BLORP_BATCH_PREDICATE_ENABLE set.
- Refactor code in can_hiz_clear_att.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20175>
This commit is contained in:
Sagar Ghuge 2020-09-21 17:56:08 -07:00 committed by Marge Bot
parent ee03b30e45
commit e488773b29

View file

@ -1299,6 +1299,52 @@ anv_fast_clear_depth_stencil(struct anv_cmd_buffer *cmd_buffer,
"after clear hiz");
}
static bool
can_hiz_clear_att(struct anv_cmd_buffer *cmd_buffer,
struct blorp_batch *batch,
const struct anv_attachment *ds_att,
const VkClearAttachment *attachment,
uint32_t rectCount, const VkClearRect *pRects)
{
/* From Bspec's section MI_PREDICATE:
*
* "The MI_PREDICATE command is used to control the Predicate state bit,
* which in turn can be used to enable/disable the processing of
* 3DPRIMITIVE commands."
*
* Also from BDW/CHV Bspec's 3DSTATE_WM_HZ_OP programming notes:
*
* "This command does NOT support predication from the use of the
* MI_PREDICATE register. To predicate depth clears and resolves on you
* must fall back to using the 3D_PRIMITIVE or GPGPU_WALKER commands."
*
* Since BLORP's predication is currently dependent on MI_PREDICATE, fall
* back to the slow depth clear path when the BLORP_BATCH_PREDICATE_ENABLE
* flag is set.
*/
if (batch->flags & BLORP_BATCH_PREDICATE_ENABLE)
return false;
if (rectCount > 1) {
anv_perf_warn(VK_LOG_OBJS(&cmd_buffer->device->vk.base),
"Fast clears for vkCmdClearAttachments supported only for rectCount == 1");
return false;
}
/* When the BLORP_BATCH_NO_EMIT_DEPTH_STENCIL flag is set, BLORP can only
* clear the first slice of the currently configured depth/stencil view.
*/
assert(batch->flags & BLORP_BATCH_NO_EMIT_DEPTH_STENCIL);
if (pRects[0].layerCount > 1 || pRects[0].baseArrayLayer > 0)
return false;
return anv_can_hiz_clear_ds_view(cmd_buffer->device, ds_att->iview,
ds_att->layout,
attachment->aspectMask,
attachment->clearValue.depthStencil.depth,
pRects->rect);
}
static void
clear_depth_stencil_attachment(struct anv_cmd_buffer *cmd_buffer,
struct blorp_batch *batch,
@ -1313,6 +1359,18 @@ clear_depth_stencil_attachment(struct anv_cmd_buffer *cmd_buffer,
s_att->vk_format == VK_FORMAT_UNDEFINED)
return;
const struct anv_attachment *ds_att = d_att->iview ? d_att : s_att;
if (ds_att->iview &&
can_hiz_clear_att(cmd_buffer, batch, ds_att, attachment, rectCount, pRects)) {
anv_fast_clear_depth_stencil(cmd_buffer, batch, ds_att->iview->image,
attachment->aspectMask,
ds_att->iview->planes[0].isl.base_level,
ds_att->iview->planes[0].isl.base_array_layer,
pRects[0].layerCount, pRects->rect,
attachment->clearValue.depthStencil.stencil);
return;
}
bool clear_depth = attachment->aspectMask & VK_IMAGE_ASPECT_DEPTH_BIT;
bool clear_stencil = attachment->aspectMask & VK_IMAGE_ASPECT_STENCIL_BIT;