Commit graph

27 commits

Author SHA1 Message Date
Emma Anholt
c1968deec2 turnip: Lazily call tu6_emit_descriptor_sets() at draw time.
This lets us batch up the state changes from multiple
vkCmdBindDescriptorSets, which ANGLE and zink will both do in a single
draw.

Improves ANGLE (sysmem) driver_overhead perf by 5.18806% +/- 1.03444% (n=5).
Improves ANGLE aztec_ruins_high perf by ~.3%. (clear result in the graph,
but the screen went to sleep mid way through and so it was high variance)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20084>
2022-12-19 19:14:02 +00:00
Connor Abbott
cb3872f2cd tu: Implement VK_EXT_descriptor_buffer
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19849>
2022-12-12 17:38:19 +00:00
Emma Anholt
f2414dc2a0 turnip: Drop the cs argument from tu6_emit_cache_flush*().
It's always draw_cs or cs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19939>
2022-11-29 19:30:25 +00:00
Dave Airlie
49c4c5cb64 turnip: use common command buffer status code.
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16922>
2022-11-11 05:01:24 +00:00
Connor Abbott
def56b531c tu: Support GMEM with layered rendering and multiview
It turns out that this actually is supported. GMEM can hold multiple
layers which are cleared, loaded, and resolved separately. The stride
between layers seems to be implicitly calculated based on the tile size,
and we have to match it when blitting to/from GMEM. One tricky thing is
that now we may realize that we don't have enough space for GMEM only
when computing the tiling config, because we may not know the number of
framebuffer layers until we have the framebuffer and too many
framebuffer layers will exhaust GMEM.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19505>
2022-11-08 16:35:02 +00:00
Connor Abbott
c8c7154c2e tu: Implement extendedDynamicState3ColorBlendEnable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Connor Abbott
e63c8b3bf1 tu: Implement extendedDynamicState3ProvokingVertexMode
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Connor Abbott
6b82998985 tu: Rename RASTERIZER_DISCARD state to PC_RASTER_CNTL
It also contains the rasterization stream.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Connor Abbott
87bdddf8f1 tu: Implement extendedDynamicState3AlphaToCoverageEnable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Connor Abbott
99caf95eba tu: Implement extendedDynamicState3Depth*Enable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Connor Abbott
0e09559bd6 tu: Implement extendedDynamicState3TessellationDomainOrigin
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Connor Abbott
55bbf56a17 tu: Implement extendedDynamicState3PolygonMode
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Connor Abbott
d20256eba3 tu: Combine GRAS_SU_CNTL drawstate with rast draw state
Emit GRAS_SU_CNTL, GRAS_CL_CNTL, the polygon mode, and the VRS registers
in one draw state. We're running out of draw states, and this saves a
draw state while preparing us for the rest of the rasterization state to
be dynamic.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
2022-11-03 21:59:42 +00:00
Mark Collins
9248ce2978 tu: Only write A6XX_PC_PRIMITIVE_CNTL_0 if changed
Increases the score in the `draw` test in `vkoverhead` to 71809
from 67170 on a HDK 888.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19107>
2022-10-19 19:00:42 +00:00
Danylo Piliaiev
4eba6d71a8 tu: Lazily init VSC to fix dynamic rendering in secondary cmdbufs
Dynamic renderpasses need vsc_prim_strm_pitch, vsc_draw_strm_pitch
values, and a correct BO. The easiest way to solve this is to
lazily init VSC when it is needed, and not at every cmdbuf
initialization.

Fixes CTS tests (when running with TU_DEBUG=gmem,forcebin):
 dEQP-VK.draw.dynamic_rendering.complete_secondary_cmd_buff.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18996>
2022-10-10 18:31:15 +00:00
Connor Abbott
68f3c38c80 tu: Implement extendedDynamicState2PatchControlPoints
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18773>
2022-10-04 15:39:43 +00:00
Emma Anholt
64d0e94d2c turnip: Use the simplified stencil write flags for the LRZ-allowed check.
Traces of GLES games that ANGLE has taken frequently have no-op stencil
writes, which ANGLE and Zink both pass straight through.  Given that we
support dynamic stencil state updates via tu_CmdSetStencil*(), draw time
really is the time for deciding this state unfortunately.

Reuse the fancier stencil write enables check from "can we do early z?" in
"can we do LRZ?".  This gets one set of draws in among_us to have LRZ, but
I don't see a detectable performance difference.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18691>
2022-09-21 17:18:07 +00:00
Connor Abbott
1c6c8ce54b tu: Make MSAA emission always dynamic
This wasn't taking into account the dynamic primitive topology, and it
was suboptimal with dynamic rendering, because we don't know when
compiling the pipeline whether variable multisample rate is being used.
It's going to be even more difficult to support the current approach
with graphics pipeline library because the MSAA state is derived from
mulisample state, rasterization state, input assembly state, and
tessellation state, which may be in different pipelines. Just set it
dynamically based on the pipeline and re-emit it when the pipeline's
MSAA or rectangular/bresenham state differs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18554>
2022-09-21 11:20:15 +00:00
Danylo Piliaiev
34109c8c10 turnip: implement VK_EXT_multi_draw
vkoverhead running:
    * draw numbers are reported as thousands of operations per second
    * percentages for draw cases are relative to 'draw'
   0, draw,                                      29151,        100.0%
   1, draw_multi,                                35449,        121.6%
   2, draw_vertex,                               28907,        99.2%
   3, draw_multi_vertex,                         56658,        194.4%

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11502>
2022-09-14 13:18:02 +00:00
Emma Anholt
3ef13ef234 turnip: Treating non-d/s-write pipelines as not having d/s feedback loops.
A subpass in gfxbench has the depth buffer present, but not written to,
for a render pass using the depth buffer as an input attachment.  We can
skip single-prim-mode and the associated "oh no don't use sysmem" in that
case.

Improves gfxbench vk-5-normal perf by 1.56193% +/- 0.0743035% (n=14).
Part of #6327.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18241>
2022-09-02 16:47:02 +00:00
Jason Ekstrand
c052c6a333 tu: Move to the common command pool framework
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18324>
2022-09-01 20:17:25 +00:00
Dave Airlie
3c092f5cd8 turnip: use common command record result.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16918>
2022-09-01 14:11:55 +00:00
Emma Anholt
b0a74776d1 tu: Emit only as many VBs as we've ever seen bound on the command buffer.
A later CmdBindPipeline would shrink the two draw states' sizes to the
number of VBs the pipeline actually uses, but we can save some CPU
overhead and memory by not emitting all the unused VBs as well.

Improves zink drawoverhead throughput on test 5 (1 VB change) by 38.5178%
+/- 0.48738% (n=18).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17932>
2022-08-22 22:26:29 +00:00
Emma Anholt
29d725a8aa tu: Only emit as many bindless regs as we have seen descriptor sets.
Cuts 12 dwords of CS per draw in vk-5-normal's main renderpass.

zink drawoverhead -test 9 (1 texture change) throughput +0.898636% +/-
0.212647% (n=30).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17932>
2022-08-22 22:26:28 +00:00
Connor Abbott
7c7feab4e1 tu: Implement VK_EXT_vertex_input_dynamic_state
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17554>
2022-08-05 03:22:00 +00:00
Connor Abbott
c82af0c43b tu: Decouple vertex input state from shader
Emit VFD_DECODE and VFD_DEST separately, similarly to what Gallium does.
This means we emit a few more VFD_DECODE for binning shaders and when
there are unused attributes, but hopefully the overhead won't be too
much. In exchange we lose one draw state, and in the future we can
pre-compute the dynamic vertex state independently of the shader, so
there should be lower CPU overhead with dynamic vertex inputs.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17554>
2022-08-05 03:22:00 +00:00
Chia-I Wu
8e61bee30c turnip: add tu_cmd_buffer.h
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17811>
2022-08-04 00:40:12 +00:00