Commit graph

2159 commits

Author SHA1 Message Date
Valentine Burley
c2af4f61a7 tu: Use vk_query_pool
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29441>
2024-07-01 16:23:29 +00:00
Valentine Burley
cc432c358a tu: Use the common versions of vkBegin/EndQuery()
Move all the logic into tu_CmdBegin/EndQueryIndexedEXT. CmdBegin/EndQuery in
the common runtime is a wrapper that calls tu_CmdBegin/EndQueryIndexedEXT with
index 0.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29441>
2024-07-01 16:23:29 +00:00
Valentine Burley
45a3c2d197 tu: Rename tu_query.cc/h to tu_query_pool.cc/h
Match the structure of the common Vulkan runtime and NVK.
Additionally update a comment to reflect the current state.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29441>
2024-07-01 16:23:28 +00:00
Valentine Burley
d8ebc632eb tu: Move buffer view related code to tu_buffer_view.cc/h
More code isolation. Match the structure of the common Vulkan runtime,
NVK and RADV.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29441>
2024-07-01 16:23:28 +00:00
Valentine Burley
09d224685d tu: Drop tu_buffer_view_init helper function
Simplify the code by inlining the logic from tu_buffer_view_init
directly into tu_CreateBufferView.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29441>
2024-07-01 16:23:28 +00:00
Valentine Burley
c21faf12e7 tu: Use vk_buffer_view
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29441>
2024-07-01 16:23:28 +00:00
Danylo Piliaiev
02b1d23fed tu: Enable LRZ feedback in sysmem
The perf benefits are to be observed but that's what blob is doing.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25345>
2024-06-26 15:53:51 +00:00
Danylo Piliaiev
2a33cd113a tu: Use LRZ feedback in gmem
We set LRZ_FEEDBACK_EARLY_LRZ_LATE_Z mask for rendering pass after
HW binning because:
- Draws with EARLY_Z contributed to depth buffer in BINNING stage;
- Draws with LATE_Z is what usually disables LRZ.
- Draws with EARLY_LRZ_LATE_Z are the ones we want because they
  represent the common case of FS with "discard".

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25345>
2024-06-26 15:53:51 +00:00
Danylo Piliaiev
229bd7b9b9 freedreno: Describe LRZ feedback mechanism
Some draws do write depth but cannot contribute to LRZ during the BINNING pass
e.g. when fragment shader has "discard" in it, however they can contribute to
LRZ during the RENDERING pass via LRZ feedback meachanism. This may allow the
draws that follow to depth test against the updated LRZ, this is especially
important if such "bad" draws were at the start of the renderpass.

LRZ feedback happens during the RENDERING pass when LRZ_FEEDBACK_ZMODE_MASK
is set, if draw has a6xx_ztest_mode that has corresponding flag set in
LRZ_FEEDBACK_ZMODE_MASK - its depth values would be used for feedback.

LRZ feedback alongside with LRZ testing also works during sysmem rendering.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25345>
2024-06-26 15:53:51 +00:00
Connor Abbott
78c5daf029 tu: Add early preamble statistic
It can affect performance if we accidentally disable early preamble so
record it here.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29903>
2024-06-26 15:16:38 +00:00
Connor Abbott
337fb7dec2 ir3, tu, freedreno: Move early_preamble to ir3_shader
The ir3_info is reset by ir3_collect_shader_info() on the expectation
that all info is collected inside that function. This meant that we were
accidentally disabling early preamble. Re-enable it.

We keep a copy in ir3_info for shader statistics in the next commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29903>
2024-06-26 15:16:38 +00:00
Valentine Burley
8cfdc099cd tu: Use the common version of vkQueueBindSparse
This is implemented in the common runtime. No need to provide a stub here.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29854>
2024-06-26 14:38:22 +00:00
Valentine Burley
d882198fc3 tu: Move buffer related code to tu_buffer.cc/h
More code isolation. Match the structure of the common Vulkan runtime,
NVK and RADV.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29854>
2024-06-26 14:38:22 +00:00
Valentine Burley
c0a9b0f8d6 tu: Use the common version of vkGetBufferMemoryRequirements2
Additionally simplify the code by inlining the logic from
tu_get_buffer_memory_requirements directly into
tu_GetDeviceBufferMemoryRequirements.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29854>
2024-06-26 14:38:22 +00:00
Valentine Burley
617291d2d9 tu: Advertise VK_KHR_shader_float_controls2
No Turnip or ir3 changes required, this was implemented in NIR by Intel.
Passes dEQP-VK.spirv_assembly.instruction.*.float_controls2.*

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29866>
2024-06-24 13:56:26 +00:00
Valentine Burley
0ad1c80250 tu: Drop tu_init_sampler helper function
Simplify the code by inlining the logic from tu_init_sampler
directly into tu_CreateSampler.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29808>
2024-06-21 19:30:06 +00:00
Valentine Burley
a931329146 tu: Move sampler related code to tu_sampler.cc/h
More code isolation. Match the structure of the common Vulkan runtime,
NVK and RADV.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29808>
2024-06-21 19:30:06 +00:00
Valentine Burley
739dfcf807 tu: Use device->vk.enabled_features instead of iterating twice
vk_device already has the list of enabled features, no need to iterate
twice on the pNext structs.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29808>
2024-06-21 19:30:06 +00:00
Valentine Burley
55fc7aea5f tu: Use vk_sampler
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29808>
2024-06-21 19:30:06 +00:00
Valentine Burley
75a6d185a0 tu: Switch to vk_ycbcr_conversion
Drop tu_sampler_ycbcr_conversion in favor of the common vk_ycbcr_conversion.
This allows using CreateSamplerYcbcrConversion and DestroySamplerYcbcrConversion
from the common runtime and will be required for vk_sampler and for using the
common ycbcr lowering later.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29808>
2024-06-21 19:30:06 +00:00
Alyssa Rosenzweig
da752ed7c1 treewide: use nir_def_replace sometimes
Two Coccinelle patches here. Didn't catch nearly as much as I would've liked but
it's a start.

Coccinelle patch:

    @@
    expression intr, repl;
    @@

    -nir_def_rewrite_uses(&intr->def, repl);
    -nir_instr_remove(&intr->instr);
    +nir_def_replace(&intr->def, repl);

Coccinelle patch:

    @@
    identifier intr;
    expression instr, repl;
    @@

    nir_intrinsic_instr *intr = nir_instr_as_intrinsic(instr);
    ...
    -nir_def_rewrite_uses(&intr->def, repl);
    -nir_instr_remove(instr);
    +nir_def_replace(&intr->def, repl);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom]
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima]
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> [etna]
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com> [r300]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29817>
2024-06-21 15:36:56 +00:00
Connor Abbott
8e6ecf3df8 tu: Don't WFI after every dispatch
I'm not sure why this was added back in 2019 before proper barrier
support, but it surely shouldn't be necessary now and is unnecessarily
serializing compute dispatches.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29815>
2024-06-21 11:06:35 +00:00
Connor Abbott
35c9b7fb90 tu: Fix unaligned indirect command synchronization
We need to wait to allow any previous uses to finish, and we have to
wait to allow the CACHE_INVALIDATE to finish before starting the
dispatch.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29815>
2024-06-21 11:06:35 +00:00
Connor Abbott
c7284c94ef tu: Use a7xx terminology for flushes
a7xx renamed events around flushing:

a6xx              a7xx
FLUSH             CLEAN
INVALIDATE        INVALIDATE
FLUSH+INVALIDATE  FLUSH

The FLUSH events stayed the same but now they also invalidate. By not
adopting the new CLEAN events, we're inadvertantly invalidating too
much.

This change is just a refactor, that makes generic code consistently use
the a7xx terminology. The next commit will actually make us use CLEAN.

Note that LRZ_FLUSH is deliberately not changed because it actually
also invalidates (and the real name on a6xx was FLUSH_AND_INVALIDATE).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29824>
2024-06-21 10:34:05 +00:00
Connor Abbott
0e220cd45a tu: Support VK_EXT_attachment_feedback_loop_dynamic_state
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23374>
2024-06-21 09:06:53 +00:00
Connor Abbott
833a0cf76e tu: Use image aspects for feedback loops
For consistency with the dynamic state.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23374>
2024-06-21 09:06:53 +00:00
Danylo Piliaiev
2d2f19aa44 tu: Add enable_tp_ubwc_flag_hint feature to a7xx
On a740 TPL1_DBG_ECO_CNTL1.TP_UBWC_FLAG_HINT must be the same between
all drivers in the system, somehow having different values affects
BLIT_OP_SCALE. We cannot automatically match blob's value, so the
best thing we could do is a toggle.

Example:
 FD_DEV_FEATURES=enable_tp_ubwc_flag_hint=0

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29754>
2024-06-20 13:49:20 +00:00
Danylo Piliaiev
37ddf572b1 tu: Fix issues with render_pass tracepoint
cmd->state.attachments was accessed out of bounds, which somehow instead
of crash caused the tracepoint to be skipped.

drawcall_bandwidth_per_sample_sum was divided by 0 when there were no
draw calls in a renderpass.

Fixes: 1aab0fc4f5
("tu: Add attachments' UBWC info to renderpass tracepoint")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29752>
2024-06-19 12:11:10 +00:00
Connor Abbott
53ba1613ec tu: Implement early preamble
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Connor Abbott
472ce31e56 tu: Workaround early preamble HW bug
This seems to be reproducable only by running CTS in parallel with
deqp-runner.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27462>
2024-06-18 16:52:31 +00:00
Zan Dobersek
9845e99960 tu: avoid memory polling in occlusion query endings using ZPASS_DONE
On newer hardware where ZPASS_DONE events are used for sample count writes
the memory polling in occlusion query endings can be wholly avoided. A WFI
is still required, but the performance gain is still in the range of 10% on
the trivial occlusionquery demo.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29403>
2024-06-18 11:39:57 +00:00
Zan Dobersek
5653c52151 tu: fix ZPASS_DONE interference between occlusion queries and autotuner
On newer devices where ZPASS_DONE events have sample count writing
abilities the firmware expects these events to come in begin-end pairs,
essentially corresponding to a typical occlusion query usage. Since this
event is also used in the autotuner we have to avoid event pairs to be
emitted in an interleaved fashion.

Additional renderpass state now tracks whether a given renderpass contains
an occlusion query. If so, autotuner will emit miscellaneous ZPASS_DONE
events in order to form its own begin-end pairs before and after the
renderpass commands.

Occlusion query behavior inside a renderpass doesn't change. But when used
outside of a renderpass, possible autotuner usage requires to again emit
ZPASS_DONE events that end up forming begin-end pairs of these events both
at the start and the end of the query.

Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Fixes: 4e6a1f8852 ("tu/autotune: Use `CP_EVENT_WRITE7::ZPASS_DONE` on A7XX")
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29403>
2024-06-18 11:39:57 +00:00
Alyssa Rosenzweig
15257b65c6 treewide: use nir_metadata_control_flow
Via Coccinelle patch:

    @@
    @@

    -nir_metadata_block_index | nir_metadata_dominance
    +nir_metadata_control_flow

...plus some manual fixups for call sites missed by coccinelle.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom]
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>
2024-06-17 16:28:14 -04:00
Valentine Burley
d9af1633a9 tu: Remove declaration of unused update_stencil_mask function
The update_stencil_mask function was removed when moving to the common
Vulkan dynamic state handling.

Fixes: 97da0a7734 ("tu: Rewrite to use common Vulkan dynamic state")
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29658>
2024-06-17 11:37:32 +00:00
Valentine Burley
5e9cb32c10 tu: Handle the new sync2 flags
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8277
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29658>
2024-06-17 11:37:32 +00:00
Danylo Piliaiev
1aab0fc4f5 tu: Add attachments' UBWC info to renderpass tracepoint
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29707>
2024-06-14 20:18:32 +00:00
Danylo Piliaiev
aba7140b38 tu: Add LRZ disable reason to renderpass tracepoint
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29707>
2024-06-14 20:18:32 +00:00
Danylo Piliaiev
96ed275a53 turnip: Implement VK_EXT_depth_clamp_zero_one
For A6XX it's a no-op, but A7XX+ doesn't clamp to [0,1] with disabled
depth clamp, to support VK_EXT_depth_clamp_zero_one we have to always
enable clamp and manually set depth range to [0,1] when rs->depth_clamp_enable
is false.

Passes:
 dEQP-VK.depth.*
 dEQP-GLES3.functional.fbo.depth.depth_test_clamp.* (zink)

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29387>
2024-06-12 12:58:32 +00:00
Valentine Burley
47bbaf000d tu: Handle all dependencies of CmdWaitEvents2
The spec describes pDependencyInfos as an array with eventCount elements.

Addresses: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10580

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29630>
2024-06-12 12:28:44 +00:00
Valentine Burley
a6a0730bd5 tu: Move event related related code to tu_event.cc/h
Match the structure of NVK and RADV. Pull all event related code from
tu_device.cc/h and tu_cmd_buffer.cc/h into one location.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29630>
2024-06-12 12:28:44 +00:00
Mark Collins
cc82f7f8ac tu: Emit GRAS_LRZ_DEPTH_BUFFER_INFO correctly
This register stores the depth format of the underlying depth
buffer, it seemingly doesn't change anything about the LRZ buffer
itself and has no behavioral changes over setting it to 0.

However, it's possible that there's some case where it does matter
so matching the proprietary driver's behavior is safer.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
9e936d3fde tu: Specify LRZ FC depth clear value on A7XX
A7XX allows setting the FC depth to an arbitrary F32 value rather
than being limited to 0.0/1.0, we use this to match the depth clear
value.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
15b02f4700 tu: Update LRZ FC dirty clear for A7XX
Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
db505ea565 tu: Update LRZ FC allocation for A7XX layout
The allocation size is now determined based off the LRFC structure
rather than hardcoding in A6XX's layout.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
bf5e8fb394 tu/lrz: Add structure for LRZ FC layout
The layout of the LRZ FC section has changed substantially between
A6XX and A7XX so the best way to express the layout was determined
to be a templated structure.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
c801fd9771 tu: Allow LRZ on A7XX
LRZ without FC should work with all the current changes.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
0068e75fc6 tu/lrz: Use actual CHIP rather than hardcoding A6XX
A lot of CHIP template parameters were hardcoded to A6XX rather than
the actual chip which would lead to an incorrect command stream being
generated.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
895c091cdd tu/lrz: Emit GRAS_LRZ_CNTL2 on A7XX
The functionality of GRAS_LRZ_CNTL on A6XX was split into GRAS_LRZ_CNTL
and GRAS_LRZ_CNTL2 on A7XX. The only new field is for the Z function to
be specified.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Mark Collins
f592483350 tu/shader: Allow LRZ when write pos with explicit early frag test
This is an exceptional case where any writes to gl_Depth should be
ignored, it means we can use LRZ in this case and don't need to
disable it.

Signed-off-by: Mark Collins <mark@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29453>
2024-06-07 10:18:10 +00:00
Faith Ekstrand
f8290aea48 turnip: Advertise VK_EXT_shader_replicated_composites
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29509>
2024-06-04 16:34:48 +00:00