Commit graph

186789 commits

Author SHA1 Message Date
Danylo Piliaiev
8b8c739ccd tu: Emit non-draw-state state at the first draw call
If this state was emitted at the point of previous RP, which
could happen if pipeline is not set at the start of current RP,
we have to emit non-draw-state state since it would become stale
in the next tile.

Fixes test with stale reg dbg:
 dEQP-VK.transform_feedback.primitives_generated_query.get.queue_reset.32bit.tese.xfb.color_write_disable_static.patch_list.pgq_default_xfb_default.two_draws.pqg_first.none_2_queries

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28326>
2024-03-26 11:44:53 +00:00
Danylo Piliaiev
5acdb22ba2 tu: Update RP state depending on pipeline in first RP draw
The pipeline used in RP may have been bound in another RP, so
we have to save relevant state and re-apply it on first draw.

Fixes GPU hang in the following test with forced binning + reg stomping:
 dEQP-VK.transform_feedback.primitives_generated_query.get.queue_reset.32bit.tese.xfb.color_write_disable_static.patch_list.pgq_default_xfb_default.two_draws.pqg_first.none_2_queries

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28326>
2024-03-26 11:44:53 +00:00
Valentine Burley
a19c511818 docs: Update features.txt for tu
VK_EXT_post_depth_coverage was implemented in
f1305d49d9 ("tu: Implement VK_EXT_post_depth_coverage").

Additionally mark that certain extensions are supported from a650
onwards rather than exclusively on that generation in features.txt
to match the formatting that the other drivers use.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28236>
2024-03-26 11:08:21 +00:00
Valentine Burley
98ae874344 tu: Trivially expose three VK_GOOGLE extensions
This patch exposes support for the following three extensions:

* VK_GOOGLE_decorate_string
* VK_GOOGLE_hlsl_functionality1
* VK_GOOGLE_user_type

There's nothing for the driver to do; it's all handled in spirv_to_nir.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28236>
2024-03-26 11:08:20 +00:00
Valentine Burley
05b9e0dfed tu: Expose VK_KHR_surface_protected_capabilities
This is implemented in common code.

Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28236>
2024-03-26 11:08:20 +00:00
Boris Brezillon
3bac815c78 pan/bi: Update the push constant count when emitting load_push_constant
This is needed for panvk.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
2024-03-26 11:10:44 +01:00
Boris Brezillon
d53e848936 pan/bi: Lower load_push_constant with dynamic indexing
Push constants are exposed as special registers on Bifrost/Valhall,
this means we can't index the push constant region with a dynamic
index. In order to support dynamic indexing, we need iterative CSELs
to select the right value from the access range.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
2024-03-26 11:10:44 +01:00
Boris Brezillon
1a07685bf1 pan/bi: Lower push constant accesses
On Bifrost, push constants are exposed as 64-bit registers which can
be accessed at a 32-bit granularity. Make sure push constant accesses
are lowered to guarantee a 32-bit alignment.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
2024-03-26 11:10:44 +01:00
Boris Brezillon
bb8379557e nir: Extend nir_lower_mem_access_bit_sizes() to support push constants
Mali GPUs have a 32-bit alignment constraint on push constants. Extend
nir_lower_mem_access_bit_sizes() so it can lower bit sizes on push
constant accesses.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
2024-03-26 11:10:41 +01:00
Boris Brezillon
544f76dd13 nir: Extend nir_get_io_offset_src_number() to support load_push_constant
Will be needed to support push constants in
nir_lower_mem_access_bit_sizes().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
2024-03-26 11:09:37 +01:00
Boris Brezillon
595d362d4b panvk: Implement dynamic rendering entry points
Implement dynamic rendering entry points so we can get rid of the
render pass logic.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28167>
2024-03-26 09:06:43 +00:00
Boris Brezillon
8cba497701 panfrost: Move the image attribute offset adjustment to a NIR pass
The gallium and vulkan drivers deal with vertex attribute emission
differently. The gallium driver re-emits the VS attributes on each
draw, while the vulkan driver uses explicit attribute/image
descriptor dirtiness tracking, and could keep the attribute array
around if a new pipeline using a different number of attribute is
bound. If we want to be able to do that, we need to assign a fixed
offset for image attributes, such that the Vulkan descriptor
lowering pass knows where the images are in the attribute table.

We could teach the Bifrost backend how to deal with a custom offset
but it doing that in a lowering pass also simplifies the Midgard
code.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28200>
2024-03-26 09:24:25 +01:00
Iago Toral Quiroga
7992d44b24 v3dv: fix image creation when exceeding maxResourceSize
Fixes crashes in tests like
dEQP-VK.pipeline.monolithic.render_to_image.core.2d_array.huge.width_height_layers.r8g8b8a8_unorm
with CTS main.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28364>
2024-03-26 07:23:56 +00:00
Faith Ekstrand
0d2c5999fd nak: Don't write undefined FS outputs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
fb15a42357 nak: Simplify over-all I/O lowering
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
a1e8bba7fa nak: Drop lower_io_arrays_to_elements_no_indirects for FS outputs
All we really need is for them to have no indirects which we can ensure
via nir_lower_indirect_derefs.  Splitting into individual variables is a
relic of older attempts at FS output lowering and not needed.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
d4ac4ce112 nak/nir: Use nir_io_semantics for FS outputs
We also add a new nir_intrinsic_fs_out_nv to which is a lot simpler than
store_output to pass to the NAK back-end.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
278eaa5ab1 nak: Call nir_lower_io_to_temporaries for FS outputs
They can't be indirected and we also need the guarantee that all output
writes are in the last block in the shader or else our back-end copying
is sketchy.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
f46445a0f6 nak/nir: Clean up lower_fs_inputs a bit
There's no reason why every single case needs to have it's own instance
of setting the cursor and rewriting the instruction.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
2b9a836ee3 nak: Break lower_fs_inputs into its own file
While we're at it, make the pass handle layer_id and front_face

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
bdb237a195 nak/nir: Use nir_io_semantics for varyings and attributes
This removes our reliance on driver_locaiton for varyings and attributes
by using nir_io_semantics instead.  This is probably better as NIR seems
to be trending this direction long-term.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
3b967789f4 nak/nir: Emit nir_intrinsic_ipa_nv directly for FS system values
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
668880c8c8 nak/nir: Add a load_fs_input hepler for flat inputs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
0d5cea7d81 nak/nir: Rename load_interpolated_input
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Faith Ekstrand
9cce4e6364 nak/nir: Emit nir_intrinsic_ald_nv directly for system values
These are simple enough that running them through the lowering code
really isn't gaining us anything.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
2024-03-26 05:57:12 +00:00
Patrick Lerda
2b4095d086 r300: fix NIR passes regression
The pass "nir_opt_constant_folding" is definitely required.

For instance, this issue is triggered on a R430 with "piglit/bin/shader_runner generated_tests/spec/glsl-1.10/execution/variable-indexing/fs-varying-array-mat2-col-rd.shader_test -auto -fb":
shader_runner: ../src/compiler/nir/nir_lower_int_to_float.c:239: lower_alu_instr: Assertion `nir_alu_type_get_base_type(info->output_type) != nir_type_int && nir_alu_type_get_base_type(info->output_type) != nir_type_uint' failed.

Fixes: 092299f18a ("r300: remove some late NIR passes")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28365>
2024-03-26 05:35:31 +00:00
Mike Blumenkrantz
bf5d203f24 zink: set dynamic rendering color attachment layouts
this is otherwise broken for fbfetch

Fixes: 2ad0146179 ("zink: use KHR_dynamic_rendering_local_read")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28340>
2024-03-26 02:13:52 +00:00
Yusuf Khan
561fae6845 nvk: fix valve segfault from setting a descriptor set from NULL
Reported by Nikita Vilunov and fix found by him when analyzing his
CS2 dump.

cc: mesa-stable

v2: these two need to be zero when set == NULL

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10719
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28353>
2024-03-26 01:52:48 +00:00
Yiwei Zhang
1a475c70b2 venus: add a more relaxed polling strategy
The default vn_relax is mainly targeting Vulkan commands expecting a
rely like object creation and property queries. The defined relax reason
here is VN_RELAX_REASON_RING_SPACE. The polling strategy involves more
busy waits to overcome sleep penalty affecting cpu utilization, as well
as an edge case for Android system server which forces to sleep longer
even with trivial hrtimer interval.

However, for the below relax reasons:
- VN_RELAX_REASON_RING_SPACE
- VN_RELAX_REASON_FENCE
- VN_RELAX_REASON_SEMAPHORE
- VN_RELAX_REASON_QUERY

It's a waste of cpu cycles if we do more busy waits if the initial
polled signals are not "ready". Having less busy waits there allows to
jump to higher order of sleeps sooner to disturb the scheduler less
until signaled.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
2024-03-26 00:37:24 +00:00
Yiwei Zhang
7dc2f62273 venus: decorate cmd enqueue macro internals with compiler hints
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
2024-03-26 00:37:24 +00:00
Yiwei Zhang
0fa9950ef5 venus: deprecate unused perf env vars
So far there's no clear wins/losses from the non-default behavior of cmd
batching and base_sleep_us. Just drop those.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
2024-03-26 00:37:24 +00:00
Yiwei Zhang
1e47ec2321 venus: avoid constant busy wait for query result waiting
Up to this commit in this MR, the gfxbench manhattan scores have been
improved by 10~15% with ANGLE-on-Venus on some AMD platforms.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
2024-03-26 00:37:24 +00:00
Yiwei Zhang
88b64d14d8 venus: add enum vn_relax_reason
Better distinguish different client waiting and prepare for applying
different waiting profile for different reasons.

Default case is avoided in reason string mapping so that below can be
hit upon compilation:
- error: enumeration value ‘XXX’ not handled in switch [-Werror=switch]

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
2024-03-26 00:37:24 +00:00
Yiwei Zhang
d05eb97408 venus: further reduce idle timeout from 5ms to 1ms
Similar to the rationale for the 50ms -> 5ms adjustment before. When
there's enough cpu cycles, doing so would only help reduce cpu
utilization. When cpu is mostly drained, less host side unnecessary
polling is favored by the scheduler. Also in the latter case, it'd be
the non-primary ring, so it doesn't hurt to idle out faster.

Besides the theory, there's no regression in popular benchmarks, but
only power wins. Making the idle timeout too small will lead to overhead
built up. e.g. From the initial notify to ring being waken up, it's
about 200us. The notify op is more expensive than ring thread doing a
few more polls. However, we normally would save many more polls by idle
out earlier. From my local testing, reducing down to 500us won't incur
and real perf regressions either.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
2024-03-26 00:37:23 +00:00
Yiwei Zhang
30d7b3bdec venus: avoid excessive ring notifications
The ring notification can be blocked on renderer main thread if a vq cmd
is waiting for a ring cmd (via a different non-idle ring). This change
optimizes to only try waking up the ring on the idle timeout period.

Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
2024-03-26 00:37:23 +00:00
Dylan Baker
c81b6e5d4c nvk: drop meson version check that is always true
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28381>
2024-03-25 20:48:17 +00:00
José Roberto de Souza
0113a2d4b3 intel/decoder: Fix binding table pointer entry being marked as invalid
If entry goes until the last byte of the bo it was being marked as
not valid while it is valid.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28376>
2024-03-25 20:27:06 +00:00
Rob Clark
787079e52a pps: Config tweaks to avoid loosing traces
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
2024-03-25 19:49:50 +00:00
Rob Clark
e1e57ea287 pps: Enable memory traces
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
2024-03-25 19:49:50 +00:00
Rob Clark
5154a0831e tu: Add perfetto memory tracing
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
2024-03-25 19:49:50 +00:00
Rob Clark
9936e91808 freedreno/drm: Add perfetto memory tracing
The design of the perfetto memory event is a bit more vk specific, but
we can abuse it to get a breakdown of memory usage for various purposes.
The memory_type parameter is (ab)used to get buffer vs image memory
split out into it's own track/graph.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
2024-03-25 19:49:50 +00:00
Rob Clark
a3fb2b07aa freedreno: Add bo usage hints
These hints aren't used for allocation, but will be used to
differentiate the purpose of an allocation in the next commit.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
2024-03-25 19:49:50 +00:00
Rob Clark
db49237267 freedreno/pps: Don't re-init perfcntrs
init_perfcntr() can be called multiple times.  We don't want to
regenerate the list of counters (and overwrite/leak various other
things), so just bail if we've already initialized.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
2024-03-25 19:49:50 +00:00
Jesse Natalie
8498371b65 ci/debian: Update DirectX-Headers
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28339>
2024-03-25 19:11:35 +00:00
Jesse Natalie
ff802ca93b ci/windows: Update DirectX-Headers, Agility SDK, zlib, DXC, and WARP
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28339>
2024-03-25 19:11:35 +00:00
Jesse Natalie
267ae85a72 microsoft/compiler: Disable GS streams workaround for validator 1.8
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28339>
2024-03-25 19:11:35 +00:00
Jesse Natalie
811bed8a23 microsoft/compiler: domainLocation component index needs to be i8
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28339>
2024-03-25 19:11:35 +00:00
Jesse Natalie
007b0fdff0 dzn: Initialize memoryTypeBits for querying properties on imported handles
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28339>
2024-03-25 19:11:35 +00:00
Jesse Natalie
5957778c16 dzn: Include vulkan_core.h instead of vulkan.h in the device enum header
Prevents pulling in X11 "None" define into the DXCore implementation,
which conflicts with updated DXCore headers.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10803
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28339>
2024-03-25 19:11:35 +00:00
Boris Brezillon
d9d6514fbc panvk: Disable global offset on varying and non-VS attribute descriptors
We are not supposed to apply the vertex index offset to our varying or
non-VS attribute (AKA image) descriptors. While at it, explicitly set
offset_enable to true when emitting vertex attribute descriptors, to
clarify our intentions.

Fixes: c0d6539827 ("panvk: Drop support for Midgard")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28182>
2024-03-25 18:30:47 +00:00