If this state was emitted at the point of previous RP, which
could happen if pipeline is not set at the start of current RP,
we have to emit non-draw-state state since it would become stale
in the next tile.
Fixes test with stale reg dbg:
dEQP-VK.transform_feedback.primitives_generated_query.get.queue_reset.32bit.tese.xfb.color_write_disable_static.patch_list.pgq_default_xfb_default.two_draws.pqg_first.none_2_queries
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28326>
The pipeline used in RP may have been bound in another RP, so
we have to save relevant state and re-apply it on first draw.
Fixes GPU hang in the following test with forced binning + reg stomping:
dEQP-VK.transform_feedback.primitives_generated_query.get.queue_reset.32bit.tese.xfb.color_write_disable_static.patch_list.pgq_default_xfb_default.two_draws.pqg_first.none_2_queries
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28326>
VK_EXT_post_depth_coverage was implemented in
f1305d49d9 ("tu: Implement VK_EXT_post_depth_coverage").
Additionally mark that certain extensions are supported from a650
onwards rather than exclusively on that generation in features.txt
to match the formatting that the other drivers use.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28236>
This patch exposes support for the following three extensions:
* VK_GOOGLE_decorate_string
* VK_GOOGLE_hlsl_functionality1
* VK_GOOGLE_user_type
There's nothing for the driver to do; it's all handled in spirv_to_nir.
Signed-off-by: Valentine Burley <valentine.burley@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28236>
Push constants are exposed as special registers on Bifrost/Valhall,
this means we can't index the push constant region with a dynamic
index. In order to support dynamic indexing, we need iterative CSELs
to select the right value from the access range.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
On Bifrost, push constants are exposed as 64-bit registers which can
be accessed at a 32-bit granularity. Make sure push constant accesses
are lowered to guarantee a 32-bit alignment.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
Mali GPUs have a 32-bit alignment constraint on push constants. Extend
nir_lower_mem_access_bit_sizes() so it can lower bit sizes on push
constant accesses.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
Will be needed to support push constants in
nir_lower_mem_access_bit_sizes().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28175>
Implement dynamic rendering entry points so we can get rid of the
render pass logic.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28167>
The gallium and vulkan drivers deal with vertex attribute emission
differently. The gallium driver re-emits the VS attributes on each
draw, while the vulkan driver uses explicit attribute/image
descriptor dirtiness tracking, and could keep the attribute array
around if a new pipeline using a different number of attribute is
bound. If we want to be able to do that, we need to assign a fixed
offset for image attributes, such that the Vulkan descriptor
lowering pass knows where the images are in the attribute table.
We could teach the Bifrost backend how to deal with a custom offset
but it doing that in a lowering pass also simplifies the Midgard
code.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28200>
Fixes crashes in tests like
dEQP-VK.pipeline.monolithic.render_to_image.core.2d_array.huge.width_height_layers.r8g8b8a8_unorm
with CTS main.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28364>
All we really need is for them to have no indirects which we can ensure
via nir_lower_indirect_derefs. Splitting into individual variables is a
relic of older attempts at FS output lowering and not needed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
This removes our reliance on driver_locaiton for varyings and attributes
by using nir_io_semantics instead. This is probably better as NIR seems
to be trending this direction long-term.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28377>
The pass "nir_opt_constant_folding" is definitely required.
For instance, this issue is triggered on a R430 with "piglit/bin/shader_runner generated_tests/spec/glsl-1.10/execution/variable-indexing/fs-varying-array-mat2-col-rd.shader_test -auto -fb":
shader_runner: ../src/compiler/nir/nir_lower_int_to_float.c:239: lower_alu_instr: Assertion `nir_alu_type_get_base_type(info->output_type) != nir_type_int && nir_alu_type_get_base_type(info->output_type) != nir_type_uint' failed.
Fixes: 092299f18a ("r300: remove some late NIR passes")
Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Pavel OndraÄka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28365>
The default vn_relax is mainly targeting Vulkan commands expecting a
rely like object creation and property queries. The defined relax reason
here is VN_RELAX_REASON_RING_SPACE. The polling strategy involves more
busy waits to overcome sleep penalty affecting cpu utilization, as well
as an edge case for Android system server which forces to sleep longer
even with trivial hrtimer interval.
However, for the below relax reasons:
- VN_RELAX_REASON_RING_SPACE
- VN_RELAX_REASON_FENCE
- VN_RELAX_REASON_SEMAPHORE
- VN_RELAX_REASON_QUERY
It's a waste of cpu cycles if we do more busy waits if the initial
polled signals are not "ready". Having less busy waits there allows to
jump to higher order of sleeps sooner to disturb the scheduler less
until signaled.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
Up to this commit in this MR, the gfxbench manhattan scores have been
improved by 10~15% with ANGLE-on-Venus on some AMD platforms.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
Better distinguish different client waiting and prepare for applying
different waiting profile for different reasons.
Default case is avoided in reason string mapping so that below can be
hit upon compilation:
- error: enumeration value ‘XXX’ not handled in switch [-Werror=switch]
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
Similar to the rationale for the 50ms -> 5ms adjustment before. When
there's enough cpu cycles, doing so would only help reduce cpu
utilization. When cpu is mostly drained, less host side unnecessary
polling is favored by the scheduler. Also in the latter case, it'd be
the non-primary ring, so it doesn't hurt to idle out faster.
Besides the theory, there's no regression in popular benchmarks, but
only power wins. Making the idle timeout too small will lead to overhead
built up. e.g. From the initial notify to ring being waken up, it's
about 200us. The notify op is more expensive than ring thread doing a
few more polls. However, we normally would save many more polls by idle
out earlier. From my local testing, reducing down to 500us won't incur
and real perf regressions either.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
The ring notification can be blocked on renderer main thread if a vq cmd
is waiting for a ring cmd (via a different non-idle ring). This change
optimizes to only try waking up the ring on the idle timeout period.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28287>
If entry goes until the last byte of the bo it was being marked as
not valid while it is valid.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28376>
The design of the perfetto memory event is a bit more vk specific, but
we can abuse it to get a breakdown of memory usage for various purposes.
The memory_type parameter is (ab)used to get buffer vs image memory
split out into it's own track/graph.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
init_perfcntr() can be called multiple times. We don't want to
regenerate the list of counters (and overwrite/leak various other
things), so just bail if we've already initialized.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28058>
We are not supposed to apply the vertex index offset to our varying or
non-VS attribute (AKA image) descriptors. While at it, explicitly set
offset_enable to true when emitting vertex attribute descriptors, to
clarify our intentions.
Fixes: c0d6539827 ("panvk: Drop support for Midgard")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28182>