The Vulkan spec got clarified recently and it's invalid (hw can support
it though). Fixes new CTS dEQP-VK.api.buffer.invalid_buffer_features.*.
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13701>
It got introduced by "radv: optimize subpass barrier flushes for
imageless framebuffers" but never used in the final version.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13702>
Assumes cmd_buffer != NULL for this path to eliminate the checks in the generic code.
Benchmarks:
I made a microbenchmark based on Bas' vulkan microbench suite.
https://gitlab.freedesktop.org/bnieuwenhuizen/vulkan_microbench/-/merge_requests/1
In this benchmark, this improves vkDescriptorTemplateUpdate performance consistently by 36%.
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
DescriptorTemplateUpdate 81.2 ns 81.2 ns 8573169
->
-------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------
DescriptorTemplateUpdate 52.9 ns 52.9 ns 13306065
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13342>
The driver should always know the attachments at this point. This
should reduce the number of L2 cache flushes for imageless framebuffers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13291>
One CTS test is still failing and the fix isn't upstream yet.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13623>
It can be considered stable these days. This also now reports the
buffer size and if instruction timing is enabled/disabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13477>
It seems stable enough to turn it on by default. This replaces
RADV_THREAD_TRACE_PIPELINE by RADV_THREAD_TRACE_INSTRUCTION_TIMING
which is enabled by default.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13477>
This seems simpler and probably faster.
This also fixes a warning for these CTS tests:
dEQP-VK.pipeline.creation_feedback.graphics_tests.vertex_stage_geometry_stage_delayed_destroy_fragment_stage_delayed_destroy
dEQP-VK.pipeline.creation_feedback.graphics_tests.vertex_stage_geometry_stage_fragment_stage
because we no longer set found_in_application_cache=false for pipelines
with NGG GS.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13528>
This is allowed and the fragment shader should read zero.
No fossils db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13222>
If the application first uses nontrivial divisors, the driver emits
the vertex shader VA to the upload BO rather than directly via the
user SGPRs locations. But, if the vertex input dynamic state changes,
the driver might select a different VS prolog that no longer needs
nontrivial divisors.
In this case, the driver needs to re-emit the prolog inputs because
otherwise the VS prolog will jump to the PC that is emitted via the
user SGPR locations, and the previous one was somewhere in the
upload BO...
This fixes a GPU hang with Bioshock and Zink.
Fixes: d9c7a17542 ("radv: enable VK_EXT_vertex_input_dynamic_state")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13377>
Fix defect reported by Coverity Scan.
Resource leak (RESOURCE_LEAK)
leaked_storage: Variable prolog going out of scope leaks the storage it points to
Fixes: 80841196b2 ("radv: implement dynamic vertex input state using vertex shader prologs")
Suggested-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13402>
It might still be read later from the streamout buffer.
Fixes a regression with
ext_transform_feedback-builtin-varyings gl_PointSize and Zink.
Fixes: 92e1981a80 ("radv: Remove PSIZ output when it isn't needed.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13413>
The descriptor set was always 0 because it wasn't gathered by the
shader info pass.
This fixes CPU crashes with
arb_shader_texture_image_samples-builtin-image and Zink.
Cc: 21.3 mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13411>
This dereferences a NULL pointer and crash many tests with Zink.
Fixes: 92e1981a80 ("radv: Remove PSIZ output when it isn't needed.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13378>
unnecessarily dereferencing the vertex buffer info array here causes a
ton of cpu overhead due to bad cache locality, so just use a mask to
avoid loading X more cachelines into memory unnecessarily
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13320>
when the shader pipeline is known to not require any of the more complex
calculations, those calculations can be excluded from the dynamic update
code
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13320>
Only try to invalidate L2 if we actually hit one of the incoherent images.
Note we may actually insert some extra flushes at the end of a command
buffer so that we may asume the caches are clean the start of the next
command buffer. However, on average I think that case is uncommon
enough that being able to make assumptions at the start of a cmdbuffer
is beneficial. Especially since MSAA is somewhat rare in more recent
games.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13239>