For used by different counter.
Vulkan:
1. VK_QUERY_PIPELINE_STATISTIC_GEOMETRY_SHADER_PRIMITIVES_BIT,
sum generated primitives of all 4 streams when GS.
2. VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT, count generated
primitives for all 4 streams when VS/TES/GS.
3. VK_QUERY_TYPE_TRANSFORM_FEEDBACK_STREAM_EXT, count generated
and streamout primitives for all 4 streams when VS/TES/GS.
OpenGL:
1. GL_GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB, sum generated
primitives for all 4 streams when GS.
2. GL_PRIMITIVES_GENERATED, count generated primitives for all 4
streams when VS/TES/GS.
3. GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, count streamout
primitives for all 4 streams when VS/TES/GS.
pipeline_stat_query_enabled_amd is for Vulkan 1 and OpenGL 1.
xfb_query_enabled_amd is for Vulkan 2/3 and OpenGL 2/3.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19015>
User can enable/disable multi VK_QUERY_TYPE_PRIMITIVES_GENERATED_EXT
queries with same or different index.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19015>
some allocations require a different memory heap even when using the
same memory bits, so allow iterating over heaps of the same memory type
to find the one that works
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19281>
it's possible for drivers to have multiple heaps with identical flags,
so this will enable passing the heap that should actually be used for
allocation
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19281>
this is supposed to create the non-optimized pipeline first and then
compile the optimized version in the background, not the other way around
facepalm.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19201>
this is supposed to return the address at the start of the buffer,
not the address at the start of the memory allocation
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19274>
Mis-matched usage is a large percentage of buffer cache misses when
searching for applicable buffers. Separating these into separate
managers puts them into separate heaps and eliminates a significant
amount of CPU time spent searching.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19283>
This was similarly done for a3xx & a4xx with 4bdd226ab6
("freedreno/ir3: Switch to NIR for a3xx/a4xx's vertex id lowering.")
already, yet a5xx was missed which I noticed as Minecraft window
titlebar text corruption on OnePlus 5T with Adreno 540.
Cc: mesa-stable
Signed-off-by: Jami Kettunen <jami.kettunen@protonmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19263>
Wire up support for the two intrisics to get default tess levels (which
adds new HS driver params) and add a helper to generate a passthrough
TCS for a given VS. The passthrough TCS is cached in the VS, indexed by
patch_vertices, as the generated TCS is a function of the VS outputs and
the patch_vertices count.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
If using a passthrough TCS shader, we can't rely on having a new TCS
stage bound, so we need to invalidate the TCS state when the # of
patch_vertices changes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
Switch over to using the common nir helper.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
Based on si_create_passthrough_tcs() as that seemed the most generic of
the various different backend driver implementations. Uses the
load_tess_level_outer_default and load_tess_level_inner_default
intrinsics to load the gl_TessLevelOuter and gl_TessLevelInner values,
so driver will somehow need to implement those to load the values set
by pipe_context::set_tess_state() or similar.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
The job fits into 15 minutes of runtime, so deparalelize.
Stress-tested.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19211>
Copy the information from the first predecessor then check whether
it matches other predecessors and modify the data accordingly.
Marked for backporting to stable to make it possible to also
backport fixes based on this.
Cc: mesa-stable
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
According to perf, this roughly halves the impact of the post-RA
optimizer in ACO's compile times.
Measurement was taken using a debug optimized build using
NIR_DEBUG=novalidate RADV_DEBUG=nocache and replaying the Fossil DB
from the Doom Eternal shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18103>
get_query_result expects a pointer to a union pipe_query_result,
which is larger than GLuint64EXT, causing the memset it does to
overwrite the stack.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19055>
This doesn't make sense, opsel preserves the not selected half of the register,
p_insert zeros it.
No Foz-DB changes.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 54292e99c7 ("aco: optimize 32-bit extracts and inserts using SDWA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19253>
It's a revert of fcd53bebe6 ("aco: Define NOMINMAX in Meson build file")
Because 852d91edcd ("windows: Always set NOMINMAX to remove min/max macros")
did the same thing
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19233>
The check_warn variable is true a bit too often; in realtity it's not
*either* of these conditions that makes things lines, it's the last one
in the pipeline. But we already have a state for this, so let's reuse
that instead of recomputing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19048>
There's a few things that depends on the primitive rasterization type,
like point-sprite lowering and polygon offset. The effective state is a
combination of several other states, and we currently kinda wing it a
bit sometimes.
This should improve the situation. In particular, we now go backwards
through the pipeline, checking one overriding state at the time.
The end result should be that we don't end up lowering point-coord
replacement when not rendering with points.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19048>