When we set the visibility stream with CP_SET_PSEUDO_REG, it does two
things (or only one of the two, with concurrent binning):
- Set the "pseudo register" used by CP_SET_BIN_DATA5_OFFSET, which in
turn is used when decoding the vis. streams.
- Set the VSC register used by the binning pass.
Preemption with skipsaverestore obliterates the second, but not the
first. This means that before running the binning pass, we have to
re-emit these registers. I *think* this is what the blob does on a7xx.
On a6xx, where the pseudo register doesn't exist, the blob seems to
re-emit the preamble every time we re-allocate the visibility streams,
but we don't support a6xx yet so we can defer making that decision.
Fixes supertuxkart under zink with preemption enabled in the kernel.
Fixes: 1d2b479a3b ("tu: Allow being preempted on a7xx")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31937>
debian/x86_64_build-base was missing the package that provides
LLVMConfig.cmake, breaking cmake builds that need LLVM, like the
SPIRV-LLVM-Translator build.
The cross-builds were missing arch-test to allow dpkg to figure out
which arch it's running on.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31940>
We never took the time to actually test this, but it works fine.
Improves performance on Navi 10 in the following test cases:
Baldur's Gate 3 Vulkan: up to 10%
Witcher 3 D3D11: around 4%
Granite primitive stress test: 107%
FSR2 sample app: 57%
Notes:
NGG is still disabled on Navi 14.
Not tested on Navi 12.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31971>
Helps performance in Baldur's Gate 3 on Navi 10
when NGG culling is enabled.
Also fix the description of the RADV_PERFTEST=nggc env var.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31971>
This has the best fossil-db results across in a sweep from 0..15.
fossil-db results on Alderlake:
Instructions in all programs: 152849904 -> 152824116 (-0.0%)
SENDs in all programs: 7677830 -> 7677830 (+0.0%)
Loops in all programs: 48470 -> 48470 (+0.0%)
Cycles in all programs: 11988670382 -> 11987530942 (-0.0%)
Spills in all programs: 42863 -> 41777 (-2.5%)
Fills in all programs: 77114 -> 73044 (-5.3%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31990>
\[WHY\]
For color fill only case we see performance on Vpe1.1 are not doubled due
to CD are all 0, no odd CD
\[HOW\]
1. Dummy stream dst rect should be in the middle of target rect so the
two (dummy seg + bg only seg) are balanced, instead of target at upper
left corner which makes it imbalance
2. BG gap generation should consider more for collaboration mode
num_multiple
3. When pure bg case, skip dummy stream handling and go ahead do BG gap
generation
4. Update memory requirement for the new pure BG case flow to avoid run out of embedded buffer
4. Additional -- fix the random Collaborate data generation bug (benign)
\[TESTING\]
Vpelibtest app + nv12torgb case with debug flag bgcolorfill set on in
vpelibtestapp
Media player with/without bgcolorfillonly flag
Teams
Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Navid Assadian <Navid.Assadian@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Signed-off-by: Tomson Chang <tomson.chang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>
\[WHY\]
1.Remove TODO comments that don't need action item
2.Delete the legacy command number check as it is now using a vector (i.e. without hard limit)
\[HOW\]
Remove TODO comments and delete the legacy command number check
Signed off by <tvisan@amd.com>
Reviewed-by: Roy Chan <Roy.Chan@amd.com>
Reviewed-by: Jesse Agate <Jesse.Agate@amd.com>
Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31918>
A while back Matt enabled shader_spilling_rate by default for anv.
But intel_clc doesn't use the driconf mechanism that we use there.
The GRL shaders spill a lot, and with us now compiling additional
generations of the shaders, Mesa build time is getting prohibitively
expensive. By setting this, we drop the time taken for a clean debug
build by approximately 35% on my current laptop.
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31993>
crrev.com/c/5980832 is updating kumquat api .. it's not
distro-packaged and used as a testing tool. For the
mechanism it is used (GfxstreamEnd2EndTests build target
in AOSP), this API change will not be disruptive.
Reviewed-by: Aaron Ruby <aruby@blackberry.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31929>
Allow KHR version of the line_rasterization extension to be
supported from the guest side.
Test: dEQP-VK.dynamic_state.monolithic.line_width.*
Reviewed-by: Aaron Ruby <aruby@blackberry.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31929>
Codegen does not automatically generate code for promoted
extensions, so we need to explicitly define support for
VK_EXT_line_rasterization to generate necessary code.
Reviewed-by: Aaron Ruby <aruby@blackberry.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31929>
The descriptor set optimization doesn't seem to help performance
during benchmarks and it seems to cause erratic behaviour during normal
operation. This disables it from being on by default.
Can be enabled by -feature VulkanBatchedDescriptorSetUpdate
Reviewed-by: Aaron Ruby <aruby@blackberry.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31929>
Now actually making use of new Xe KMD OA syncronization uAPI.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31283>
Xe KMD added a uAPI to syncronze metrics id changes, so we can make
it wait for all previous workloads in exec_queue and all previous
metrics id changes to finish before start change it again.
This should make Vulkan queries more robust.
So this makes use of intel_bind_timeline to syncronize the metrics id
changes and xe_queue_get_syncobj_for_idle() to syncronize with
exec_queue.
As i915 and some versions of Xe KMD will not support it, this feature
will only be used then intel_bind_timeline parameter is not NULL and
timeline has a valid syncobj id.
At this patch level all callers will set it to NULL, next patch will
add and initialize timeline in ANV when supported by Xe KMD.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31283>
Sync with:
commit 086ed1d51544bfc1123b93eccc2ae88e0fbf3d51
Merge: fb6c5b1fdc 53f4b30b05
Author: Dave Airlie <airlied@redhat.com>
Merge tag 'exynos-drm-next-for-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31283>
Move the session init params into common code, instead of having the
same code copied for each VCN version.
Change the padding logic to enforce minimum padding required for input
surface (in case the input surface is smaller than aligned coded size)
and also avoid setting the padding above supported maximum.
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31975>