Documentation is worded in a confusing way, which may be understood that
we don't have to set this field to get good results.
MESH part of this commit improves performance of vk_meshlet_cadscene
by a factor of 2 on A380.
Fixes: ef04caea9b ("anv: Implement Mesh Shading pipeline")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19412>
(cherry picked from commit d1d2dee970)
From empirical tests (on a660) R8G8 with UBWC enabled requires 256b
alignment, otherwise there would be a GPU fault during blits.
Set alignment to 4096 for all UBWC images since that's what blob does
and this area is heavily undertested.
Fixes GPU fault in Borderlands 3 running through DXVK.
cc: mesa-stable
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19298>
(cherry picked from commit 5eaca461a7)
In 7f6ecb8667 we added reference counting for descriptor set layouts,
however, we didn't realize that pools created without the flag
VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT don't free individual
descriptors and can only be reset or destroyed. Since we only drop
references when individual descriptor sets were destroyed, we would
leak set layouts referenced from descriptor sets allocated from these
pools.
Fix that by keeping a list of all allocated descriptor sets (no matter
whether VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT is present or
not) and then traversing the list dropping the references on pool resets
and destroys.
Fixes: 7f6ecb8667 ('v3dv: add reference counting for descriptor set layouts')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19337>
(cherry picked from commit 07c7d846e5)
The flush_resources recorded in the context need to stay alive until
the context is flushed, at which point additional resolve operations
are done to those resources. While the backing BO is alive due to being
referenced in the cmdstream, the resource might already be destroyed
at this point.
Keep a reference to the resource to make sure it is still available at
context flush time.
Fixes: 7b9d8d1936 ("etnaviv: flush used render buffers on context flush when neccessary")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19280>
(cherry picked from commit 16e0702ec7)
The previous list of passes contained several errors: "constprop" does not
exist anymore and a trailing ',' is not allowed. This made LLVMRunPasses
fail, but the error is silently ignored. Thus none of the listed passes
ran at all.
https://reviews.llvm.org/D85159 suggests "InstSimplify really should be
used anywhere ConstProp is being used" so replace constprop with
instsimplify and remove the trailing comma.
By enabling pass logging with
LLVMPassBuilderOptionsSetDebugLogging(opts, true),
the difference is visible. Before:
Running pass: AlwaysInlinerPass on [module]
Running analysis: InnerAnalysisManagerProxy<llvm::FunctionAnalysisManager, llvm::Module> on [module]
Running analysis: ProfileSummaryAnalysis on [module]
Running pass: CoroConditionalWrapper on [module]
Running pass: AnnotationRemarksPass on fs_variant_partial (1162 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_partial
Running pass: AnnotationRemarksPass on fs_variant_whole (1110 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_whole
After:
Running pass: AlwaysInlinerPass on [module]
Running analysis: InnerAnalysisManagerProxy<llvm::FunctionAnalysisManager, llvm::Module> on [module]
Running analysis: ProfileSummaryAnalysis on [module]
Running pass: CoroConditionalWrapper on [module]
Running pass: AnnotationRemarksPass on fs_variant_partial (1162 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_partial
Running pass: AnnotationRemarksPass on fs_variant_whole (1110 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_whole
Running analysis: InnerAnalysisManagerProxy<llvm::FunctionAnalysisManager, llvm::Module> on [module]
Running pass: SROAPass on fs_variant_partial (1162 instructions)
Running analysis: DominatorTreeAnalysis on fs_variant_partial
Running analysis: AssumptionAnalysis on fs_variant_partial
Running analysis: TargetIRAnalysis on fs_variant_partial
Running pass: EarlyCSEPass on fs_variant_partial (1111 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_partial
Running pass: SimplifyCFGPass on fs_variant_partial (961 instructions)
Running pass: ReassociatePass on fs_variant_partial (961 instructions)
Running pass: PromotePass on fs_variant_partial (897 instructions)
Running pass: InstCombinePass on fs_variant_partial (897 instructions)
Running analysis: OptimizationRemarkEmitterAnalysis on fs_variant_partial
Running analysis: AAManager on fs_variant_partial
Running analysis: BasicAA on fs_variant_partial
Running analysis: ScopedNoAliasAA on fs_variant_partial
Running analysis: TypeBasedAA on fs_variant_partial
Running analysis: OuterAnalysisManagerProxy<llvm::ModuleAnalysisManager, llvm::Function> on fs_variant_partial
Running pass: SROAPass on fs_variant_whole (1110 instructions)
Running analysis: DominatorTreeAnalysis on fs_variant_whole
Running analysis: AssumptionAnalysis on fs_variant_whole
Running analysis: TargetIRAnalysis on fs_variant_whole
Running pass: EarlyCSEPass on fs_variant_whole (1059 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_whole
Running pass: SimplifyCFGPass on fs_variant_whole (912 instructions)
Running pass: ReassociatePass on fs_variant_whole (912 instructions)
Running pass: PromotePass on fs_variant_whole (844 instructions)
Running pass: InstCombinePass on fs_variant_whole (844 instructions)
Running analysis: OptimizationRemarkEmitterAnalysis on fs_variant_whole
Running analysis: AAManager on fs_variant_whole
Running analysis: BasicAA on fs_variant_whole
Running analysis: ScopedNoAliasAA on fs_variant_whole
Running analysis: TypeBasedAA on fs_variant_whole
Running analysis: OuterAnalysisManagerProxy<llvm::ModuleAnalysisManager, llvm::Function> on fs_variant_whole
Fixes: 2037c34f24 ("gallivm: move to new pass manager to handle coroutines change.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19217>
(cherry picked from commit 28f4fcaa4f)
This doesn't make sense, opsel preserves the not selected half of the register,
p_insert zeros it.
No Foz-DB changes.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 54292e99c7 ("aco: optimize 32-bit extracts and inserts using SDWA")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19253>
(cherry picked from commit 616d3908dc)
I misread the ISA doc and got the order wrong.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: dae1629778 ("aco: disable sdwa on gfx11")
Fixes: e68e6c75ca ("aco: use v_perm_b32 to copy 0xff00/0x00ff/0xff/0x00")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19223>
(cherry picked from commit f32dde2902)
For example if src0 is 0x80000000 we should return 1, not 0.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: a5747f8ab3 ("nir: add opcodes for *find_msb_rev and lowering")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18951>
(cherry picked from commit 741dbadae0)
LLVM 15 changed the coroutine presplit function attribute in
735e6c40b5e9 [Coroutines] Convert coroutine.presplit to enum attr
This needed to be updated in mesa.
Cc: mesa-stable
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18815>
(cherry picked from commit bcb136d548)
Conflicts:
src/gallium/auxiliary/gallivm/lp_bld_intr.h
Reports d3d12 support for mapping all the contiguous planes.
This will be used by vaDeriveImage in the VA frontend
Fixes: a585d95803 ("radeonsi/vcn: WA 10bit encoding crash in vaapi")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18300>
(cherry picked from commit a1f904f7a3)
vaDeriveImage should check if the underlying gallium driver can map contiguous planes
before skipping with disallowlist.
Fixes: a585d95803 ("radeonsi/vcn: WA 10bit encoding crash in vaapi")
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18300>
(cherry picked from commit 81ae033b94)
Conflicts:
src/gallium/include/pipe/p_video_enums.h
This failed to take fabs of the first component, implementing an unintended
formula that would return the right results in some common cases but is wrong in
general:
max { x, |y|, |z| }
instead of the intended
max { |x|, |y|, |z| }
Reexpress the implementation to make correctness obvious.
Fixes: 272e927d0e ("nir/spirv: initial handling of OpenCL.std extension opcodes")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18754>
(cherry picked from commit fc5c671e87)
The loop over sources has to happen for every instruction, regardless of whether
we also need to register allocate the destination. The other source loops handle
this properly, but this one was missed.
Fixes spilling failure in shaders/android/angle/aztec_ruins/16.shader_test when
the input NIR is shuffled a bit (from reordering passes).
Fixes: 129d390bd8 ("pan/mdg: Fix bound setting in RA for sources")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19093>
(cherry picked from commit 829f769e60)
The register file on Midgard is not large enough to sustain 256 threads in a
threadgroup when all ISA-defined registers are used. As such, we want
to advertise the smallest MAX_THREADS_PER_BLOCK permissible by the spec to
avoid compiling shaders that will necessarily spill. The minimum-maximum in
OpenGL ES 3.1 is 128, so set that on Midgard.
6 compute shaders LOST in shader-db due to exceeding this new limit. These
shaders would fault if they were attempted to be executed.
Cc: mesa-stable
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19092>
(cherry picked from commit 9b19104a30)
Turn -DWINDOWS_NO_FUTEX to be pre_args for not need add direct dependencies
to dep_futex for libraries and executables.
So only add dependencies to idep_mesautil is enough.
And this will make sure all source code are either using Windows futex,
or use mtx_t consistently across different sources, other than mixed usage of
futex and mtx_t before this commit.
If -DWINDOWS_NO_FUTEX is not globally available, that would cause
/src/util/simple_mtx.h:116: undefined reference to `futex_wait'
This error is raised when
* compiled with -D min-windows-version=7
* moved futex_wait from futex.h to futex.c
* used simple_mtx_t in more codes
Or linkage error:
src/compiler/libcompiler.a.p/glsl_types.cpp.obj: in function `futex_wake':
/../../src/util/futex.h:154: undefined reference to `WaitOnAddress'
When:
* compiled with -D min-windows-version=7
* used simple_mtx_t in more codes
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7494
Fixes: c002bbeb2f ("util: Add a Win32 futex impl")
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19087>
(cherry picked from commit 2b64ff9284)
When the Anv pipeline got migrated to the runtime, we gain/lost a bit
of functionality which is that the disk cache is always read
regardless of VK_ENABLE_PIPELINE_CACHE=0.
This change brings the old behavior back.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 591da98779 ("vulkan: Add a common VkPipelineCache implementation")
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19050>
(cherry picked from commit cea113c977)
Otherwise users of `idep_vulkan_wsi` won't pull in the udev dependency,
which will cause the linker to fail later on in compiling.
The user of this dependency is lavapipe which would fail to link if this
isn't provided.
Fixes: 4885e63a6d (vulkan/wsi: implement missing wsi_register_device_event)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19037>
(cherry picked from commit b516f59490)
this ensures there's no weird perf happening, avoids using renderpass
instead of dynamic rendering, and avoids hitting an assert from broken
framebuffer construction
cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19065>
(cherry picked from commit d3880a6324)