The previous list of passes contained several errors: "constprop" does not
exist anymore and a trailing ',' is not allowed. This made LLVMRunPasses
fail, but the error is silently ignored. Thus none of the listed passes
ran at all.
https://reviews.llvm.org/D85159 suggests "InstSimplify really should be
used anywhere ConstProp is being used" so replace constprop with
instsimplify and remove the trailing comma.
By enabling pass logging with
LLVMPassBuilderOptionsSetDebugLogging(opts, true),
the difference is visible. Before:
Running pass: AlwaysInlinerPass on [module]
Running analysis: InnerAnalysisManagerProxy<llvm::FunctionAnalysisManager, llvm::Module> on [module]
Running analysis: ProfileSummaryAnalysis on [module]
Running pass: CoroConditionalWrapper on [module]
Running pass: AnnotationRemarksPass on fs_variant_partial (1162 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_partial
Running pass: AnnotationRemarksPass on fs_variant_whole (1110 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_whole
After:
Running pass: AlwaysInlinerPass on [module]
Running analysis: InnerAnalysisManagerProxy<llvm::FunctionAnalysisManager, llvm::Module> on [module]
Running analysis: ProfileSummaryAnalysis on [module]
Running pass: CoroConditionalWrapper on [module]
Running pass: AnnotationRemarksPass on fs_variant_partial (1162 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_partial
Running pass: AnnotationRemarksPass on fs_variant_whole (1110 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_whole
Running analysis: InnerAnalysisManagerProxy<llvm::FunctionAnalysisManager, llvm::Module> on [module]
Running pass: SROAPass on fs_variant_partial (1162 instructions)
Running analysis: DominatorTreeAnalysis on fs_variant_partial
Running analysis: AssumptionAnalysis on fs_variant_partial
Running analysis: TargetIRAnalysis on fs_variant_partial
Running pass: EarlyCSEPass on fs_variant_partial (1111 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_partial
Running pass: SimplifyCFGPass on fs_variant_partial (961 instructions)
Running pass: ReassociatePass on fs_variant_partial (961 instructions)
Running pass: PromotePass on fs_variant_partial (897 instructions)
Running pass: InstCombinePass on fs_variant_partial (897 instructions)
Running analysis: OptimizationRemarkEmitterAnalysis on fs_variant_partial
Running analysis: AAManager on fs_variant_partial
Running analysis: BasicAA on fs_variant_partial
Running analysis: ScopedNoAliasAA on fs_variant_partial
Running analysis: TypeBasedAA on fs_variant_partial
Running analysis: OuterAnalysisManagerProxy<llvm::ModuleAnalysisManager, llvm::Function> on fs_variant_partial
Running pass: SROAPass on fs_variant_whole (1110 instructions)
Running analysis: DominatorTreeAnalysis on fs_variant_whole
Running analysis: AssumptionAnalysis on fs_variant_whole
Running analysis: TargetIRAnalysis on fs_variant_whole
Running pass: EarlyCSEPass on fs_variant_whole (1059 instructions)
Running analysis: TargetLibraryAnalysis on fs_variant_whole
Running pass: SimplifyCFGPass on fs_variant_whole (912 instructions)
Running pass: ReassociatePass on fs_variant_whole (912 instructions)
Running pass: PromotePass on fs_variant_whole (844 instructions)
Running pass: InstCombinePass on fs_variant_whole (844 instructions)
Running analysis: OptimizationRemarkEmitterAnalysis on fs_variant_whole
Running analysis: AAManager on fs_variant_whole
Running analysis: BasicAA on fs_variant_whole
Running analysis: ScopedNoAliasAA on fs_variant_whole
Running analysis: TypeBasedAA on fs_variant_whole
Running analysis: OuterAnalysisManagerProxy<llvm::ModuleAnalysisManager, llvm::Function> on fs_variant_whole
Fixes: 2037c34f24 ("gallivm: move to new pass manager to handle coroutines change.")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19217>
For simple clients using the swap chain contention back pressure to regulate
their drawing and that don't query buffer age introduce a new DRI option with
which set to true (the default is false) we block the client until a new buffer
is available. This way we stall the client's execution until a new buffer is
available and the redrawing of the client starts only at this point and not
before.
The motivation for that is to reduce latency for clients that regulate their
drawing by swapchain contention back pressure. These clients draw whenever
possible and their drawing is implicitly stopping whenever we block. When we
block at the end of the swap and return only when a new buffer is available
the client can draw and we directly present. Otherwise the client would draw,
we block on the buffer becoming available, and only then show what the client
had drawn, usually one frame later.
Co-authored-by: Michel Dänzer <michel@daenzer.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14684>
This is known to produce bogus results for certain combinations of
operands, so don't use it. See this issue for details:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/6555
None of the shaders in shaderdb are affected by this change, as all of
the shaders containing an integer division require a higher GLSL version
than vc4 supports and are therefore all skipped.
This is port of the v3d change 73e8fc3efb
to vc4; we expect the impact to be the same as for v3d, based on testing
a custom shader.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18019>
We can produce slightly better code for these in the backend, so
do that.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18019>
some allocations require a different memory heap even when using the
same memory bits, so allow iterating over heaps of the same memory type
to find the one that works
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19281>
it's possible for drivers to have multiple heaps with identical flags,
so this will enable passing the heap that should actually be used for
allocation
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19281>
this is supposed to create the non-optimized pipeline first and then
compile the optimized version in the background, not the other way around
facepalm.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19201>
this is supposed to return the address at the start of the buffer,
not the address at the start of the memory allocation
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19274>
Mis-matched usage is a large percentage of buffer cache misses when
searching for applicable buffers. Separating these into separate
managers puts them into separate heaps and eliminates a significant
amount of CPU time spent searching.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19283>
If using a passthrough TCS shader, we can't rely on having a new TCS
stage bound, so we need to invalidate the TCS state when the # of
patch_vertices changes.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
Switch over to using the common nir helper.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19259>
The check_warn variable is true a bit too often; in realtity it's not
*either* of these conditions that makes things lines, it's the last one
in the pipeline. But we already have a state for this, so let's reuse
that instead of recomputing.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19048>
There's a few things that depends on the primitive rasterization type,
like point-sprite lowering and polygon offset. The effective state is a
combination of several other states, and we currently kinda wing it a
bit sometimes.
This should improve the situation. In particular, we now go backwards
through the pipeline, checking one overriding state at the time.
The end result should be that we don't end up lowering point-coord
replacement when not rendering with points.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19048>
This literal is consistence with st_api::name comes from global variable
st_gl_api that will be removed in following commits
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19197>
It's a global variable and have no need destroy
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19197>
As it's only accessed in st_manager.c, there is no need expose it
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19197>
Even if there is a slight difference of meaning between FIXME and
TODO, at some point we agreed to use just FIXME for all pending things
to do, just to make it easier to grepping for things that can be done.
And after all, one could argue that is there is something pending TO
DO, is that needs FIXING.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19225>
Extend the compiler/driver ABI to attach preambles, and plumb them into
the USC hardware when needed. This is the easy part!
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>
The big discovery is the "number of uniform registers" field. I learned
about this one accidentally when my preamble shaders weren't working
right, because we had inadvertently hardcoded "at most 32 registers" :-)
In the course of identifying that field, I found that the pipeline
address is used as a tagged pointer, with some unknown field in the
bottom bits and alignment demanded. The XML is updated to account for
this.
I later found that there's also a "number of general purpose registers
used by the preamble shader" field. I missed this one first, because the
encoding is slightly different from the usual "number of general purpose
registers in the main shader" field. The specification is slightly
coarser. I don't know why the hardware needs that
information anyway -- occupancy of the preamble shader should be
irrelevant -- but it's not a big deal.
Finally I found that the "more than 4 textures?" bit is... not that. I
do not yet know what it is, but it is... not that.
These all use the new groups() modifier for GenXML
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>
This gets shader-db's runner working, in conjunction with a shader-db ./run
modified to set ASAHI_MESA_DEBUG=precompile. This flag triggers precompiles of
all shaders witha default key so we can exercise the compiler.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18813>
According meson about with_glx == 'xlib' check only defied '-DUSE_XSHM',
so the macro check GLX_INDIRECT_RENDERING make no-sense, removed it.
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19221>