Add a separate pass which uses the analyze_ubo_ranges machinery to
construct ranges of readonly globals accessed in the shader and push
them to constants in the preamble, using ldg.k if possible. This is
enough to handle inline uniforms in turnip but also provides a base for
OpenCL, although the pass would need further work for that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26934>
In previous patches, we have moved the Intel specific lowering code in
brw_nir_lower_texture file. We can go ahead and drop the Intel specific
texture source too.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27458>
geometry shaders don't specify the input topology, only the class of topology.
normalize when generating a passthrough gs.
asahi will be more picky about this in the future.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Antonino Maniscalco <antonino.maniscalco@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27457>
Follow the blob and optimize subgroup operation using brcst.active and
getlast when supported.
The transformation consists of two parts. First, a NIR transform
replaces subgroup operations with a sequence of new brcst_active_ir3
intrinsics followed by a new [type]_clusters_ir3 intrinsic (where type
can be reduce, inclusive_scan, or exclusive_scan).
The brcst_active_ir3 intrinsic is lowered directly to a brcst.active
instruction. The other intrinsics get lowered to a new macro
(OPC_SCAN_CLUSTERS_MACRO) which later gets emitted as a loop (using
getlast/getone) that iterates all clusters and produces the requested
scan result.
OPC_SCAN_CLUSTERS_MACRO has a number of optional arguments. First, since
the exclusive scan result is not a natural by-product of the loop but
has to be calculated explicitly, its destination is optional. This is
necessary since adding it unconditionally will produce unused
instructions that won't be DCE'd anymore at this point. Second, when
performing 32b MUL_U reductions (that expand to multiple instructions),
an extra scratch register is necessary.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6387
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26950>
As per this optimisations description:
"Takes assignments to variables that are dereferenced only
once and pastes the RHS expression into where the variables
dereferenced."
However the optimisation is run at compile time before multiple
shaders from the same stage could have been pasted together.
So this optimisation can incorrectly assume a global is only
referenced once since it cannot see the other pieces of the
shader stage until link time.
Here we skip the optimisation if the variable is a global. We
could change it to only run at link time however this
optimisation is only run at link time if we are being forced
to use GLSL IR to inline a function that glsl to nir cannot
handle and this will also be removed in a future patchset.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10482
Fixes: d75a36a9ee ("glsl: remove do_copy_propagation_elements() optimisation pass")
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27351>
LLVM_LIB_DIR is a variable used for runtime compilations.
When cross compiling, LLVM_LIB_DIR must be set to the
libclang path on the target. So, this path should not
be retrieved during compilation but at runtime.
dladdr uses an address to search for a loaded library.
If a library is found, it returns information about it.
The path to the libclang library can therefore be
retrieved using one of its functions. This is useful
because we don't know the name of the libclang library
(libclang.so.X or libclang-cpp.so.X)
v2 (Karol): use clang::CompilerInvocation::CreateFromArgs for dladdr
v3 (Karol): follow symlinks to fix errors on debian
Fixes: e22491c832 ("clc: fetch clang resource dir at runtime")
Signed-off-by: Antoine Coutant <antoine.coutant@smile.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by (v1): Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568>
As we want to start using `dladdr`, this is needed to prevent `dladdr`
returning information of the wrong file.
Fixes tag as it's required by the actual fix.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Fixes: e22491c832 ("clc: fetch clang resource dir at runtime")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568>
Setting opencl-external-clang-headers to enabled while using shared LLVM
was broken and this option was mostly used for windows to force static
inclusion of opencl base headers.
Simply relying on the shared-llvm option here is enough to get what we
want.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25568>