The standard way to query options in mesa is `os_get_option()` which
abstracts platform-specific mechanisms to get config variables.
However in quite a few places `getenv()` is still used and this may
preclude controlling some options on some systems.
For instance it is not generally possible to use `MESA_DEBUG` on
Android.
So replace most `getenv()` occurrences with `os_get_option()` to
support configuration options more consistently across different
platforms.
Do the same with `secure_getenv()` replacing it with
`os_get_option_secure()`.
The bulk of the proposed changes are mechanically performed by the
following script:
-----------------------------------------------------------------------
#!/bin/sh
set -e
replace() {
# Don't replace in some files, for example where `os_get_option` is defined,
# or in external files
EXCLUDE_FILES_PATTERN='(src/util/os_misc.c|src/util/u_debug.h|src/gtest/include/gtest/internal/gtest-port.h)'
# Don't replace some "system" variables
EXCLUDE_VARS_PATTERN='("XDG|"DISPLAY|"HOME|"TMPDIR|"POSIXLY_CORRECT)'
git grep "[=!( ]$1(" -- src/ | cut -d ':' -f 1 | sort | uniq | \
grep -v -E "$EXCLUDE_FILES_PATTERN" | \
while read -r file;
do
# Don't replace usages of XDG_* variables or HOME
sed -E -e "/$EXCLUDE_VARS_PATTERN/!s/([=!\( ])$1\(/\1$2\(/g" -i "$file";
done
}
# Add const to os_get_option results, to avoid warning about discarded qualifier:
# warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
# but also errors in some cases:
# error: invalid conversion from ‘const char*’ to ‘char*’ [-fpermissive]
add_const_results() {
git grep -l -P '(?<!const )char.*os_get_option' | \
while read -r file;
do
sed -e '/^\s*const/! s/\(char.*os_get_option\)/const \1/g' -i "$file"
done
}
replace 'secure_getenv' 'os_get_option_secure'
replace 'getenv' 'os_get_option'
add_const_results
-----------------------------------------------------------------------
After this, the `#include "util/os_misc.h"` is also added in files where
`os_get_option()` was not used before.
And since the replacements from the script above generated some new
`-Wdiscarded-qualifiers` warnings, those have been addressed as well,
generally by declaring `os_get_option()` results as `const char *` and
adjusting some function declarations.
Finally some replacements caused new errors like:
-----------------------------------------------------------------------
../src/gallium/auxiliary/gallivm/lp_bld_misc.cpp:127:31: error: no matching function for call to 'strtok'
127 | for (n = 0, option = strtok(env_llc_options, " "); option; n++, option = strtok(NULL, " ")) {
| ^~~~~~
/android-ndk-r27c/toolchains/llvm/prebuilt/linux-x86_64/bin/../sysroot/usr/include/string.h:124:17: note: candidate function not viable: 1st argument ('const char *') would lose const qualifier
124 | char* _Nullable strtok(char* _Nullable __s, const char* _Nonnull __delimiter);
| ^ ~~~~~~~~~~~~~~~~~~~
-----------------------------------------------------------------------
Those have been addressed too, copying the const string returned by
`os_get_option()` so that it could be modified.
In particular, the error above has been fixed by copying the `const
char *env_llc_options` variable in
`src/gallium/auxiliary/gallivm/lp_bld_misc.cpp` to a `char *` which can
be tokenized using `strtok()`.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38128>
nir_parallel_copy_instr can be emulated using an intrinsic for each
entry and an array of arrays that is used by the pass to remember which
copies belong together.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36483>
This reduces indirect indexing of 1-element arrays to indexing with 0.
Without this, we fail an assertion later.
Discovered when writing a test.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38113>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38097>
Slight differences due to different optimization order.
Totals from 135 (0.17% of 79839) affected shaders: (Navi48)
Instrs: 287852 -> 287527 (-0.11%); split: -0.15%, +0.03%
CodeSize: 1522972 -> 1521764 (-0.08%); split: -0.12%, +0.04%
Latency: 1806803 -> 1825754 (+1.05%); split: -0.08%, +1.12%
InvThroughput: 242693 -> 244703 (+0.83%); split: -0.02%, +0.84%
VClause: 4092 -> 4084 (-0.20%)
SClause: 7462 -> 7478 (+0.21%)
Copies: 20509 -> 20401 (-0.53%); split: -0.74%, +0.21%
Branches: 6395 -> 6386 (-0.14%)
PreSGPRs: 7334 -> 7337 (+0.04%); split: -0.03%, +0.07%
PreVGPRs: 6375 -> 6382 (+0.11%)
VALU: 151787 -> 151595 (-0.13%); split: -0.15%, +0.02%
SALU: 52967 -> 52910 (-0.11%); split: -0.23%, +0.12%
VMEM: 6704 -> 6696 (-0.12%)
SMEM: 12099 -> 12129 (+0.25%)
Tested on a small collection of 2518 shaders from Dredge with callgrind using RADV:
baseline:
nir_opt_algebraic was called 12917 times from radv_optimize_nir()
nir_opt_cse was called 15204 times from radv_optimize_nir()
relative time spent in radv_optimize_nir(): 31.48%
total instruction fetch cost: 28,642,638,021
with nir/algebraic: ad-hoc constant-fold ALU instructions
nir_opt_algebraic was called 12797 times from radv_optimize_nir()
nir_opt_cse was called 12963 times from radv_optimize_nir()
relative time spent in radv_optimize_nir(): 30.63%
total instruction fetch cost: 28,284,386,123
=> ~1.27% improvement in total compile times
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
The arr::in_bounds field was set unconditionally for every deref created
for a chain. For struct derefs, which don't have this field, this would
write to an unused memory location, which is probably why this never
caused issues.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: f19cbe98e3 ("nir,spirv: Preserve inbounds access information")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38110>
Wrapping jump instructions that are located inside ifs can break SSA
invariants because the else block no longer dominates the merge block.
Repair the SSA to make the validator happy again.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37957>
Most of the time, we can infer the type to append in
util_dynarray_append using __typeof__, which is standardized in C23 and
support in Jesse's MSMSVCV. This patch drops the type argument most of
the time, making util_dynarray a little more ergonomic to use.
This is done in four steps.
First, rename util_dynarray_append -> util_dynarray_append_typed
bash -c "find . -type f -exec sed -i -e 's/util_dynarray_append(/util_dynarray_append_typed(/g' \{} \;"
Then, add a new append that infers the type. This is much more ergonomic
for what you want most of the time.
Next, use type-inferred append as much as possible, via Coccinelle
patch (plus manual fixup):
@@
expression dynarray, element;
type type;
@@
-util_dynarray_append_typed(dynarray, type, element);
+util_dynarray_append(dynarray, element);
Finally, hand fixup cases that Coccinelle missed or incorrectly
translated, of which there were several because we can't used the
untyped append with a literal (since the sizeof won't do what you want).
All four steps are squashed to produce a single patch changing every
util_dynarray_append call site in tree to either drop a type parameter
(if possible) or insert a _typed suffix (if we can't infer). As such,
the final patch is best reviewed by hand even though it was
tool-assisted.
No Long Linguine Meals were involved in the making of this patch.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38038>
In !35844, there was some discussion about allowing 64-bit bcsel that
would be lowered in the driver. One challenge there would be if a 64-bit
bcsel was transformed into integer min or max by an algebraic
optimization. I believe these were the only algebraic patterns that
could create new integer min or max that would not be immediately
constant folded.
There were no shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38033>
Introduces an opt pass that attempts to optimize
load_barycentric_at_{sample,offset} with simpler load_barycentric_*
equivalents where possible, and optionally lowers
load_barycentric_at_sample to load_barycentric_at_offset with a position
derived from the sample ID instead.
Signed-off-by: Simon Perretta <simon.perretta@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37658>
nir_lower_compute_system_values will attempt to lower
load_workgroup_size unless workgroup_size_variable is set. For precomp
shaders, the workgroup size is set statically for each entrypoint by
nir_precompiled_build_variant. Because we call
lower_compute_system_values early, it sets the workgroup size to zero.
Temporarily setting workgroup_size_variable while we are still
processing all the entrypoints together inhibits this.
Signed-off-by: Olivia Lee <olivia.lee@collabora.com>
Fixes: 20970bcd96 ("panfrost: Add base of OpenCL C infrastructure")
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37799>
Not sure about creating u64vec16 loads, but creating unaligned loads is
possible with opt_if_rewrite_uniform_uses.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37953>
Those are normally uniform always, but for the purpose of fused
threads handling, we need to check their sources.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ca1533cd03 ("nir/divergence: add a new mode to cover fused threads on Intel HW")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37929>
df1876f615 ("nir: Mark negative re-distribution on fadd as imprecise")
fixed the fadd case by marking it as imprecise. This commit fixes the
ffma case for the same reason.
However, "imprecise" isn't necessary and nowadays we have "nsz" which is
more accurate here. Use that for both fadd and ffma.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Fixes: 62795475e8 ("nir/algebraic: Distribute source modifiers into instructions")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37930>
Introduce new NIR intrinsics to handle getting a "sink" read-only
address and another intrinsic to handle conversion of address to
read-write (allowing implementation to replace the "sink" read-only with
another address like required for Asahi)
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37914>