This calls nir_separate_merged_clip_cull_io in zink, which is better
than having to handle separate clip & cull arrays in all passes.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38452>
Only needed by zink. This clip/cull distance separation pass is needed
to remove nir_io_separate_clip_cull_distance_arrays, so that all shared
GLSL code only uses merged clip+cull distance outputs.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38452>
This aborts if a pass would make any progress. It can be used to assert that:
- our minimalist pass invocation loops in drivers are sufficient and don't
leave any unoptimized code in the shader
- our lowering is sufficient and other passes don't add instructions that
would cause lowering having to be repeated
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38406>
Based on existing softfloat64 support and Berkeley SoftFloat. This is
targeted at drivers that can't preserve denorms, so operations where
denorm support is irrelevant like conversions to/from integers aren't
handled.
Because the existing mechanism used by Gallium for softfloat64 doesn't
support includes, we unfortunately can't extract common code into a
header. This can be done later if we switch Gallium to using glslang and
spirv-to-nir.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37608>
We have to invert the condition to select the non-NaN source, instead of
just swapping a and b. The equivalent float32 function was failing
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp32.input_args.denorm_nmax_nan_preserve_frag
without this fix.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37608>
Either we need to save this pointer or toss it.
==146166==ERROR: AddressSanitizer: heap-use-after-free on address 0x7bfe77013920 at pc 0x7b9e6fd5b978 bp 0x7ffc30ef18e0 sp 0x7ffc30ef18d8
READ of size 4 at 0x7bfe77013920 thread T0
#0 0x7b9e6fd5b977 in get_header ../src/util/ralloc.c:83
#1 0x7b9e6fd5b977 in ralloc_parent ../src/util/ralloc.c:382
#2 0x7b9e6fd5b977 in reralloc_size ../src/util/ralloc.c:198
#3 0x7b9e6fd5b977 in reralloc_array_size ../src/util/ralloc.c:241
#4 0x7b9e705f83c2 in range_minimum_query_table_resize ../src/util/range_minimum_query.c:21
#5 0x7b9e7018af1d in realloc_info ../src/compiler/nir/nir_dominance_lca.c:33
#6 0x7b9e7018af1d in nir_calc_dominance_lca_impl ../src/compiler/nir/nir_dominance_lca.c:126
#7 0x7b9e6ff9815c in nir_metadata_require ../src/compiler/nir/nir_metadata.c:42
#8 0x7b9e6ff998e4 in nir_metadata_require_most ../src/compiler/nir/nir_metadata.c:200
#9 0x7b9e6f8aab4d in st_finalize_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:735
#10 0x7b9e6f0afb14 in st_create_common_variant ../src/mesa/state_tracker/st_program.c:858
#11 0x7b9e6f0be2d3 in st_get_common_variant ../src/mesa/state_tracker/st_program.c:973
#12 0x7b9e6f0bf9cf in st_precompile_shader_variant ../src/mesa/state_tracker/st_program.c:1478
#13 0x7b9e6f0bf9cf in st_finalize_program ../src/mesa/state_tracker/st_program.c:1596
#14 0x7b9e6f8b0127 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:633
#15 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816
#16 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412
#17 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474
#18 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872
#19 0x7f9e7893dd65 in GOMP_parallel (/lib64/libgomp.so.1+0xdd65) (BuildId: 9cc501fdca53b5d4ab094f709486781c98573bc9)
#20 0x000000400d6a in main /home/alyssa/shader-db/run.c:689
#21 0x7f9e78011574 in __libc_start_call_main (/lib64/libc.so.6+0x3574) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
#22 0x7f9e78011627 in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x3627) (BuildId: 48c4b9b1efb1df15da8e787f489128bf31893317)
#23 0x000000401014 in _start (/home/alyssa/shader-db/run+0x401014) (BuildId: a83b8d830cc265be3f54ea3e7a21a0fb5156624b)
0x7bfe77013920 is located 0 bytes inside of 64-byte region [0x7bfe77013920,0x7bfe77013960)
freed by thread T0 here:
#0 0x7f9e782e5beb in free.part.0 (/usr/lib64/libasan.so.8+0xe5beb) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
#1 0x7b9e6fd5bc39 in unsafe_free ../src/util/ralloc.c:319
#2 0x7b9e6fd5bc39 in ralloc_free ../src/util/ralloc.c:264
#3 0x7b9e70063d81 in nir_sweep ../src/compiler/nir/nir_sweep.c:219
#4 0x7b9e6f0bf499 in st_finalize_program ../src/mesa/state_tracker/st_program.c:1585
#5 0x7b9e6f8b0127 in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:633
#6 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816
#7 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412
#8 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474
#9 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872
previously allocated by thread T0 here:
#0 0x7f9e782e5e4b in realloc.part.0 (/usr/lib64/libasan.so.8+0xe5e4b) (BuildId: cab80046dbc1c97c6e14490acc37d079701f8d9a)
#1 0x7b9e6fd5a883 in resize ../src/util/ralloc.c:167
#2 0x7b9e705f83c2 in range_minimum_query_table_resize ../src/util/range_minimum_query.c:21
#3 0x7b9e7018af1d in realloc_info ../src/compiler/nir/nir_dominance_lca.c:33
#4 0x7b9e7018af1d in nir_calc_dominance_lca_impl ../src/compiler/nir/nir_dominance_lca.c:126
#5 0x7b9e6ff9815c in nir_metadata_require ../src/compiler/nir/nir_metadata.c:42
#6 0x7b9e6ff998e4 in nir_metadata_require_most ../src/compiler/nir/nir_metadata.c:200
#7 0x7b9e6f8b0ede in st_link_glsl_to_nir ../src/mesa/state_tracker/st_glsl_to_nir.cpp:550
#8 0x7b9e6f8b3611 in st_link_shader ../src/mesa/state_tracker/st_glsl_to_nir.cpp:816
#9 0x7b9e6f7bcf51 in link_program ../src/mesa/main/shaderapi.c:1412
#10 0x7b9e6f7bcf51 in link_program_error ../src/mesa/main/shaderapi.c:1474
#11 0x0000004020b0 in main._omp_fn.0 /home/alyssa/shader-db/run.c:872
Fixes: 17876a00af ("nir: Add a faster lowest common ancestor algorithm")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38412>
This is done by grep ALIGN( to align(
docs,*.xml,blake3 is excluded
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38365>
We add a bunch of new helpers to avoid the need to touch >parent_instr,
including the full set of:
* nir_def_is_*
* nir_def_as_*_or_null
* nir_def_as_* [assumes the right instr type]
* nir_src_is_*
* nir_src_as_*
* nir_scalar_is_*
* nir_scalar_as_*
Plus nir_def_instr() where there's no more suitable helper.
Also an existing helper is renamed to unify all the names, while we're
churning the tree:
* nir_src_as_alu_instr -> nir_src_as_alu
..and then we port the tree to use the helpers as much as possible, using
nir_def_instr() where that does not work.
Acked-by: Marek Olšák <maraeo@gmail.com>
---
To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm
taking this opportunity to clean up a lot of NIR patterns.
Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
gcc has a a false positive here, silenced with the pragmas, use separate commit
for easily revert latter once gcc fixed it.
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
Previously the type info for nested values was copied from the source
operand, rather than propagating the new type from the destination
operand.
Fixes: 4c363acf94 ("vtn: Allow for OpCopyLogical with different but compatible types")
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38248>
On Mali, we need not only clamp but also convert to float16 on Valhall+.
We could have a separate pass for this but it fits in nicely with the
rest of nir_lower_point_size() so we might as well put it there.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38379>
The backend has been fully ignoring all writemasks for a long time,
so it really doesn't make sense to have them on our custom intrinsics.
I'm not sure they even make sense for some of the block intrinsics.
Also, the store_ssbo -> store_ssbo_intel pass was not setting writemask
at all, leaving it at the default value of 0 (aka write nothing, if it
had been respected...)
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38343>
In some cases propagating through a bcsel may be harmful. If the bcsel
uses are unlikely to be eliminated in both branch of an if statement,
propagating through it may result in extra moves for phi instructions
and extended live ranges.
v2: Fix missing parameter in call. Noticed by Rhys. I fixed this on the
test machine, but I must have forgotten to propagate the change back to
my dev machine.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38321>
The size and stage parameters are left-overs from history. Originally,
the function acted on a list and so it needed an explicit stage and size
output. Now that it takes a NIR shader and a mode, we can just take the
stage from the shader and set num_(in|out)puts.
The one caller that actually used the explicit output parameter was
turnip. However, given that the helper sorts and re-numbers all the I/O
variables, it's not like changing num_(in|out)puts instead of writing it
to some other location is that big of a deal.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38297>
Some shaders, especially RTPSO shaders that have parts of the PSO
inlined, can become absolutely huge. Using a sparse bitset avoids
quadratic complexity in memory consumption for the liveness information.
This reduces peak memory usage in worst-case tests (hammering
compilation of many huge RTPSOs on 32 threads concurrently) by ~60%,
from 43GB to 18GB.
CPU time (seconds) differences for a workload with mostly small shaders:
Difference at 95.0% confidence
-5.27 +/- 1.08963
-0.88811% +/- 0.183626%
(Student's t, pooled s = 0.629735)
Peak resident set usage for the mostly-small workload:
Difference at 95.0% confidence
30809 +/- 13394.3
1.59276% +/- 0.69246%
(Student's t, pooled s = 7741.09)
CPU time for the heavy workload did not show any difference.
Co-authored-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37908>
This is more robust because it ensures that we only ever check the
location on something that we know is an outupt. Also, if it's an
output then we know (thanks, validation!) that it's a variable.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38265>
While we're here, make the variable handling a little more robust by
checking the deref mode before assuming there's a reachable variable.
Reviewed-by: Olivia Lee <olivia.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38265>
The standard way to query options in mesa is `os_get_option()` which
abstracts platform-specific mechanisms to get config variables.
However in quite a few places `getenv()` is still used and this may
preclude controlling some options on some systems.
For instance it is not generally possible to use `MESA_DEBUG` on
Android.
So replace most `getenv()` occurrences with `os_get_option()` to
support configuration options more consistently across different
platforms.
Do the same with `secure_getenv()` replacing it with
`os_get_option_secure()`.
The bulk of the proposed changes are mechanically performed by the
following script:
-----------------------------------------------------------------------
#!/bin/sh
set -e
replace() {
# Don't replace in some files, for example where `os_get_option` is defined,
# or in external files
EXCLUDE_FILES_PATTERN='(src/util/os_misc.c|src/util/u_debug.h|src/gtest/include/gtest/internal/gtest-port.h)'
# Don't replace some "system" variables
EXCLUDE_VARS_PATTERN='("XDG|"DISPLAY|"HOME|"TMPDIR|"POSIXLY_CORRECT)'
git grep "[=!( ]$1(" -- src/ | cut -d ':' -f 1 | sort | uniq | \
grep -v -E "$EXCLUDE_FILES_PATTERN" | \
while read -r file;
do
# Don't replace usages of XDG_* variables or HOME
sed -E -e "/$EXCLUDE_VARS_PATTERN/!s/([=!\( ])$1\(/\1$2\(/g" -i "$file";
done
}
# Add const to os_get_option results, to avoid warning about discarded qualifier:
# warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
# but also errors in some cases:
# error: invalid conversion from ‘const char*’ to ‘char*’ [-fpermissive]
add_const_results() {
git grep -l -P '(?<!const )char.*os_get_option' | \
while read -r file;
do
sed -e '/^\s*const/! s/\(char.*os_get_option\)/const \1/g' -i "$file"
done
}
replace 'secure_getenv' 'os_get_option_secure'
replace 'getenv' 'os_get_option'
add_const_results
-----------------------------------------------------------------------
After this, the `#include "util/os_misc.h"` is also added in files where
`os_get_option()` was not used before.
And since the replacements from the script above generated some new
`-Wdiscarded-qualifiers` warnings, those have been addressed as well,
generally by declaring `os_get_option()` results as `const char *` and
adjusting some function declarations.
Finally some replacements caused new errors like:
-----------------------------------------------------------------------
../src/gallium/auxiliary/gallivm/lp_bld_misc.cpp:127:31: error: no matching function for call to 'strtok'
127 | for (n = 0, option = strtok(env_llc_options, " "); option; n++, option = strtok(NULL, " ")) {
| ^~~~~~
/android-ndk-r27c/toolchains/llvm/prebuilt/linux-x86_64/bin/../sysroot/usr/include/string.h:124:17: note: candidate function not viable: 1st argument ('const char *') would lose const qualifier
124 | char* _Nullable strtok(char* _Nullable __s, const char* _Nonnull __delimiter);
| ^ ~~~~~~~~~~~~~~~~~~~
-----------------------------------------------------------------------
Those have been addressed too, copying the const string returned by
`os_get_option()` so that it could be modified.
In particular, the error above has been fixed by copying the `const
char *env_llc_options` variable in
`src/gallium/auxiliary/gallivm/lp_bld_misc.cpp` to a `char *` which can
be tokenized using `strtok()`.
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38128>
nir_parallel_copy_instr can be emulated using an intrinsic for each
entry and an array of arrays that is used by the pass to remember which
copies belong together.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36483>
This reduces indirect indexing of 1-element arrays to indexing with 0.
Without this, we fail an assertion later.
Discovered when writing a test.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38113>