we already have a perfectly good spiller and SSA... use it when it helps. yes,
this costs a bit of CPU time, but it's guarded behind enough checks that the
average time should be fine.
this was prompted by a shadertoy where we were losing waves due to way too
many constants pooled at the start of a chunky shader.
in GL shader-db, only affected shaders are in blender:
instrs HURT: shaders/blender/1020.shader_test FS: 3125 -> 3178 (1.70%)
instrs HURT: shaders/blender/981.shader_test FS: 3125 -> 3178 (1.70%)
instrs HURT: shaders/blender/729.shader_test FS: 3086 -> 3154 (2.20%)
instrs HURT: shaders/blender/1023.shader_test FS: 3085 -> 3153 (2.20%)
instrs HURT: shaders/blender/424.shader_test FS: 3085 -> 3153 (2.20%)
threads helped: shaders/blender/1020.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/1023.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/424.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/729.shader_test FS: 576 -> 640 (11.11%)
threads helped: shaders/blender/981.shader_test FS: 576 -> 640 (11.11%)
in VK fossils, there's a lot more high pressure shaders that benefit:
Totals from 113 (0.21% of 54019) affected shaders:
MaxWaves: 64448 -> 73088 (+13.41%)
Instrs: 388529 -> 391646 (+0.80%); split: -0.00%, +0.80%
CodeSize: 2750064 -> 2769106 (+0.69%); split: -0.00%, +0.69%
ALU: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
FSCIB: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
GPRs: 21297 -> 19289 (-9.43%)
Preamble instrs: 75703 -> 75911 (+0.27%)
notable improvement in Far Cry 5.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
this is asking for trouble, since divergence analysis doesn't handle stuff we
lower quickly. this fixes geometry shaders blowing up since the cited commit,
but since I was the one who r-b'd that change, I don't have anyone to blame but
myself C:
Fixes: d61edf079b ("nir: add nir_move_only_convergent/divergent")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
Most of the time with nir_def_rewrite_uses_after, you want to rewrite after the
replacement. Make that the default thing to be more ergonomic and to drop
parent_instr uses.
We leave nir_def_rewrite_uses_after_instr defined if you really want the old
signature with an arbitrary after point.
Via Coccinelle patch:
@@
expression a, b;
@@
-nir_def_rewrite_uses_after(a, b, b->parent_instr)
+nir_def_rewrite_uses_after_def(a, b)
Followed by a bunch of sed.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>
See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in
And this causes build errors when building for C23:
-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
123 | #define unreachable(str) \
| ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
456 | #define unreachable() (__builtin_unreachable ())
| ^~~~~~~~~~~
-----------------------------------------------------------------------
So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.
Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.
This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.
All the instances of the macro, including the definition, were updated
with the following command line:
git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
while read file; \
do \
sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
done && \
sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
Set to true everywhere except:
- spirv_to_nir used by Vulkan
- bindless handles in GLSL
- some internal shaders and driver-specific code
Acked-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36099>
prologs inflate register pressure, so this can help a lot in the monolithic case
(together with dynamic strides). eliminates spilling from some vertex shaders in
Control that read a ton of attributes per vertex.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36271>
for intrinsics, we have these really nice builders using designated initializers
+ macros to specify optional indices. texture instrs have even more craziness
involved, but we can do the same trick. this commit takes the existing "fixed
form" deref-centric tex builders and generalizes them to work with non-deref
textures, making it useful also for GL and late VK passes, while providing an
API that strives to be ergonomic and consistent.
this series only implements a subset of possible texture operations for now, but
more generalizing could be added as people have need.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
The GL driver expects special sync handling when a buffer is newly
exported, and also requires that bo->prime_fd be set so the batch code
can use it later. Add a function to do this for the KMS export case,
which otherwise would not need a PRIME fd.
agx_bo_export() then becomes a simple dup of bo->prime_fd (which is
probably marginally faster than redoing drmPrimeHandleToFD() anyway).
The thread safety story here is that as long as we do all this the first
time a BO is exported (in any way), there is no way for another thread
to have gotten ahold of the BO already, so no need for extra locking.
This does not affect hk, since it doesn't rely on bo->prime_fd for
anything. It also doesn't affect the timestamp BO and other special
cases.
Fixes: 067d820c9d ("asahi: Mark KMS exported resource BOs as shared")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13563
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36241>
I'm guessing the hardware needs to prefetch the whole sampler heap, so if we're
not gonna use it, let's omit it. I don't know if this helps, but it can't hurt.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>