Commit graph

2498 commits

Author SHA1 Message Date
Alyssa Rosenzweig
31ecf16428 asahi: inline UVS indices
this lets us optimize VS for linked shaders (across APIs). less indirection,
less ALU in the VS, less loads in the preamble (Vulkan) / USC uniform pushes
(OpenGL). not the most critical thing, this was already optimized to make
unlinked shaders fast, but it can't hurt ;)

also optimizing linked shaders is less objectionable from an ESO
perspective than optimizing static state.

GL:

   total instrs in shared programs: 2866067 -> 2778519 (-3.05%)
   instrs in affected programs: 1041399 -> 953851 (-8.41%)

   total threads in shared programs: 27802944 -> 27803648 (<.01%)
threads in affected programs: 1984 -> 2688 (35.48%)

   total uniforms in shared programs: 2064008 -> 2036112 (-1.35%)
uniforms in affected programs: 978997 -> 951101 (-2.85%)

Vulkan:

   Totals from 20408 (37.78% of 54019) affected shaders:
   MaxWaves: 20342464 -> 20342976 (+0.00%)
   Instrs: 7262316 -> 6958468 (-4.18%); split: -4.18%, +0.00%
   CodeSize: 53744780 -> 51480354 (-4.21%); split: -4.22%, +0.00%
   ALU: 5691626 -> 5385049 (-5.39%); split: -5.39%, +0.00%
   FSCIB: 5691626 -> 5385049 (-5.39%); split: -5.39%, +0.00%
   IC: 1210560 -> 1210512 (-0.00%)
   GPRs: 1231162 -> 1252219 (+1.71%); split: -0.58%, +2.29%
   Uniforms: 3854892 -> 3759804 (-2.47%); split: -2.47%, +0.00%
   Preamble instrs: 3390251 -> 3238677 (-4.47%); split: -4.47%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
2025-08-03 21:57:26 +00:00
Alyssa Rosenzweig
8b5c800d1f asahi: use NIR gathered interpolation
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
2025-08-03 21:57:26 +00:00
Alyssa Rosenzweig
b8f50b6317 nir: gather info in opt_varyings_bulk
the info is all messed up so we need to do this right after. merge this
code.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36501>
2025-08-03 21:57:25 +00:00
Alyssa Rosenzweig
3e8575c037 nir,agx: pull lower_printf_buffer into backend
no other users now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36516>
2025-08-03 21:27:50 +00:00
Emma Anholt
d5826506ce nir,agx: Move AGX's loop (generalized) to shared NIR code.
When I went to use opt_reassociate for tu, I was advised that you want to
do this loop to get the best results.  If everyone needs it, let's make it
common code and explain what's going on.

In the process, also make it skip work appropriately when there's no
progress.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36342>
2025-08-03 20:58:28 +00:00
Alyssa Rosenzweig
c550cfce88 hk: use new reset query kernel
this avoids pathologically bad performance for large #s of writes. fixes
extremely bad performance in RDR2.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13603
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:41:11 -04:00
Alyssa Rosenzweig
43e0a2d3a5 libagx: port reset query helper to libagx
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:41:11 -04:00
Alyssa Rosenzweig
d2cb6ea0e1 libagx: factor out query_report
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:41:11 -04:00
Alyssa Rosenzweig
7f8ed2628b asahi: use 16-bit coordinates for bg program
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:41:11 -04:00
Alyssa Rosenzweig
8a8fe2ffc1 agx: handle 16-bit coordinates
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:41:11 -04:00
Alyssa Rosenzweig
0319bd0a84 agx: set register cache hints
impl cribbed from the Valhall compiler. that seems only fair, I wrote the
code either way (-:

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
35e70bf30a agx: lower export even later
so we can do reg cache opt as late as possible without losing this information.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
d9c0971e50 agx: plumb is_alu query for reg cache opt
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
a27e51f3c1 agx: fix cache bit packing
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
e97005e688 agx: fix simd reduce forcing no cache bit
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
c6111cc43c agx: fix export instructions in the IR
so we can see thru them properly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
17f2a3af7a agx: fix reg cache printing
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
d15bfdf0a7 agx: track block divergence
conservative for now. we'll need this for correctness.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
fc9f3363fa agx: add foreach_reg_{src,dest}
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
700a88233b asahi: rename compressed 1 to just compressed
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
74ed2b78e8 asahi,hk: optimize no-op FS
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
626fa80c1b asahi: optimize pass type with depth-only passes
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
7f2a6cdd26 hk: only enable image view min LOD for dx12
I don't really want random Vulkan apps using this. fixes Steam shading
precaching via fossilize.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
a0a18c084e hk: kill psiz writes via topology, not feature
this regresses DXVK fast link shaders, I guess, but fixes Proton shader
precompiles. per discussion with Hans-Kristian

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
9c987ee75e asahi: use native colour masking
seems to work now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
562377f01d agx: try to rematerialize to improve occupancy
we already have a perfectly good spiller and SSA... use it when it helps. yes,
this costs a bit of CPU time, but it's guarded behind enough checks that the
average time should be fine.

this was prompted by a shadertoy where we were losing waves due to way too
many constants pooled at the start of a chunky shader.

in GL shader-db, only affected shaders are in blender:

   instrs HURT:   shaders/blender/1020.shader_test FS:              3125 -> 3178 (1.70%)
   instrs HURT:   shaders/blender/981.shader_test FS:               3125 -> 3178 (1.70%)
   instrs HURT:   shaders/blender/729.shader_test FS:               3086 -> 3154 (2.20%)
   instrs HURT:   shaders/blender/1023.shader_test FS:              3085 -> 3153 (2.20%)
   instrs HURT:   shaders/blender/424.shader_test FS:               3085 -> 3153 (2.20%)

   threads helped:   shaders/blender/1020.shader_test FS:              576 -> 640 (11.11%)
   threads helped:   shaders/blender/1023.shader_test FS:              576 -> 640 (11.11%)
   threads helped:   shaders/blender/424.shader_test FS:               576 -> 640 (11.11%)
   threads helped:   shaders/blender/729.shader_test FS:               576 -> 640 (11.11%)
   threads helped:   shaders/blender/981.shader_test FS:               576 -> 640 (11.11%)

in VK fossils, there's a lot more high pressure shaders that benefit:

   Totals from 113 (0.21% of 54019) affected shaders:
   MaxWaves: 64448 -> 73088 (+13.41%)
   Instrs: 388529 -> 391646 (+0.80%); split: -0.00%, +0.80%
   CodeSize: 2750064 -> 2769106 (+0.69%); split: -0.00%, +0.69%
   ALU: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
   FSCIB: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
   GPRs: 21297 -> 19289 (-9.43%)
   Preamble instrs: 75703 -> 75911 (+0.27%)

notable improvement in Far Cry 5.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
6544a4f1ae asahi: drop sink/move in GS code
this is asking for trouble, since divergence analysis doesn't handle stuff we
lower quickly. this fixes geometry shaders blowing up since the cited commit,
but since I was the one who r-b'd that change, I don't have anyone to blame but
myself C:

Fixes: d61edf079b ("nir: add nir_move_only_convergent/divergent")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
5761213587 asahi: clang-format
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
bcf1a1c20b treewide: use nir_def_block
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -definition->parent_instr->block
    +nir_def_block(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
82ae8b1d33 treewide: simplify nir_def_rewrite_uses_after
Most of the time with nir_def_rewrite_uses_after, you want to rewrite after the
replacement. Make that the default thing to be more ergonomic and to drop
parent_instr uses.

We leave nir_def_rewrite_uses_after_instr defined if you really want the old
signature with an arbitrary after point.

Via Coccinelle patch:

    @@
    expression a, b;
    @@

    -nir_def_rewrite_uses_after(a, b, b->parent_instr)
    +nir_def_rewrite_uses_after_def(a, b)

Followed by a bunch of sed.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
cc6e3b84cb treewide: use nir_def_as_*
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -nir_instr_as_alu(definition->parent_instr)
    +nir_def_as_alu(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_intrinsic(definition->parent_instr)
    +nir_def_as_intrinsic(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_phi(definition->parent_instr)
    +nir_def_as_phi(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_load_const(definition->parent_instr)
    +nir_def_as_load_const(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_deref(definition->parent_instr)
    +nir_def_as_deref(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_tex(definition->parent_instr)
    +nir_def_as_tex(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Marek Olšák
5531f01326 nir: move list.h outside the glsl directory
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36425>
2025-07-31 20:23:02 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Marek Olšák
8d3e76c250 nir: split nir_move_load_frag_coord from nir_move_load_input
It's a pure system value on AMD, not an input.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36357>
2025-07-29 16:20:48 -04:00
Marek Olšák
688a639117 nir: add nir_tex_instr::can_speculate
Set to true everywhere except:
- spirv_to_nir used by Vulkan
- bindless handles in GLSL
- some internal shaders and driver-specific code

Acked-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36099>
2025-07-24 18:41:38 +00:00
Alyssa Rosenzweig
8a1a410389 treewide: use SWAP macro
Via Coccinelle patch + manual clean up:

    @@
    identifier temporary, a, b;
    type T;
    @@

    -T temporary = a;
    -a = b;
    -b = temporary;
    +SWAP(a, b);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36297>
2025-07-23 19:49:47 +00:00
Alyssa Rosenzweig
1f8b22a208 hk: optimize varyings
Stats are all over the place for some apps.

 PERCENTAGE DELTAS                   Shaders  MaxWaves   Instrs   CodeSize   Spills    Fills    Scratch     ALU      FSCIB       IC       GPRs    Uniforms Preamble instrs
  detroit_become_human                995       +0.12%    -1.21%    -1.27%   +77.42%   +71.67%   +78.82%    -1.11%    -1.11%    -1.03%    -3.22%    -1.40%       -2.07%
  god_of_war                          1029      -0.04%    -5.05%    -4.73%      .         .         .       -4.32%    -4.32%    -2.48%    -1.64%    -4.61%       -5.13%
  total_warhammer_3                   642       +7.28%   -14.66%   -13.45%   -49.23%   -49.04%   -34.78%   -14.25%   -14.25%   -10.50%    -8.72%    -7.98%       -8.57%

But probably a win overall. Improves Control fps by a few %.

Totals:
MaxWaves: 53068992 -> 53195520 (+0.24%); split: +0.39%, -0.16%
Instrs: 23793021 -> 22776261 (-4.27%); split: -4.41%, +0.14%
CodeSize: 169869654 -> 163433116 (-3.79%); split: -3.96%, +0.17%
Spills: 66100 -> 66124 (+0.04%); split: -0.67%, +0.71%
Fills: 43755 -> 43471 (-0.65%); split: -2.05%, +1.40%
Scratch: 403698 -> 403754 (+0.01%); split: -0.31%, +0.32%
ALU: 18510167 -> 17814743 (-3.76%); split: -3.95%, +0.19%
FSCIB: 18454849 -> 17759392 (-3.77%); split: -3.96%, +0.19%
IC: 5279176 -> 5184542 (-1.79%); split: -1.82%, +0.03%
GPRs: 3833103 -> 3751058 (-2.14%); split: -4.12%, +1.97%
Uniforms: 10528625 -> 10271771 (-2.44%); split: -2.52%, +0.08%
Preamble instrs: 10289152 -> 10008570 (-2.73%); split: -2.95%, +0.23%

Totals from 42844 (79.31% of 54019) affected shaders:
MaxWaves: 42014720 -> 42141248 (+0.30%); split: +0.50%, -0.20%
Instrs: 18406093 -> 17389333 (-5.52%); split: -5.70%, +0.18%
CodeSize: 131343714 -> 124907176 (-4.90%); split: -5.13%, +0.23%
Spills: 21479 -> 21503 (+0.11%); split: -2.06%, +2.17%
Fills: 5488 -> 5204 (-5.17%); split: -16.31%, +11.13%
Scratch: 332144 -> 332200 (+0.02%); split: -0.37%, +0.39%
ALU: 14476764 -> 13781340 (-4.80%); split: -5.05%, +0.25%
FSCIB: 14476671 -> 13781214 (-4.80%); split: -5.05%, +0.25%
IC: 3930760 -> 3836126 (-2.41%); split: -2.45%, +0.04%
GPRs: 3158947 -> 3076902 (-2.60%); split: -4.99%, +2.40%
Uniforms: 8660650 -> 8403796 (-2.97%); split: -3.06%, +0.09%
Preamble instrs: 8304828 -> 8024246 (-3.38%); split: -3.66%, +0.28%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36265>
2025-07-23 14:15:57 +00:00
Alyssa Rosenzweig
7701d2c986 agx/nir_lower_gs: handle XFB corner
exposed by next commits.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36265>
2025-07-23 14:15:57 +00:00
Alyssa Rosenzweig
803e61837e agx: reassociate ALU
GL:

   total instrs in shared programs: 2881862 -> 2801415 (-2.79%)
   instrs in affected programs: 2264277 -> 2183830 (-3.55%)

   total alu in shared programs: 2362306 -> 2281986 (-3.40%)
   alu in affected programs: 1882190 -> 1801870 (-4.27%)

   total fscib in shared programs: 2359848 -> 2279314 (-3.41%)
   fscib in affected programs: 1891013 -> 1810479 (-4.26%)

   total ic in shared programs: 661722 -> 661702 (<.01%)
   ic in affected programs: 1304 -> 1284 (-1.53%)

   total gprs in shared programs: 899341 -> 900319 (0.11%)
   gprs in affected programs: 48696 -> 49674 (2.01%)

   total uniforms in shared programs: 2069880 -> 2064570 (-0.26%)
   uniforms in affected programs: 426411 -> 421101 (-1.25%)

   total threads in shared programs: 27802432 -> 27802624 (<.01%)
   threads in affected programs: 5568 -> 5760 (3.45%)

   total preamble in shared programs: 1202295 -> 1222360 (1.67%)
   preamble in affected programs: 452890 -> 472955 (4.43%)

VK:

   Totals:
   MaxWaves: 53077184 -> 53075712 (-0.00%); split: +0.05%, -0.05%
   Instrs: 23845634 -> 23561020 (-1.19%); split: -1.22%, +0.02%
   CodeSize: 170339242 -> 168601666 (-1.02%); split: -1.04%, +0.02%
   Spills: 65594 -> 65784 (+0.29%); split: -1.43%, +1.72%
   Fills: 43190 -> 43178 (-0.03%); split: -2.21%, +2.18%
   Scratch: 404208 -> 403474 (-0.18%); split: -0.27%, +0.08%
   ALU: 18566800 -> 18288141 (-1.50%); split: -1.52%, +0.02%
   FSCIB: 18511881 -> 18230860 (-1.52%); split: -1.54%, +0.02%
   IC: 5260462 -> 5259748 (-0.01%); split: -0.02%, +0.00%
   GPRs: 3831837 -> 3838887 (+0.18%); split: -0.25%, +0.43%
   Uniforms: 10453510 -> 10443173 (-0.10%); split: -0.29%, +0.19%
   Preamble instrs: 10409287 -> 10496713 (+0.84%); split: -0.10%, +0.94%

   Totals from 32343 (59.87% of 54019) affected shaders:
   MaxWaves: 31027072 -> 31025600 (-0.00%); split: +0.08%, -0.08%
   Instrs: 19806186 -> 19521572 (-1.44%); split: -1.46%, +0.03%
   CodeSize: 141121024 -> 139383448 (-1.23%); split: -1.25%, +0.02%
   Spills: 65252 -> 65442 (+0.29%); split: -1.44%, +1.73%
   Fills: 42745 -> 42733 (-0.03%); split: -2.23%, +2.20%
   Scratch: 403096 -> 402362 (-0.18%); split: -0.27%, +0.08%
   ALU: 15544339 -> 15265680 (-1.79%); split: -1.82%, +0.03%
   FSCIB: 15491754 -> 15210733 (-1.81%); split: -1.84%, +0.03%
   IC: 4817376 -> 4816662 (-0.01%); split: -0.02%, +0.01%
   GPRs: 2735551 -> 2742601 (+0.26%); split: -0.35%, +0.61%
   Uniforms: 7717506 -> 7707169 (-0.13%); split: -0.39%, +0.25%
   Preamble instrs: 7713698 -> 7801124 (+1.13%); split: -0.14%, +1.27%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36147>
2025-07-22 23:17:01 +00:00
Alyssa Rosenzweig
7a786a9c7a agx: run more opt passes
preparing for reassoc

GL mostly noise, Vulkan:

Totals from 32853 (60.82% of 54019) affected shaders:
MaxWaves: 31747776 -> 31758272 (+0.03%); split: +0.04%, -0.01%
Instrs: 18017616 -> 18016663 (-0.01%); split: -0.11%, +0.11%
CodeSize: 128159164 -> 128249442 (+0.07%); split: -0.13%, +0.20%
Spills: 63634 -> 62658 (-1.53%); split: -1.83%, +0.30%
Fills: 42547 -> 41669 (-2.06%); split: -2.51%, +0.44%
Scratch: 341914 -> 341748 (-0.05%); split: -0.09%, +0.04%
ALU: 13999432 -> 13998308 (-0.01%); split: -0.13%, +0.12%
FSCIB: 13979325 -> 13978584 (-0.01%); split: -0.13%, +0.12%
IC: 3953418 -> 3957996 (+0.12%); split: -0.03%, +0.14%
GPRs: 2621294 -> 2619432 (-0.07%); split: -0.13%, +0.06%
Uniforms: 7118591 -> 7040633 (-1.10%); split: -1.91%, +0.82%
Preamble instrs: 6800746 -> 6571058 (-3.38%); split: -3.76%, +0.39%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36147>
2025-07-22 23:17:01 +00:00
Alyssa Rosenzweig
ecc51d9b9b agx: make sure denorm flushing really happens
Backport-to: 25.1
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36147>
2025-07-22 23:17:01 +00:00
Alyssa Rosenzweig
bff6dff572 hk: support static vertex input state
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
prologs inflate register pressure, so this can help a lot in the monolithic case
(together with dynamic strides). eliminates spilling from some vertex shaders in
Control that read a ton of attributes per vertex.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36271>
2025-07-22 11:21:50 +00:00
Alyssa Rosenzweig
a85219f89f asahi: use tex builders
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
2025-07-21 12:11:42 +00:00
Alyssa Rosenzweig
6b34e2174e nir: introduce ergonomic tex builder
for intrinsics, we have these really nice builders using designated initializers
+ macros to specify optional indices. texture instrs have even more craziness
involved, but we can do the same trick. this commit takes the existing "fixed
form" deref-centric tex builders and generalizes them to work with non-deref
textures, making it useful also for GL and late VK passes, while providing an
API that strives to be ergonomic and consistent.

this series only implements a subset of possible texture operations for now, but
more generalizing could be added as people have need.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
2025-07-21 12:11:41 +00:00
Alyssa Rosenzweig
20e2267be5 hk: readvertise required bgra4 format
spec dealt with.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
ca255cb703 hk: always lower bindless samplers
oddly only a single CTS case hits this.

dEQP-VK.subgroups.uniform_descriptor_indexing.combined_image_sampler

Fixes: 642c6c6f62 ("hk,agx: promote bindless samplers")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
20bf4a28a2 hk: use amul instead of imul
lets use do our address arithmetic tricks. preambles helped.

Totals:
MaxWaves: 53066688 -> 53068928 (+0.00%); split: +0.01%, -0.01%
Instrs: 23846587 -> 23794159 (-0.22%); split: -0.24%, +0.02%
CodeSize: 170248964 -> 169876998 (-0.22%); split: -0.24%, +0.02%
Spills: 66570 -> 66401 (-0.25%); split: -0.53%, +0.27%
Fills: 44068 -> 43879 (-0.43%); split: -0.98%, +0.55%
Scratch: 404374 -> 403894 (-0.12%); split: -0.18%, +0.06%
ALU: 18567924 -> 18510790 (-0.31%); split: -0.33%, +0.02%
FSCIB: 18512622 -> 18455472 (-0.31%); split: -0.33%, +0.02%
IC: 5255884 -> 5279176 (+0.44%); split: -0.11%, +0.56%
GPRs: 3833699 -> 3833127 (-0.01%); split: -0.05%, +0.04%
Uniforms: 10531468 -> 10528625 (-0.03%); split: -0.03%, +0.00%
Preamble instrs: 10435998 -> 10289152 (-1.41%); split: -1.43%, +0.02%

Totals from 6482 (12.00% of 54019) affected shaders:
MaxWaves: 5819712 -> 5821952 (+0.04%); split: +0.09%, -0.05%
Instrs: 5777505 -> 5725077 (-0.91%); split: -1.01%, +0.10%
CodeSize: 42654844 -> 42282878 (-0.87%); split: -0.97%, +0.09%
Spills: 23065 -> 22896 (-0.73%); split: -1.53%, +0.79%
Fills: 7927 -> 7738 (-2.38%); split: -5.46%, +3.08%
Scratch: 310148 -> 309668 (-0.15%); split: -0.23%, +0.08%
ALU: 4424867 -> 4367733 (-1.29%); split: -1.39%, +0.10%
FSCIB: 4424651 -> 4367501 (-1.29%); split: -1.39%, +0.10%
IC: 1144594 -> 1167886 (+2.03%); split: -0.53%, +2.56%
GPRs: 620494 -> 619922 (-0.09%); split: -0.33%, +0.24%
Uniforms: 1622654 -> 1619811 (-0.18%); split: -0.20%, +0.02%
Preamble instrs: 2119640 -> 1972794 (-6.93%); split: -7.03%, +0.10%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
a60755c015 agx: use immediate load ts/ss forms
Honeykrisp appreciates this. Funny looking fossil stats:

Totals:
Preamble instrs: 10638975 -> 10435998 (-1.91%); split: -1.91%, +0.00%

Totals from 23612 (43.71% of 54019) affected shaders:
Preamble instrs: 5104103 -> 4901126 (-3.98%); split: -3.98%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
3a798e58e5 agx: add immediate load ts/ss encodings
TellowKrinkle found this by experimentation. Seems to work great.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
e48d1ca349 agx: optimize imgwblk uniform
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00