Commit graph

2489 commits

Author SHA1 Message Date
Alyssa Rosenzweig
8a8fe2ffc1 agx: handle 16-bit coordinates
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:41:11 -04:00
Alyssa Rosenzweig
0319bd0a84 agx: set register cache hints
impl cribbed from the Valhall compiler. that seems only fair, I wrote the
code either way (-:

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
35e70bf30a agx: lower export even later
so we can do reg cache opt as late as possible without losing this information.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
d9c0971e50 agx: plumb is_alu query for reg cache opt
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
a27e51f3c1 agx: fix cache bit packing
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
e97005e688 agx: fix simd reduce forcing no cache bit
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
c6111cc43c agx: fix export instructions in the IR
so we can see thru them properly.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
17f2a3af7a agx: fix reg cache printing
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
d15bfdf0a7 agx: track block divergence
conservative for now. we'll need this for correctness.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:54 -04:00
Alyssa Rosenzweig
fc9f3363fa agx: add foreach_reg_{src,dest}
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
700a88233b asahi: rename compressed 1 to just compressed
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
74ed2b78e8 asahi,hk: optimize no-op FS
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
626fa80c1b asahi: optimize pass type with depth-only passes
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
7f2a6cdd26 hk: only enable image view min LOD for dx12
I don't really want random Vulkan apps using this. fixes Steam shading
precaching via fossilize.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
a0a18c084e hk: kill psiz writes via topology, not feature
this regresses DXVK fast link shaders, I guess, but fixes Proton shader
precompiles. per discussion with Hans-Kristian

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
9c987ee75e asahi: use native colour masking
seems to work now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
562377f01d agx: try to rematerialize to improve occupancy
we already have a perfectly good spiller and SSA... use it when it helps. yes,
this costs a bit of CPU time, but it's guarded behind enough checks that the
average time should be fine.

this was prompted by a shadertoy where we were losing waves due to way too
many constants pooled at the start of a chunky shader.

in GL shader-db, only affected shaders are in blender:

   instrs HURT:   shaders/blender/1020.shader_test FS:              3125 -> 3178 (1.70%)
   instrs HURT:   shaders/blender/981.shader_test FS:               3125 -> 3178 (1.70%)
   instrs HURT:   shaders/blender/729.shader_test FS:               3086 -> 3154 (2.20%)
   instrs HURT:   shaders/blender/1023.shader_test FS:              3085 -> 3153 (2.20%)
   instrs HURT:   shaders/blender/424.shader_test FS:               3085 -> 3153 (2.20%)

   threads helped:   shaders/blender/1020.shader_test FS:              576 -> 640 (11.11%)
   threads helped:   shaders/blender/1023.shader_test FS:              576 -> 640 (11.11%)
   threads helped:   shaders/blender/424.shader_test FS:               576 -> 640 (11.11%)
   threads helped:   shaders/blender/729.shader_test FS:               576 -> 640 (11.11%)
   threads helped:   shaders/blender/981.shader_test FS:               576 -> 640 (11.11%)

in VK fossils, there's a lot more high pressure shaders that benefit:

   Totals from 113 (0.21% of 54019) affected shaders:
   MaxWaves: 64448 -> 73088 (+13.41%)
   Instrs: 388529 -> 391646 (+0.80%); split: -0.00%, +0.80%
   CodeSize: 2750064 -> 2769106 (+0.69%); split: -0.00%, +0.69%
   ALU: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
   FSCIB: 292960 -> 295863 (+0.99%); split: -0.00%, +0.99%
   GPRs: 21297 -> 19289 (-9.43%)
   Preamble instrs: 75703 -> 75911 (+0.27%)

notable improvement in Far Cry 5.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
6544a4f1ae asahi: drop sink/move in GS code
this is asking for trouble, since divergence analysis doesn't handle stuff we
lower quickly. this fixes geometry shaders blowing up since the cited commit,
but since I was the one who r-b'd that change, I don't have anyone to blame but
myself C:

Fixes: d61edf079b ("nir: add nir_move_only_convergent/divergent")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36399>
2025-08-03 14:40:53 -04:00
Alyssa Rosenzweig
5761213587 asahi: clang-format
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
bcf1a1c20b treewide: use nir_def_block
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -definition->parent_instr->block
    +nir_def_block(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
82ae8b1d33 treewide: simplify nir_def_rewrite_uses_after
Most of the time with nir_def_rewrite_uses_after, you want to rewrite after the
replacement. Make that the default thing to be more ergonomic and to drop
parent_instr uses.

We leave nir_def_rewrite_uses_after_instr defined if you really want the old
signature with an arbitrary after point.

Via Coccinelle patch:

    @@
    expression a, b;
    @@

    -nir_def_rewrite_uses_after(a, b, b->parent_instr)
    +nir_def_rewrite_uses_after_def(a, b)

Followed by a bunch of sed.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
cc6e3b84cb treewide: use nir_def_as_*
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -nir_instr_as_alu(definition->parent_instr)
    +nir_def_as_alu(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_intrinsic(definition->parent_instr)
    +nir_def_as_intrinsic(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_phi(definition->parent_instr)
    +nir_def_as_phi(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_load_const(definition->parent_instr)
    +nir_def_as_load_const(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_deref(definition->parent_instr)
    +nir_def_as_deref(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_tex(definition->parent_instr)
    +nir_def_as_tex(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Marek Olšák
5531f01326 nir: move list.h outside the glsl directory
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36425>
2025-07-31 20:23:02 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Marek Olšák
8d3e76c250 nir: split nir_move_load_frag_coord from nir_move_load_input
It's a pure system value on AMD, not an input.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36357>
2025-07-29 16:20:48 -04:00
Marek Olšák
688a639117 nir: add nir_tex_instr::can_speculate
Set to true everywhere except:
- spirv_to_nir used by Vulkan
- bindless handles in GLSL
- some internal shaders and driver-specific code

Acked-by: Job Noorman <job@noorman.info>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36099>
2025-07-24 18:41:38 +00:00
Alyssa Rosenzweig
8a1a410389 treewide: use SWAP macro
Via Coccinelle patch + manual clean up:

    @@
    identifier temporary, a, b;
    type T;
    @@

    -T temporary = a;
    -a = b;
    -b = temporary;
    +SWAP(a, b);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36297>
2025-07-23 19:49:47 +00:00
Alyssa Rosenzweig
1f8b22a208 hk: optimize varyings
Stats are all over the place for some apps.

 PERCENTAGE DELTAS                   Shaders  MaxWaves   Instrs   CodeSize   Spills    Fills    Scratch     ALU      FSCIB       IC       GPRs    Uniforms Preamble instrs
  detroit_become_human                995       +0.12%    -1.21%    -1.27%   +77.42%   +71.67%   +78.82%    -1.11%    -1.11%    -1.03%    -3.22%    -1.40%       -2.07%
  god_of_war                          1029      -0.04%    -5.05%    -4.73%      .         .         .       -4.32%    -4.32%    -2.48%    -1.64%    -4.61%       -5.13%
  total_warhammer_3                   642       +7.28%   -14.66%   -13.45%   -49.23%   -49.04%   -34.78%   -14.25%   -14.25%   -10.50%    -8.72%    -7.98%       -8.57%

But probably a win overall. Improves Control fps by a few %.

Totals:
MaxWaves: 53068992 -> 53195520 (+0.24%); split: +0.39%, -0.16%
Instrs: 23793021 -> 22776261 (-4.27%); split: -4.41%, +0.14%
CodeSize: 169869654 -> 163433116 (-3.79%); split: -3.96%, +0.17%
Spills: 66100 -> 66124 (+0.04%); split: -0.67%, +0.71%
Fills: 43755 -> 43471 (-0.65%); split: -2.05%, +1.40%
Scratch: 403698 -> 403754 (+0.01%); split: -0.31%, +0.32%
ALU: 18510167 -> 17814743 (-3.76%); split: -3.95%, +0.19%
FSCIB: 18454849 -> 17759392 (-3.77%); split: -3.96%, +0.19%
IC: 5279176 -> 5184542 (-1.79%); split: -1.82%, +0.03%
GPRs: 3833103 -> 3751058 (-2.14%); split: -4.12%, +1.97%
Uniforms: 10528625 -> 10271771 (-2.44%); split: -2.52%, +0.08%
Preamble instrs: 10289152 -> 10008570 (-2.73%); split: -2.95%, +0.23%

Totals from 42844 (79.31% of 54019) affected shaders:
MaxWaves: 42014720 -> 42141248 (+0.30%); split: +0.50%, -0.20%
Instrs: 18406093 -> 17389333 (-5.52%); split: -5.70%, +0.18%
CodeSize: 131343714 -> 124907176 (-4.90%); split: -5.13%, +0.23%
Spills: 21479 -> 21503 (+0.11%); split: -2.06%, +2.17%
Fills: 5488 -> 5204 (-5.17%); split: -16.31%, +11.13%
Scratch: 332144 -> 332200 (+0.02%); split: -0.37%, +0.39%
ALU: 14476764 -> 13781340 (-4.80%); split: -5.05%, +0.25%
FSCIB: 14476671 -> 13781214 (-4.80%); split: -5.05%, +0.25%
IC: 3930760 -> 3836126 (-2.41%); split: -2.45%, +0.04%
GPRs: 3158947 -> 3076902 (-2.60%); split: -4.99%, +2.40%
Uniforms: 8660650 -> 8403796 (-2.97%); split: -3.06%, +0.09%
Preamble instrs: 8304828 -> 8024246 (-3.38%); split: -3.66%, +0.28%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36265>
2025-07-23 14:15:57 +00:00
Alyssa Rosenzweig
7701d2c986 agx/nir_lower_gs: handle XFB corner
exposed by next commits.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36265>
2025-07-23 14:15:57 +00:00
Alyssa Rosenzweig
803e61837e agx: reassociate ALU
GL:

   total instrs in shared programs: 2881862 -> 2801415 (-2.79%)
   instrs in affected programs: 2264277 -> 2183830 (-3.55%)

   total alu in shared programs: 2362306 -> 2281986 (-3.40%)
   alu in affected programs: 1882190 -> 1801870 (-4.27%)

   total fscib in shared programs: 2359848 -> 2279314 (-3.41%)
   fscib in affected programs: 1891013 -> 1810479 (-4.26%)

   total ic in shared programs: 661722 -> 661702 (<.01%)
   ic in affected programs: 1304 -> 1284 (-1.53%)

   total gprs in shared programs: 899341 -> 900319 (0.11%)
   gprs in affected programs: 48696 -> 49674 (2.01%)

   total uniforms in shared programs: 2069880 -> 2064570 (-0.26%)
   uniforms in affected programs: 426411 -> 421101 (-1.25%)

   total threads in shared programs: 27802432 -> 27802624 (<.01%)
   threads in affected programs: 5568 -> 5760 (3.45%)

   total preamble in shared programs: 1202295 -> 1222360 (1.67%)
   preamble in affected programs: 452890 -> 472955 (4.43%)

VK:

   Totals:
   MaxWaves: 53077184 -> 53075712 (-0.00%); split: +0.05%, -0.05%
   Instrs: 23845634 -> 23561020 (-1.19%); split: -1.22%, +0.02%
   CodeSize: 170339242 -> 168601666 (-1.02%); split: -1.04%, +0.02%
   Spills: 65594 -> 65784 (+0.29%); split: -1.43%, +1.72%
   Fills: 43190 -> 43178 (-0.03%); split: -2.21%, +2.18%
   Scratch: 404208 -> 403474 (-0.18%); split: -0.27%, +0.08%
   ALU: 18566800 -> 18288141 (-1.50%); split: -1.52%, +0.02%
   FSCIB: 18511881 -> 18230860 (-1.52%); split: -1.54%, +0.02%
   IC: 5260462 -> 5259748 (-0.01%); split: -0.02%, +0.00%
   GPRs: 3831837 -> 3838887 (+0.18%); split: -0.25%, +0.43%
   Uniforms: 10453510 -> 10443173 (-0.10%); split: -0.29%, +0.19%
   Preamble instrs: 10409287 -> 10496713 (+0.84%); split: -0.10%, +0.94%

   Totals from 32343 (59.87% of 54019) affected shaders:
   MaxWaves: 31027072 -> 31025600 (-0.00%); split: +0.08%, -0.08%
   Instrs: 19806186 -> 19521572 (-1.44%); split: -1.46%, +0.03%
   CodeSize: 141121024 -> 139383448 (-1.23%); split: -1.25%, +0.02%
   Spills: 65252 -> 65442 (+0.29%); split: -1.44%, +1.73%
   Fills: 42745 -> 42733 (-0.03%); split: -2.23%, +2.20%
   Scratch: 403096 -> 402362 (-0.18%); split: -0.27%, +0.08%
   ALU: 15544339 -> 15265680 (-1.79%); split: -1.82%, +0.03%
   FSCIB: 15491754 -> 15210733 (-1.81%); split: -1.84%, +0.03%
   IC: 4817376 -> 4816662 (-0.01%); split: -0.02%, +0.01%
   GPRs: 2735551 -> 2742601 (+0.26%); split: -0.35%, +0.61%
   Uniforms: 7717506 -> 7707169 (-0.13%); split: -0.39%, +0.25%
   Preamble instrs: 7713698 -> 7801124 (+1.13%); split: -0.14%, +1.27%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36147>
2025-07-22 23:17:01 +00:00
Alyssa Rosenzweig
7a786a9c7a agx: run more opt passes
preparing for reassoc

GL mostly noise, Vulkan:

Totals from 32853 (60.82% of 54019) affected shaders:
MaxWaves: 31747776 -> 31758272 (+0.03%); split: +0.04%, -0.01%
Instrs: 18017616 -> 18016663 (-0.01%); split: -0.11%, +0.11%
CodeSize: 128159164 -> 128249442 (+0.07%); split: -0.13%, +0.20%
Spills: 63634 -> 62658 (-1.53%); split: -1.83%, +0.30%
Fills: 42547 -> 41669 (-2.06%); split: -2.51%, +0.44%
Scratch: 341914 -> 341748 (-0.05%); split: -0.09%, +0.04%
ALU: 13999432 -> 13998308 (-0.01%); split: -0.13%, +0.12%
FSCIB: 13979325 -> 13978584 (-0.01%); split: -0.13%, +0.12%
IC: 3953418 -> 3957996 (+0.12%); split: -0.03%, +0.14%
GPRs: 2621294 -> 2619432 (-0.07%); split: -0.13%, +0.06%
Uniforms: 7118591 -> 7040633 (-1.10%); split: -1.91%, +0.82%
Preamble instrs: 6800746 -> 6571058 (-3.38%); split: -3.76%, +0.39%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36147>
2025-07-22 23:17:01 +00:00
Alyssa Rosenzweig
ecc51d9b9b agx: make sure denorm flushing really happens
Backport-to: 25.1
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36147>
2025-07-22 23:17:01 +00:00
Alyssa Rosenzweig
bff6dff572 hk: support static vertex input state
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
prologs inflate register pressure, so this can help a lot in the monolithic case
(together with dynamic strides). eliminates spilling from some vertex shaders in
Control that read a ton of attributes per vertex.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36271>
2025-07-22 11:21:50 +00:00
Alyssa Rosenzweig
a85219f89f asahi: use tex builders
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
2025-07-21 12:11:42 +00:00
Alyssa Rosenzweig
6b34e2174e nir: introduce ergonomic tex builder
for intrinsics, we have these really nice builders using designated initializers
+ macros to specify optional indices. texture instrs have even more craziness
involved, but we can do the same trick. this commit takes the existing "fixed
form" deref-centric tex builders and generalizes them to work with non-deref
textures, making it useful also for GL and late VK passes, while providing an
API that strives to be ergonomic and consistent.

this series only implements a subset of possible texture operations for now, but
more generalizing could be added as people have need.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36050>
2025-07-21 12:11:41 +00:00
Alyssa Rosenzweig
20e2267be5 hk: readvertise required bgra4 format
spec dealt with.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
ca255cb703 hk: always lower bindless samplers
oddly only a single CTS case hits this.

dEQP-VK.subgroups.uniform_descriptor_indexing.combined_image_sampler

Fixes: 642c6c6f62 ("hk,agx: promote bindless samplers")
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
20bf4a28a2 hk: use amul instead of imul
lets use do our address arithmetic tricks. preambles helped.

Totals:
MaxWaves: 53066688 -> 53068928 (+0.00%); split: +0.01%, -0.01%
Instrs: 23846587 -> 23794159 (-0.22%); split: -0.24%, +0.02%
CodeSize: 170248964 -> 169876998 (-0.22%); split: -0.24%, +0.02%
Spills: 66570 -> 66401 (-0.25%); split: -0.53%, +0.27%
Fills: 44068 -> 43879 (-0.43%); split: -0.98%, +0.55%
Scratch: 404374 -> 403894 (-0.12%); split: -0.18%, +0.06%
ALU: 18567924 -> 18510790 (-0.31%); split: -0.33%, +0.02%
FSCIB: 18512622 -> 18455472 (-0.31%); split: -0.33%, +0.02%
IC: 5255884 -> 5279176 (+0.44%); split: -0.11%, +0.56%
GPRs: 3833699 -> 3833127 (-0.01%); split: -0.05%, +0.04%
Uniforms: 10531468 -> 10528625 (-0.03%); split: -0.03%, +0.00%
Preamble instrs: 10435998 -> 10289152 (-1.41%); split: -1.43%, +0.02%

Totals from 6482 (12.00% of 54019) affected shaders:
MaxWaves: 5819712 -> 5821952 (+0.04%); split: +0.09%, -0.05%
Instrs: 5777505 -> 5725077 (-0.91%); split: -1.01%, +0.10%
CodeSize: 42654844 -> 42282878 (-0.87%); split: -0.97%, +0.09%
Spills: 23065 -> 22896 (-0.73%); split: -1.53%, +0.79%
Fills: 7927 -> 7738 (-2.38%); split: -5.46%, +3.08%
Scratch: 310148 -> 309668 (-0.15%); split: -0.23%, +0.08%
ALU: 4424867 -> 4367733 (-1.29%); split: -1.39%, +0.10%
FSCIB: 4424651 -> 4367501 (-1.29%); split: -1.39%, +0.10%
IC: 1144594 -> 1167886 (+2.03%); split: -0.53%, +2.56%
GPRs: 620494 -> 619922 (-0.09%); split: -0.33%, +0.24%
Uniforms: 1622654 -> 1619811 (-0.18%); split: -0.20%, +0.02%
Preamble instrs: 2119640 -> 1972794 (-6.93%); split: -7.03%, +0.10%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
a60755c015 agx: use immediate load ts/ss forms
Honeykrisp appreciates this. Funny looking fossil stats:

Totals:
Preamble instrs: 10638975 -> 10435998 (-1.91%); split: -1.91%, +0.00%

Totals from 23612 (43.71% of 54019) affected shaders:
Preamble instrs: 5104103 -> 4901126 (-3.98%); split: -3.98%, +0.00%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
3a798e58e5 agx: add immediate load ts/ss encodings
TellowKrinkle found this by experimentation. Seems to work great.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
e48d1ca349 agx: optimize imgwblk uniform
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
1ded5f55e8 agx: optimize txl LOD
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
2dd91b0d1c agx: simplify block image store offset
just make 32-bit offset.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Alyssa Rosenzweig
cbbc24a473 agx: fix dead phis
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36257>
2025-07-21 11:42:20 +00:00
Asahi Lina
140c625bda asahi: Ensure shared BOs have a prime_fd
The GL driver expects special sync handling when a buffer is newly
exported, and also requires that bo->prime_fd be set so the batch code
can use it later. Add a function to do this for the KMS export case,
which otherwise would not need a PRIME fd.

agx_bo_export() then becomes a simple dup of bo->prime_fd (which is
probably marginally faster than redoing drmPrimeHandleToFD() anyway).

The thread safety story here is that as long as we do all this the first
time a BO is exported (in any way), there is no way for another thread
to have gotten ahold of the BO already, so no need for extra locking.

This does not affect hk, since it doesn't rely on bo->prime_fd for
anything. It also doesn't affect the timestamp BO and other special
cases.

Fixes: 067d820c9d ("asahi: Mark KMS exported resource BOs as shared")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/13563
Signed-off-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36241>
2025-07-20 00:45:48 +09:00
Alyssa Rosenzweig
2308960bed treewide: use nir_mov_scalar
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Via Coccinelle patch:

    @@
    expression builder, scalar;
    @@

    -nir_channel(builder, scalar.def, scalar.comp)
    +nir_mov_scalar(builder, scalar)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36142>
2025-07-16 18:59:16 +00:00
Alyssa Rosenzweig
a5e9669a78 hk: only pass sampler heap if needed
I'm guessing the hardware needs to prefetch the whole sampler heap, so if we're
not gonna use it, let's omit it. I don't know if this helps, but it can't hurt.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:22 +00:00
Alyssa Rosenzweig
74c32c2357 hk: optimize desc set addr push
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:22 +00:00
Alyssa Rosenzweig
642c6c6f62 hk,agx: promote bindless samplers
via the bindless_sampler_agx intrinsic.

Totals from 29771 (55.11% of 54019) affected shaders:
MaxWaves: 28934080 -> 28938304 (+0.01%); split: +0.02%, -0.00%
Instrs: 16623874 -> 16369120 (-1.53%); split: -1.54%, +0.01%
CodeSize: 117532138 -> 115994992 (-1.31%); split: -1.32%, +0.01%
Spills: 12721 -> 12652 (-0.54%); split: -0.72%, +0.17%
Fills: 6733 -> 6636 (-1.44%); split: -1.96%, +0.52%
Scratch: 132994 -> 132712 (-0.21%); split: -0.22%, +0.01%
ALU: 13054253 -> 12803059 (-1.92%); split: -1.93%, +0.01%
FSCIB: 13054138 -> 12802912 (-1.92%); split: -1.94%, +0.01%
IC: 3916012 -> 3915588 (-0.01%); split: -0.01%, +0.00%
GPRs: 2290907 -> 2289519 (-0.06%); split: -0.07%, +0.01%
Uniforms: 6794773 -> 6696943 (-1.44%); split: -1.44%, +0.00%
Preamble instrs: 6953594 -> 7024455 (+1.02%); split: -0.37%, +1.39%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:21 +00:00
Alyssa Rosenzweig
49f042c5e8 hk: plumb sampler state counts
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36127>
2025-07-16 18:27:21 +00:00