Commit graph

226 commits

Author SHA1 Message Date
Marek Olšák
e372365cf4 nir: rename nir_copy_prop -> nir_opt_copy_prop
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38411>
2025-11-15 02:16:38 +00:00
Konstantin Seurer
de32f9275f treewide: add & use parent instr helpers
We add a bunch of new helpers to avoid the need to touch >parent_instr,
including the full set of:

* nir_def_is_*
* nir_def_as_*_or_null
* nir_def_as_* [assumes the right instr type]
* nir_src_is_*
* nir_src_as_*
* nir_scalar_is_*
* nir_scalar_as_*

Plus nir_def_instr() where there's no more suitable helper.

Also an existing helper is renamed to unify all the names, while we're
churning the tree:

* nir_src_as_alu_instr -> nir_src_as_alu

..and then we port the tree to use the helpers as much as possible, using
nir_def_instr() where that does not work.

Acked-by: Marek Olšák <maraeo@gmail.com>

---

To eliminate nir_def::parent_instr we need to churn the tree anyway, so I'm
taking this opportunity to clean up a lot of NIR patterns.

Co-authored-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38313>
2025-11-12 21:22:13 +00:00
Daniel Schürmann
870616af34 nir/constant_folding: switch to nir_shader_lower_instructions()
Small differences due to implicit DCE.
Totals from 76 (0.10% of 79839) affected shaders: (Navi48)

Instrs: 168051 -> 168044 (-0.00%); split: -0.01%, +0.01%
CodeSize: 893284 -> 893256 (-0.00%); split: -0.01%, +0.01%
Latency: 1082007 -> 1082027 (+0.00%); split: -0.00%, +0.00%
InvThroughput: 155100 -> 155105 (+0.00%)
Copies: 9649 -> 9654 (+0.05%)
VALU: 92504 -> 92509 (+0.01%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37195>
2025-10-30 19:28:07 +00:00
Georg Lehmann
cf4ab485ea nir: remove manual nir_load_global_constant
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:39:53 +02:00
Georg Lehmann
2306cba65b nir: remove manual nir_store_global
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:37:58 +02:00
Georg Lehmann
77540cac8c nir: remove manual nir_load_global
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37959>
2025-10-21 12:37:58 +02:00
Georg Lehmann
714a149396 nir: remove unsigned upper bound config
All config information is now either in nir->info or nir->options.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37361>
2025-09-16 09:24:04 +00:00
Georg Lehmann
ce91b0be08 nir: define new subgroup size info
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37258>
2025-09-12 21:05:17 +00:00
Konstantin Seurer
951b187b95 nir: Use nir_def_block in more places
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:10 +00:00
Konstantin Seurer
9df7b48d2f nir: Use nir_def_as_* in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36746>
2025-08-24 14:03:09 +00:00
Marek Olšák
3aadae22ad nir: make nir_block::predecessors & dom_frontier sets non-malloc'd
We can just place the set structures inside nir_block.

This reduces the number of ralloc calls by 6.7% when compiling Heaven
shaders with radeonsi+ACO using a release build (i.e. not including
nir_validate set allocations, which are also removed).

Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36728>
2025-08-21 06:13:48 +00:00
Job Noorman
cb773dec8c nir/opt_load_store_vectorize: add support for offset_shift
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
2025-08-20 07:51:30 +00:00
Qiang Yu
196569b1a4 all: rename gl_shader_stage to mesa_shader_stage
It's not only for GL, change to a generic name.

Use command:
  find . -type f -not -path '*/.git/*' -exec sed -i 's/\bgl_shader_stage\b/mesa_shader_stage/g' {} +

Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Yonggang Luo <luoyonggang@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36569>
2025-08-06 10:28:40 +08:00
Alyssa Rosenzweig
bcf1a1c20b treewide: use nir_def_block
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -definition->parent_instr->block
    +nir_def_block(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Alyssa Rosenzweig
cc6e3b84cb treewide: use nir_def_as_*
Via Coccinelle patch:

    @@
    expression definition;
    @@

    -nir_instr_as_alu(definition->parent_instr)
    +nir_def_as_alu(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_intrinsic(definition->parent_instr)
    +nir_def_as_intrinsic(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_phi(definition->parent_instr)
    +nir_def_as_phi(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_load_const(definition->parent_instr)
    +nir_def_as_load_const(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_deref(definition->parent_instr)
    +nir_def_as_deref(definition)

    @@
    expression definition;
    @@

    -nir_instr_as_tex(definition->parent_instr)
    +nir_def_as_tex(definition)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36489>
2025-08-01 15:34:24 +00:00
Antonio Ospite
ddf2aa3a4d build: avoid redefining unreachable() which is standard in C23
In the C23 standard unreachable() is now a predefined function-like
macro in <stddef.h>

See https://android.googlesource.com/platform/bionic/+/HEAD/docs/c23.md#is-now-a-predefined-function_like-macro-in

And this causes build errors when building for C23:

-----------------------------------------------------------------------
In file included from ../src/util/log.h:30,
                 from ../src/util/log.c:30:
../src/util/macros.h:123:9: warning: "unreachable" redefined
  123 | #define unreachable(str)    \
      |         ^~~~~~~~~~~
In file included from ../src/util/macros.h:31:
/usr/lib/gcc/x86_64-linux-gnu/14/include/stddef.h:456:9: note: this is the location of the previous definition
  456 | #define unreachable() (__builtin_unreachable ())
      |         ^~~~~~~~~~~
-----------------------------------------------------------------------

So don't redefine it with the same name, but use the name UNREACHABLE()
to also signify it's a macro.

Using a different name also makes sense because the behavior of the
macro was extending the one of __builtin_unreachable() anyway, and it
also had a different signature, accepting one argument, compared to the
standard unreachable() with no arguments.

This change improves the chances of building mesa with the C23 standard,
which for instance is the default in recent AOSP versions.

All the instances of the macro, including the definition, were updated
with the following command line:

  git grep -l '[^_]unreachable(' -- "src/**" | sort | uniq | \
  while read file; \
  do \
    sed -e 's/\([^_]\)unreachable(/\1UNREACHABLE(/g' -i "$file"; \
  done && \
  sed -e 's/#undef unreachable/#undef UNREACHABLE/g' -i src/intel/isl/isl_aux_info.c

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36437>
2025-07-31 17:49:42 +00:00
Rhys Perry
f45026751f nir/cf: have nir_remove_after_cf_node remove phis at the start too
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Backport-to: 25.1
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35975>
2025-07-23 09:06:36 +00:00
Marek Olšák
b0494f9485 nir/opt_varyings: optimize the consumer after constant propagation and dedupli.
A TF2 shader propagates 0 to the consumer, which eliminates 1 input
if we run algebraic opts and DCE before compaction.

This is a prerequisite for removing all IO var optimizations from the GLSL
linker that are redundant with nir_opt_varyings.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36091>
2025-07-15 13:38:29 +00:00
Alyssa Rosenzweig
3c2f46fcac treewide: use nir_break_if with named if
Some checks are pending
macOS-CI / macOS-CI (dri) (push) Waiting to run
macOS-CI / macOS-CI (xlib) (push) Waiting to run
Via Coccinelle patch:

    @@
    expression builder, condition;
    identifier nif;
    @@

    -nir_if *nif = nir_push_if(builder, condition);
    -{
    -nir_jump(builder, nir_jump_break);
    -}
    -nir_pop_if(builder, nif);
    +nir_break_if(builder, condition);

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Mel Henning <mhenning@darkrefraction.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35794>
2025-06-30 14:51:54 -04:00
Marek Olšák
069fdc6f71 nir: handle mov and bcsel in nir_def_bits_used
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34489>
2025-05-13 15:38:37 +00:00
Marek Olšák
e080833478 nir: handle iand/ior opcodes recursively in nir_def_bits_used
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34489>
2025-05-13 15:38:37 +00:00
Marek Olšák
a78ed8b8e8 nir: handle extract opcodes recursively in nir_def_bits_used
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34489>
2025-05-13 15:38:37 +00:00
Marek Olšák
e38a0b9a05 nir: handle u2u/i2i recursively in nir_def_bits_used
to get the number of bits actually used by the uses.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34489>
2025-05-13 15:38:37 +00:00
Marek Olšák
7e7ef7b8b7 nir: handle bit shifts by constants in nir_def_bits_used
useful for open-coded bitfield extracts that are not using ubfe/ibfe

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34489>
2025-05-13 15:38:37 +00:00
Marek Olšák
7d24a9b649 nir: handle ibfe/ubfe in nir_def_bits_used
it will be used by radeonsi

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34489>
2025-05-13 15:38:37 +00:00
Daniel Schürmann
3dab7b0a45 nir/tests: add tests for nir_move_terminate_out_of_loops
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33479>
2025-05-09 17:20:29 +00:00
Caio Oliveira
2ed79f80ba nir/load_store_vectorize: Skip new bit-sizes that are unaligned with high_offset
Otherwise this would require combining two values to produce a single
(new bit-size) channel, which vectorize_stores() don't handle.  The pass
can still keep trying smaller bit-sizes.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12946
Fixes: ce9205c03b ("nir: add a load/store vectorization pass")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34414>
2025-04-11 19:17:17 +00:00
Konstantin
e7a44de184 nir/tests: Do not rely on __LINE__
__LINE__ can be inconsistent when using different compilers. This patch
changes the test runner to do a simple string find/replace of the test
source file instead of looking for the line where the reference string
starts.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33980>
2025-04-04 19:01:01 +00:00
Konstantin Seurer
3a69b52d37 nir: Test nir_minimize_call_live_states
Adds a couple of tests for various instructions and controlflow
constructs.

Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33289>
2025-03-03 23:30:57 +00:00
Timur Kristóf
65139305e2 nir: Don't use deprecated NIR_PASS_V macro anymore.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33609>
2025-02-22 08:54:16 +01:00
Georg Lehmann
ca8147edbe nir/peephole_select: add options struct
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:16 +00:00
Daniel Schürmann
259b73a3ae nir/print: print phi sources sorted by predecessor blocks
We already print the predecessors sorted. Just do the same with
phi sources.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33574>
2025-02-20 14:22:14 +00:00
Daniel Schürmann
fbaabcfb0a nir/loop_analyze: store nir_loop_induction_variable hash table in loop_info
No need to create a separate array.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33131>
2025-01-30 03:48:36 +00:00
Konstantin Seurer
eb3ab68e5e nir/tests: Add reference shaders
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>
2025-01-18 11:02:25 +00:00
Konstantin Seurer
8838a0c595 nir/tests: Add a helper for comparing a shader against a string
This allows unit tests to compare against a reference nir shader instead
of implementing checks for interesting instructions/CF nodes.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>
2025-01-18 11:02:25 +00:00
Konstantin Seurer
6d1d15183f nir/tests: Improve shader creation
Sets some fields so they are not printed and allows specifying a stage.
This decreases the size of reference shaders.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32644>
2025-01-18 11:02:25 +00:00
Timur Kristóf
ec548fd37b Revert "nir/opt_varyings: Add workaround for RADV mesh shader multiview."
The workaround is not needed anymore, because RADV now implements
the FS layer ID input as a sysval.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32641>
2025-01-02 14:07:51 +00:00
Marek Olšák
c21bc65ba7 nir/opt_load_store_vectorize: make hole_size signed to indicate overlapping loads
A negative hole size means the loads overlap. This will be used by drivers
to handle overlapping loads in the callback easily.

Reviewed-by: Mel Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32699>
2025-01-01 00:03:55 +00:00
Kenneth Graunke
5712fc48a9 nir: Allow large overfetching holes in the load store vectorizer
The load_*_uniform_block_intel intrinsics always load either 8x or 16x
32-bit components worth of data (so 32 byte increments).  This leads to
cases where we load a few components from one vec8, followed by a few
components of an adjacent vec8.  We want to combine those into a vec16
load, as that loads a whole cacheline at a time, and requires less hoops
to calculate addresses and request memory loads.

So, we allow 7 * 4 = 28 bytes of holes, which handles vec8+vec8 where
only the .x component is read.

Most drivers and intrinsics will not want such large holes.  I thought
about adding a per-intrinsic max_hole to the core code, but decided that
since we already have driver callbacks, we can just rely on them to
reject what makes sense to them.

No driver callbacks currently allow holes, so this should not currently
affect any drivers.  But any work in progress branches may need to be
updated to reject larger holes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32315>
2024-12-03 02:02:33 +00:00
Rhys Perry
9f3607de76 nir/tests: fix SSA dominance in opt_if_merge tests
It isn't necessary for these ALU instructions to be used in the next IF.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: c437f2e79c ("nir/tests: Add tests for opt_if_merge")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12211
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32391>
2024-12-02 09:38:22 +00:00
Timothy Arceri
6ca81adffc nir: allow loops with unknown induction var initialiser to unroll
If the condition of the loop terminator is based on an unsigned value we
can in some cases find the max number of possible loop trips. With the
max loop trips know a complex unroll can unroll the loop.

For example:

   uniform uint x;
   uint i = x;
   while (true) {
      if (i >= 4)
         break;

      i += 6;
   }

The above loop can be unrolled even though we don't know the initial
value of the induction variable because it can have at most 1 iteration.

There were no changes with my shader-db collection. Change was inspired
by MR #31312 where builtin shader code failed to unroll.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31701>
2024-12-02 11:44:33 +11:00
Rhys Perry
65a54b4ec4 nir/lcssa: fix premature exit of loop after rematerializing derefs
If we have NIR such as:

32x4  %48 = @load_vulkan_descriptor (%47) (desc_type=SSBO)
32x4  %76 = deref_cast (tint_symbol_11 *)%48 (ssbo tint_symbol_11)  (ptr_stride=0, align_mul=4, align_offset=0)
32x4  %77 = deref_struct &%76->tint_symbol_10 (ssbo int)  // &((tint_symbol_11 *)%48)->tint_symbol_10

A single nir_rematerialize_deref_in_use_blocks() will rematerialize the
deref_struct and then it's deref_cast. However,
nir_foreach_instr_reverse_safe is not safe if the next iteration's
instruction is removed. This can result in the instruction loop exiting
and the load_vulkan_descriptor never having an LCSSA phi.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 439e8c42cc ("nir/lcssa: Fix rematerializing derefs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11770
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32225>
2024-11-19 18:59:05 +00:00
Marek Olšák
a44e5cfccf nir/opt_load_store_vectorize: allow a 4-byte hole between 2 loads
If there is a 4-byte hole between 2 loads, drivers can now optionally
vectorize the loads by including the hole between them, e.g.:
    4B load + 4B hole + 8B load --> 16B load

All vectorize callbacks already reject all holes, but AMD will want to
allow it.

radeonsi+ACO with the new vectorization callback:

TOTALS FROM AFFECTED SHADERS (25248/58918)
  VGPRs: 871116 -> 871872 (0.09 %)
  Spilled SGPRs: 397 -> 407 (2.52 %)
  Code Size: 43074536 -> 42496352 (-1.34 %) bytes

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
80c156422d nir/opt_load_store_vectorize: allow overfetching, merge overfetched loads
New load merging transformations (first, second), examples:
    (vec4, vec3) ==> vec8(read=0x7f) (because NIR doesn't have vec7)
    (vec1, vec8(read=0x7f)) ==> vec8(read=0xff)
    - the unused component at the end of vec8 is dropped

Not merged:
    vec8(read=0xfe) + vec1
    - unused components at the beginning are kept

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
65ace5649b nir: reject unsupported component counts from all vectorize callbacks
If you allow an unsupported component count in the callback for loads,
nir_opt_load_store_vectorize will align num_components to the next supported
vector size, essentially overfetching.

This changes all callbacks to reject it. AMD will enable it in a later commit.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
02923e237d nir: add hole_size parameter into the vectorize callback
It will be used to allow merging loads with a hole between them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Rhys Perry
be64454710 nir/tests: test opt_loop_peel_initial_break with derefs in header block
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>
2024-10-01 12:24:22 +00:00
Rhys Perry
8410b4cdd6 nir/tests: add some loop peeling tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>
2024-09-12 23:36:58 +00:00
Timothy Arceri
bb426b7f3c nir/tests: add basic terminator merge test
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30862>
2024-08-29 10:26:30 +00:00
Timothy Arceri
85741c6a15 nir/tests: make add_loop_terminators more flexible
Here we update the helper to have an option to add the break to the else
blocks of the terminators.

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30862>
2024-08-29 10:26:30 +00:00