Commit graph

45 commits

Author SHA1 Message Date
Mary Guillemard
e0be93d881 nir: Add Panfrost specific shader_output intrinsic
On Avalon, this is a bitfield that holds information on what
values a vertex shader should output.

Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com>
Reviewed-by: Benjamin Lee <benjamin.lee@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33910>
2025-03-10 07:38:16 +01:00
Alyssa Rosenzweig
9a58a8257e treewide: Switch to nir_progress
Via the Coccinelle patch at the end of the commit message, followed by

sed -ie 's/progress = progress | /progress |=/g' $(git grep -l 'progress = prog')
ninja -C ~/mesa/build clang-format
cd ~/mesa/src/compiler/nir && clang-format -i *.c
agxfmt

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    -return prog;
    +return nir_progress(prog, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    @@

    -if (prog_expr) {
    -nir_metadata_preserve(impl, metadata);
    -return true;
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -return false;
    -}
    +bool progress = prog_expr;
    +return nir_progress(progress, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all);
    -return prog;
    +return nir_progress(prog, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, prog ? (metadata) : nir_metadata_all);
    +nir_progress(prog, impl, metadata);

    @@
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, metadata);
    -return true;
    +return nir_progress(true, impl, metadata);

    @@
    expression impl;
    @@

    -nir_metadata_preserve(impl, nir_metadata_all);
    -return false;
    +return nir_no_progress(impl);

    @@
    identifier other_prog, prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    -other_prog |= prog;
    +other_prog = other_prog | nir_progress(prog, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +nir_progress(prog, impl, metadata);

    @@
    identifier other_prog, prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -other_prog = true;
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +other_prog = other_prog | nir_progress(prog, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    identifier prog;
    @@

    -if (prog_expr) {
    -nir_metadata_preserve(impl, metadata);
    -prog = true;
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +bool impl_progress = prog_expr;
    +prog = prog | nir_progress(impl_progress, impl, metadata);

    @@
    identifier other_prog, prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -other_prog = true;
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +other_prog = other_prog | nir_progress(prog, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    identifier prog;
    @@

    -if (prog_expr) {
    -prog = true;
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +bool impl_progress = prog_expr;
    +prog = prog | nir_progress(impl_progress, impl, metadata);

    @@
    expression prog_expr, impl, metadata;
    @@

    -if (prog_expr) {
    -nir_metadata_preserve(impl, metadata);
    -} else {
    -nir_metadata_preserve(impl, nir_metadata_all);
    -}
    +bool impl_progress = prog_expr;
    +nir_progress(impl_progress, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, metadata);
    -prog = true;
    +prog = nir_progress(true, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -}
    -return prog;
    +return nir_progress(prog, impl, metadata);

    @@
    identifier prog;
    expression impl, metadata;
    @@

    -if (prog) {
    -nir_metadata_preserve(impl, metadata);
    -}
    +nir_progress(prog, impl, metadata);

    @@
    expression impl;
    @@

    -nir_metadata_preserve(impl, nir_metadata_all);
    +nir_no_progress(impl);

    @@
    expression impl, metadata;
    @@

    -nir_metadata_preserve(impl, metadata);
    +nir_progress(true, impl, metadata);

squashme! sed -ie 's/progress = progress | /progress |=/g' $(git grep -l 'progress = prog')

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33722>
2025-02-26 15:19:53 +00:00
Benjamin Lee
6f541e2016 panfrost: add intrinsic to load frag coord at a barycentric
This is needed for noperspective lowering, where we need to multiply the
varying value by gl_FragCoord.w at the same barycentric as the varying.
Normal nir_load_frag_coord_zw instructions are lowered to the new
intrinsic on bifrost with the pan_lower_frag_coord_zw pass.

Signed-off-by: Benjamin Lee <benjamin.lee@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32127>
2025-01-03 07:04:05 +00:00
Georg Lehmann
b8fa9daf0c nir: sink/move alu with two identical, non constant sources.
Foz-DB Navi21:
Totals from 32363 (40.76% of 79395) affected shaders:
MaxWaves: 787499 -> 787675 (+0.02%); split: +0.02%, -0.00%
Instrs: 28783404 -> 28783464 (+0.00%); split: -0.01%, +0.01%
CodeSize: 156763536 -> 156765148 (+0.00%); split: -0.01%, +0.02%
VGPRs: 1493304 -> 1492848 (-0.03%); split: -0.04%, +0.01%
Latency: 243022511 -> 243051994 (+0.01%); split: -0.08%, +0.09%
InvThroughput: 57827398 -> 57828129 (+0.00%); split: -0.05%, +0.05%
VClause: 582208 -> 582298 (+0.02%); split: -0.07%, +0.08%
SClause: 959634 -> 959312 (-0.03%); split: -0.07%, +0.04%
Copies: 1965821 -> 1965826 (+0.00%); split: -0.17%, +0.17%
Branches: 710593 -> 710596 (+0.00%); split: -0.00%, +0.01%
PreSGPRs: 1313513 -> 1313632 (+0.01%); split: -0.00%, +0.01%
PreVGPRs: 1210596 -> 1209103 (-0.12%); split: -0.12%, +0.00%
VALU: 19463445 -> 19463497 (+0.00%); split: -0.02%, +0.02%
SALU: 3319529 -> 3319500 (-0.00%); split: -0.01%, +0.01%

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32783>
2024-12-30 13:28:30 +00:00
Caterina Shablia
f4fcfa8016 pan,nir: introduce load_attribute_pan
load_attribute_pan is a panfrost-specific intrinsic for loading
vertex attributes. Takes explicit vertex and instance IDs which
we need in order to implement vertex attribute divisor with
non-zero base instance on v9+.

Passes which are used by panvk are modified to be aware of
load_attribute_pan.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32039>
2024-12-18 08:33:16 +00:00
Georg Lehmann
dbf63a0788 nir: remove nir_op_is_derivative
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>
2024-10-17 09:50:19 +00:00
Georg Lehmann
41e82b8b8e nir: sink is_subgroup_invocation_lt_amd
Having it closer to the branches means we can eliminate an exec copy.

Foz-DB Navi31:
Totals from 11615 (14.63% of 79395) affected shaders:
Instrs: 6804372 -> 6804903 (+0.01%); split: -0.04%, +0.05%
CodeSize: 33684672 -> 33680584 (-0.01%); split: -0.07%, +0.05%
VGPRs: 578616 -> 578604 (-0.00%)
SpillSGPRs: 1506 -> 1304 (-13.41%)
Latency: 29817034 -> 29821320 (+0.01%); split: -0.03%, +0.05%
InvThroughput: 3581587 -> 3581217 (-0.01%); split: -0.02%, +0.01%
VClause: 124826 -> 124782 (-0.04%); split: -0.04%, +0.00%
SClause: 187916 -> 187645 (-0.14%); split: -0.27%, +0.13%
Copies: 520969 -> 510027 (-2.10%); split: -2.20%, +0.10%
PreSGPRs: 442584 -> 421344 (-4.80%)
VALU: 3810755 -> 3810267 (-0.01%); split: -0.01%, +0.00%
SALU: 763402 -> 752650 (-1.41%); split: -1.48%, +0.07%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
7fa7812219 nir: merge out of loop decision with nir_can_move_instr logic
One place to modify instead of two when adding new intrinsics here.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Georg Lehmann
91f8e32a85 nir/opt_sink: do not sink inverse_ballot out of loops
Inverse_ballot result is undefined if the input is not dynamically uniform.
And sinking out of loops might make the input divergent.

Fixes: 18a0ff137f ("nir: sink/move inverse_ballot like moves")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Georg Lehmann
1ec3cc2aed nir/opt_sink: do not sink load_ubo_vec4 out of loops
Same reason as for load_ubo.

Fixes: d199d65c3a ("nir/nir_opt_move,sink: Include load_ubo_vec4 as a load_ubo instr.")

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>
2024-09-12 21:49:34 +00:00
Daniel Schürmann
50d416fe89 nir: add nir_block *nir_src_get_block(src) helper
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7710>
2024-08-29 09:42:55 +00:00
Marek Olšák
b2d32ae246 nir: add nir_intrinsic_load_per_primitive_input, split from io_semantics flag
Instead of having 1 bit in nir_io_semantics indicating a per-primitive
FS input, add a dedicated intrinsic for it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29895>
2024-07-23 16:13:16 +00:00
Daniel Schürmann
ffef3d1709 nir/opt_sink: ignore loops without backedge
Loops without backedge should not be considered loops.
For RADV, 2069 (2.61% of 79395) affected shaders.

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28783>
2024-07-16 12:29:08 +00:00
Karol Herbst
d5da434851 nir/opt_sink: add load_kernel_input
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25362>
2024-06-26 10:04:02 +00:00
Alyssa Rosenzweig
15257b65c6 treewide: use nir_metadata_control_flow
Via Coccinelle patch:

    @@
    @@

    -nir_metadata_block_index | nir_metadata_dominance
    +nir_metadata_control_flow

...plus some manual fixups for call sites missed by coccinelle.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Juan A. Suarez Romero <jasuarez@igalia.com> [broadcom]
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com> [lima]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29745>
2024-06-17 16:28:14 -04:00
Georg Lehmann
18a0ff137f nir: sink/move inverse_ballot like moves
It's just a copy for the backends that don't lower it.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29502>
2024-06-04 15:40:57 +00:00
Alyssa Rosenzweig
c39896b17b nir: Use getters for nir_src::parent_*
First, we need to give the parent_instr field a unique name to be able to
replace with a helper.  We have parent_instr fields for both nir_src and
nir_def, so let's rename nir_src::parent_instr in preparation for rework.

This was done with a combination of sed and manual fix-ups.

Then we use semantic patches plus manual fixups:

    @@
    expression s;
    @@

    -s->renamed_parent_instr
    +nir_src_parent_instr(s)

    @@
    expression s;
    @@

    -s.renamed_parent_instr
    +nir_src_parent_instr(&s)

    @@
    expression s;
    @@

    -s->parent_if
    +nir_src_parent_if(s)

    @@
    expression s;
    @@

    -s.renamed_parent_if
    +nir_src_parent_if(&s)

    @@
    expression s;
    @@

    -s->is_if
    +nir_src_is_if(s)

    @@
    expression s;
    @@

    -s.is_if
    +nir_src_is_if(&s)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24671>
2023-10-10 04:58:05 -04:00
Alyssa Rosenzweig
4bcb62d203 nir/opt_sink: Also consider load_preamble as const
Acts like constants, schedule them like constants. This lets us move lowered
frag coord code down. Results on dolphin ubers:

   total instructions in shared programs: 195144 -> 196633 (0.76%)
   instructions in affected programs: 175737 -> 177226 (0.85%)
   helped: 28
   HURT: 27
   Instructions are HURT.

   total bytes in shared programs: 1379980 -> 1388308 (0.60%)
   bytes in affected programs: 1244250 -> 1252578 (0.67%)
   helped: 28
   HURT: 27
   Bytes are HURT.

   total halfregs in shared programs: 13591 -> 13557 (-0.25%)
   halfregs in affected programs: 2176 -> 2142 (-1.56%)
   helped: 12
   HURT: 2
   Inconclusive result (%-change mean confidence interval includes 0).

   total threads in shared programs: 233728 -> 234112 (0.16%)
   threads in affected programs: 3264 -> 3648 (11.76%)
   helped: 6
   HURT: 0
   Threads are helped.

Results on Android shader-db:

   total instructions in shared programs: 1775324 -> 1775912 (0.03%)
   instructions in affected programs: 155305 -> 155893 (0.38%)
   helped: 353
   HURT: 548
   Instructions are HURT.

   total bytes in shared programs: 11676650 -> 11678454 (0.02%)
   bytes in affected programs: 1058924 -> 1060728 (0.17%)
   helped: 370
   HURT: 547
   Inconclusive result (value mean confidence interval includes 0).

   total halfregs in shared programs: 484143 -> 471212 (-2.67%)
   halfregs in affected programs: 98833 -> 85902 (-13.08%)
   helped: 2478
   HURT: 674
   Halfregs are helped.

Instr count changes due to losing the RA lottery.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
aead5316d2 nir/opt_sink: Move ALU with constant sources
In general, sinking ALU instructions can negatively impact register pressure,
since it extends the live ranges of the sources, although it does shrink the live range
of the destination.

However, constants do not usually contribute to register pressure. This is not a
totally true assumption, but it's pretty good in practice, since...

* constants can be rematerialized (backend-dependent)
* constants can often be inlined (ISA-dependent)
* constants can sometimes be promoted to free uniform registers (ISA-dependent)
* constants can live in scalar registers although the ALU destination might need
  a vector register (and vector registers are assumed to be much more expensive
  than scalar registers, again ISA-dependent)

So, assume that constants have zero effect on register pressure. Now consider an
ALU instruction where all but one source is a constant. Then there are two
cases:

1. The ALU instruction is moved past when its source was otherwise killed. Then
   there is no effect on register pressure, since the source live range is
   extended exactly as much as the destination live range shrinks.

2. The ALU instruction is moved down but its source is still alive where it's
   moved to. Then register pressure is improved, since the source live range is
   unchanged while the destination live range shrinks.

So, as a heuristic, we always move ALU instructions where n-1 sources are
constant. As an inevitable special case, this also (necessarily) moves unary ALU
ops, which should be beneficial by the same justification. This is not 100%
perfect but it is well-motivated. Results on AGX are decent:

   total instructions in shared programs: 1796101 -> 1795652 (-0.02%)
   instructions in affected programs: 326822 -> 326373 (-0.14%)
   helped: 800
   HURT: 371
   Inconclusive result (%-change mean confidence interval includes 0).

   total bytes in shared programs: 11805004 -> 11801424 (-0.03%)
   bytes in affected programs: 2610630 -> 2607050 (-0.14%)
   helped: 912
   HURT: 462
   Inconclusive result (%-change mean confidence interval includes 0).

   total halfregs in shared programs: 525818 -> 515399 (-1.98%)
   halfregs in affected programs: 118197 -> 107778 (-8.81%)
   helped: 2095
   HURT: 804
   Halfregs are helped.

   total threads in shared programs: 18916608 -> 18917056 (<.01%)
   threads in affected programs: 4800 -> 5248 (9.33%)
   helped: 7
   HURT: 0
   Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
561df40211 nir/opt_sink: Do not move derivatives
At the moment, this does nothing. It will prevent problems from the next patch,
however.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
469fd36fba nir/opt_sink: Sink frag coord instructions
load_input-like. ubershaders:

   instructions in affected programs: 72392 -> 72522 (0.18%)
   helped: 8
   HURT: 18
   Inconclusive result (value mean confidence interval includes 0).

   total bytes in shared programs: 1468550 -> 1469170 (0.04%)
   bytes in affected programs: 560486 -> 561106 (0.11%)
   helped: 10
   HURT: 17
   Inconclusive result (value mean confidence interval includes 0).

   total halfregs in shared programs: 13946 -> 13898 (-0.34%)
   halfregs in affected programs: 3642 -> 3594 (-1.32%)
   helped: 21
   HURT: 0
   Halfregs are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
c07a9dca65 nir/opt_sink: Sink load_local_pixel_agx
This is the AGX version of load_output, which shaders can use for framebuffer
fetch. It is beneficial to sink framebuffer fetch as late as possible, both to
reduce register pressure but also to reduce serialization of overlapping
fragments.

Results on a collection of ubershaders:

   total bytes in shared programs: 1468928 -> 1468550 (-0.03%)
   bytes in affected programs: 495300 -> 494922 (-0.08%)
   helped: 24
   HURT: 0
   Bytes are helped.

   total halfregs in shared programs: 14162 -> 13946 (-1.53%)
   halfregs in affected programs: 5148 -> 4932 (-4.20%)
   helped: 27
   HURT: 0
   Halfregs are helped.

   total threads in shared programs: 216896 -> 217664 (0.35%)
   threads in affected programs: 6912 -> 7680 (11.11%)
   helped: 12
   HURT: 0
   Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
596682ad4b nir/opt_sink: Sink load_constant_agx
By the time this runs, we will have already lowered load_ubo and load_vbo to
load_constant_agx so we need to handle the backend version.

This is very important for reducing register pressure in monolithic VS+GS
shaders on AGX. Since no other backend has _agx intrinsics, there's no need for
an option to gate this.

The additional instruction count is from more frequent wait instructions due to
fewer instructions grouped together. This should be mitigated in the future with
an ACO-style latency-reducing scheduler in the backend, after register pressure
is reduced by opt_sink.

total instructions in shared programs: 1793385 -> 1796101 (0.15%)
instructions in affected programs: 199816 -> 202532 (1.36%)
helped: 3
HURT: 941
Instructions are HURT.

total bytes in shared programs: 11799628 -> 11805004 (0.05%)
bytes in affected programs: 1345656 -> 1351032 (0.40%)
helped: 34
HURT: 919
Bytes are HURT.

total halfregs in shared programs: 533151 -> 525818 (-1.38%)
halfregs in affected programs: 40335 -> 33002 (-18.18%)
helped: 613
HURT: 42
Halfregs are helped.

total threads in shared programs: 18910464 -> 18916608 (0.03%)
threads in affected programs: 6144 -> 12288 (100.00%)
helped: 12
HURT: 0
Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Faith Ekstrand
4e2830c9ef nir: Clean up nir_op_is_vec() and its callers
The nir_op_is_vec() helper I added in 842338e2f0 ("nir: Add a
nir_op_is_vec helper") treats nir_op_mov as a vec even though the
semanitcs of the two are different.  In retrospect, this was a mistake
as the previous three fixup commits show.  This commit splits the helper
into two: nir_op_is_vec() and nir_op_is_vec_or_mov() and uses the
appropriate helper at each call site.  Hopefully, this rename will
encurage any future users of these helpers to think about nir_op_mov as
separate from nir_op_vecN and we can avoid these bugs.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24704>
2023-08-16 21:42:30 +00:00
Faith Ekstrand
b64da56b1a nir: s/nir_instr_ssa_def/nir_instr_def/
Generated by sed:

    sed -i -e 's/nir_instr_ssa_def/nir_instr_def/g' src/**/*.h src/**/*.c src/**/*.cpp

Suggested-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24703>
2023-08-15 17:44:27 +00:00
Faith Ekstrand
65b6ac8aa4 nir: Rename nir_instr_type_ssa_undef to nir_instr_type_undef
We already renamed the type, we just need to rename the enum and the
casting helper functions.

Generated with sed:

    sed -i -e 's/nir_instr_type_ssa_undef/nir_instr_type_undef/g' src/**/*.h src/**/*.c src/**/*.cpp
    sed -i -e 's/nir_instr_as_ssa_undef/nir_instr_as_undef/g' src/**/*.h src/**/*.c src/**/*.cpp

and two tiny whitespace fixups in lima.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24703>
2023-08-15 17:44:27 +00:00
Alyssa Rosenzweig
09d31922de nir: Drop "SSA" from NIR language
Everything is SSA now.

   sed -e 's/nir_ssa_def/nir_def/g' \
       -e 's/nir_ssa_undef/nir_undef/g' \
       -e 's/nir_ssa_scalar/nir_scalar/g' \
       -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \
       -e 's/nir_gather_ssa_types/nir_gather_types/g' \
       -i $(git grep -l nir | grep -v relnotes)

   git mv src/compiler/nir/nir_gather_ssa_types.c \
          src/compiler/nir/nir_gather_types.c

   ninja -C build/ clang-format
   cd src/compiler/nir && find *.c *.h -type f -exec clang-format -i \{} \;

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>
2023-08-12 16:44:41 -04:00
Faith Ekstrand
777d336b1f nir: clang-format src/compiler/nir/*.[ch]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24382>
2023-08-12 19:27:28 +00:00
Alyssa Rosenzweig
190b1fdc64 nir: Convert to nir_foreach_function_impl
Done by hand at each call site but going very quickly with funny Vim motions and
common regexes. This is a very common idiom in NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23807>
2023-06-27 22:44:04 +00:00
Alyssa Rosenzweig
7f6491b76d nir: Combine if_uses with instruction uses
Every nir_ssa_def is part of a chain of uses, implemented with doubly linked
lists.  That means each requires 2 * 64-bit = 16 bytes per def, which is
memory intensive. Together they require 32 bytes per def. Not cool.

To cut that memory use in half, we can combine the two linked lists into a
single use list that contains both regular instruction uses and if-uses. To do
this, we augment the nir_src with a boolean "is_if", and reimplement the
abstract if-uses operations on top of that list. That boolean should fit into
the padding already in nir_src so should not actually affect memory use, and in
the future we sneak it into the bottom bit of a pointer.

However, this creates a new inefficiency: now iterating over regular uses
separate from if-uses is (nominally) more expensive. It turns out virtually
every caller of nir_foreach_if_use(_safe) also calls nir_foreach_use(_safe)
immediately before, so we rewrite most of the callers to instead call a new
single `nir_foreach_use_including_if(_safe)` which predicates the logic based on
`src->is_if`. This should mitigate the performance difference.

There's a bit of churn, but this is largely a mechanical set of changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22343>
2023-04-07 23:48:03 +00:00
Daniel Schürmann
2bb369dd8d nir: add assertions that loops don't have a Continue Construct
Hoping that I didn't miss any, this *should* add assertions
to all functions and passes which explicitly handle 'nir_loop'.

Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>
2023-02-21 10:41:11 +00:00
Iago Toral Quiroga
0a04468704 nir/nir_opt_move: allow to move uniform loads
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Emma Anholt
d199d65c3a nir/nir_opt_move,sink: Include load_ubo_vec4 as a load_ubo instr.
We weren't doing much motion in nir-to-tgsi because we considered all our
lowered load-ubos as unmovable.

softpipe shader-db:

total temps in shared programs: 563942 -> 563136 (-0.14%)
temps in affected programs: 9833 -> 9027 (-8.20%)

r300 shader-db:

instructions in affected programs: 22858 -> 23575 (3.14%)
temps in affected programs: 2039 -> 1813 (-11.08%)

(NIR had given r300 -19% instrs for +40% temps, so this feels like a
worthwhile trade back).

Reivewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14138>
2021-12-11 02:12:27 +00:00
Rhys Perry
a6d92eaf4f nir/sink,nir/move: sink/move reorderable load_ssbo
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>
2021-01-21 18:07:03 +00:00
Rhys Perry
870724d43b nir/opt_sink: use common instruction removal/insertion helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7880>
2020-12-03 11:00:16 +00:00
Daniel Schürmann
e60fcf0a87 nir/opt_sink: return early when trying to sink unused instructions
Fixes: 5f6c5e5b86 ('nir: don't sink instructions into loops')

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7874>
2020-12-02 17:12:44 +00:00
Daniel Schürmann
5f6c5e5b86 nir: don't sink instructions into loops
Repeatedly loading constants or evaluating ALU operations
in loops doesn't seem beneficial. This might increase the register
pressure, but the tradeoff seems worth it.

Totals from 13629 (9.77% of 139517) affected shaders (RAVEN):
SGPRs: 1179481 -> 1184697 (+0.44%); split: -0.03%, +0.47%
VGPRs: 978776 -> 978732 (-0.00%); split: -0.02%, +0.02%
SpillSGPRs: 51036 -> 50943 (-0.18%); split: -1.35%, +1.17%
CodeSize: 113775020 -> 113428812 (-0.30%); split: -0.34%, +0.04%
MaxWaves: 49877 -> 49881 (+0.01%); split: +0.02%, -0.01%
Instrs: 22295979 -> 22204936 (-0.41%); split: -0.42%, +0.02%
Cycles: 1637198832 -> 1626916048 (-0.63%); split: -0.64%, +0.01%
VMEM: 2403434 -> 2507645 (+4.34%); split: +4.76%, -0.42%
SMEM: 849676 -> 834576 (-1.78%); split: +0.60%, -2.38%
VClause: 412396 -> 398139 (-3.46%); split: -3.46%, +0.01%
SClause: 810480 -> 817349 (+0.85%); split: -0.19%, +1.04%
Copies: 2188260 -> 2166716 (-0.98%); split: -1.18%, +0.19%
Branches: 761204 -> 760475 (-0.10%); split: -0.15%, +0.05%
PreSGPRs: 972892 -> 981054 (+0.84%); split: -0.05%, +0.89%
PreVGPRs: 925390 -> 925420 (+0.00%); split: -0.02%, +0.02%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7694>
2020-12-01 17:31:06 +01:00
Daniel Schürmann
e4281dbecc nir: also move b2i in case of nir_move_copies
Booleans are often more efficient with register usage.
This also allows to move comparisons further.

Totals from affected shaders: (VEGA)
SGPRS: 451608 -> 450320 (-0.29 %)
VGPRS: 351448 -> 351256 (-0.05 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 1008 -> 1008 (0.00 %) dwords per thread
Code Size: 26555596 -> 26551080 (-0.02 %) bytes
LDS: 10323 -> 10323 (0.00 %) blocks
Max Waves: 42850 -> 42934 (0.20 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>
2020-07-20 15:56:45 +00:00
Daniel Schürmann
9300a14ffb nir: refactor nir_can_move_instr
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5622>
2020-07-07 19:24:28 +02:00
Daniel Schürmann
09d0e06c5c nir: also move vecN in case of nir_move_copies
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5622>
2020-07-07 19:24:28 +02:00
Rhys Perry
d8e05edbd9 nir/sink,nir/move: move/sink nir_op_mov
Can uncover opportunities to move other instructions. This can increase
register usage, but that doesn't seem to actually happen.

This optimizes a pattern of a load_per_vertex_input followed by several
moves and then a store_output in a different block.

v2: add nir_move_copies to make it optional

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Acked-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
2020-01-14 13:56:45 +00:00
Rhys Perry
04fac72ec7 nir/sink,nir/move: move/sink load_per_vertex_input
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
2020-01-14 13:56:45 +00:00
Connor Abbott
5ac32b2954 nir/sink: Don't sink load_ubo to outside of its defining loop
Previously, this could have made the resource divergent in code like
that which is genereated by nir_lower_non_uniform_access.

Fixes: da8ed68a ('nir: replace nir_move_load_const() with nir_opt_sink()')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 17:55:25 +00:00
Connor Abbott
af9296b8c0 nir/sink: Rewrite loop handling logic
Previously, for code like:
loop {
    loop {
        a = load_ubo()
    }
    use(a)
}
adjust_block_for_loops() would return the block before the first loop.
Now we compute the range of allowed blocks and then walk the dominance
tree directly, guaranteeing directly that we always choose a block that
dominates all the uses and is dominated by the definition.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-09 17:55:25 +00:00
Rhys Perry
da8ed68aca nir: replace nir_move_load_const() with nir_opt_sink()
This is mostly the same as nir_move_load_const() but can also move
undef instructions, comparisons and some intrinsics (being careful with
loops).

v2: actually delete nir_move_load_const.c
v3: fix nir_opt_sink() usage in freedreno
v3: update Makefile.sources
v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def
v4: handle if uses
v4: fix handling of nested loops
v5: re-write adjust_block_for_loops
v5: re-write setting of use_block for if uses

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 22:01:30 +00:00