Commit graph

4881 commits

Author SHA1 Message Date
Danylo Piliaiev
a5f0f7d4b1 turnip,ir3: Implement A7XX push consts load via preamble
New push consts loading consist of:
- Push consts are set for the entire pipeline via HLSQ_SHARED_CONSTS_IMM
  array which could fit up to 256b of push consts.
- For each shader stage that uses push consts READ_IMM_SHARED_CONSTS
  should be set in HLSQ_*_CNTL, otherwise push consts may get overwritten
  by new push consts that are set after the draw.
- Push consts are loaded into consts reg file in a shader preamble via
  stsc at the very start of the preamble.

OPC_PUSH_CONSTS_LOAD_MACRO is used instead of directly translating NIR
intrinsic into stsc because: we don't want to teach legalize pass how
to set (ss) between stores and loads of consts reg file, don't want for
stsc to be reordered, etc.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25086>
2023-10-04 15:51:54 +00:00
Georg Lehmann
bd16d3cdaf nir/lower_subgroups: use intrinsic builder more
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25501>
2023-10-03 12:49:28 +00:00
Georg Lehmann
289b369597 nir: make quad intrinsic dst bit size match src0
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25501>
2023-10-03 12:49:28 +00:00
Rhys Perry
7139a78959 nir/constant_folding: remove zero texel offset
fossil-db (navi31):
Totals from 7 (0.01% of 79330) affected shaders:
Instrs: 7001 -> 6993 (-0.11%)
CodeSize: 35736 -> 35692 (-0.12%)
InvThroughput: 3232 -> 3229 (-0.09%)
Copies: 552 -> 549 (-0.54%)
PreSGPRs: 277 -> 273 (-1.44%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25477>
2023-10-02 10:11:37 +00:00
Georg Lehmann
305db1af11 nir: scalarize masked_swizzle_amd created from shuffle_xor
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9901
Fixes: 0ef87f148d ("nir/lower_subgroups: Don't do multiple lowerings at once")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25468>
2023-10-02 09:01:18 +00:00
Alyssa Rosenzweig
10b9c2fa36 nir: Support arrays in block_image_store_agx
For layered rendering, runs once per layer.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-01 12:32:12 -04:00
Alyssa Rosenzweig
f4042afd57 nir: Add layer_id_written_agx sysval
We'll implement layer ID reads in the frag shader with a varying read, but if
the VS doesn't write the varying we need to return 0 per the spec. Add a sysval
to detect that case so we can handle it at runtime without keys.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2023-10-01 12:32:11 -04:00
Timothy Arceri
f2e87c5c28 nir: add used field to nir variables
Will be use in a following path by the glsl nir based linker.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25371>
2023-09-28 13:55:16 +00:00
Timothy Arceri
337c32cb3a nir: copy explicit_invariant flag to nir vars
This will be used in the following patch.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25371>
2023-09-28 13:55:16 +00:00
Caio Oliveira
af3eb80afa nir: Handle cooperative matrix in various passes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23825>
2023-09-28 07:35:02 +00:00
Caio Oliveira
3105d516d0 nir: Add new intrinsics for Cooperative Matrix
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23825>
2023-09-28 07:35:02 +00:00
Caio Oliveira
2d0f4f2c17 compiler/types: Add support for Cooperative Matrix types
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23825>
2023-09-28 07:35:02 +00:00
Timothy Arceri
1780102923 nir: fix typo in comment
The variable is unused or dead, not used.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25414>
2023-09-28 01:54:43 +00:00
Rhys Perry
65afc8bebf nir/algebraic: optimize u2u32(a >> 32)
fossil-db (navi21):
Totals from 352 (0.44% of 79330) affected shaders:
Instrs: 271816 -> 271240 (-0.21%); split: -0.28%, +0.07%
CodeSize: 1546520 -> 1544448 (-0.13%); split: -0.23%, +0.09%
SpillVGPRs: 832 -> 827 (-0.60%); split: -1.08%, +0.48%
Latency: 4037120 -> 4021748 (-0.38%); split: -0.41%, +0.03%
InvThroughput: 1369540 -> 1362066 (-0.55%); split: -0.59%, +0.04%
VClause: 6476 -> 6471 (-0.08%); split: -0.12%, +0.05%
SClause: 6798 -> 6794 (-0.06%)
Copies: 44828 -> 44630 (-0.44%); split: -0.89%, +0.45%
Branches: 8845 -> 8844 (-0.01%); split: -0.05%, +0.03%
PreSGPRs: 14684 -> 14659 (-0.17%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25409>
2023-09-27 22:13:01 +00:00
Rhys Perry
bcdac65ca3 nir/lower_int64: fix find_lsb(0)
If the high 32 bits were zero, this would be umin(find_lsb(lo), 31). This
evaluates to 31 if lo is also zero, instead of -1.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 9293d8e64b ("nir: Add find_lsb lowering to nir_lower_int64.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25409>
2023-09-27 22:13:01 +00:00
Karol Herbst
807ff7ed01 nir: add nir_lower_alu_vec8_16_srcs pass
This pass is useful for vector based backends as we might end up with alu
instructions referencing vec8/vec16 values even though being vec4 or
smaller themselves.

This new pass intents to clean up any use of vec8/vec16 sources other
passes won't.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25330>
2023-09-27 11:54:13 +00:00
Samuel Pitoiset
1ce80653b2 nir: rename atomic_add_gs_invocation_count_amd to make it more generic
It will be re-used to implement mesh/tash shader invocations queries.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25331>
2023-09-26 07:50:15 +00:00
Caio Oliveira
63ab985511 util: Use an opaque type for linear context
In the linear allocation only the parent (context) can be used
to allocate new children, so let's use an opaque type to identify
the linear context.  This is similar to what's done in GC allocator.

Update the documentation and a couple of function names to
refer to linear context instead of linear parent.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
2023-09-25 17:26:17 +00:00
Caio Oliveira
aec516ead6 util: Remove size from linear_parent creation
None of the callsites took advantage of this, so remove
the feature.  This will help to a next change that will
add an opaque type to represent a linear parent.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25280>
2023-09-25 17:26:17 +00:00
Caio Oliveira
3988d901ac meson: Remove unnecessary inc_compiler mentions
The inc_compiler should come as part of idep_compiler, idep_nir or
idep_nir_headers dependency.

Acked-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v3dv)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25314>
2023-09-22 14:52:50 +00:00
Caio Oliveira
ec835595f0 compiler: Use a meson dependency for libcompiler
That will make sure the include directories are passed on and also
make sure the generated headers are properly built before whoever code
depends on it. NIR dependency propagates that dependency too.

Since the right include directory is always propagated, we can remove
the extra "compiler/" prefix from the `#include`s in glsl_types.h.

Note: NIR has a special "header only" dependency, so include the
generated headers for compiler there too.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9843
Reviewed-by: Eric Engestrom <eric@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25314>
2023-09-22 14:52:50 +00:00
Konstantin Seurer
be8a73f40d nir/deref: Layer rematerialization helpers
nir_rematerialize_derefs_in_use_blocks_impl can be implemented on top of
nir_rematerialize_deref_in_use_blocks.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>
2023-09-22 10:05:58 +00:00
Konstantin Seurer
439e8c42cc nir/lcssa: Fix rematerializing derefs
This would pull derefs out of loops by emitting the pattern
`deref(phi(deref))` which is not allowed by nir_validate.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>
2023-09-22 10:05:58 +00:00
Konstantin Seurer
29dc1b193a nir: Add nir_rematerialize_deref_in_use_blocks
nir_rematerialize_deref_in_use_blocks can be used in passes that don't
run on the whole function.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>
2023-09-22 10:05:58 +00:00
Rhys Perry
ba809dccb8 nir/deref: remove rematerialize_deref_in_block cache
Nothing was ever inserted into this.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>
2023-09-22 10:05:58 +00:00
Konstantin Seurer
ab1310e84d nir: Add nir_foreach_block_in_cf_node_reverse
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>
2023-09-22 10:05:58 +00:00
Konstantin Seurer
70e497a2ac nir: Add nir_cf_node_cf_tree_prev
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23712>
2023-09-22 10:05:58 +00:00
Ian Romanick
2157f136d7 nir/rematerialize: Rematerialize ALUs used only by compares with zero
This was 4th on the list of things to try in 3ee2e84c60 ("nir:
Rematerialize compare instructions"). This is implemented as a separate
subpass that tries to find ALU instructions (with restrictions) that are
only used by comparisons with zero that are in turn only used as
conditions for bcsel or if-statements.

There are two restrictions implemented. One of the sources must be a
constant. This is done in an attempt to prevent increasing register
pressure. Additionally, the opcode of the instruction must be one that
has a high probablility of getting a conditional modifier on Intel
GPUs. Not all instructions can have a conditional modifiers (e.g., min
and max), so I don't think there is any benefit to moving these
instructions.

v2: Rebase on many, many recent NIR infrastructure changes.

v3: Make data in commit message more clear. Suggested by Matt. Rebase on
b5d6b7c402 ("nir: Drop most uses if nir_instr_rewrite_src()").

All of the affected shaders on ILK and G45 are in CS:GO. There is some
brief analysis of the changes in the MR.

Reviewed-by: Matt Tuner <mattst88@gmail.com>

Shader-db results:

DG2
total instructions in shared programs: 22824637 -> 22824258 (<.01%)
instructions in affected programs: 365742 -> 365363 (-0.10%)
helped: 190 / HURT: 97

total cycles in shared programs: 832186193 -> 832157290 (<.01%)
cycles in affected programs: 41245259 -> 41216356 (-0.07%)
helped: 208 / HURT: 117

total spills in shared programs: 4072 -> 4060 (-0.29%)
spills in affected programs: 366 -> 354 (-3.28%)
helped: 4 / HURT: 2

total fills in shared programs: 3601 -> 3607 (0.17%)
fills in affected programs: 708 -> 714 (0.85%)
helped: 4 / HURT: 2

LOST:   0
GAINED: 1

Tiger Lake and Ice Lake had similar results. (Ice Lake shown)
total instructions in shared programs: 20320934 -> 20320689 (<.01%)
instructions in affected programs: 236592 -> 236347 (-0.10%)
helped: 176 / HURT: 29

total cycles in shared programs: 849846341 -> 849843856 (<.01%)
cycles in affected programs: 41277336 -> 41274851 (<.01%)
helped: 195 / HURT: 110

LOST:   0
GAINED: 1

Skylake
total instructions in shared programs: 18550811 -> 18550470 (<.01%)
instructions in affected programs: 233908 -> 233567 (-0.15%)
helped: 182 / HURT: 25

total cycles in shared programs: 835910983 -> 835889167 (<.01%)
cycles in affected programs: 38764359 -> 38742543 (-0.06%)
helped: 207/ HURT: 94

total spills in shared programs: 4522 -> 4506 (-0.35%)
spills in affected programs: 324 -> 308 (-4.94%)
helped: 4 / HURT: 0

total fills in shared programs: 5296 -> 5280 (-0.30%)
fills in affected programs: 324 -> 308 (-4.94%)
helped: 4 / HURT: 0

LOST:   0
GAINED: 1

Broadwell
total instructions in shared programs: 18199130 -> 18197920 (<.01%)
instructions in affected programs: 214664 -> 213454 (-0.56%)
helped: 191 / HURT: 0

total cycles in shared programs: 935131908 -> 934870248 (-0.03%)
cycles in affected programs: 75770568 -> 75508908 (-0.35%)
helped: 203 / HURT: 84

total spills in shared programs: 13896 -> 13734 (-1.17%)
spills in affected programs: 162 -> 0
helped: 3 / HURT: 0

total fills in shared programs: 16989 -> 16761 (-1.34%)
fills in affected programs: 228 -> 0
helped: 3 / HURT: 0

Haswell
total instructions in shared programs: 16969502 -> 16969085 (<.01%)
instructions in affected programs: 185498 -> 185081 (-0.22%)
helped: 121 / HURT: 1

total cycles in shared programs: 925290863 -> 924806827 (-0.05%)
cycles in affected programs: 30200863 -> 29716827 (-1.60%)
helped: 100 / HURT: 85

total spills in shared programs: 13565 -> 13533 (-0.24%)
spills in affected programs: 736 -> 704 (-4.35%)
helped: 8 / HURT: 0

total fills in shared programs: 15468 -> 15436 (-0.21%)
fills in affected programs: 740 -> 708 (-4.32%)
helped: 8 / HURT: 0

LOST:   0
GAINED: 1

Ivy Bridge
total instructions in shared programs: 15839127 -> 15838947 (<.01%)
instructions in affected programs: 77776 -> 77596 (-0.23%)
helped: 58 / HURT: 0

total cycles in shared programs: 459852774 -> 459739770 (-0.02%)
cycles in affected programs: 11970210 -> 11857206 (-0.94%)
helped: 79 / HURT: 53

Sandy Bridge
total instructions in shared programs: 14106847 -> 14106831 (<.01%)
instructions in affected programs: 1611 -> 1595 (-0.99%)
helped: 10 / HURT: 0

total cycles in shared programs: 775004024 -> 775007516 (<.01%)
cycles in affected programs: 2530686 -> 2534178 (0.14%)
helped: 55 / HURT: 48

Iron Lake
total cycles in shared programs: 257753356 -> 257754900 (<.01%)
cycles in affected programs: 2977374 -> 2978918 (0.05%)
helped: 12 / HURT: 106

GM45
total cycles in shared programs: 169711382 -> 169712816 (<.01%)
cycles in affected programs: 2402070 -> 2403504 (0.06%)
helped: 12 / HURT: 57

Fossil-db results:

All Intel platforms had similar results. (DG2 shown)
Totals:
Instrs: 193884596 -> 193465896 (-0.22%); split: -0.25%, +0.03%
Cycles: 14050193354 -> 14048194826 (-0.01%); split: -0.34%, +0.33%
Spill count: 114944 -> 100449 (-12.61%); split: -13.59%, +0.98%
Fill count: 201525 -> 179534 (-10.91%); split: -11.22%, +0.31%
Scratch Memory Size: 10028032 -> 8468480 (-15.55%)

Totals from 16912 (2.59% of 653124) affected shaders:
Instrs: 34173709 -> 33755009 (-1.23%); split: -1.41%, +0.19%
Cycles: 2945969110 -> 2943970582 (-0.07%); split: -1.62%, +1.55%
Spill count: 97753 -> 83258 (-14.83%); split: -15.98%, +1.15%
Fill count: 176355 -> 154364 (-12.47%); split: -12.82%, +0.35%
Scratch Memory Size: 8619008 -> 7059456 (-18.09%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20176>
2023-09-21 16:58:29 +00:00
Connor Abbott
4282386311 nir/spirv: Add inverse_ballot intrinsic
This is actually a no-op on AMD, so we really don't want to lower it to
something more complicated.  There may be a more efficient way to do
this on Intel too. In addition, in the future we'll want to use this for
lowering boolean reduce operations, where the inverse ballot will
operate on the backend's "natural" ballot type as indicated by
options->ballot_bit_size, instead of uvec4 as produced by SPIR-V. In
total, there are now three possible lowerings we may have to perform:

- inverse_ballot with source type of uvec4 from SPIR-V to inverse_ballot
with natural source type, when the backend supports inverse_ballot
natively.
- inverse_ballot with source type of uvec4 from SPIR-V to arithmetic,
when the backend doesn't support inverse_ballot.
- inverse_ballot with natural source type from reduce operation, when
the backend doesn't support inverse_ballot.

Previously we just did the second lowering unconditionally in vtn, but
it's just a combination of the first and third. We add support here for
the first and third lowerings in nir_lower_subgroups, instead of simply
moving the second lowering, to avoid unnecessary churn.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>
2023-09-20 14:41:18 +00:00
Connor Abbott
0ef87f148d nir/lower_subgroups: Don't do multiple lowerings at once
Since using nir_shader_lower_instructions(), instructions get revisited
before proceeding with the next one. This already guarantees that any
subsequent lowerings of those instructions happen during the same pass
of nir_lower_subgroups().

v2: use nir_shader_lower_instructions() instead of setting the cursor.

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25123>
2023-09-20 14:41:18 +00:00
Pavel Ondračka
1c72c71bdf nir/move_vec_src_uses_to_dest: allow to skip reuse of constant sources
And enable this for r300 and intel-vec4

crocus HSW (mostly helps few doplhin ubershaders):
total instructions in shared programs: 1576736 -> 1576589 (<.01%)
instructions in affected programs: 38235 -> 38088 (-0.38%)
helped: 12
HURT: 0
total cycles in shared programs: 111025838 -> 110944796 (-0.07%)
cycles in affected programs: 5646582 -> 5565540 (-1.44%)
helped: 15
HURT: 6
total spills in shared programs: 447 -> 432 (-3.36%)
spills in affected programs: 186 -> 171 (-8.06%)
helped: 12
HURT: 0
total fills in shared programs: 792 -> 774 (-2.27%)
fills in affected programs: 291 -> 273 (-6.19%)
helped: 12
HURT: 0

r300 RV530:
total instructions in shared programs: 96655 -> 96304 (-0.36%)
instructions in affected programs: 15020 -> 14669 (-2.34%)
helped: 79
HURT: 18
total temps in shared programs: 13027 -> 12952 (-0.58%)
temps in affected programs: 677 -> 602 (-11.08%)
helped: 41
HURT: 9
total cycles in shared programs: 147745 -> 147314 (-0.29%)
cycles in affected programs: 21831 -> 21400 (-1.97%)
helped: 84
HURT: 19

r300 RV370:
total instructions in shared programs: 63678 -> 63669 (-0.01%)
instructions in affected programs: 931 -> 922 (-0.97%)
helped: 12
HURT: 6
total temps in shared programs: 10028 -> 10013 (-0.15%)
temps in affected programs: 339 -> 324 (-4.42%)
helped: 33
HURT: 10
total cycles in shared programs: 101118 -> 101087 (-0.03%)
cycles in affected programs: 2659 -> 2628 (-1.17%)
helped: 22
HURT: 6

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24932>
2023-09-19 18:05:37 +02:00
Pavel Ondračka
dc60194599 nir/move_vec_src_uses_to_dest: skip reuse if vec is used only once in store_output
lima and etnaviv show no change in shader-db.

crocus HSW:
total instructions in shared programs: 1576762 -> 1576736 (<.01%)
instructions in affected programs: 485 -> 459 (-5.36%)
helped: 28
HURT: 1
total cycles in shared programs: 111025898 -> 111025838 (<.01%)
cycles in affected programs: 1248 -> 1188 (-4.81%)
helped: 29
HURT: 0

RV370:
total instructions in shared programs: 63889 -> 63558 (-0.52%)
instructions in affected programs: 9116 -> 8785 (-3.63%)
helped: 129
HURT: 0
total temps in shared programs: 10071 -> 10016 (-0.55%)
temps in affected programs: 285 -> 230 (-19.30%)
helped: 51
HURT: 0
total cycles in shared programs: 101344 -> 100997 (-0.34%)
cycles in affected programs: 9326 -> 8979 (-3.72%)
helped: 129
HURT: 0

RV530:
total instructions in shared programs: 93597 -> 93267 (-0.35%)
instructions in affected programs: 10309 -> 9979 (-3.20%)
helped: 166
HURT: 0
total temps in shared programs: 13019 -> 12955 (-0.49%)
temps in affected programs: 337 -> 273 (-18.99%)
helped: 61
HURT: 1
total cycles in shared programs: 144506 -> 144159 (-0.24%)
cycles in affected programs: 10662 -> 10315 (-3.25%)
helped: 165
HURT: 0

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24932>
2023-09-19 18:05:30 +02:00
Dave Airlie
51840bbdce nir: add a deref slot counter that handles compact
Conor suggested this, so we can mark slots properly
in the io marking.

This fixes a problem seen when rewriting llvmpipe to use
nir info instead of tgsi info.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24803>
2023-09-18 16:47:30 +00:00
Alyssa Rosenzweig
b318b3d520 nir: Remove nir_ssa_for_src
It is now unused and has no real use cases now that nir_register is gone.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>
2023-09-18 10:25:17 -04:00
Alyssa Rosenzweig
55333fce77 treewide: Remove remaining nir_ssa_for_src
Coccinelle missed these, a few manual changes here.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>
2023-09-18 10:25:17 -04:00
Alyssa Rosenzweig
d1eb17e92e treewide: Drop nir_ssa_for_src users
Via Coccinelle patch:

    @@
    expression b, s, n;
    @@

    -nir_ssa_for_src(b, *s, n)
    +s->ssa

    @@
    expression b, s, n;
    @@

    -nir_ssa_for_src(b, s, n)
    +s.ssa

Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>
2023-09-18 10:25:17 -04:00
Alyssa Rosenzweig
4bcb62d203 nir/opt_sink: Also consider load_preamble as const
Acts like constants, schedule them like constants. This lets us move lowered
frag coord code down. Results on dolphin ubers:

   total instructions in shared programs: 195144 -> 196633 (0.76%)
   instructions in affected programs: 175737 -> 177226 (0.85%)
   helped: 28
   HURT: 27
   Instructions are HURT.

   total bytes in shared programs: 1379980 -> 1388308 (0.60%)
   bytes in affected programs: 1244250 -> 1252578 (0.67%)
   helped: 28
   HURT: 27
   Bytes are HURT.

   total halfregs in shared programs: 13591 -> 13557 (-0.25%)
   halfregs in affected programs: 2176 -> 2142 (-1.56%)
   helped: 12
   HURT: 2
   Inconclusive result (%-change mean confidence interval includes 0).

   total threads in shared programs: 233728 -> 234112 (0.16%)
   threads in affected programs: 3264 -> 3648 (11.76%)
   helped: 6
   HURT: 0
   Threads are helped.

Results on Android shader-db:

   total instructions in shared programs: 1775324 -> 1775912 (0.03%)
   instructions in affected programs: 155305 -> 155893 (0.38%)
   helped: 353
   HURT: 548
   Instructions are HURT.

   total bytes in shared programs: 11676650 -> 11678454 (0.02%)
   bytes in affected programs: 1058924 -> 1060728 (0.17%)
   helped: 370
   HURT: 547
   Inconclusive result (value mean confidence interval includes 0).

   total halfregs in shared programs: 484143 -> 471212 (-2.67%)
   halfregs in affected programs: 98833 -> 85902 (-13.08%)
   helped: 2478
   HURT: 674
   Halfregs are helped.

Instr count changes due to losing the RA lottery.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
aead5316d2 nir/opt_sink: Move ALU with constant sources
In general, sinking ALU instructions can negatively impact register pressure,
since it extends the live ranges of the sources, although it does shrink the live range
of the destination.

However, constants do not usually contribute to register pressure. This is not a
totally true assumption, but it's pretty good in practice, since...

* constants can be rematerialized (backend-dependent)
* constants can often be inlined (ISA-dependent)
* constants can sometimes be promoted to free uniform registers (ISA-dependent)
* constants can live in scalar registers although the ALU destination might need
  a vector register (and vector registers are assumed to be much more expensive
  than scalar registers, again ISA-dependent)

So, assume that constants have zero effect on register pressure. Now consider an
ALU instruction where all but one source is a constant. Then there are two
cases:

1. The ALU instruction is moved past when its source was otherwise killed. Then
   there is no effect on register pressure, since the source live range is
   extended exactly as much as the destination live range shrinks.

2. The ALU instruction is moved down but its source is still alive where it's
   moved to. Then register pressure is improved, since the source live range is
   unchanged while the destination live range shrinks.

So, as a heuristic, we always move ALU instructions where n-1 sources are
constant. As an inevitable special case, this also (necessarily) moves unary ALU
ops, which should be beneficial by the same justification. This is not 100%
perfect but it is well-motivated. Results on AGX are decent:

   total instructions in shared programs: 1796101 -> 1795652 (-0.02%)
   instructions in affected programs: 326822 -> 326373 (-0.14%)
   helped: 800
   HURT: 371
   Inconclusive result (%-change mean confidence interval includes 0).

   total bytes in shared programs: 11805004 -> 11801424 (-0.03%)
   bytes in affected programs: 2610630 -> 2607050 (-0.14%)
   helped: 912
   HURT: 462
   Inconclusive result (%-change mean confidence interval includes 0).

   total halfregs in shared programs: 525818 -> 515399 (-1.98%)
   halfregs in affected programs: 118197 -> 107778 (-8.81%)
   helped: 2095
   HURT: 804
   Halfregs are helped.

   total threads in shared programs: 18916608 -> 18917056 (<.01%)
   threads in affected programs: 4800 -> 5248 (9.33%)
   helped: 7
   HURT: 0
   Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
561df40211 nir/opt_sink: Do not move derivatives
At the moment, this does nothing. It will prevent problems from the next patch,
however.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
469fd36fba nir/opt_sink: Sink frag coord instructions
load_input-like. ubershaders:

   instructions in affected programs: 72392 -> 72522 (0.18%)
   helped: 8
   HURT: 18
   Inconclusive result (value mean confidence interval includes 0).

   total bytes in shared programs: 1468550 -> 1469170 (0.04%)
   bytes in affected programs: 560486 -> 561106 (0.11%)
   helped: 10
   HURT: 17
   Inconclusive result (value mean confidence interval includes 0).

   total halfregs in shared programs: 13946 -> 13898 (-0.34%)
   halfregs in affected programs: 3642 -> 3594 (-1.32%)
   helped: 21
   HURT: 0
   Halfregs are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
c07a9dca65 nir/opt_sink: Sink load_local_pixel_agx
This is the AGX version of load_output, which shaders can use for framebuffer
fetch. It is beneficial to sink framebuffer fetch as late as possible, both to
reduce register pressure but also to reduce serialization of overlapping
fragments.

Results on a collection of ubershaders:

   total bytes in shared programs: 1468928 -> 1468550 (-0.03%)
   bytes in affected programs: 495300 -> 494922 (-0.08%)
   helped: 24
   HURT: 0
   Bytes are helped.

   total halfregs in shared programs: 14162 -> 13946 (-1.53%)
   halfregs in affected programs: 5148 -> 4932 (-4.20%)
   helped: 27
   HURT: 0
   Halfregs are helped.

   total threads in shared programs: 216896 -> 217664 (0.35%)
   threads in affected programs: 6912 -> 7680 (11.11%)
   helped: 12
   HURT: 0
   Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
596682ad4b nir/opt_sink: Sink load_constant_agx
By the time this runs, we will have already lowered load_ubo and load_vbo to
load_constant_agx so we need to handle the backend version.

This is very important for reducing register pressure in monolithic VS+GS
shaders on AGX. Since no other backend has _agx intrinsics, there's no need for
an option to gate this.

The additional instruction count is from more frequent wait instructions due to
fewer instructions grouped together. This should be mitigated in the future with
an ACO-style latency-reducing scheduler in the backend, after register pressure
is reduced by opt_sink.

total instructions in shared programs: 1793385 -> 1796101 (0.15%)
instructions in affected programs: 199816 -> 202532 (1.36%)
helped: 3
HURT: 941
Instructions are HURT.

total bytes in shared programs: 11799628 -> 11805004 (0.05%)
bytes in affected programs: 1345656 -> 1351032 (0.40%)
helped: 34
HURT: 919
Bytes are HURT.

total halfregs in shared programs: 533151 -> 525818 (-1.38%)
halfregs in affected programs: 40335 -> 33002 (-18.18%)
helped: 613
HURT: 42
Halfregs are helped.

total threads in shared programs: 18910464 -> 18916608 (0.03%)
threads in affected programs: 6144 -> 12288 (100.00%)
helped: 12
HURT: 0
Threads are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:16 -04:00
Alyssa Rosenzweig
d628be082b nir/gather_info: Use nir_op_is_derivative
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:15 -04:00
Alyssa Rosenzweig
6d3425653a nir/opt_gcm: Use nir_op_is_derivative more
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:15 -04:00
Alyssa Rosenzweig
e0246ed8e4 nir/opt_preamble: Use nir_op_is_derivative
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:15 -04:00
Alyssa Rosenzweig
1a788a86c1 nir: Hoist nir_op_is_derivative
Redefine in terms of the algebraic property. This correctly handles the
Mali-specific derivatives.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:15 -04:00
Alyssa Rosenzweig
b77dc9f7d7 nir: Add NIR_OP_IS_DERIVATIVE property
Like IS_SELECTION.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24833>
2023-09-18 08:38:15 -04:00
Caio Oliveira
3890c60584 compiler/types: Remove unused GLSL_TYPE_FUNCTION and related functions
GLSL doesn't use that type.  SPIR-V used for a while but later started
relying on its own data structures and stopped using it.
See ca62e849d3 ("nir/spirv: Stop using glsl_type for function types")

If we were ever to add this one again, would be better to have a way to
grab a key for lookup that did not require allocations, right now that's
needed to inject return type as the first element in params array.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25160>
2023-09-12 23:18:12 +00:00
Illia Polishchuk
b8a54c50a6 nir: fix invalid sampler search by texture id
Sampler id cannot be mapped to a uniform object location

Fixes: 1a8dd84ec6 ("nir: Propagate the type sampler type change to the used variable.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9793

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25145>
2023-09-12 15:44:52 +00:00
Illia Polishchuk
5a7044d0bc zink: move find_sampler_var from zink to nir core
Avoid code duplication because it need to be used in following commits

Fixes: 1a8dd84ec6 ("nir: Propagate the type sampler type change to the used variable.")

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25145>
2023-09-12 15:44:52 +00:00