Commit graph

1790 commits

Author SHA1 Message Date
Daniel Schürmann
fcc5dec8d6 aco/insert_exec_mask: remove ever_again_needs and Exact_Branch
This information is not required anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
cbb1b095ca aco/insert_exec_mask: remove some unnecessary WQM loop handling code
These workarounds are were necessary to prevent infinite loops
with helper lane registers containing wrong data.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
580a63b4ac aco/insert_exec_mask: remove Preserve_WQM flag
If WQM is needed anywhere after discard_if(), it will also
be flagged as WQM. We can rely on that to preserve the WQM mask.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
a5a40beefa aco: don't emit WQM for bool_to_scalar_condition
This was only necessary to ensure that the source is computed in WQM
if the current exec mask is in WQM.

Totals from 23170 (17.17% of 134913) affected shaders: (GFX10.3)
VGPRs: 1384464 -> 1383400 (-0.08%); split: -0.08%, +0.01%
SpillSGPRs: 7575 -> 7574 (-0.01%)
CodeSize: 146444752 -> 146317104 (-0.09%); split: -0.13%, +0.04%
MaxWaves: 429870 -> 429868 (-0.00%)
Instrs: 27202586 -> 27170316 (-0.12%); split: -0.17%, +0.05%
Latency: 379488313 -> 379335412 (-0.04%); split: -0.07%, +0.03%
InvThroughput: 69500561 -> 69487704 (-0.02%); split: -0.03%, +0.01%
VClause: 473080 -> 473038 (-0.01%); split: -0.02%, +0.01%
SClause: 1080576 -> 1080571 (-0.00%); split: -0.06%, +0.06%
Copies: 1492865 -> 1504604 (+0.79%); split: -0.40%, +1.19%
Branches: 711295 -> 716849 (+0.78%); split: -0.01%, +0.79%
PreSGPRs: 1331362 -> 1328402 (-0.22%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
f816dd1be7 aco: don't propagate WQM for p_as_uniform
This was needed, so that in case of active helper lanes,
these contain the correct value. It is now handled implicitly.

Totals from 1004 (0.74% of 134913) affected shaders: (GFX10.3)
CodeSize: 7581020 -> 7580892 (-0.00%); split: -0.00%, +0.00%
Instrs: 1454940 -> 1454908 (-0.00%); split: -0.00%, +0.00%
Latency: 12984953 -> 12984894 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 3173037 -> 3173049 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 47498 -> 47273 (-0.47%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
825cd696dc aco/insert_exec_mask: stay in WQM while helper lanes are still needed
This patch flags all instructions WQM which don't require
Exact mode, but depend on the exec mask as long as WQM
is needed on any control flow path afterwards.
This will mostly prevent accidental copies of WQM values
within Exact mode, and also makes a lot of other workarounds
unnecessary.

Totals from 17374 (12.88% of 134913) affected shaders: (GFX10.3)
VGPRs: 526952 -> 527384 (+0.08%); split: -0.01%, +0.09%
CodeSize: 33740512 -> 33766636 (+0.08%); split: -0.06%, +0.14%
MaxWaves: 488166 -> 488108 (-0.01%); split: +0.00%, -0.02%
Instrs: 6254240 -> 6260557 (+0.10%); split: -0.08%, +0.18%
Latency: 66497580 -> 66463472 (-0.05%); split: -0.15%, +0.10%
InvThroughput: 13265741 -> 13264036 (-0.01%); split: -0.03%, +0.01%
VClause: 122962 -> 122975 (+0.01%); split: -0.01%, +0.02%
SClause: 334805 -> 334405 (-0.12%); split: -0.51%, +0.39%
Copies: 275728 -> 282341 (+2.40%); split: -0.91%, +3.31%
Branches: 92546 -> 90990 (-1.68%); split: -1.68%, +0.00%
PreSGPRs: 504119 -> 504352 (+0.05%); split: -0.00%, +0.05%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Samuel Pitoiset
2451290bc4 radv: rewrite RADV_FORCE_VRS directly in NIR
This introduces a small NIR pass that exports
VARYING_SLOT_PRIMITIVE_SHADING_RATE if RADV_FORCE_VRS is used,
instead of doing this in both backend compilers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14907>
2022-02-09 17:40:34 +01:00
Daniel Schürmann
5e9df85b1a aco: optimize discard_if when WQM is not needed afterwards
Totals from 11560 (8.57% of 134913) affected shaders: (GFX10.3)
CodeSize: 12092560 -> 11997652 (-0.78%)
Instrs: 2205325 -> 2181598 (-1.08%)
Latency: 15376048 -> 15356958 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 3526105 -> 3525120 (-0.03%); split: -0.03%, +0.00%
Copies: 98543 -> 87601 (-11.10%)
Branches: 16919 -> 16873 (-0.27%)
PreSGPRs: 291584 -> 291532 (-0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
13c3137960 aco: merge block_kind_uses_[demote|discard_if]
These serve the same purpose. The new name is
block_kind_uses_discard.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
e7d1c8cc5e aco: make Preserve_WQM independent from block_kind_uses_discard_if
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
08b8500dfb aco: remove block_kind_discard
This case doesn't seem to happen in practice.
No need to micro-optimize it.

This patch merges instruction selection for discard/discard_if.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
b67092e685 aco: emit nir_intrinsic_discard() as p_discard_if()
This simplifies the code and emits a slightly better
sequence in some cases.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Rhys Perry
0447a2303f aco: don't encode src2 for v_writelane_b32_e64
Encoding src2 doesn't cause issues for print_asm() because we have a
workaround there, but it does for RGP and it seems the developers are not
interested in fixing it.

https://github.com/GPUOpen-Tools/radeon_gpu_profiler/issues/61

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14832>
2022-02-03 16:52:00 +00:00
Rhys Perry
5e3b8eeac4 aco: add test for optimizations with casts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
6b1dfa7eac aco: fix neg(mul)/abs(mul) optimization with different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
13bbc7c882 aco: don't combine add/mul of different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
3d8a8c6fc1 aco: don't apply omod/clamp of different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
7e30f99b0a aco: don't combine fneg/fabs of different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
27f1f5537d aco/tests: implement sub-dword program inputs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
e86b88f85b aco/tests: add a bunch more building helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
ba44634e4d aco: fix v_mac_legacy_f32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f68797ead7 ("aco: create v_mac_legacy_f32/v_fmac_legacy_f32")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5952
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14820>
2022-02-01 15:08:04 +00:00
Rhys Perry
16e0c312fa aco: preserve pass_flags during format conversions
This helps the "vopc() & exec" optimization.

fossil-db (Sienna Cichlid):
Totals from 1638 (1.21% of 134913) affected shaders:
CodeSize: 3331804 -> 3327520 (-0.13%); split: -0.19%, +0.06%
Instrs: 611807 -> 610096 (-0.28%)
Latency: 5579326 -> 5574874 (-0.08%)
InvThroughput: 936782 -> 936731 (-0.01%); split: -0.01%, +0.00%
Copies: 43324 -> 43302 (-0.05%); split: -0.06%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773>
2022-01-31 13:45:01 +00:00
Rhys Perry
1804c21fb5 aco: optimize abs(mul(a, b))
fossil-db (Sienna Cichlid):
Totals from 18 (0.01% of 134913) affected shaders:
CodeSize: 173924 -> 173852 (-0.04%)
Instrs: 33864 -> 33846 (-0.05%)
Latency: 122233 -> 122211 (-0.02%)
InvThroughput: 22482 -> 22462 (-0.09%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773>
2022-01-31 13:45:01 +00:00
Rhys Perry
452975f257 aco: fix neg(abs(mul(a, b))) if the mul is not VOP3
Previously, is_abs was just ignored if mul_instr->isVOP3()==false.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773>
2022-01-31 13:45:01 +00:00
Rhys Perry
f68797ead7 aco: create v_mac_legacy_f32/v_fmac_legacy_f32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>
2022-01-20 22:54:42 +00:00
Rhys Perry
43e32ad074 aco: consider legacy multiplications in optimizer
Optimize omod, -(a*b), b2f(a)*b, a*1, a*0 and create MAD/FMA.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>
2022-01-20 22:54:42 +00:00
Rhys Perry
e7f91b194a radv,aco,ac/llvm: implement fmulz and ffmaz
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>
2022-01-20 22:54:42 +00:00
Dave Airlie
d54c07b4c4 mesa/*: use an internal enum for tessellation primitive types.
To avoid dragging gl.h into places it has no business being,
defined tessellation primitive mode to an enum.

This has a lot of fallout all over the place.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
2022-01-19 21:54:58 +00:00
Rhys Perry
b59764a9fc aco: use p_extract for SGPR nir_op_unpack_half_2x16_split_y
fossil-db (Sienna Cichlid):
Totals from 7264 (5.40% of 134627) affected shaders:
VGPRs: 548152 -> 548184 (+0.01%)
SpillSGPRs: 7615 -> 7607 (-0.11%)
CodeSize: 71025152 -> 70993036 (-0.05%); split: -0.05%, +0.00%
Instrs: 13386799 -> 13298580 (-0.66%); split: -0.66%, +0.00%
Latency: 177464497 -> 177091693 (-0.21%); split: -0.21%, +0.00%
InvThroughput: 32185148 -> 32151873 (-0.10%); split: -0.10%, +0.00%
VClause: 233167 -> 233166 (-0.00%); split: -0.00%, +0.00%
SClause: 468423 -> 468426 (+0.00%); split: -0.00%, +0.01%
Copies: 950727 -> 942753 (-0.84%); split: -0.85%, +0.02%
Branches: 455919 -> 455901 (-0.00%); split: -0.01%, +0.00%

fossil-db (Vega):
Totals from 7264 (5.39% of 134762) affected shaders:
SGPRs: 738800 -> 738816 (+0.00%)
VGPRs: 550264 -> 550344 (+0.01%)
SpillSGPRs: 11149 -> 11157 (+0.07%)
CodeSize: 67487104 -> 67466772 (-0.03%); split: -0.03%, +0.00%
Instrs: 13142106 -> 13061767 (-0.61%)
Latency: 209278575 -> 208438854 (-0.40%); split: -0.40%, +0.00%
InvThroughput: 81486405 -> 81265773 (-0.27%); split: -0.27%, +0.00%
VClause: 222293 -> 222291 (-0.00%); split: -0.00%, +0.00%
SClause: 447783 -> 447800 (+0.00%); split: -0.00%, +0.01%
Copies: 1243760 -> 1238842 (-0.40%); split: -0.43%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14592>
2022-01-19 12:41:48 +00:00
Samuel Pitoiset
e6173ed1d2 radv: allow to disable anisotropic filtering for single level image with drirc
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14471>
2022-01-13 16:17:48 +00:00
Daniel Schürmann
8a78706643 nir: refactor nir_opt_move
This patch is a rewrite of nir_opt_move.
Differently from the previous version, each instruction is checked
if it can be moved downwards and then inserted before the first user
of the definition. The advantage is that less insert operations are
performed, the original order is kept if two movable instructions have
the same first user, and instructions without user in the same block
are moved towards the end.

v2: Only return true if an instruction really changed the position.
    Don't care for discards, this will be handled by another MR.
v3: fix self-referring phis and update according to nir_can_move_instr().
v4: use nir_can_move_instr() and nir_instr_ssa_def()
v5: deduplicate some code

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3657>
2022-01-12 13:41:54 +00:00
Daniel Schürmann
4e2b624c10 aco: validate VOP3P opsel correctly
Before RA, subdword operands must use .xx
After RA, opsel can either be .xx or .yy

Cc: mesa-stable
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14472>
2022-01-11 11:41:12 +00:00
Rhys Perry
60c711833f aco: remove pack_half_2x16(a, 0) optimization
This makes the compiler less predictable and should only have a very small
effect on performance.

fossil-db (Vega):
Totals from 2410 (1.79% of 134756) affected shaders:
CodeSize: 6911568 -> 6942840 (+0.45%)

Fixes Horizon Zero Dawn artifacts.

If a shader has:
   a = pack_half_2x16(a, 0) //rtne
   store(pack_half_2x16(0, b) | a) //rtne
   a = unpack_2x16(a).x
It will become:
   store(pack_half_2x16(a, b)) //rtz
   a = unpack_2x16(pack_half_2x16(a, 0)).x //rtne

So a later shader with "unpack_2x16(load()).x" will use "a" rounded to
zero, while the previous shader will use "a" rounded to the nearest even.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 2f125908b3 ("radv,aco: lower_pack_half_2x16")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14475>
2022-01-10 22:19:29 +00:00
Marek Olšák
116a05c721 ac: move ac_exp_param.h to ac_nir.h
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14266>
2022-01-05 12:46:31 +00:00
Timur Kristóf
bc94c2718a aco: Emit VRS rate when it's per-primitive.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14193>
2022-01-04 17:46:02 +00:00
Samuel Pitoiset
2bf25e6f6e radv,aco: keep track of the prolog disassembly if necessary
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13376>
2022-01-04 07:50:07 +00:00
Samuel Pitoiset
e836174077 aco: do not print prologs disassembly if no disassembler
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13376>
2022-01-04 07:50:07 +00:00
Samuel Pitoiset
3ef736c94e aco: fix a dynamic-stack-buffer-overflow when printing instructions
Detected by ASAN.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14329>
2022-01-04 08:04:50 +01:00
Tatsuyuki Ishi
31d839aacc aco: lower masked swizzle to DPP8
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971>
2021-12-31 20:56:39 +00:00
Tatsuyuki Ishi
da0412e55b aco: support DPP8
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13971>
2021-12-31 20:56:39 +00:00
Daniel Schürmann
16a527deef aco: don't split VOP3P definitions
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Daniel Schürmann
7e02787a54 aco: use p_create_vector(v2b,v2b) in get_alu_src_vop3p()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Daniel Schürmann
e56d8b0b2e aco: use explicit zero-padding for 64bit image loads in expand_vector()
Previously, this only worked because of regClass mismatches
in the allocated vector.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Daniel Schürmann
91f17e1c73 aco/optimizer: apply extract from subdword p_split_vector
Totals from 1345 (1.00% of 134572) affected shaders: (GFX10.3)
VGPRs: 76752 -> 76744 (-0.01%); split: -0.02%, +0.01%
SpillSGPRs: 1459 -> 1460 (+0.07%)
SpillVGPRs: 1776 -> 1784 (+0.45%); split: -0.39%, +0.84%
CodeSize: 13310964 -> 13309420 (-0.01%); split: -0.06%, +0.05%
Scratch: 178176 -> 179200 (+0.57%)
Instrs: 2516874 -> 2516860 (-0.00%); split: -0.05%, +0.05%
Latency: 23228506 -> 23230338 (+0.01%); split: -0.14%, +0.15%
InvThroughput: 6002384 -> 6000158 (-0.04%); split: -0.24%, +0.21%
VClause: 41115 -> 41117 (+0.00%); split: -0.28%, +0.29%
SClause: 104639 -> 104664 (+0.02%); split: -0.07%, +0.09%
Copies: 185121 -> 184862 (-0.14%); split: -0.69%, +0.55%
Branches: 100740 -> 100735 (-0.00%); split: -0.01%, +0.00%
PreVGPRs: 70119 -> 69968 (-0.22%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Daniel Schürmann
fb622775b5 aco/optimizer: optimize extract(extract())
Totals from 53 (0.04% of 134572) affected shaders: (GFX10.3)
SpillVGPRs: 1780 -> 1776 (-0.22%); split: -0.34%, +0.11%
CodeSize: 968352 -> 963196 (-0.53%); split: -0.55%, +0.02%
Scratch: 180224 -> 178176 (-1.14%)
Instrs: 169800 -> 169158 (-0.38%); split: -0.39%, +0.01%
Latency: 6186064 -> 6141408 (-0.72%); split: -1.16%, +0.44%
InvThroughput: 2605044 -> 2582967 (-0.85%); split: -1.37%, +0.52%
VClause: 4851 -> 4866 (+0.31%); split: -0.16%, +0.47%
SClause: 1744 -> 1746 (+0.11%)
Copies: 42874 -> 42325 (-1.28%); split: -1.40%, +0.12%
Branches: 5762 -> 5765 (+0.05%); split: -0.02%, +0.07%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Daniel Schürmann
5ad9c20d4a aco/optimizer: apply extract from p_extract_vector
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Daniel Schürmann
11712729eb aco/optimizer: keep instr_mod_labels after applying extract
This allows to use clamp on SDWA and VOP3 opsel instructions.

Totals from 32 (0.02% of 134572) affected shaders: (GFX10.3)
SpillVGPRs: 1783 -> 1780 (-0.17%); split: -0.50%, +0.34%
CodeSize: 881480 -> 881496 (+0.00%); split: -0.15%, +0.15%
Instrs: 154400 -> 154388 (-0.01%); split: -0.13%, +0.12%
Latency: 5021791 -> 5033485 (+0.23%); split: -0.67%, +0.90%
InvThroughput: 2486454 -> 2492312 (+0.24%); split: -0.67%, +0.91%
VClause: 4763 -> 4755 (-0.17%); split: -0.52%, +0.36%
Copies: 42866 -> 42965 (+0.23%); split: -0.25%, +0.48%
Branches: 5640 -> 5639 (-0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Daniel Schürmann
1502c22e2c aco: don't allow SDWA on VOP3P instructions
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13576>
2021-12-31 14:52:14 +00:00
Timur Kristóf
8d238f5581 aco: Export per-primitive mesh shader output attributes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
2021-12-31 13:05:09 +00:00
Timur Kristóf
fc1424f1d8 aco: Use the correct outinfo for mesh shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13580>
2021-12-31 13:05:09 +00:00