Commit graph

1814 commits

Author SHA1 Message Date
Rhys Perry
aa55ecc296 aco/insert_exec_mask: fix top-level to-exact with non-global exact mask
After transitioning to exact after a discard, the exec stack might be:
[exact|global, wqm, exact]

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244>
2022-03-08 12:49:59 +00:00
Rhys Perry
d28b6b6856 aco: rework removal of jumps over branches
Only allow this in situations where we know it's safe. In particular, this
stops removal of unconditional branches like with
block_kind_continue_or_break.

Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang.

fossil-db (Sienna Cichlid):
Totals from 34 (0.02% of 162293) affected shaders:
Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08%
CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06%
Latency: 3467316 -> 3467652 (+0.01%)
InvThroughput: 3085493 -> 3085578 (+0.00%)
Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f030b75b7d ("aco: relax condition to remove branches in case of few instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214>
2022-03-04 12:32:36 +00:00
Samuel Pitoiset
9b113f1b6c aco: implement nir_op_pack_{uint,sint}_2x16
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>
2022-03-04 08:06:56 +00:00
Daniel Schürmann
ccf4bcd162 aco/ra: don't immediately assign a register for p_branch
These get now assigned after handling phis.

Totals from 564 (0.42% of 134913) affected shaders: (GFX10.3)
CodeSize: 5519744 -> 5515308 (-0.08%)
Instrs: 1063045 -> 1061936 (-0.10%)
Latency: 11880452 -> 11875904 (-0.04%)
InvThroughput: 2259933 -> 2259581 (-0.02%); split: -0.02%, +0.00%
Copies: 86908 -> 85799 (-1.28%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry
c3070773f8 aco/tests: add test for branch definition RA
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry
32d0bae8ec aco: fix branch definition validation
Like how they have to be register allocated differently, branch
definitions at merge block predecessors need to be validated differently.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry
bed5a31005 aco: add validate_instr_defs()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry
d5349a99c2 aco/ra: fix register allocation of branch definitions
fossil-db (Sienna Cichlid):
Totals from 704 (0.52% of 134913) affected shaders:
CodeSize: 7177288 -> 7182072 (+0.07%); split: -0.00%, +0.07%
Instrs: 1371781 -> 1372977 (+0.09%); split: -0.00%, +0.09%
Latency: 17993572 -> 18001344 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 4198996 -> 4199569 (+0.01%); split: -0.00%, +0.01%
Copies: 122456 -> 123516 (+0.87%); split: -0.01%, +0.88%
Branches: 43815 -> 43818 (+0.01%); split: -0.02%, +0.03%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry
608d48b787 aco/ra: add get_reg_phi() helper
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Rhys Perry
ceca5e68c4 aco: remove vcc hint from branch definitions
This doesn't seem to have much benefit anymore.

fossil-db (Sienna Cichlid):
Totals from 198 (0.15% of 134913) affected shaders:
CodeSize: 2610536 -> 2610872 (+0.01%); split: -0.01%, +0.02%
Instrs: 479001 -> 479085 (+0.02%); split: -0.01%, +0.03%
Latency: 7310684 -> 7300735 (-0.14%); split: -0.16%, +0.02%
InvThroughput: 2439084 -> 2437446 (-0.07%); split: -0.07%, +0.00%
SClause: 14760 -> 14722 (-0.26%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>
2022-03-03 20:21:08 +00:00
Samuel Pitoiset
516aee64cc radv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16
v_cvt_pknorm_{u16,i16}_f32 can be emitted instead, it's supported on
all generations.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15215>
2022-03-03 14:54:12 +01:00
Timur Kristóf
11957d3863 aco: Remove superfluous code for mesh shader workgroup ID.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199>
2022-03-01 15:37:12 +00:00
Daniel Schürmann
f030b75b7d aco: relax condition to remove branches in case of few instructions
This patch relaxes the conditions under which
we remove branch instructions.

Totals from 27246 (20.20% of 134913) affected shaders: (GFX10.3)
CodeSize: 193413312 -> 192924928 (-0.25%)
Instrs: 36146788 -> 36024692 (-0.34%)
Latency: 528374112 -> 528469044 (+0.02%); split: -0.01%, +0.02%
InvThroughput: 106198759 -> 106216583 (+0.02%); split: -0.00%, +0.02%
Branches: 1040640 -> 918543 (-11.73%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8647>
2022-02-25 15:38:08 +00:00
Timur Kristóf
1ca6b2f216 aco: Support memory modes properly with load/store_buffer_amd.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161>
2022-02-25 14:08:39 +01:00
Timur Kristóf
ba4b48e787 aco: Support task_payload with barriers, refactor allowed storage class.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161>
2022-02-25 14:08:36 +01:00
Timur Kristóf
cd0dd5d6b7 aco: Add storage class for Task Shader payload.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161>
2022-02-25 13:20:08 +01:00
Timur Kristóf
9cc9cf77a8 aco: Fix multiview view index for mesh shaders.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf
082b691141 aco: Fix workgroup_id.y and .z for NV_mesh_shader.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Timur Kristóf
10ebfb3bf2 aco: Allow 1-byte loads and stores with load/store_buffer_amd
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034>
2022-02-25 06:31:33 +00:00
Georg Lehmann
d223b7f096 radv, aco: Add u_foreach_bit to .clang-format.
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15083>
2022-02-22 14:57:29 +00:00
Samuel Pitoiset
369b8cffea radv,aco,llvm: lower adjusting vertex alpha in NIR
Instead of duplicating the same lowering in both compiler backends.
This pass will be used to do more VS input lowering.

fossils-db (Polaris10):
Totals from 48 (0.04% of 135960) affected shaders:
VGPRs: 1692 -> 1684 (-0.47%)
CodeSize: 54016 -> 53964 (-0.10%); split: -0.11%, +0.01%
MaxWaves: 339 -> 341 (+0.59%)
Instrs: 11260 -> 11247 (-0.12%); split: -0.13%, +0.02%
Latency: 88165 -> 88113 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 36153 -> 36093 (-0.17%)
Copies: 583 -> 568 (-2.57%)

fossils-db (Pitcairn):
Totals from 43 (0.03% of 135960) affected shaders:
VGPRs: 1548 -> 1552 (+0.26%)
CodeSize: 47900 -> 47820 (-0.17%)
Instrs: 10751 -> 10731 (-0.19%)
Latency: 83029 -> 82873 (-0.19%)
VClause: 168 -> 164 (-2.38%)
SClause: 393 -> 391 (-0.51%)
Copies: 705 -> 685 (-2.84%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15076>
2022-02-22 08:08:42 +00:00
Samuel Pitoiset
cbd5724a6d aco: implement nir_intrinsic_load_vrs_rates_amd
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14713>
2022-02-16 08:11:19 +01:00
Daniel Schürmann
1bbbabedb7 aco/insert_exec_mask: refactor and remove some unnecessary WQM handling code
Some cases cannot happen and don't need to be handled anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
d7d7b9974a aco/insert_exec_mask: refactor and simplify get_block_needs()
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
fcc5dec8d6 aco/insert_exec_mask: remove ever_again_needs and Exact_Branch
This information is not required anymore.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
cbb1b095ca aco/insert_exec_mask: remove some unnecessary WQM loop handling code
These workarounds are were necessary to prevent infinite loops
with helper lane registers containing wrong data.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
580a63b4ac aco/insert_exec_mask: remove Preserve_WQM flag
If WQM is needed anywhere after discard_if(), it will also
be flagged as WQM. We can rely on that to preserve the WQM mask.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
a5a40beefa aco: don't emit WQM for bool_to_scalar_condition
This was only necessary to ensure that the source is computed in WQM
if the current exec mask is in WQM.

Totals from 23170 (17.17% of 134913) affected shaders: (GFX10.3)
VGPRs: 1384464 -> 1383400 (-0.08%); split: -0.08%, +0.01%
SpillSGPRs: 7575 -> 7574 (-0.01%)
CodeSize: 146444752 -> 146317104 (-0.09%); split: -0.13%, +0.04%
MaxWaves: 429870 -> 429868 (-0.00%)
Instrs: 27202586 -> 27170316 (-0.12%); split: -0.17%, +0.05%
Latency: 379488313 -> 379335412 (-0.04%); split: -0.07%, +0.03%
InvThroughput: 69500561 -> 69487704 (-0.02%); split: -0.03%, +0.01%
VClause: 473080 -> 473038 (-0.01%); split: -0.02%, +0.01%
SClause: 1080576 -> 1080571 (-0.00%); split: -0.06%, +0.06%
Copies: 1492865 -> 1504604 (+0.79%); split: -0.40%, +1.19%
Branches: 711295 -> 716849 (+0.78%); split: -0.01%, +0.79%
PreSGPRs: 1331362 -> 1328402 (-0.22%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
f816dd1be7 aco: don't propagate WQM for p_as_uniform
This was needed, so that in case of active helper lanes,
these contain the correct value. It is now handled implicitly.

Totals from 1004 (0.74% of 134913) affected shaders: (GFX10.3)
CodeSize: 7581020 -> 7580892 (-0.00%); split: -0.00%, +0.00%
Instrs: 1454940 -> 1454908 (-0.00%); split: -0.00%, +0.00%
Latency: 12984953 -> 12984894 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 3173037 -> 3173049 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 47498 -> 47273 (-0.47%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Daniel Schürmann
825cd696dc aco/insert_exec_mask: stay in WQM while helper lanes are still needed
This patch flags all instructions WQM which don't require
Exact mode, but depend on the exec mask as long as WQM
is needed on any control flow path afterwards.
This will mostly prevent accidental copies of WQM values
within Exact mode, and also makes a lot of other workarounds
unnecessary.

Totals from 17374 (12.88% of 134913) affected shaders: (GFX10.3)
VGPRs: 526952 -> 527384 (+0.08%); split: -0.01%, +0.09%
CodeSize: 33740512 -> 33766636 (+0.08%); split: -0.06%, +0.14%
MaxWaves: 488166 -> 488108 (-0.01%); split: +0.00%, -0.02%
Instrs: 6254240 -> 6260557 (+0.10%); split: -0.08%, +0.18%
Latency: 66497580 -> 66463472 (-0.05%); split: -0.15%, +0.10%
InvThroughput: 13265741 -> 13264036 (-0.01%); split: -0.03%, +0.01%
VClause: 122962 -> 122975 (+0.01%); split: -0.01%, +0.02%
SClause: 334805 -> 334405 (-0.12%); split: -0.51%, +0.39%
Copies: 275728 -> 282341 (+2.40%); split: -0.91%, +3.31%
Branches: 92546 -> 90990 (-1.68%); split: -1.68%, +0.00%
PreSGPRs: 504119 -> 504352 (+0.05%); split: -0.00%, +0.05%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951>
2022-02-11 19:05:30 +00:00
Samuel Pitoiset
2451290bc4 radv: rewrite RADV_FORCE_VRS directly in NIR
This introduces a small NIR pass that exports
VARYING_SLOT_PRIMITIVE_SHADING_RATE if RADV_FORCE_VRS is used,
instead of doing this in both backend compilers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14907>
2022-02-09 17:40:34 +01:00
Daniel Schürmann
5e9df85b1a aco: optimize discard_if when WQM is not needed afterwards
Totals from 11560 (8.57% of 134913) affected shaders: (GFX10.3)
CodeSize: 12092560 -> 11997652 (-0.78%)
Instrs: 2205325 -> 2181598 (-1.08%)
Latency: 15376048 -> 15356958 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 3526105 -> 3525120 (-0.03%); split: -0.03%, +0.00%
Copies: 98543 -> 87601 (-11.10%)
Branches: 16919 -> 16873 (-0.27%)
PreSGPRs: 291584 -> 291532 (-0.02%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
13c3137960 aco: merge block_kind_uses_[demote|discard_if]
These serve the same purpose. The new name is
block_kind_uses_discard.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
e7d1c8cc5e aco: make Preserve_WQM independent from block_kind_uses_discard_if
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
08b8500dfb aco: remove block_kind_discard
This case doesn't seem to happen in practice.
No need to micro-optimize it.

This patch merges instruction selection for discard/discard_if.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Daniel Schürmann
b67092e685 aco: emit nir_intrinsic_discard() as p_discard_if()
This simplifies the code and emits a slightly better
sequence in some cases.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805>
2022-02-08 16:16:07 +00:00
Rhys Perry
0447a2303f aco: don't encode src2 for v_writelane_b32_e64
Encoding src2 doesn't cause issues for print_asm() because we have a
workaround there, but it does for RGP and it seems the developers are not
interested in fixing it.

https://github.com/GPUOpen-Tools/radeon_gpu_profiler/issues/61

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14832>
2022-02-03 16:52:00 +00:00
Rhys Perry
5e3b8eeac4 aco: add test for optimizations with casts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
6b1dfa7eac aco: fix neg(mul)/abs(mul) optimization with different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
13bbc7c882 aco: don't combine add/mul of different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
3d8a8c6fc1 aco: don't apply omod/clamp of different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
7e30f99b0a aco: don't combine fneg/fabs of different bit-size
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
27f1f5537d aco/tests: implement sub-dword program inputs
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
e86b88f85b aco/tests: add a bunch more building helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810>
2022-02-03 16:02:04 +00:00
Rhys Perry
ba44634e4d aco: fix v_mac_legacy_f32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f68797ead7 ("aco: create v_mac_legacy_f32/v_fmac_legacy_f32")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5952
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14820>
2022-02-01 15:08:04 +00:00
Rhys Perry
16e0c312fa aco: preserve pass_flags during format conversions
This helps the "vopc() & exec" optimization.

fossil-db (Sienna Cichlid):
Totals from 1638 (1.21% of 134913) affected shaders:
CodeSize: 3331804 -> 3327520 (-0.13%); split: -0.19%, +0.06%
Instrs: 611807 -> 610096 (-0.28%)
Latency: 5579326 -> 5574874 (-0.08%)
InvThroughput: 936782 -> 936731 (-0.01%); split: -0.01%, +0.00%
Copies: 43324 -> 43302 (-0.05%); split: -0.06%, +0.01%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773>
2022-01-31 13:45:01 +00:00
Rhys Perry
1804c21fb5 aco: optimize abs(mul(a, b))
fossil-db (Sienna Cichlid):
Totals from 18 (0.01% of 134913) affected shaders:
CodeSize: 173924 -> 173852 (-0.04%)
Instrs: 33864 -> 33846 (-0.05%)
Latency: 122233 -> 122211 (-0.02%)
InvThroughput: 22482 -> 22462 (-0.09%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773>
2022-01-31 13:45:01 +00:00
Rhys Perry
452975f257 aco: fix neg(abs(mul(a, b))) if the mul is not VOP3
Previously, is_abs was just ignored if mul_instr->isVOP3()==false.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773>
2022-01-31 13:45:01 +00:00
Rhys Perry
f68797ead7 aco: create v_mac_legacy_f32/v_fmac_legacy_f32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>
2022-01-20 22:54:42 +00:00
Rhys Perry
43e32ad074 aco: consider legacy multiplications in optimizer
Optimize omod, -(a*b), b2f(a)*b, a*1, a*0 and create MAD/FMA.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>
2022-01-20 22:54:42 +00:00