Rhys Perry
aa55ecc296
aco/insert_exec_mask: fix top-level to-exact with non-global exact mask
...
After transitioning to exact after a discard, the exec stack might be:
[exact|global, wqm, exact]
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15244 >
2022-03-08 12:49:59 +00:00
Rhys Perry
d28b6b6856
aco: rework removal of jumps over branches
...
Only allow this in situations where we know it's safe. In particular, this
stops removal of unconditional branches like with
block_kind_continue_or_break.
Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang.
fossil-db (Sienna Cichlid):
Totals from 34 (0.02% of 162293) affected shaders:
Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08%
CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06%
Latency: 3467316 -> 3467652 (+0.01%)
InvThroughput: 3085493 -> 3085578 (+0.00%)
Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f030b75b7d ("aco: relax condition to remove branches in case of few instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214 >
2022-03-04 12:32:36 +00:00
Samuel Pitoiset
9b113f1b6c
aco: implement nir_op_pack_{uint,sint}_2x16
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231 >
2022-03-04 08:06:56 +00:00
Daniel Schürmann
ccf4bcd162
aco/ra: don't immediately assign a register for p_branch
...
These get now assigned after handling phis.
Totals from 564 (0.42% of 134913) affected shaders: (GFX10.3)
CodeSize: 5519744 -> 5515308 (-0.08%)
Instrs: 1063045 -> 1061936 (-0.10%)
Latency: 11880452 -> 11875904 (-0.04%)
InvThroughput: 2259933 -> 2259581 (-0.02%); split: -0.02%, +0.00%
Copies: 86908 -> 85799 (-1.28%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00
Rhys Perry
c3070773f8
aco/tests: add test for branch definition RA
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00
Rhys Perry
32d0bae8ec
aco: fix branch definition validation
...
Like how they have to be register allocated differently, branch
definitions at merge block predecessors need to be validated differently.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00
Rhys Perry
bed5a31005
aco: add validate_instr_defs()
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00
Rhys Perry
d5349a99c2
aco/ra: fix register allocation of branch definitions
...
fossil-db (Sienna Cichlid):
Totals from 704 (0.52% of 134913) affected shaders:
CodeSize: 7177288 -> 7182072 (+0.07%); split: -0.00%, +0.07%
Instrs: 1371781 -> 1372977 (+0.09%); split: -0.00%, +0.09%
Latency: 17993572 -> 18001344 (+0.04%); split: -0.00%, +0.04%
InvThroughput: 4198996 -> 4199569 (+0.01%); split: -0.00%, +0.01%
Copies: 122456 -> 123516 (+0.87%); split: -0.01%, +0.88%
Branches: 43815 -> 43818 (+0.01%); split: -0.02%, +0.03%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00
Rhys Perry
608d48b787
aco/ra: add get_reg_phi() helper
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00
Rhys Perry
ceca5e68c4
aco: remove vcc hint from branch definitions
...
This doesn't seem to have much benefit anymore.
fossil-db (Sienna Cichlid):
Totals from 198 (0.15% of 134913) affected shaders:
CodeSize: 2610536 -> 2610872 (+0.01%); split: -0.01%, +0.02%
Instrs: 479001 -> 479085 (+0.02%); split: -0.01%, +0.03%
Latency: 7310684 -> 7300735 (-0.14%); split: -0.16%, +0.02%
InvThroughput: 2439084 -> 2437446 (-0.07%); split: -0.07%, +0.00%
SClause: 14760 -> 14722 (-0.26%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00
Samuel Pitoiset
516aee64cc
radv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16
...
v_cvt_pknorm_{u16,i16}_f32 can be emitted instead, it's supported on
all generations.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15215 >
2022-03-03 14:54:12 +01:00
Timur Kristóf
11957d3863
aco: Remove superfluous code for mesh shader workgroup ID.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15199 >
2022-03-01 15:37:12 +00:00
Daniel Schürmann
f030b75b7d
aco: relax condition to remove branches in case of few instructions
...
This patch relaxes the conditions under which
we remove branch instructions.
Totals from 27246 (20.20% of 134913) affected shaders: (GFX10.3)
CodeSize: 193413312 -> 192924928 (-0.25%)
Instrs: 36146788 -> 36024692 (-0.34%)
Latency: 528374112 -> 528469044 (+0.02%); split: -0.01%, +0.02%
InvThroughput: 106198759 -> 106216583 (+0.02%); split: -0.00%, +0.02%
Branches: 1040640 -> 918543 (-11.73%)
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8647 >
2022-02-25 15:38:08 +00:00
Timur Kristóf
1ca6b2f216
aco: Support memory modes properly with load/store_buffer_amd.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161 >
2022-02-25 14:08:39 +01:00
Timur Kristóf
ba4b48e787
aco: Support task_payload with barriers, refactor allowed storage class.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161 >
2022-02-25 14:08:36 +01:00
Timur Kristóf
cd0dd5d6b7
aco: Add storage class for Task Shader payload.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15161 >
2022-02-25 13:20:08 +01:00
Timur Kristóf
9cc9cf77a8
aco: Fix multiview view index for mesh shaders.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034 >
2022-02-25 06:31:33 +00:00
Timur Kristóf
082b691141
aco: Fix workgroup_id.y and .z for NV_mesh_shader.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034 >
2022-02-25 06:31:33 +00:00
Timur Kristóf
10ebfb3bf2
aco: Allow 1-byte loads and stores with load/store_buffer_amd
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15034 >
2022-02-25 06:31:33 +00:00
Georg Lehmann
d223b7f096
radv, aco: Add u_foreach_bit to .clang-format.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15083 >
2022-02-22 14:57:29 +00:00
Samuel Pitoiset
369b8cffea
radv,aco,llvm: lower adjusting vertex alpha in NIR
...
Instead of duplicating the same lowering in both compiler backends.
This pass will be used to do more VS input lowering.
fossils-db (Polaris10):
Totals from 48 (0.04% of 135960) affected shaders:
VGPRs: 1692 -> 1684 (-0.47%)
CodeSize: 54016 -> 53964 (-0.10%); split: -0.11%, +0.01%
MaxWaves: 339 -> 341 (+0.59%)
Instrs: 11260 -> 11247 (-0.12%); split: -0.13%, +0.02%
Latency: 88165 -> 88113 (-0.06%); split: -0.07%, +0.01%
InvThroughput: 36153 -> 36093 (-0.17%)
Copies: 583 -> 568 (-2.57%)
fossils-db (Pitcairn):
Totals from 43 (0.03% of 135960) affected shaders:
VGPRs: 1548 -> 1552 (+0.26%)
CodeSize: 47900 -> 47820 (-0.17%)
Instrs: 10751 -> 10731 (-0.19%)
Latency: 83029 -> 82873 (-0.19%)
VClause: 168 -> 164 (-2.38%)
SClause: 393 -> 391 (-0.51%)
Copies: 705 -> 685 (-2.84%)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15076 >
2022-02-22 08:08:42 +00:00
Samuel Pitoiset
cbd5724a6d
aco: implement nir_intrinsic_load_vrs_rates_amd
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14713 >
2022-02-16 08:11:19 +01:00
Daniel Schürmann
1bbbabedb7
aco/insert_exec_mask: refactor and remove some unnecessary WQM handling code
...
Some cases cannot happen and don't need to be handled anymore.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Daniel Schürmann
d7d7b9974a
aco/insert_exec_mask: refactor and simplify get_block_needs()
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Daniel Schürmann
fcc5dec8d6
aco/insert_exec_mask: remove ever_again_needs and Exact_Branch
...
This information is not required anymore.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Daniel Schürmann
cbb1b095ca
aco/insert_exec_mask: remove some unnecessary WQM loop handling code
...
These workarounds are were necessary to prevent infinite loops
with helper lane registers containing wrong data.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Daniel Schürmann
580a63b4ac
aco/insert_exec_mask: remove Preserve_WQM flag
...
If WQM is needed anywhere after discard_if(), it will also
be flagged as WQM. We can rely on that to preserve the WQM mask.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Daniel Schürmann
a5a40beefa
aco: don't emit WQM for bool_to_scalar_condition
...
This was only necessary to ensure that the source is computed in WQM
if the current exec mask is in WQM.
Totals from 23170 (17.17% of 134913) affected shaders: (GFX10.3)
VGPRs: 1384464 -> 1383400 (-0.08%); split: -0.08%, +0.01%
SpillSGPRs: 7575 -> 7574 (-0.01%)
CodeSize: 146444752 -> 146317104 (-0.09%); split: -0.13%, +0.04%
MaxWaves: 429870 -> 429868 (-0.00%)
Instrs: 27202586 -> 27170316 (-0.12%); split: -0.17%, +0.05%
Latency: 379488313 -> 379335412 (-0.04%); split: -0.07%, +0.03%
InvThroughput: 69500561 -> 69487704 (-0.02%); split: -0.03%, +0.01%
VClause: 473080 -> 473038 (-0.01%); split: -0.02%, +0.01%
SClause: 1080576 -> 1080571 (-0.00%); split: -0.06%, +0.06%
Copies: 1492865 -> 1504604 (+0.79%); split: -0.40%, +1.19%
Branches: 711295 -> 716849 (+0.78%); split: -0.01%, +0.79%
PreSGPRs: 1331362 -> 1328402 (-0.22%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Daniel Schürmann
f816dd1be7
aco: don't propagate WQM for p_as_uniform
...
This was needed, so that in case of active helper lanes,
these contain the correct value. It is now handled implicitly.
Totals from 1004 (0.74% of 134913) affected shaders: (GFX10.3)
CodeSize: 7581020 -> 7580892 (-0.00%); split: -0.00%, +0.00%
Instrs: 1454940 -> 1454908 (-0.00%); split: -0.00%, +0.00%
Latency: 12984953 -> 12984894 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 3173037 -> 3173049 (+0.00%); split: -0.00%, +0.00%
PreSGPRs: 47498 -> 47273 (-0.47%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Daniel Schürmann
825cd696dc
aco/insert_exec_mask: stay in WQM while helper lanes are still needed
...
This patch flags all instructions WQM which don't require
Exact mode, but depend on the exec mask as long as WQM
is needed on any control flow path afterwards.
This will mostly prevent accidental copies of WQM values
within Exact mode, and also makes a lot of other workarounds
unnecessary.
Totals from 17374 (12.88% of 134913) affected shaders: (GFX10.3)
VGPRs: 526952 -> 527384 (+0.08%); split: -0.01%, +0.09%
CodeSize: 33740512 -> 33766636 (+0.08%); split: -0.06%, +0.14%
MaxWaves: 488166 -> 488108 (-0.01%); split: +0.00%, -0.02%
Instrs: 6254240 -> 6260557 (+0.10%); split: -0.08%, +0.18%
Latency: 66497580 -> 66463472 (-0.05%); split: -0.15%, +0.10%
InvThroughput: 13265741 -> 13264036 (-0.01%); split: -0.03%, +0.01%
VClause: 122962 -> 122975 (+0.01%); split: -0.01%, +0.02%
SClause: 334805 -> 334405 (-0.12%); split: -0.51%, +0.39%
Copies: 275728 -> 282341 (+2.40%); split: -0.91%, +3.31%
Branches: 92546 -> 90990 (-1.68%); split: -1.68%, +0.00%
PreSGPRs: 504119 -> 504352 (+0.05%); split: -0.00%, +0.05%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14951 >
2022-02-11 19:05:30 +00:00
Samuel Pitoiset
2451290bc4
radv: rewrite RADV_FORCE_VRS directly in NIR
...
This introduces a small NIR pass that exports
VARYING_SLOT_PRIMITIVE_SHADING_RATE if RADV_FORCE_VRS is used,
instead of doing this in both backend compilers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14907 >
2022-02-09 17:40:34 +01:00
Daniel Schürmann
5e9df85b1a
aco: optimize discard_if when WQM is not needed afterwards
...
Totals from 11560 (8.57% of 134913) affected shaders: (GFX10.3)
CodeSize: 12092560 -> 11997652 (-0.78%)
Instrs: 2205325 -> 2181598 (-1.08%)
Latency: 15376048 -> 15356958 (-0.12%); split: -0.12%, +0.00%
InvThroughput: 3526105 -> 3525120 (-0.03%); split: -0.03%, +0.00%
Copies: 98543 -> 87601 (-11.10%)
Branches: 16919 -> 16873 (-0.27%)
PreSGPRs: 291584 -> 291532 (-0.02%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805 >
2022-02-08 16:16:07 +00:00
Daniel Schürmann
13c3137960
aco: merge block_kind_uses_[demote|discard_if]
...
These serve the same purpose. The new name is
block_kind_uses_discard.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805 >
2022-02-08 16:16:07 +00:00
Daniel Schürmann
e7d1c8cc5e
aco: make Preserve_WQM independent from block_kind_uses_discard_if
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805 >
2022-02-08 16:16:07 +00:00
Daniel Schürmann
08b8500dfb
aco: remove block_kind_discard
...
This case doesn't seem to happen in practice.
No need to micro-optimize it.
This patch merges instruction selection for discard/discard_if.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805 >
2022-02-08 16:16:07 +00:00
Daniel Schürmann
b67092e685
aco: emit nir_intrinsic_discard() as p_discard_if()
...
This simplifies the code and emits a slightly better
sequence in some cases.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14805 >
2022-02-08 16:16:07 +00:00
Rhys Perry
0447a2303f
aco: don't encode src2 for v_writelane_b32_e64
...
Encoding src2 doesn't cause issues for print_asm() because we have a
workaround there, but it does for RGP and it seems the developers are not
interested in fixing it.
https://github.com/GPUOpen-Tools/radeon_gpu_profiler/issues/61
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Tatsuyuki Ishi <ishitatsuyuki@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14832 >
2022-02-03 16:52:00 +00:00
Rhys Perry
5e3b8eeac4
aco: add test for optimizations with casts
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
6b1dfa7eac
aco: fix neg(mul)/abs(mul) optimization with different bit-size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
13bbc7c882
aco: don't combine add/mul of different bit-size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
3d8a8c6fc1
aco: don't apply omod/clamp of different bit-size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
7e30f99b0a
aco: don't combine fneg/fabs of different bit-size
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
27f1f5537d
aco/tests: implement sub-dword program inputs
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
e86b88f85b
aco/tests: add a bunch more building helpers
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14810 >
2022-02-03 16:02:04 +00:00
Rhys Perry
ba44634e4d
aco: fix v_mac_legacy_f32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: f68797ead7 ("aco: create v_mac_legacy_f32/v_fmac_legacy_f32")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5952
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14820 >
2022-02-01 15:08:04 +00:00
Rhys Perry
16e0c312fa
aco: preserve pass_flags during format conversions
...
This helps the "vopc() & exec" optimization.
fossil-db (Sienna Cichlid):
Totals from 1638 (1.21% of 134913) affected shaders:
CodeSize: 3331804 -> 3327520 (-0.13%); split: -0.19%, +0.06%
Instrs: 611807 -> 610096 (-0.28%)
Latency: 5579326 -> 5574874 (-0.08%)
InvThroughput: 936782 -> 936731 (-0.01%); split: -0.01%, +0.00%
Copies: 43324 -> 43302 (-0.05%); split: -0.06%, +0.01%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773 >
2022-01-31 13:45:01 +00:00
Rhys Perry
1804c21fb5
aco: optimize abs(mul(a, b))
...
fossil-db (Sienna Cichlid):
Totals from 18 (0.01% of 134913) affected shaders:
CodeSize: 173924 -> 173852 (-0.04%)
Instrs: 33864 -> 33846 (-0.05%)
Latency: 122233 -> 122211 (-0.02%)
InvThroughput: 22482 -> 22462 (-0.09%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773 >
2022-01-31 13:45:01 +00:00
Rhys Perry
452975f257
aco: fix neg(abs(mul(a, b))) if the mul is not VOP3
...
Previously, is_abs was just ignored if mul_instr->isVOP3()==false.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 93c8ebfa78 ("aco: Initial commit of independent AMD compiler")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14773 >
2022-01-31 13:45:01 +00:00
Rhys Perry
f68797ead7
aco: create v_mac_legacy_f32/v_fmac_legacy_f32
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00
Rhys Perry
43e32ad074
aco: consider legacy multiplications in optimizer
...
Optimize omod, -(a*b), b2f(a)*b, a*1, a*0 and create MAD/FMA.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436 >
2022-01-20 22:54:42 +00:00