fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-20 20:20:18 +01:00

Author	SHA1	Message	Date
Rhys Perry	cf5fc4b973	aco: disallow SMEM offsets that are not multiples of 4 These can't be encoded on GFX6/7, and combining these additions causes CTS failures on GFX10.3. I think the low 2 MSBs are ignored before the addition, not after, so load(a + 3, 0) becomes load(a, 3), which is the same as load(a, 0). No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13755>	2021-12-17 22:14:36 +00:00
Rhys Perry	94603786c5	aco: fix check_vop3_operands() for f16vec2 ffma fneg combine For v_pk_fma_f16, we should consider all three operands, not the first two. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `15a375b4c8` ("radv,aco: don't lower some ffma instructions") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14229>	2021-12-17 11:16:12 +00:00
Rhys Perry	165ca5088b	radv,aco: implement nir_op_ffma Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>	2021-12-13 11:22:33 +00:00
Rhys Perry	f4f5d577fc	aco: swap operands if necessary to create v_madak/v_fmaak Also rewrite the check_literal logic to be more straightforward. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>	2021-12-13 11:22:33 +00:00
Rhys Perry	2665320c78	aco: create v_fmamk_f32/v_fmaak_f32 from nir_op_ffma Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>	2021-12-13 11:22:33 +00:00
Rhys Perry	a487747ebd	aco: use more predictable tiebreaker when forming MADs fossil-db (GFX10.3): Totals from 84981 (58.10% of 146267) affected shaders: VGPRs: 3829896 -> 3820480 (-0.25%); split: -0.33%, +0.08% CodeSize: 270860472 -> 270850132 (-0.00%); split: -0.08%, +0.08% MaxWaves: 2035822 -> 2042516 (+0.33%); split: +0.39%, -0.06% Instrs: 51285526 -> 51308869 (+0.05%); split: -0.03%, +0.08% Latency: 931503706 -> 932556231 (+0.11%); split: -0.19%, +0.30% InvThroughput: 217084232 -> 217070849 (-0.01%); split: -0.12%, +0.11% fossil-db (GFX10): Totals from 85520 (58.47% of 146267) affected shaders: VGPRs: 3729132 -> 3725344 (-0.10%); split: -0.21%, +0.10% CodeSize: 272796500 -> 272783084 (-0.00%); split: -0.09%, +0.08% MaxWaves: 2246410 -> 2249012 (+0.12%); split: +0.17%, -0.05% Instrs: 51643962 -> 51664865 (+0.04%); split: -0.04%, +0.08% Latency: 932331949 -> 933274979 (+0.10%); split: -0.19%, +0.29% InvThroughput: 214187040 -> 214130994 (-0.03%); split: -0.13%, +0.11% fossil-db (GFX9): Totals from 84619 (57.80% of 146401) affected shaders: SGPRs: 5366240 -> 5366944 (+0.01%); split: -0.09%, +0.10% VGPRs: 3765608 -> 3764972 (-0.02%); split: -0.23%, +0.22% CodeSize: 263634732 -> 263616320 (-0.01%); split: -0.08%, +0.08% MaxWaves: 546617 -> 547091 (+0.09%); split: +0.18%, -0.09% Instrs: 51426195 -> 51458334 (+0.06%); split: -0.03%, +0.10% Latency: 1164445660 -> 1161923480 (-0.22%); split: -0.46%, +0.24% InvThroughput: 542964697 -> 542329595 (-0.12%); split: -0.26%, +0.14% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9805>	2021-12-13 11:22:33 +00:00
Rhys Perry	65a78b2252	aco: properly update use counts if a extract is still used Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13909>	2021-11-29 18:52:12 +00:00
Samuel Pitoiset	add883bf9b	aco: fix right shift of exponent 32 detected by UBSAN src/amd/compiler/aco_optimizer.cpp:1316:17: runtime error: shift exponent 32 is too large for 32-bit type 'unsigned int' Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13951>	2021-11-25 16:15:30 +00:00
Rhys Perry	9bc0fc89c8	aco: disable mul(cndmask(0, 1, b), a) optimization sometimes This optimization doesn't work for SDWA or DPP multiplications and we can't do it if denormal flushing is required because v_cndmask_b32 doesn't do that and we can't do it if we can't assume operands are finite because 0.0 * inf is NaN, not 0. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13434>	2021-10-20 07:35:52 +00:00
Timur Kristóf	ca6ef505ff	aco/optimizer: Skip SDWA on v_lshlrev when unnecessary in apply_extract. In the following cases: - lower 16 bits are extracted and the shift amount is 16 or more - lower 8 bits are extracted and the shift amount is 24 or more the undesireable upper bits are already shifted out, and therefore there is no need to add SDWA to the v_lshlrev instruction. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 58239 (45.27% of 128647) affected shaders: CodeSize: 153498624 -> 153265616 (-0.15%); split: -0.15%, +0.00% Instrs: 29636304 -> 29578064 (-0.20%); split: -0.20%, +0.00% Latency: 136931496 -> 136876379 (-0.04%); split: -0.04%, +0.00% InvThroughput: 21134367 -> 21078861 (-0.26%); split: -0.26%, +0.00% Copies: 2777550 -> 2777548 (-0.00%); split: -0.00%, +0.00% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13121>	2021-10-12 16:27:50 +00:00
Daniel Schürmann	40a93e271c	aco: clang-format No changes, just formatting. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13087>	2021-09-28 19:48:00 +00:00
Timur Kristóf	d3e0cf3d32	aco: Omit p_extract after ds_read with matching bit size. Fossil DB stats on Sienna Cichlid: Totals from 135 (0.10% of 128647) affected shaders: CodeSize: 525184 -> 523704 (-0.28%) Instrs: 92835 -> 92684 (-0.16%) Latency: 311528 -> 311055 (-0.15%) InvThroughput: 86572 -> 86455 (-0.14%) Copies: 7666 -> 7650 (-0.21%) Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 58374 (45.38% of 128647) affected shaders: CodeSize: 160322912 -> 159622564 (-0.44%) Instrs: 30755822 -> 30639193 (-0.38%) Latency: 136713768 -> 136690360 (-0.02%) InvThroughput: 21739219 -> 21658151 (-0.37%) Copies: 3297969 -> 3297953 (-0.00%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11560>	2021-09-28 17:59:27 +00:00
Timur Kristóf	f2e41eda9e	aco: Add ability to optimize v_lshl + v_sub into v_mad_i32_i24. Also change combine_add_lshl to use check_vop3_operands instead of its own checks of the operands. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12786>	2021-09-20 12:39:03 +02:00
Filip Gawin	6b5e9352ef	aco: cleanup assignment of unique_ptrs Reviewed-by: Joshua Ashton <joshua@froggi.es> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12903>	2021-09-18 11:09:24 +00:00
Daniel Schürmann	0988f7b9ba	aco: remove explicit dst_preserve flag Instead, we can rely on the fact that subdword definitions must preserve the unused bits while dword definitions either pad or sign-extend. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>	2021-09-02 20:39:17 +02:00
Daniel Schürmann	9e3ff06c38	aco: rewrite SDWA selector This commit introduces a new struct SubdwordSel in order to ease and clean up the usage of SDWA selections. This includes removing the distinction between register-allocated and fixed SDWA selections. Instead, SDWA selections can now also access the high bits of subdword variables. Alignment and sizes are validated accordingly. Size, offset and sign_extend can be evaluated via helper methods. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12640>	2021-09-02 20:39:17 +02:00
Rhys Perry	33ddbd220f	aco: remove DPP when applying constants/literals/sgprs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>	2021-08-31 16:58:20 +00:00
Rhys Perry	e27946ca11	aco: don't constant propagate to DPP instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12601>	2021-08-31 16:58:20 +00:00
Timur Kristóf	76b9dd6266	aco: Unset 16 and 24-bit flags from operands in apply_extract. Consider the following sequence in a shader: b = p_extract a c = v_mad_u32_u16 b, X, 0 The optimizer applies extract, resulting in: c = v_mad_u32_u16 a, X, 0 (correct) Then it mistakenly turns that into: c = v_mul_u32_u24 a, X, 0 (incorrect) In this case, the p_extract is applied to v_mad_u32_u16 by apply_extract. After this, we can no longer be sure that the operands are still 16 or 24-bit, so we have to remove this flag. No Fossil DB changes. Fixes: `54292e99c7` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12558>	2021-08-30 14:05:33 +00:00
Daniel Schürmann	2eeaaabb8e	aco/optimizer: combine v_pk_mul_u16 + v_pk_add_u16 -> v_pk_mad_u16 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>	2021-08-27 19:57:59 +00:00
Daniel Schürmann	be16ebc5ca	aco/optimizer: fuse v_mul_f64 + v_add_f64 -> v_fma_f64 No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>	2021-08-27 19:57:59 +00:00
Daniel Schürmann	8e27ca9953	aco/optimizer: combine v_mul_lo_u16 + v_add_u16 -> v_mad_u16 Totals from 192 (0.13% of 150170) affected shaders: (GFX10.3) CodeSize: 1027224 -> 1019872 (-0.72%) Instrs: 174784 -> 173863 (-0.53%) Latency: 4235742 -> 4232177 (-0.08%); split: -0.11%, +0.03% InvThroughput: 1777026 -> 1775945 (-0.06%); split: -0.09%, +0.03% Copies: 34098 -> 34099 (+0.00%); split: -0.03%, +0.03% PreVGPRs: 4920 -> 4850 (-1.42%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>	2021-08-27 19:57:59 +00:00
Daniel Schürmann	23d5865f42	aco: refactor nir_op_imul selection Previously, the optimization to use v_mul_lo_u16 for 32bit multiplications was done in instruction_selection. This was moved to the optimizer to ease some case distinctions. The mixed results are due to increased use of SDWA. Totals from 2616 (1.74% of 150170) affected shaders: (GFX10.3) VGPRs: 143888 -> 143872 (-0.01%); split: -0.02%, +0.01% CodeSize: 5604032 -> 5604080 (+0.00%); split: -0.01%, +0.01% Instrs: 1086798 -> 1083915 (-0.27%); split: -0.27%, +0.01% Latency: 8215793 -> 8213023 (-0.03%); split: -0.10%, +0.07% InvThroughput: 20765157 -> 20773766 (+0.04%); split: -0.02%, +0.06% VClause: 35256 -> 35260 (+0.01%); split: -0.02%, +0.03% SClause: 29021 -> 29024 (+0.01%); split: -0.00%, +0.01% Copies: 74163 -> 74306 (+0.19%); split: -0.05%, +0.24% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>	2021-08-27 19:57:59 +00:00
Daniel Schürmann	d8eef134d8	aco: only apply extract if not used more than 4 times Totals from 61 (0.04% of 150170) affected shaders: (GFX10.3) CodeSize: 1087732 -> 1087380 (-0.03%); split: -0.22%, +0.18% Instrs: 192343 -> 192205 (-0.07%); split: -0.16%, +0.09% Latency: 7231670 -> 7148073 (-1.16%); split: -1.19%, +0.04% InvThroughput: 3436715 -> 3394926 (-1.22%); split: -1.25%, +0.04% VClause: 4831 -> 4833 (+0.04%) Copies: 50130 -> 49934 (-0.39%); split: -0.67%, +0.28% Branches: 5945 -> 5948 (+0.05%) PreSGPRs: 3486 -> 3472 (-0.40%) PreVGPRs: 5154 -> 5152 (-0.04%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11678>	2021-08-27 19:57:59 +00:00
Timur Kristóf	abcc83e713	aco: Fix to_uniform_bool_instr when operands are not suitable. Don't attempt to transform uniform boolean instructions when their operands are unsuitable. This can happen eg. due to other optimizations that combine SALU instructions which clear out the uniform instruction labels. Cc: mesa-stable Fixes: `8a32f57fff` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11573>	2021-08-25 12:43:50 +00:00
Rhys Perry	2201f5a58c	aco: remove label_extract if the extract is used by a non-VALU If an extract is used by a non-VALU instruction, it can't be applied to all instructions, so it's not beneficial to try to apply it. This check isn't needed because can_apply_extract()/can_use_SDWA() should already handle non-VALU instructions. fossil-db (Sienna Cichlid): Totals from 1020 (0.68% of 150170) affected shaders: SpillSGPRs: 1577 -> 1571 (-0.38%) CodeSize: 7863668 -> 7858336 (-0.07%); split: -0.07%, +0.00% Instrs: 1431583 -> 1431083 (-0.03%); split: -0.04%, +0.01% Latency: 25891250 -> 25890916 (-0.00%); split: -0.01%, +0.01% InvThroughput: 7248683 -> 7248655 (-0.00%); split: -0.01%, +0.01% SClause: 49072 -> 49071 (-0.00%) Copies: 126649 -> 126580 (-0.05%); split: -0.11%, +0.06% Branches: 39129 -> 39120 (-0.02%); split: -0.03%, +0.01% PreSGPRs: 53071 -> 52943 (-0.24%); split: -0.26%, +0.02% PreVGPRs: 57437 -> 57435 (-0.00%); split: -0.01%, +0.01% fossil-db (Polaris10): Totals from 654 (0.43% of 151696) affected shaders: CodeSize: 5814552 -> 5811568 (-0.05%); split: -0.05%, +0.00% Instrs: 1105783 -> 1105049 (-0.07%); split: -0.07%, +0.00% Latency: 20261458 -> 20259744 (-0.01%); split: -0.01%, +0.00% InvThroughput: 9011785 -> 9011749 (-0.00%); split: -0.00%, +0.00% Copies: 104693 -> 103904 (-0.75%); split: -0.76%, +0.00% PreSGPRs: 36105 -> 36095 (-0.03%); split: -0.03%, +0.01% PreVGPRs: 43813 -> 43809 (-0.01%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12212>	2021-08-23 14:56:37 +01:00
Rhys Perry	2e6834d4f6	aco: combine DPP into VALU before RA Mostly helps a bunch of Cyberpunk 2077 shaders. Catches some of the cases that the post-RA can't optimize because of register assignment. fossil-db (Siena Cichlid): Totals from 25 (0.02% of 150170) affected shaders: CodeSize: 78808 -> 75764 (-3.86%) Instrs: 14311 -> 13547 (-5.34%) Latency: 278697 -> 277885 (-0.29%) InvThroughput: 63428 -> 62754 (-1.06%) Copies: 1348 -> 1349 (+0.07%); split: -0.07%, +0.15% PreVGPRs: 1035 -> 1011 (-2.32%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>	2021-08-19 18:17:33 +00:00
Rhys Perry	b97cfd72af	aco: handle DPP in the optimizer There are a bunch of optimizations that are broken when DPP is involved. fossil-db (Sienna Cichlid): Totals from 100 (0.07% of 150170) affected shaders: CodeSize: 325204 -> 325192 (-0.00%); split: -0.06%, +0.05% Instrs: 62773 -> 62664 (-0.17%); split: -0.18%, +0.00% Latency: 295348 -> 295266 (-0.03%); split: -0.03%, +0.00% InvThroughput: 73990 -> 73946 (-0.06%); split: -0.06%, +0.01% Copies: 1650 -> 1609 (-2.48%); split: -2.55%, +0.06% PreSGPRs: 3554 -> 3520 (-0.96%) Fossil-db changes are probably because v_sub_f32_dpp(v_mul_f32) is no longer being combined into MAD and then split back into separate instructions. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>	2021-08-19 18:17:33 +00:00
Rhys Perry	1d894a8c85	aco: move a bunch of helpers into aco_ir.h/aco_ir.cpp Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11924>	2021-08-19 18:17:33 +00:00
Rhys Perry	211d1dfd34	aco: don't create v_madmk_f32/v_madak_f32 from v_fma_legacy_f16 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5105 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12004>	2021-07-22 15:43:31 +00:00
Daniel Schürmann	9b1a296172	aco/optimizer: ensure to not erase high bits when propagating packed constants Packed constants with non-zero values in the high half might have been propagated as 16 bit, dropping the high half. Cc: mesa-stable Closes: #5070 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11954>	2021-07-20 07:48:39 +00:00
Timur Kristóf	60c5abf685	aco: Remove s_and with exec when all lanes are active. This helps NGG GS and culling shaders. No Fossil DB changes without NGG culling. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>	2021-07-16 14:31:54 +00:00
Tony Wasserka	66e51dc474	aco: Remove use of deprecated Operand constructors This migration was done with libclang-based automatic tooling, which performed these replacements: * Operand(uint8_t) -> Operand::c8 * Operand(uint16_t) -> Operand::c16 * Operand(uint32_t, false) -> Operand::c32 * Operand(uint32_t, bool) -> Operand::c32_or_c64 * Operand(uint64_t) -> Operand::c64 * Operand(0) -> Operand::zero(num_bytes) Casts that were previously used for constructor selection have automatically been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)). Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>	2021-07-13 17:43:26 +00:00
Daniel Schürmann	1e2639026f	aco: Format. Manually adjusted some comments for more intuitive line breaks. Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>	2021-07-12 21:27:31 +00:00
Daniel Schürmann	3f9e986d33	aco: add missing Licenses and remove Authors from files Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>	2021-07-12 12:09:31 +00:00
Daniel Schürmann	59fdaa1985	aco: reorder and cleanup #includes Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>	2021-07-12 12:09:31 +00:00
Timur Kristóf	5713e059ea	aco: Add validation for v_permlane instructions. Previously there hasn't been any validation for these instructions, but after shooting myself in the leg with it a few times, I decided to add the validation now. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Rhys Perry	54292e99c7	aco: optimize 32-bit extracts and inserts using SDWA Still need to use dst_u=preserve field to optimize packs fossil-db (Sienna Cichlid): Totals from 15974 (10.66% of 149839) affected shaders: VGPRs: 1009064 -> 1008968 (-0.01%); split: -0.03%, +0.02% SpillSGPRs: 7959 -> 7964 (+0.06%) CodeSize: 101716436 -> 101159568 (-0.55%); split: -0.55%, +0.01% MaxWaves: 284464 -> 284490 (+0.01%); split: +0.02%, -0.01% Instrs: 19334216 -> 19224241 (-0.57%); split: -0.57%, +0.00% Latency: 375465295 -> 375230478 (-0.06%); split: -0.14%, +0.08% InvThroughput: 79006105 -> 78860705 (-0.18%); split: -0.25%, +0.07% fossil-db (Polaris): Totals from 11369 (7.51% of 151365) affected shaders: SGPRs: 787920 -> 787680 (-0.03%); split: -0.04%, +0.01% VGPRs: 681056 -> 681040 (-0.00%); split: -0.01%, +0.00% CodeSize: 68127288 -> 67664120 (-0.68%); split: -0.69%, +0.01% MaxWaves: 54370 -> 54371 (+0.00%) Instrs: 13294638 -> 13214109 (-0.61%); split: -0.62%, +0.01% Latency: 373515759 -> 373214571 (-0.08%); split: -0.11%, +0.03% InvThroughput: 166529524 -> 166275291 (-0.15%); split: -0.20%, +0.05% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:43 +00:00
Rhys Perry	2f94353735	aco: add p_extract/p_insert These will let us make the SDWA optimizer much simpler than if we were to recognize combinations of shift/and/bfe. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Rhys Perry	3013670dfd	aco: disallow SGPRs on DPP instructions They can't be encoded. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10841>	2021-05-19 14:25:37 +00:00
Rhys Perry	e3c283e0bc	aco: use -1.0x and 1.0\|x\| for fneg/fabs Besides -1.0*x being 1 dword smaller than x^0x80000000, this commit also improves generated code when the application requires that denormals are flushed. Future versions of DXVK will require that 32-bit denormals are flushed. fossil-db (GFX8): Totals from 21021 (14.22% of 147787) affected shaders: SGPRs: 1288960 -> 1288944 (-0.00%); split: -0.01%, +0.01% VGPRs: 792672 -> 792848 (+0.02%); split: -0.01%, +0.03% CodeSize: 62439228 -> 62403552 (-0.06%); split: -0.11%, +0.05% MaxWaves: 136182 -> 136181 (-0.00%); split: +0.00%, -0.00% Instrs: 12230882 -> 12239927 (+0.07%); split: -0.01%, +0.08% fossil-db (GFX10.3): Totals from 20191 (13.80% of 146267) affected shaders: VGPRs: 799992 -> 800032 (+0.01%) CodeSize: 59763656 -> 59715484 (-0.08%); split: -0.12%, +0.03% MaxWaves: 525378 -> 525376 (-0.00%) Instrs: 11511082 -> 11517419 (+0.06%); split: -0.00%, +0.06% fossil-db (GFX8, d3d float controls): Totals from 87160 (58.98% of 147787) affected shaders: SGPRs: 5395072 -> 5408480 (+0.25%); split: -0.06%, +0.31% VGPRs: 3596716 -> 3581592 (-0.42%); split: -0.55%, +0.13% CodeSize: 271347396 -> 266814460 (-1.67%); split: -1.67%, +0.00% MaxWaves: 539669 -> 540400 (+0.14%); split: +0.15%, -0.02% Instrs: 53395194 -> 52257505 (-2.13%); split: -2.13%, +0.00% fossil-db (GFX10.3, d3d float controls): Totals from 82306 (56.27% of 146267) affected shaders: VGPRs: 3572312 -> 3558848 (-0.38%); split: -0.44%, +0.06% CodeSize: 273494748 -> 269648968 (-1.41%); split: -1.41%, +0.00% MaxWaves: 2007156 -> 2009950 (+0.14%); split: +0.15%, -0.01% Instrs: 52251568 -> 51356424 (-1.71%); split: -1.71%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9079>	2021-03-24 14:02:41 +00:00
Rhys Perry	561fcfb50f	aco: don't optimize min(a*1.0, ...) to min(a, ...) on GFX8 fossil-db (GFX8): Totals from 2 (0.00% of 147787) affected shaders: VMEM: 662 -> 642 (-3.02%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9079>	2021-03-24 14:02:41 +00:00
Daniel Schürmann	fc3606f29c	aco/optimizer: set VCC hint on new v_cmp_* definitions Totals from 11692 (7.99% of 146267) affected shaders (Navi10): CodeSize: 97419384 -> 97352560 (-0.07%); split: -0.07%, +0.00% Instrs: 18571138 -> 18570969 (-0.00%); split: -0.00%, +0.00% Cycles: 1431348400 -> 1431346296 (-0.00%); split: -0.00%, +0.00% SMEM: 696646 -> 696650 (+0.00%) SClause: 668511 -> 668490 (-0.00%); split: -0.00%, +0.00% Copies: 1279475 -> 1279474 (-0.00%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9531>	2021-03-18 17:15:00 +00:00
Rhys Perry	3d4c13f3b8	aco: add DeviceInfo Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:44:22 +00:00
Rhys Perry	0f178290ca	aco: don't affect isPrecise() after applying output modifiers fossil-db (GFX10.3): Totals from 26679 (19.14% of 139391) affected shaders: SGPRs: 1757155 -> 1757059 (-0.01%); split: -0.05%, +0.04% VGPRs: 1175932 -> 1173556 (-0.20%); split: -0.21%, +0.01% CodeSize: 86203592 -> 85572480 (-0.73%); split: -0.73%, +0.00% MaxWaves: 315513 -> 315805 (+0.09%); split: +0.10%, -0.00% Instrs: 16297785 -> 16143745 (-0.95%); split: -0.95%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8718>	2021-01-26 22:22:58 +00:00
Rhys Perry	2f0d480c73	aco: optimize out a*1.0 if it's used as a float fossil-db (GFX10): Totals from 370 (0.27% of 139391) affected shaders: CodeSize: 641436 -> 634156 (-1.13%); split: -1.14%, +0.00% Instrs: 117668 -> 115739 (-1.64%); split: -1.64%, +0.00% fossil-db (GFX10.3): Totals from 370 (0.27% of 139391) affected shaders: SGPRs: 26888 -> 26912 (+0.09%) VGPRs: 13964 -> 13916 (-0.34%) CodeSize: 692008 -> 679008 (-1.88%) MaxWaves: 4779 -> 4783 (+0.08%) Instrs: 134265 -> 132026 (-1.67%); split: -1.67%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Rhys Perry	54a09545ec	aco: optimize a*0.0 fossil-db (GFX10): Totals from 1943 (1.39% of 139391) affected shaders: SGPRs: 99952 -> 99544 (-0.41%); split: -0.44%, +0.03% VGPRs: 60880 -> 60272 (-1.00%); split: -1.02%, +0.02% CodeSize: 5138488 -> 5107500 (-0.60%); split: -0.61%, +0.01% MaxWaves: 32193 -> 32380 (+0.58%) Instrs: 983178 -> 975684 (-0.76%); split: -0.77%, +0.01% fossil-db (GFX10.3): Totals from 1943 (1.39% of 139391) affected shaders: SGPRs: 99832 -> 99648 (-0.18%); split: -0.25%, +0.06% VGPRs: 64708 -> 63944 (-1.18%); split: -1.27%, +0.09% CodeSize: 5196732 -> 5157632 (-0.75%); split: -0.76%, +0.00% MaxWaves: 27478 -> 27486 (+0.03%); split: +0.06%, -0.03% Instrs: 1007222 -> 998737 (-0.84%); split: -0.84%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Rhys Perry	0c3d8e8e2e	aco: disable a*1.0 optimization if the instruction is precise fossil-db (GFX10): Totals from 10370 (7.44% of 139391) affected shaders: SGPRs: 564072 -> 564016 (-0.01%); split: -0.01%, +0.00% VGPRs: 248312 -> 248532 (+0.09%); split: -0.02%, +0.11% CodeSize: 12866732 -> 13208904 (+2.66%); split: -0.00%, +2.66% MaxWaves: 190198 -> 190170 (-0.01%) Instrs: 2460818 -> 2545351 (+3.44%) fossil-db (GFX10.3): Totals from 10370 (7.44% of 139391) affected shaders: SGPRs: 563904 -> 564272 (+0.07%); split: -0.16%, +0.22% VGPRs: 289344 -> 295016 (+1.96%); split: -0.88%, +2.84% CodeSize: 13519204 -> 14197020 (+5.01%); split: -0.00%, +5.01% MaxWaves: 155946 -> 154566 (-0.88%) Instrs: 2719177 -> 2806919 (+3.23%); split: -0.00%, +3.23% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Rhys Perry	e115b01948	aco: return references in instruction cast methods Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>	2021-01-22 14:12:33 +00:00
Rhys Perry	1d245cd18b	aco: use format-check methods Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>	2021-01-22 14:12:32 +00:00

1 2 3 4 5 ...

307 commits