fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 05:00:09 +01:00

Author	SHA1	Message	Date
Rhys Perry	ede1c171c5	aco: fix outdated label_vec from p_create_vector labelling Fixes random dEQP-VK.transform_feedback.fuzz.* crashes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `2dc550202e` ('aco: copy-propagate p_create_vector copies of vectors') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4730>	2020-04-24 12:21:15 +00:00
Rhys Perry	665250e830	aco: fix v_or(s_lshl) and v_add(s_lshl) optimizations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `d1621834f3` ('aco: combine VALU and SALU into various VOP3 instructions') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2822 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4717>	2020-04-24 08:55:19 +00:00
Rhys Perry	d1621834f3	aco: combine VALU and SALU into various VOP3 instructions shader-db (Navi): Totals from 2916 (2.28% of 127638) affected shaders: SGPRs: 184427 -> 184283 (-0.08%); split: -0.10%, +0.02% VGPRs: 143520 -> 143640 (+0.08%); split: -0.00%, +0.09% CodeSize: 14913548 -> 14913288 (-0.00%); split: -0.00%, +0.00% MaxWaves: 26034 -> 26012 (-0.08%) Instrs: 2935435 -> 2930960 (-0.15%); split: -0.15%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	2dc550202e	aco: copy-propagate p_create_vector copies of vectors Instead of copying the operands of the other p_create_vector and labelling the definition with label_vec, copy the operands and label it with label_temp so that it can be copy-propagated. This was found while removing a redundant copy in load_input_from_temps() which removed duplicate p_create_vector instructions. shader-db (Navi): Totals from 139 (0.11% of 127638) affected shaders: VGPRs: 8472 -> 7948 (-6.19%) CodeSize: 514592 -> 512368 (-0.43%) MaxWaves: 1089 -> 1195 (+9.73%) Instrs: 100214 -> 99658 (-0.55%) Cycles: 400856 -> 398632 (-0.55%) VMEM: 15545 -> 15338 (-1.33%) Copies: 5140 -> 4584 (-10.82%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	41ac44e1b3	aco: improve vector optimization with sub-dword vectors Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>	2020-04-14 10:49:12 +00:00
Daniel Schürmann	28d36d26c2	aco: fix p_extract_vector optimization in presence of unequally sized vector operands Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4506>	2020-04-13 16:35:40 +00:00
Daniel Schürmann	a39df3bfce	aco: don't constant-propagate into subdword PSEUDO instructions PSEUDO instructions are lowered using SDWA, and thus, cannot take literals and before GFX9 cannot take constants at all. As the in-register representation differs between 32bit and 16bit floats, we first need to ensure correct behavior. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4492>	2020-04-10 07:19:27 +00:00
Rhys Perry	20a4b1461b	aco: zero-initialize Temp Fixes dEQP-VK.transform_feedback.* crashes from accesses garbage temporaries in emit_extract_vector(). Fixes: `85521061` ("aco: prepare helper functions for subdword handling") Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4463>	2020-04-06 19:15:19 +00:00
Daniel Schürmann	0bb3537676	aco: don't assume split_vector(create_vector) has the same number of elements when optimizing Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002>	2020-04-03 23:13:15 +01:00
Daniel Schürmann	c436743b0c	aco: don't propagate SGPRs into subdword PSEUDO instructions Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4002>	2020-04-03 23:13:15 +01:00
Samuel Pitoiset	c953292630	aco: always optimize v_mad to v_madak in presence of literals v_mad and v_madak are both 64-bit instructions, so it doesn't increase code size to always apply a 32-bit literal instead of using v_mad and a sgpr which contains that literal. Found with some Youngblood shaders but help some other games. vkpipeline-db (VEGA10): Totals from affected shaders: SGPRS: 46168 -> 46016 (-0.33 %) VGPRS: 45576 -> 45564 (-0.03 %) Code Size: 5187208 -> 5179584 (-0.15 %) bytes Max Waves: 3297 -> 3297 (0.00 %) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410>	2020-04-03 07:30:49 +00:00
Timur Kristóf	655c050119	aco: Fix combining DS additions in the optimizer. Previously, it was calculated incorrectly for 64-bit writes and reads. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>	2020-03-11 08:34:10 +00:00
Rhys Perry	2d1ba86382	aco: handle v_add_co_u32_e64 in parse_base_offset() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3902>	2020-03-03 18:31:06 +00:00
Rhys Perry	483d4ec57c	aco: improve SCC handling in some SALU combines Add some checks and remove some unnecessary checks. Found by observation. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>	2020-02-12 19:18:45 +00:00
Rhys Perry	d45e9451cf	aco: disable some instruction combining if it could change an exec operand Found by observation. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3599>	2020-02-12 19:18:40 +00:00
Samuel Pitoiset	ddd767387f	aco: fix creating v_madak if v_mad_f32 has two sgpr literals Do not ignore that src1 can be a sgpr. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2435 Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3759>	2020-02-11 07:17:31 +00:00
Timur Kristóf	4d34abd15c	aco/optimizer: Don't combine uniform bool s_and to s_andn2. Fixes: `8a32f57fff` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3714>	2020-02-05 22:53:45 +00:00
Daniel Schürmann	71440ba0f5	aco: reorder VMEM operands in ACO IR For all VMEM instructions, the resource constant is now in operands[0]. For MIMG instructions, the sampler shares operands[1] with write data in case this instruction writes memory. Moving the VADDR to be the last operand for MIMG is the first step to support Navi NSA encoding. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Daniel Schürmann	396be00640	aco: fix combine_salu_not_bitwise() when SCC is used Previously, we didn't use the SCC bit, and thus, we didn't care about it. With 'aco: Transform uniform bitwise instructions to 32-bit if possible.' that changed, so that we have to handle it. Fixes: `8a32f57fff` ('aco: Transform uniform bitwise instructions to 32-bit if possible.') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3598>	2020-01-28 18:14:02 +01:00
Rhys Perry	2dc63d39d3	aco: fix literal application with v_cndmask_b32/v_addc_co_u32/etc No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `0be7409069` ('aco: rewrite literal combining') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Rhys Perry	827681f921	aco: always add sgprs to sgpr_ids when choosing literals Even if it's a literal, we should add this to sgpr_ids. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `0be7409069` ('aco: rewrite literal combining') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Timur Kristóf	8a32f57fff	aco: Transform uniform bitwise instructions to 32-bit if possible. This allows removing superfluous s_cselect instructions that come from turning booleans into 64-bit vector condition. v2 by Daniel Schürmann: - Make the code massively simpler v3 by Timur Kristóf: - Fix regressions, make it work in wave32 mode - Eliminate extra moves by not always using the SCC definition - Use s_absdiff_i32 for uniform XOR - Skip the transformation for uncommon or invalid instructions Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3450>	2020-01-24 14:40:45 +00:00
Timur Kristóf	23edcf6490	aco: Make a better guess at which instructions need the VCC hint. Previously, bool_to_vector_condition would always set the VCC hint on its result. This commit improves it by having the optimizer set the VCC hint only when the result really needs to be in the VCC. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3451>	2020-01-24 13:14:23 +00:00
Samuel Pitoiset	b8abfafe86	aco: fix constant folding of SMRD instructions on GFX6 SMRD instructions have an 8-bit dword offset on SI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3432>	2020-01-20 16:24:55 +00:00
Rhys Perry	e151398de6	aco: fix stack buffer overflow in apply_sgprs() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `cef7879719` ('aco: rewrite apply_sgprs()') Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2361 Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3442>	2020-01-20 11:13:11 +00:00
Samuel Pitoiset	a445cb35bd	aco: do not combine additions of DS instructions on GFX6 The offset field doesn't work as expected on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3412>	2020-01-16 14:06:06 +00:00
Timur Kristóf	dfaa3c0af6	aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible. When possible, get rid of an s_not when all it does is invert the SCC, and its successor s_cbranch / s_cselect can be inverted instead. Also modify some parts of instruction_selection to take advantage of this feature. Example: s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406 s2: %3902 = s_cselect_b64 -1, 0, %3900:scc s2: %407, s1: %3903:scc = s_not_b64 %3902 s2: %3906, s1: %3905:scc = s_and_b64 %407, %0:exec p_cbranch_z %3905:scc Can now be optimized to: s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406 p_cbranch_nz %3900:scc Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	c0f82165a7	aco: Optimize out s_and with exec, when used on uniform bitwise values. Previously all booleans needed an s_and with exec when they were turned into a scalar condition. However, this is not needed for uniform booleans. v2 by Daniel Schürmann: - Make the code more readable v3 by Timur Kristóf: - Fix regressions, make it work in wave32 mode Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	1c44129db3	aco: Don't skip combine_instruction when definitions[1] is used. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Timur Kristóf	d962bbd895	aco: Implement 64-bit constant propagation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2020-01-14 21:21:06 +01:00
Rhys Perry	f978e0e516	aco: add integer min/max to can_swap_operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	92ace0bb31	aco: replace extract_vector with copies Helps a small number of small shaders with situations like this: a = p_create_vector ... b = p_extract_vector a, 3 and copy propagation can't be done Totals from affected shaders: SGPRS: 14304 -> 14416 (0.78 %) VGPRS: 8716 -> 6592 (-24.37 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 184664 -> 176888 (-4.21 %) bytes Max Waves: 6260 -> 6260 (0.00 %) Instructions: 35561 -> 33617 (-5.47 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	e686e4765e	aco: add min(-max(), ) and max(-min(), ) optimization No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	fa8357eb70	aco: improve clamp optimization Not sure why it checked the use count, it doesn't apply the constants. pipeline-db (Navi): Totals from affected shaders: SGPRS: 269409 -> 269745 (0.12 %) VGPRS: 238120 -> 238132 (0.01 %) Spilled SGPRs: 305 -> 305 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 22908584 -> 22904672 (-0.02 %) bytes Max Waves: 20217 -> 20217 (0.00 %) Instructions: 4275312 -> 4263869 (-0.27 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 155409 -> 155233 (-0.11 %) VGPRS: 153072 -> 153072 (0.00 %) Spilled SGPRs: 269 -> 269 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 14650824 -> 14650396 (-0.00 %) bytes Max Waves: 9609 -> 9609 (0.00 %) Instructions: 2762802 -> 2755517 (-0.26 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	edc888ccb1	aco: fix clamp optimization We can't do the optimization if there are neg/abs in-between. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	f664cb01ec	aco: improve creation of v_madmk_f32/v_madak_f32 Using needs_vop3 check was flawed because it would only combine the literal if the first operand is the literal. If the second or third operand is the literal, then needs_vop3 will be true and the literal will not be combined. pipeline-db (Navi): Totals from affected shaders: SGPRS: 782051 -> 782051 (0.00 %) VGPRS: 630048 -> 630048 (0.00 %) Spilled SGPRs: 195 -> 195 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 54743740 -> 54585548 (-0.29 %) bytes Max Waves: 67340 -> 67340 (0.00 %) Instructions: 10182030 -> 10182030 (0.00 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 701990 -> 699590 (-0.34 %) VGPRS: 566632 -> 566784 (0.03 %) Spilled SGPRs: 218 -> 218 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 49173564 -> 49007856 (-0.34 %) bytes Max Waves: 59650 -> 59612 (-0.06 %) Instructions: 9315135 -> 9293330 (-0.23 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	15e25da3e5	aco: take advantage of GFX10's constant bus limit and VOP3 literals pipeline-db (Navi): Totals from affected shaders: SGPRS: 2397159 -> 2392494 (-0.19 %) VGPRS: 1756036 -> 1753920 (-0.12 %) Spilled SGPRs: 461 -> 470 (1.95 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 110287304 -> 109946304 (-0.31 %) bytes Max Waves: 318341 -> 318475 (0.04 %) Instructions: 21019327 -> 20533618 (-2.31 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 0 -> 0 (0.00 %) bytes Max Waves: 0 -> 0 (0.00 %) Instructions: 0 -> 0 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	9c2d37308f	aco: allow an extra SGPR with multiple uses to be applied to VOP3 This is in a separate patch from the apply_sgprs() rewrite so that the rewrite can be more easily tested. pipeline-db (Navi): Totals from affected shaders: SGPRS: 3056 -> 3056 (0.00 %) VGPRS: 1632 -> 1632 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 156468 -> 156304 (-0.10 %) bytes Max Waves: 288 -> 288 (0.00 %) Instructions: 29510 -> 29469 (-0.14 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 1616 -> 1616 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 156132 -> 155968 (-0.11 %) bytes Max Waves: 289 -> 289 (0.00 %) Instructions: 29426 -> 29385 (-0.14 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	f4c2c90e1a	aco: allow applying two sgprs to an instruction We could create VALU instructions which read two sgprs, but only if isel created an instruction which already read one of them. This change is in a separate patch from the apply_sgprs() rewrite so that it can be tested if the rewrite affected anything. pipeline-db (Navi): Totals from affected shaders: SGPRS: 216 -> 216 (0.00 %) VGPRS: 64 -> 64 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 1756 -> 1708 (-2.73 %) bytes Max Waves: 120 -> 120 (0.00 %) Instructions: 312 -> 300 (-3.85 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 216 -> 216 (0.00 %) VGPRS: 64 -> 64 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 1784 -> 1736 (-2.69 %) bytes Max Waves: 120 -> 120 (0.00 %) Instructions: 319 -> 307 (-3.76 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	7da07ca3e4	aco: follow through temporary when merging tests into constant comparisons This can happen with v_mov_b32(s_mov_b32(literal)) pipeline-db (Navi): Totals from affected shaders: SGPRS: 632 -> 632 (0.00 %) VGPRS: 492 -> 492 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 77488 -> 76928 (-0.72 %) bytes Max Waves: 67 -> 67 (0.00 %) Instructions: 14426 -> 14332 (-0.65 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 632 -> 632 (0.00 %) VGPRS: 492 -> 492 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 77512 -> 76952 (-0.72 %) bytes Max Waves: 67 -> 67 (0.00 %) Instructions: 14432 -> 14338 (-0.65 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	dc6c35e1c3	aco: be more careful with literals in combine_salu_{n2,lshl_add} No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	fcf52eb42d	aco: add check_vop3_operands() This will be useful when taking advantage of GFX10 features. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	cef7879719	aco: rewrite apply_sgprs() This will make it easier to apply two different sgprs (for GFX10) or apply the same sgpr twice (just remove the break). No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	0be7409069	aco: rewrite literal combining Should make taking advantage of GFX10's increased constant bus limit and VOP3 literals easier. No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	84b9f3786b	aco: improve can_use_VOP3() No pipeline-db changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	3cb98ed939	aco: combine two sgprs into a VALU if they're the same This was supposed to be done before but it wasn't done correctly and everywhere. pipeline-db (Navi): Totals from affected shaders: SGPRS: 784680 -> 786128 (0.18 %) VGPRS: 574012 -> 573892 (-0.02 %) Spilled SGPRs: 461 -> 461 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 45477088 -> 45478172 (0.00 %) bytes Max Waves: 81294 -> 81277 (-0.02 %) Instructions: 8657970 -> 8622483 (-0.41 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 780664 -> 782072 (0.18 %) VGPRS: 573880 -> 573760 (-0.02 %) Spilled SGPRs: 629 -> 629 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 45445244 -> 45448340 (0.01 %) bytes Max Waves: 81178 -> 81161 (-0.02 %) Instructions: 8649902 -> 8614918 (-0.40 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	c240c1aecf	aco: apply literals to split mads Removing the return is also needed to apply literals to mads (which can be done on GFX10). pipeline-db (Navi): Totals from affected shaders: SGPRS: 368787 -> 367555 (-0.33 %) VGPRS: 312436 -> 312448 (0.00 %) Spilled SGPRs: 461 -> 461 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 26113388 -> 26098260 (-0.06 %) bytes Max Waves: 35982 -> 35982 (0.00 %) Instructions: 5038670 -> 5028941 (-0.19 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 369843 -> 368659 (-0.32 %) VGPRS: 317224 -> 317196 (-0.01 %) Spilled SGPRs: 629 -> 629 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 26310540 -> 26295156 (-0.06 %) bytes Max Waves: 36324 -> 36326 (0.01 %) Instructions: 5073957 -> 5064164 (-0.19 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>	2020-01-14 12:56:28 +00:00
Rhys Perry	809c8feb92	aco: check if multiplication/clamp is live when applying output modifier It's possible that a multiplication/clamp is dead code and the single use is from a different user. Fixes portal rendering in Path of Exile when global illumination is enabled. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	ef8abfa790	aco: disable add combining for ds_swizzle_b32 ds_bpermute_b32/ds_permute_b32 are fine, I think Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00
Rhys Perry	69bed1c918	aco: don't DCE atomics with return values We don't create atomics with definitions if they are not used in NIR, but our own DCE can remove the uses if an export turns out to be null. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `93c8ebfa78` ('aco: Initial commit of independent AMD compiler') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>	2020-01-13 13:26:43 +00:00

1 2 3 4

175 commits