fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-22 10:58:08 +02:00

Author	SHA1	Message	Date
Timur Kristóf	277f37d036	aco: Use 24-bit multiplication for NGG wave id and thread id. Both of them should always fit 24 bits anyway. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	eafc1e7365	aco: Use 24-bit multiplication in TCS I/O The TCS inputs and outputs must always fit into the LDS, which implies that their addresses also always fit 24 bits. On AMD GPUs, 24-bit multiplication is much faster than 32-bit multiplication, so we can take the opportunity to use that for TCS I/O instead. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	64332a0937	aco: Const correctness for aco_print_ir. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	0c0691d43e	aco: Const correctness for get_barrier_interaction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	f321dc33c8	aco: Abort when RA can't find a register. Previously, it was just unreachable, which means it will generate invalid shaders when it encounters a situation when it can't allocate registers for eg. a large load. This commit makes it slightly easier to notice such problems without triggering a GPU hang. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	f2e7aee244	aco: Increase barrier_count to 7 to include barrier_barrier. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	25775d346c	aco: Only store TCS outputs to VMEM when they are read by TES. Totals from affected shaders (GFX10): Code Size: 10832 -> 10736 (-0.89 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Rhys Perry	ede1c171c5	aco: fix outdated label_vec from p_create_vector labelling Fixes random dEQP-VK.transform_feedback.fuzz.* crashes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `2dc550202e` ('aco: copy-propagate p_create_vector copies of vectors') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4730>	2020-04-24 12:21:15 +00:00
Rhys Perry	665250e830	aco: fix v_or(s_lshl) and v_add(s_lshl) optimizations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `d1621834f3` ('aco: combine VALU and SALU into various VOP3 instructions') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2822 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4717>	2020-04-24 08:55:19 +00:00
Rhys Perry	0d9fe0405f	aco: improve code for 32-bit isign No shader-db changes on Navi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	d1621834f3	aco: combine VALU and SALU into various VOP3 instructions shader-db (Navi): Totals from 2916 (2.28% of 127638) affected shaders: SGPRs: 184427 -> 184283 (-0.08%); split: -0.10%, +0.02% VGPRs: 143520 -> 143640 (+0.08%); split: -0.00%, +0.09% CodeSize: 14913548 -> 14913288 (-0.00%); split: -0.00%, +0.00% MaxWaves: 26034 -> 26012 (-0.08%) Instrs: 2935435 -> 2930960 (-0.15%); split: -0.15%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	607fb4153d	aco: move call to store_output_to_temps in store_ls_or_es_output earlier Skips get_intrinsic_io_basic_offset() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	b497b774a5	aco: remove copy in load_input_from_temps() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	2dc550202e	aco: copy-propagate p_create_vector copies of vectors Instead of copying the operands of the other p_create_vector and labelling the definition with label_vec, copy the operands and label it with label_temp so that it can be copy-propagated. This was found while removing a redundant copy in load_input_from_temps() which removed duplicate p_create_vector instructions. shader-db (Navi): Totals from 139 (0.11% of 127638) affected shaders: VGPRs: 8472 -> 7948 (-6.19%) CodeSize: 514592 -> 512368 (-0.43%) MaxWaves: 1089 -> 1195 (+9.73%) Instrs: 100214 -> 99658 (-0.55%) Cycles: 400856 -> 398632 (-0.55%) VMEM: 15545 -> 15338 (-1.33%) Copies: 5140 -> 4584 (-10.82%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	e4383b5c7f	aco: decrease the uses of other copy operations after splitting/removing For copies like v[7:8] = v[8:9], what currently happens is: - do_copy() will skip the second dword - the uses of the second dword will be reduced to 0 - the copy operation will be removed from the map and v8 will never be set to v9. So just decrease the uses of other operations after splitting or removing the current operation, so: "v8 = v9" will be split off, it's uses reduced and then the new copy will be done in the next iteration. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4686>	2020-04-23 11:39:23 +00:00
Daniel Schürmann	36e0d2f39b	aco: coalesce v_mad's accumulator with definition's affinities Totals from affected shaders: Code Size: 8922676 -> 8915192 (-0.08 %) bytes Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	d000d76f13	aco: use upper part of gap in register file if it is beneficial for striding Totals from affected shaders: SGPRS: 1717288 -> 1716984 (-0.02 %) VGPRS: 1305924 -> 1304904 (-0.08 %) Code Size: 138508892 -> 138420144 (-0.06 %) bytes Max Waves: 115726 -> 115735 (0.01 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	d666d83be2	aco: try to always find a register with stride for even sizes Totals from affected shaders: SGPRS: 1162400 -> 1162400 (0.00 %) VGPRS: 947364 -> 946960 (-0.04 %) Code Size: 98399300 -> 98399004 (-0.00 %) bytes Max Waves: 74665 -> 74682 (0.02 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	5a3c1f4f0b	aco: stop get_reg_simple after reaching max_used_gpr Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	2796cb4c24	aco: refactor get_reg_simple() to return early on exact matches in the best fit algorithm Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	6792e134f3	aco: don't create vector affinities for operands which are not killed or are duplicates Totals from affected shaders: SGPRS: 825184 -> 825184 (0.00 %) VGPRS: 697640 -> 697240 (-0.06 %) Code Size: 79244104 -> 79201072 (-0.05 %) bytes Max Waves: 42388 -> 42386 (-0.00 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	edc2b57ac1	aco: allocate full register for subdword definitions if HW doesn't support it Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	97a870cf88	aco: move attempt to find strided register into get_reg_simple() This simplifies code and helps some shaders Totals from affected shaders: Code Size: 51227172 -> 51202216 (-0.05 %) bytes Max Waves: 19955 -> 19948 (-0.04 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	c7f97f110c	aco: use DefInfo in more places to simplify RA Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	734f86db6b	aco: create and use DefInfo struct in RA for maintaining all information necessary to find a register. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	5b2f628da3	aco: create pseudo dummy instruction in RA to be used for live-range splits Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	d9f7d1d5cb	aco: refactor get_reg() to also handle affinities This simplifies definition handling and helps a few shaders Totals from affected shaders: Code Size: 659540 -> 659376 (-0.02 %) bytes Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	7c8f4ebca9	aco: refactor get_reg() to take Temp instead of RegClass This patch also moves get_reg_specified() and get_reg_vec() before get_reg() to make use of it later. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:22 +00:00
Daniel Schürmann	0a9ed98178	aco: simplify operand handling in RA Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:22 +00:00
Rhys Perry	f13049f48a	aco: implement 64-bit sgpr swaps In our pipeline-db, helps almost exclusively Detroit: Become Human. Totals from 6726 (5.36% of 125503) affected shaders: CodeSize: 74680952 -> 74102228 (-0.77%) Instrs: 14551507 -> 14406001 (-1.00%) Cycles: 1748272436 -> 1690173104 (-3.32%) VMEM: 964671 -> 964058 (-0.06%) Copies: 1993312 -> 1847806 (-7.30%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Rhys Perry	2ab45f41e0	aco: implement sub-dword swaps Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Rhys Perry	83fdb1ed3d	aco: add VOP3P_instruction The optimizer isn't yet updated to handle this, since lower_to_hw_instr will be the only user for now. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Rhys Perry	8fc24f9a45	aco: fix copy statistic for 64-bit vgpr constant copy The statistic is in units of instructions. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Daniel Schürmann	c3c1f4d6bc	aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel Is simpler and helps a couple of shaders. Totals from affected shaders: (Vega) Code Size: 16341296 -> 16335460 (-0.04 %) bytes Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>	2020-04-20 15:12:50 +00:00
Daniel Schürmann	be0bb7e101	aco: fix 64bit fsub Fixes: `425558bfd5` ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>	2020-04-20 15:12:50 +00:00
Daniel Schürmann	425558bfd5	aco: use v_subrev_f32 for fsub with an sgpr operand in src1 This fixes an accidentally introduced regression. Fixes: `9be4be515f` ('aco: implement 16-bit nir_op_fsub/nir_op_fadd') Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4633>	2020-04-19 16:16:27 +00:00
Samuel Pitoiset	c4ca9e66dd	aco: fix exporting the viewport index if the fragment shader needs it It's like the layer, it has to be exported via the pos and also as a varying if the fragment shader reads it. Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_* Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>	2020-04-17 16:23:24 +00:00
Rhys Perry	839c886b34	aco: add missing scc clobber to nir_op_unpack_32_2x16_split_y The ISA doc is inconsistent whether this instruction writes SCC. It does. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>	2020-04-16 17:04:53 +01:00
Rhys Perry	ac74367bef	aco: implement various 8/16-bit conversions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4552>	2020-04-16 17:04:45 +01:00
Samuel Pitoiset	11faaf646d	aco: fix emitting stream output with tess eval shaders Fixes dEQP-VK.transform_feedback.simple.winding_patch_list_12. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4553>	2020-04-16 07:57:39 +00:00
Samuel Pitoiset	91aa596ca7	aco: implement nir_op_f2i8/nir_op_f2u8 I think we should really refactor the conversions path. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4551>	2020-04-16 08:47:49 +02:00
Rhys Perry	c818b5c089	aco: fix 1D textureGrad() on GFX9 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `6f718edced` ('aco: simplify gathering of MIMG address components') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4550>	2020-04-15 10:45:07 +00:00
Samuel Pitoiset	08a396033b	aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents v_frexp_exp_i16_f16 returns the two's complement for negative exponents. For example, with 0.333252 it returns 0.666504 for the mantissa and 65535 for the exponent (-1 in decimal). RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with the sign extension bit set. The latter is probably what we want to do in long term but for now RA doesn't support changing non-SDWA instructions to SDWA if useful/needed. Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546>	2020-04-15 10:12:44 +02:00
Rhys Perry	fbd2be3f5d	aco: clear moved operands in get_reg_create_vector() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>	2020-04-14 10:49:12 +00:00
Rhys Perry	52cc1f8237	aco: improve p_create_vector RA for sub-dword operands These's still improvements needed for sub-dword definitions, but that's not as simple. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>	2020-04-14 10:49:12 +00:00
Rhys Perry	e18711cda3	aco: fix p_extract_vector validation Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>	2020-04-14 10:49:12 +00:00
Rhys Perry	41ac44e1b3	aco: improve vector optimization with sub-dword vectors Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4507>	2020-04-14 10:49:12 +00:00
Daniel Schürmann	28d36d26c2	aco: fix p_extract_vector optimization in presence of unequally sized vector operands Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4506>	2020-04-13 16:35:40 +00:00
Samuel Pitoiset	fc1068de0d	aco: fix nir_op_pack_32_2x16_split if one operand is a constant Because 16-bit constants are represented with the s1 RegClass, we have to extract the low half. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509>	2020-04-13 11:51:17 +00:00
Samuel Pitoiset	4cfaef68d7	aco: implement 16-bit nir_op_f2i64/nir_op_f2u64 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4509>	2020-04-13 11:51:17 +00:00

1 2 3 4 5 ...

582 commits