fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-20 22:30:12 +01:00

Author	SHA1	Message	Date
Rhys Perry	48b7beb7b0	aco: add and use get_buffer_store_op() helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	936b70c8cf	aco: refactor visit_store_scratch() to use new helpers Should support 8/16-bit stores now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	18817041f7	aco: refactor visit_store_global() to use new helpers Should support 8/16-bit stores now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	c7bd69b3ae	aco: refactor visit_store_ssbo() to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	f75c830433	aco: refactor store_vmem_mubuf() to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	98b4cc7110	aco: refactor store_lds() to use new helpers It should also work correctly for 8/16-bit stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	562353e1f1	aco: add helpers for splitting stores split_store_data() splits a vector and p_as_uniforms it if needed. scan_write_mask()/advance_write_mask() are similar to u_bit_scan_consecutive_range(), but makes it easier to only clear part of the range and will also give ranges for zero'd bits. split_buffer_store() is a helper for splitting VMEM/SMEM stores. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	211a9f2057	aco: use emit_load helper for VMEM/SMEM loads Also implements 8/16-bit loads for scratch/global. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	57e6886f98	aco: refactor load_lds to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	542733dbbf	aco: add emit_load helper This helper is used for recombining split loads, passing the result to p_as_uniform, aligning the offset down and shifting it right if needed and handling large constant offsets. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	b77d638e1b	aco: add and use RegClass::get() helper Eventually, we'll probably want to replace the current RegClass(type, size) constructor with this. This has a functional change in that get_reg_class() now creates v1/v2 instead of v4b/v8b. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	69b92db131	aco: be more careful about using SMEM for load_global Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Timur Kristóf	62ff2ff808	aco: Move s_setprio to correct place after the gs_alloc_req. Previously the setprio was inside the branch, so it would only reset the priority on the first wave, but not the others. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	277f37d036	aco: Use 24-bit multiplication for NGG wave id and thread id. Both of them should always fit 24 bits anyway. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	eafc1e7365	aco: Use 24-bit multiplication in TCS I/O The TCS inputs and outputs must always fit into the LDS, which implies that their addresses also always fit 24 bits. On AMD GPUs, 24-bit multiplication is much faster than 32-bit multiplication, so we can take the opportunity to use that for TCS I/O instead. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	64332a0937	aco: Const correctness for aco_print_ir. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	0c0691d43e	aco: Const correctness for get_barrier_interaction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	f321dc33c8	aco: Abort when RA can't find a register. Previously, it was just unreachable, which means it will generate invalid shaders when it encounters a situation when it can't allocate registers for eg. a large load. This commit makes it slightly easier to notice such problems without triggering a GPU hang. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	f2e7aee244	aco: Increase barrier_count to 7 to include barrier_barrier. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	25775d346c	aco: Only store TCS outputs to VMEM when they are read by TES. Totals from affected shaders (GFX10): Code Size: 10832 -> 10736 (-0.89 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Rhys Perry	ede1c171c5	aco: fix outdated label_vec from p_create_vector labelling Fixes random dEQP-VK.transform_feedback.fuzz.* crashes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `2dc550202e` ('aco: copy-propagate p_create_vector copies of vectors') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4730>	2020-04-24 12:21:15 +00:00
Rhys Perry	665250e830	aco: fix v_or(s_lshl) and v_add(s_lshl) optimizations Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `d1621834f3` ('aco: combine VALU and SALU into various VOP3 instructions') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2822 Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4717>	2020-04-24 08:55:19 +00:00
Rhys Perry	0d9fe0405f	aco: improve code for 32-bit isign No shader-db changes on Navi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	d1621834f3	aco: combine VALU and SALU into various VOP3 instructions shader-db (Navi): Totals from 2916 (2.28% of 127638) affected shaders: SGPRs: 184427 -> 184283 (-0.08%); split: -0.10%, +0.02% VGPRs: 143520 -> 143640 (+0.08%); split: -0.00%, +0.09% CodeSize: 14913548 -> 14913288 (-0.00%); split: -0.00%, +0.00% MaxWaves: 26034 -> 26012 (-0.08%) Instrs: 2935435 -> 2930960 (-0.15%); split: -0.15%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	607fb4153d	aco: move call to store_output_to_temps in store_ls_or_es_output earlier Skips get_intrinsic_io_basic_offset() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	b497b774a5	aco: remove copy in load_input_from_temps() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	2dc550202e	aco: copy-propagate p_create_vector copies of vectors Instead of copying the operands of the other p_create_vector and labelling the definition with label_vec, copy the operands and label it with label_temp so that it can be copy-propagated. This was found while removing a redundant copy in load_input_from_temps() which removed duplicate p_create_vector instructions. shader-db (Navi): Totals from 139 (0.11% of 127638) affected shaders: VGPRs: 8472 -> 7948 (-6.19%) CodeSize: 514592 -> 512368 (-0.43%) MaxWaves: 1089 -> 1195 (+9.73%) Instrs: 100214 -> 99658 (-0.55%) Cycles: 400856 -> 398632 (-0.55%) VMEM: 15545 -> 15338 (-1.33%) Copies: 5140 -> 4584 (-10.82%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>	2020-04-23 12:39:33 +00:00
Rhys Perry	e4383b5c7f	aco: decrease the uses of other copy operations after splitting/removing For copies like v[7:8] = v[8:9], what currently happens is: - do_copy() will skip the second dword - the uses of the second dword will be reduced to 0 - the copy operation will be removed from the map and v8 will never be set to v9. So just decrease the uses of other operations after splitting or removing the current operation, so: "v8 = v9" will be split off, it's uses reduced and then the new copy will be done in the next iteration. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4686>	2020-04-23 11:39:23 +00:00
Daniel Schürmann	36e0d2f39b	aco: coalesce v_mad's accumulator with definition's affinities Totals from affected shaders: Code Size: 8922676 -> 8915192 (-0.08 %) bytes Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	d000d76f13	aco: use upper part of gap in register file if it is beneficial for striding Totals from affected shaders: SGPRS: 1717288 -> 1716984 (-0.02 %) VGPRS: 1305924 -> 1304904 (-0.08 %) Code Size: 138508892 -> 138420144 (-0.06 %) bytes Max Waves: 115726 -> 115735 (0.01 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	d666d83be2	aco: try to always find a register with stride for even sizes Totals from affected shaders: SGPRS: 1162400 -> 1162400 (0.00 %) VGPRS: 947364 -> 946960 (-0.04 %) Code Size: 98399300 -> 98399004 (-0.00 %) bytes Max Waves: 74665 -> 74682 (0.02 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	5a3c1f4f0b	aco: stop get_reg_simple after reaching max_used_gpr Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	2796cb4c24	aco: refactor get_reg_simple() to return early on exact matches in the best fit algorithm Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	6792e134f3	aco: don't create vector affinities for operands which are not killed or are duplicates Totals from affected shaders: SGPRS: 825184 -> 825184 (0.00 %) VGPRS: 697640 -> 697240 (-0.06 %) Code Size: 79244104 -> 79201072 (-0.05 %) bytes Max Waves: 42388 -> 42386 (-0.00 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	edc2b57ac1	aco: allocate full register for subdword definitions if HW doesn't support it Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	97a870cf88	aco: move attempt to find strided register into get_reg_simple() This simplifies code and helps some shaders Totals from affected shaders: Code Size: 51227172 -> 51202216 (-0.05 %) bytes Max Waves: 19955 -> 19948 (-0.04 %) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	c7f97f110c	aco: use DefInfo in more places to simplify RA Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	734f86db6b	aco: create and use DefInfo struct in RA for maintaining all information necessary to find a register. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	5b2f628da3	aco: create pseudo dummy instruction in RA to be used for live-range splits Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	d9f7d1d5cb	aco: refactor get_reg() to also handle affinities This simplifies definition handling and helps a few shaders Totals from affected shaders: Code Size: 659540 -> 659376 (-0.02 %) bytes Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:23 +00:00
Daniel Schürmann	7c8f4ebca9	aco: refactor get_reg() to take Temp instead of RegClass This patch also moves get_reg_specified() and get_reg_vec() before get_reg() to make use of it later. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:22 +00:00
Daniel Schürmann	0a9ed98178	aco: simplify operand handling in RA Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4573>	2020-04-22 18:23:22 +00:00
Rhys Perry	f13049f48a	aco: implement 64-bit sgpr swaps In our pipeline-db, helps almost exclusively Detroit: Become Human. Totals from 6726 (5.36% of 125503) affected shaders: CodeSize: 74680952 -> 74102228 (-0.77%) Instrs: 14551507 -> 14406001 (-1.00%) Cycles: 1748272436 -> 1690173104 (-3.32%) VMEM: 964671 -> 964058 (-0.06%) Copies: 1993312 -> 1847806 (-7.30%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Rhys Perry	2ab45f41e0	aco: implement sub-dword swaps Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Rhys Perry	83fdb1ed3d	aco: add VOP3P_instruction The optimizer isn't yet updated to handle this, since lower_to_hw_instr will be the only user for now. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Rhys Perry	8fc24f9a45	aco: fix copy statistic for 64-bit vgpr constant copy The statistic is in units of instructions. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4469>	2020-04-22 13:25:17 +00:00
Daniel Schürmann	c3c1f4d6bc	aco: move src1 to vgpr instead of using VOP3 for VOP2 instructions during isel Is simpler and helps a couple of shaders. Totals from affected shaders: (Vega) Code Size: 16341296 -> 16335460 (-0.04 %) bytes Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>	2020-04-20 15:12:50 +00:00
Daniel Schürmann	be0bb7e101	aco: fix 64bit fsub Fixes: `425558bfd5` ('aco: use v_subrev_f32 for fsub with an sgpr operand in src1') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4642>	2020-04-20 15:12:50 +00:00
Daniel Schürmann	425558bfd5	aco: use v_subrev_f32 for fsub with an sgpr operand in src1 This fixes an accidentally introduced regression. Fixes: `9be4be515f` ('aco: implement 16-bit nir_op_fsub/nir_op_fadd') Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4633>	2020-04-19 16:16:27 +00:00
Samuel Pitoiset	c4ca9e66dd	aco: fix exporting the viewport index if the fragment shader needs it It's like the layer, it has to be exported via the pos and also as a varying if the fragment shader reads it. Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_* Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>	2020-04-17 16:23:24 +00:00

... 4 5 6 7 8 ...

845 commits