fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 18:00:13 +01:00

Author	SHA1	Message	Date
James Park	93094b8c5e	aco: Remove nonstandard parentheses Remove parentheses in cases where a parenthesized type is followed by an initializer list. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7785>	2020-12-01 11:08:21 +00:00
Tony Wasserka	2bb8874320	aco: Fix -Wshadow warnings Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>	2020-11-20 09:29:19 +00:00
Rhys Perry	e54c111c45	aco: always use p_parallelcopy for pre-RA copies Most fossil-db changes are because literals are applied earlier (in label_instruction), so use counts are more accurate and more literals are applied. fossil-db (Navi): Totals from 79551 (57.89% of 137413) affected shaders: SGPRs: 4549610 -> 4542802 (-0.15%); split: -0.19%, +0.04% VGPRs: 3326764 -> 3324172 (-0.08%); split: -0.10%, +0.03% SpillSGPRs: 38886 -> 34562 (-11.12%); split: -11.14%, +0.02% CodeSize: 240143456 -> 240001008 (-0.06%); split: -0.11%, +0.05% MaxWaves: 1078919 -> `1079281` (+0.03%); split: +0.04%, -0.01% Instrs: 46627073 -> 46528490 (-0.21%); split: -0.22%, +0.01% fossil-db (Polaris): Totals from 98463 (70.90% of 138881) affected shaders: SGPRs: 5164689 -> 5164353 (-0.01%); split: -0.02%, +0.01% VGPRs: 3920936 -> 3921856 (+0.02%); split: -0.00%, +0.03% SpillSGPRs: 56298 -> 52259 (-7.17%); split: -7.22%, +0.04% CodeSize: 258680092 -> 258692712 (+0.00%); split: -0.02%, +0.03% MaxWaves: 620863 -> 620823 (-0.01%); split: +0.00%, -0.01% Instrs: 50776289 -> 50757577 (-0.04%); split: -0.04%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Samuel Pitoiset	408195ec53	aco: remove useless occurences of radv_nir_compiler_options Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>	2020-10-14 15:09:34 +00:00
Rhys Perry	ec2185c598	aco: keep track of temporaries' regclasses in the Program A future change will switch the liveness sets to bit vectors, which don't contain regclass information. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6733>	2020-09-21 13:47:28 +00:00
Rhys Perry	e8ac14527a	aco: keep loop live-through variables spilled fossil-db (Navi): Totals from 3149 (2.32% of 135946) affected shaders: VGPRs: 280928 -> 280932 (+0.00%) SpillSGPRs: 51133 -> 30042 (-41.25%) CodeSize: 43063076 -> 41377252 (-3.91%); split: -3.92%, +0.00% Instrs: 8278435 -> 8037133 (-2.91%); split: -2.92%, +0.00% Cycles: 709575456 -> 683366172 (-3.69%); split: -3.69%, +0.00% VMEM: 542887 -> 542937 (+0.01%); split: +0.05%, -0.04% SMEM: 210255 -> 206368 (-1.85%); split: +0.12%, -1.97% SClause: 258847 -> 258019 (-0.32%); split: -0.52%, +0.20% Copies: 731836 -> 684784 (-6.43%); split: -6.44%, +0.01% Branches: 305422 -> 292844 (-4.12%); split: -4.12%, +0.00% PreSGPRs: 333103 -> 332701 (-0.12%) PreVGPRs: 280086 -> 280089 (+0.00%) Helps mostly Detroit: Become Human and the single spilling Doom Eternal shader. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	75d6c30572	aco: fix spills_entry heuristic for branch blocks in init_live_in_vars() fossil-db (Navi): Totals from 222 (0.16% of 135946) affected shaders: SpillSGPRs: 9121 -> 9117 (-0.04%) SpillVGPRs: 2820 -> 1821 (-35.43%) CodeSize: 5134264 -> 5053336 (-1.58%); split: -1.63%, +0.05% Instrs: 953435 -> 938761 (-1.54%); split: -1.59%, +0.05% Cycles: 100567688 -> 97252432 (-3.30%); split: -3.34%, +0.04% VMEM: 40752 -> 39219 (-3.76%); split: +0.04%, -3.80% SMEM: 15416 -> 15509 (+0.60%); split: +0.64%, -0.03% VClause: 20120 -> 19091 (-5.11%) SClause: 23540 -> 23544 (+0.02%); split: -0.11%, +0.12% Copies: 125912 -> 122017 (-3.09%); split: -3.36%, +0.26% Branches: 31131 -> 30009 (-3.60%) Mostly affects parallel-rdp ubershaders. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	1a5444b900	aco: don't consider the first partial spill if it's the wrong type Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	8f6a900d5e	aco: consider branch definitions in spiller Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: 20.2 <mesa-stable> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>	2020-08-26 13:26:58 +00:00
Rhys Perry	d1f992f3c2	aco: rework barriers and replace can_reorder fossil-db (Navi): Totals from 273 (0.21% of 132058) affected shaders: CodeSize: 937472 -> 936556 (-0.10%) Instrs: 158874 -> 158648 (-0.14%) Cycles: 13563516 -> 13562612 (-0.01%) VMEM: 85246 -> 85244 (-0.00%) SMEM: 21407 -> 21310 (-0.45%); split: +0.05%, -0.50% VClause: 9321 -> 9317 (-0.04%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>	2020-07-28 16:56:34 +00:00
Rhys Perry	d169f09e37	aco: be more careful combining additions that could wrap into loads/stores SMEM does the addition with 64-bits, not 32. So if the original code relied on wrapping around (for example, for subtraction), it would break. Apparently swizzled MUBUF accesses also have issues with combining additions that could overflow. Normal MUBUF accesses seem fine. fossil-db (Navi): Totals from 27219 (20.02% of 135946) affected shaders: CodeSize: 128303256 -> 131062756 (+2.15%); split: -0.00%, +2.15% Instrs: 24818911 -> 25280558 (+1.86%); split: -0.01%, +1.87% VMEM: 162311926 -> 177226874 (+9.19%); split: +9.36%, -0.17% SMEM: 18182559 -> 20218734 (+11.20%); split: +11.53%, -0.34% VClause: 423635 -> 424398 (+0.18%); split: -0.02%, +0.20% SClause: 865384 -> 1104986 (+27.69%); split: -0.00%, +27.69% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2748 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2720>	2020-07-21 18:25:35 +00:00
Rhys Perry	b85ef04324	aco: add add_interference() helper This won't add interferences between spill ids of different types and will exit early if there's already an interference. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5805>	2020-07-16 16:22:57 +00:00
Rhys Perry	2c7554fe01	aco: use unordered_set for spill id interferences Seems to be faster. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5805>	2020-07-16 16:22:57 +00:00
Rhys Perry	47d7e1e662	aco: rewrite graph coloring in spiller I don't think this is much of an optimization in the typical case, but for very complex shaders this should work much better. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5805>	2020-07-16 16:22:57 +00:00
Rhys Perry	5a941f4d6d	aco: fix underestimated pressure in spiller when a phi has a killed def Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5805>	2020-07-16 16:22:57 +00:00
Samuel Pitoiset	ca51f75f9d	aco: fix more validation errors from vgpr spill/restore code It looks like the attempt to fix this in `1e791e51a6` was incomplete. This fixes crashes with Devil May Cry 5 with a debug build. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5828>	2020-07-10 08:28:33 +02:00
Rhys Perry	1e791e51a6	aco: fix validation error from vgpr spill/restore code Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5504>	2020-06-17 10:57:17 +00:00
Daniel Schürmann	2ae27b96ef	aco: change live_out variables to std::unordered_set Improves performance of live_var_analysis for larger shaders Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4130>	2020-04-09 15:08:57 +00:00
Rhys Perry	c51348bd9b	aco: move some register demand helpers into aco_live_var_analysis.cpp Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>	2020-03-16 16:09:02 +00:00
Albert Astals Cid	2521c81c9e	aco: Minor optimization in spill_ctx constructor 'register_demand' is passed by value and only copied once; consider moving it to avoid unnecessary copies Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3968> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3968>	2020-03-02 12:21:03 +00:00
Daniel Schürmann	71440ba0f5	aco: reorder VMEM operands in ACO IR For all VMEM instructions, the resource constant is now in operands[0]. For MIMG instructions, the sampler shares operands[1] with write data in case this instruction writes memory. Moving the VADDR to be the last operand for MIMG is the first step to support Navi NSA encoding. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Rhys Perry	517fc3abc4	aco: fill reg_demand with sensible information in add_coupling_code() process_block() will use this to determine the register demand of the before the current instruction. Previously, it was filled with zeroes which could result in process_block() only using the register demand of after the current instruction. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	26d2511bcb	aco: improve assertion at the end of spiller Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	4537b97410	aco: don't update demand in add_coupling_code() for loop headers We don't need to update it since it won't be used later. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	521525fc0a	aco: don't consider loop header blocks branch blocks in add_coupling_code Loops without continues create header blocks with only 1 predecessor. CC: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Rhys Perry	590c26beab	aco: fix target calculation when vgpr spilling introduces sgpr spilling A shader might require vgpr spilling but not require sgpr spilling. In that case, the spiller lowers the sgpr target by 5 which could mean sgpr spilling is then required. Then the vgpr target has to be lowered to make space for the linear vgprs. Previously, space wasn't make for the linear vgprs. Found while testing the spiller on the pipeline-db with a lowered limit Fixes: `a7ff1bb5b9` ('aco: simplify calculation of target register pressure when spilling') Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>	2020-01-29 18:02:27 +00:00
Samuel Pitoiset	13b4e9adcf	ac: declare an enum for the OOB select field on GFX10 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>	2019-12-19 15:15:32 +01:00
Timur Kristóf	07754a9c9e	aco/wave32: Replace hardcoded numbers in spiller with wave size. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-12-04 10:36:01 +00:00
Daniel Schürmann	746b9380bd	aco: rematerialize s_movk instructions Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-12 15:59:48 +00:00
Daniel Schürmann	a2a6880743	aco: fix invalid access on Pseudo_instructions Fixes: `93c8ebfa78` aco: Initial commit of independent AMD compiler Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-12 15:59:48 +00:00
Daniel Schürmann	5c7dcb15e0	aco: only use single-dword loads/stores for spilling Fixes: `8678699918` "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-04 20:14:14 +01:00
Daniel Schürmann	d97c0bdd55	aco: fix immediate offset for spills if scratch is used Fixes: `8678699918` "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-04 20:14:14 +01:00
Daniel Schürmann	8678699918	aco: implement VGPR spilling VGPR spilling is implemented via MUBUF instructions and scratch memory. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	b0de16b7de	aco: omit linear VGPRs as spill variables Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	aded548e66	aco: ensure that spilled VGPR reloads are done after p_logical_start Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	a7ff1bb5b9	aco: simplify calculation of target register pressure when spilling Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Rhys Perry	e73de4e1d8	aco: fix new_demand calculation for first instructions Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	93b42a1907	aco: don't add interferences between spilled phi operands Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	fdf8ad0256	aco: consider loop_exit blocks like merge blocks, even if they have only one predecessor Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	d48d72e98a	aco: don't insert the exec mask into set of live-out variables when spilling Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	cd20e29de1	aco: fix transitive affinities of spilled variables Variables spilled on both branch legs need to be assigned to the same spilling slot. These affinities can be transitive through multiple merge blocks. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	0b8216b2cd	aco: Lower to CSSA Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes. Previously, it was possible that phi operands have intersecting live-ranges, and thus, couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to spill phis, even if it was beneficial. This patch implements a conversion pass which is currently only called if spilling is necessary. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:32 +00:00
Rhys Perry	fc04a2fc31	aco: take LDS into account when calculating num_waves pipeline-db (Vega): SGPRS: 344 -> 344 (0.00 %) VGPRS: 424 -> 524 (23.58 %) Spilled SGPRs: 84 -> 80 (-4.76 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 52812 -> 52484 (-0.62 %) bytes LDS: 135 -> 135 (0.00 %) blocks Max Waves: 56 -> 53 (-5.36 %) v2: consider WGP, rework to be clearer and apply the "maximum 16 workgroups per CU" limit properly v2: use "SIMD" instead of "EU" v2: fix spiller by introducing "Program::max_waves" v2: rename "lds_size" to "lds_limit" v3: make max_waves actually independant of register usage v3: fix issue where max_waves was way too high v3: use DIV_ROUND_UP(a, b) instead of max(a / b, 1) v3: rename "workgroups_per_cu" to "workgroups_per_cu_wgp" v4: fix typo from "workgroups_per_cu" rename Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> (v3)	2019-10-23 19:11:21 +01:00
Rhys Perry	08d510010b	aco: increase accuracy of SGPR limits SGPRs are allocated in groups of 16 on GFX8/GFX9. GFX10 allocates a fixed number of SGPRs and has 106 addressable SGPRs. pipeline-db (Vega): SGPRS: 5912 -> 6232 (5.41 %) VGPRS: 1772 -> 1780 (0.45 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 88228 -> 87904 (-0.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 559 -> 571 (2.15 %) piepline-db (Navi): SGPRS: 341256 -> 363384 (6.48 %) VGPRS: 171536 -> 170960 (-0.34 %) Spilled SGPRs: 832 -> 581 (-30.17 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 14207332 -> 14190872 (-0.12 %) bytes LDS: 33 -> 33 (0.00 %) blocks Max Waves: 18072 -> 18251 (0.99 %) v2: unconditionally count vcc as an extra sgpr on GFX10+ v3: pass SGPRs rounded to 8 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-23 19:11:21 +01:00
Daniel Schürmann	93c8ebfa78	aco: Initial commit of independent AMD compiler ACO (short for AMD Compiler) is a new compiler backend with the goal to replace LLVM for Radeon hardware for the RADV driver. ACO currently supports only VS, PS and CS on VI and Vega. There are some optimizations missing because of unmerged NIR changes which may decrease performance. Full commit history can be found at https://github.com/daniel-schuermann/mesa/commits/backend Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Co-authored-by: Rhys Perry <pendingchaos02@gmail.com> Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Co-authored-by: Connor Abbott <cwabbott0@gmail.com> Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com> Co-authored-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 12:10:00 +02:00

1 2

95 commits