fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-24 19:40:10 +01:00

Author	SHA1	Message	Date
Rhys Perry	502a073552	aco: fix NSA following writelane No fossil-db changes on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `c353895c92` ("aco: use non-sequential addressing") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9187>	2021-03-17 12:31:05 +00:00
Rhys Perry	298d400e5c	aco/tests: add test for NSAToVMEMBug Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9187>	2021-03-17 12:31:05 +00:00
Rhys Perry	194f3e4c69	aco: fix NSA MIMG followed by MUBUF/MTBUF No fossil-db changes on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `c353895c92` ("aco: use non-sequential addressing") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9187>	2021-03-17 12:31:05 +00:00
Timur Kristóf	8205cce007	aco: Use ASSERTED to avoid unused variable warning. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9632>	2021-03-16 21:46:52 +00:00
Michel Dänzer	d411691965	aco/tests: Use _exit in child process Since the child process doesn't call exec(), exit() attempted to run atexit handlers registered by the parent process. This could result in the child process hanging in exit() if there were still disk cache threads alive when the parent process called fork(). (The CI runners hit this multiple times when running tests in strace) Fixes: `6a246f5c6d` "aco/tests: Fix deadlock for too large test lists" Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9472>	2021-03-16 15:32:33 +00:00
Rhys Perry	38b2e13766	aco: remove vmem/smem score statistics Replaced by the Latency statistic. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	a0243f5c47	aco: add ACO_DEBUG=perfinfo This prints the program with each instruction's contribution to it's latency and various factors for the calculation of the Inverse Throughput statistic. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	5d6a1095bf	aco: add print option to print program without temporary IDs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	23ecceb160	aco: add latency and inverse throughput statistics Latency is estimanted duration of a single wave, ignoring others in the CU. It is similar to the old cycles statistic except it it's more accurate and considers memory operations. The InvThroughput statistic is a combination of MaxWaves, Latency and the portion of the wave's execution which does not use various resources. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	83ce9407f2	aco: add instruction classes These should mostly match LLVM. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	0af7ff49fd	aco: lower p_constaddr into separate instructions earlier This allows them to be scheduled properly and simplifies the assembler a little. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	ab957bb899	aco: move wait_imm to aco_ir.h Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:34 +00:00
Rhys Perry	7d5643c0fe	aco: track divergent and uniform branch depth Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:30 +00:00
Rhys Perry	8f71be0a7b	aco: simplify loop_nest_depth tracking in isel Keep track of the current loop depth in Program and set the depth inside Program::insert_block() instead of repeating it every time we insert one. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:24 +00:00
Rhys Perry	341dd9d834	aco: set compr for fp16 exports Obviously this didn't affect correctness. Not sure about performance. It also changes enabled_channels to match radeonsi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `f29c81f863` ("aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9459>	2021-03-11 13:54:18 +00:00
Tony Wasserka	97c97781f6	aco: Fix vector::reserve() being called with the wrong size The container is moved from before and hence returns size 0. To get the correct value, the new instruction container must be used instead. This was flagged by clang-tidy. The fixed call still triggers the corresponding diagnostic, hence this change silences it by adding a redundant clear() after move. Fixes: `7f1b537304` ("aco: add new NOP insertion pass for GFX6-9") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9432>	2021-03-08 10:44:20 +01:00
Rhys Perry	7c7e8942f8	radv,aco: remove aco_compiler_statistics This removes a pointer from radv_shader_binary_legacy::data. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9411>	2021-03-05 17:01:16 +00:00
Rhys Perry	3a72044ece	aco: add missing usable_read2 check A Hitman 2 shader does: read64(local_invocation_index() * 4 - 4). This was likely emitting a ds_read2_b32 on GFX6. For local_invocation_index()=0, because the first dword was out-of-bounds, the second was likely also considered out-of-bounds (even though it's not, at offset 0). Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/3882 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `57e6886f98` ("aco: refactor load_lds to use new helpers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Rhys Perry	941739619e	Revert "radv,aco: allow unaligned LDS access on GFX9+" This reverts commit `1a0b0e8460`. The bounds checking behaviour of ds_read_b64, ds_read_b96 and ds_read_b128 make this feature very difficult to use safely. This fixes a blocking artifact in Hitman 2. Previously, it contained: ds_read_b64(local_invocation_index() * 4 - 4) For local_invocation_index()=0, the second dword would be considered out-of-bounds, even though it's at offset 0. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Rob Clark	a9618e7c42	util: Add accessor for util_cpu_caps In release builds, there should be no change, but in debug builds the assert will help us catch undefined behavior resulting from using util_cpu_caps before it is initialized. With fix for u_half_test for MSVC from Jesse Natalie squashed in. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>	2021-02-26 18:31:19 +00:00
Rhys Perry	c3af0c2079	aco: use p_as_uniform for get_sampler_desc and convert_pointer_to_64_bit Since value-numbering no longer works across loops, we no longer need to use v_readfirstlane_b32. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>	2021-02-26 13:33:56 +00:00
Rhys Perry	5f1b354472	aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM We should avoid a situation where a v_readfirstlane_b32 is in WQM but it's source is calculated in Exact. Fixes hang when running Assassin's Creed: Valhalla benchmark. fossil-db (GFX10.3): Totals from 1021 (0.70% of 146267) affected shaders: CodeSize: 7835228 -> 7842992 (+0.10%); split: -0.00%, +0.10% Instrs: 1519208 -> 1521149 (+0.13%); split: -0.00%, +0.13% SClause: 78921 -> 78920 (-0.00%) Copies: 44456 -> 45421 (+2.17%); split: -0.05%, +2.22% Branches: 12987 -> 13933 (+7.28%) PreSGPRs: 47599 -> 47813 (+0.45%) Cycles: 10037540 -> 10045304 (+0.08%); split: -0.00%, +0.08% VMEM: 538381 -> 538777 (+0.07%); split: +0.11%, -0.03% SMEM: 84553 -> 84554 (+0.00%); split: +0.01%, -0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9288>	2021-02-26 13:33:56 +00:00
Daniel Schürmann	690ac7409a	aco/value_numbering: use can_eliminate() function to avoid unnecessary hashmap lookups No fossil-db changes. Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9195>	2021-02-25 11:35:42 +01:00
Daniel Schürmann	fbf791e70c	aco: value number VOPC instructions with different exec masks This becomes possible as long as we do val = s_and_b32/64 exec, val before any subgroup operations. This precautional instruction can be removed by the optimizer if 'val' was computed by a VOPC instruction using the same exec mask. Totals from 59 (0.04% of 146267) affected shaders (Navi10): VGPRs: 2808 -> 2816 (+0.28%) CodeSize: 340888 -> 340852 (-0.01%); split: -0.20%, +0.19% Instrs: 61733 -> 61625 (-0.17%); split: -0.18%, +0.01% Cycles: 470636 -> 469112 (-0.32%); split: -0.33%, +0.01% VMEM: 8091 -> 7993 (-1.21%) SMEM: 2736 -> 2719 (-0.62%); split: +0.29%, -0.91% VClause: 1745 -> 1741 (-0.23%) SClause: 2394 -> 2392 (-0.08%); split: -0.25%, +0.17% Copies: 3249 -> 3253 (+0.12%); split: -0.62%, +0.74% Branches: 1210 -> 1206 (-0.33%) PreSGPRs: 3126 -> 3176 (+1.60%); split: -0.16%, +1.76% Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9195>	2021-02-25 11:35:42 +01:00
Daniel Schürmann	ffebe48013	aco: don't rematerialize exec Since exec is not considered a temporary anymore, we accidentally allowed to rematerialize it. Fixes: `a56ddca4e8` ('aco: make all exec accesses non-temporaries') Closes: #4327 Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9215>	2021-02-23 14:41:31 +00:00
Rhys Perry	75c9adf039	aco/lower_phis: fix all_preds_uniform with continue_or_break Found in a Death Stranding shader with loop unrolling disabled. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `9a089baff1` ("aco: optimize boolean phis with uniform selections") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9193>	2021-02-23 10:46:56 +00:00
Vinson Lee	7cc83f237e	aco: Initialize ds_state.front.writeMask. Fix defect reported by Coverity Scan. Uninitialized scalar variable (UNINIT) uninit_use: Using uninitialized value ds_state.front. Field ds_state.front.writeMask is uninitialized. Fixes: `d488d0fd7b` ("aco: add framework for testing isel and integration tests") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9033>	2021-02-21 19:33:00 -08:00
Timur Kristóf	a6e1178f91	aco: Disallow LSHS temp-only I/O when VS output is written indirectly. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9100>	2021-02-18 12:10:56 +00:00
Timur Kristóf	48f349971f	aco: Fix LDS statistics of tess control shaders. The calculate_tess_lds_size function already returns the size in blocks of the encoding granule, but we forgot to adjust config->lds_size. This variable is not used to actually set the LDS size used for TCS, but by ACO to make scheduling decisions. Fossil DB stats on Sienna Cichlid: Please note that the +3729.43% is NOT a regression. The real LDS size used didn't change, it was just reported incorrectly. Totals from 1342 (0.96% of 139391) affected shaders: VGPRs: 60880 -> 80240 (+31.80%); split: -0.05%, +31.85% CodeSize: 3378456 -> 3381224 (+0.08%); split: -0.23%, +0.31% LDS: 687104 -> 26312192 (+3729.43%) MaxWaves: 29794 -> 23962 (-19.57%) Instrs: 644194 -> 644610 (+0.06%); split: -0.32%, +0.39% Cycles: 2675068 -> 2676804 (+0.06%); split: -0.31%, +0.38% VMEM: 428840 -> 517418 (+20.66%); split: +22.53%, -1.88% SMEM: 91831 -> 88587 (-3.53%); split: +5.70%, -9.23% VClause: 22740 -> 19384 (-14.76%); split: -16.18%, +1.42% SClause: 19116 -> 18373 (-3.89%); split: -4.34%, +0.46% Copies: 66662 -> 63448 (-4.82%); split: -5.55%, +0.73% Fixes: `cf89bdb9ba` "radv: align the LDS size in calculate_tess_lds_size()" Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9098>	2021-02-18 11:57:22 +00:00
Daniel Schürmann	29b866fef6	aco: remove special handling of load_helper_invocation These should now behave the same as is_helper_invocation. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9058>	2021-02-17 21:53:52 +00:00
Rhys Perry	1a0b0e8460	radv,aco: allow unaligned LDS access on GFX9+ fossil-db (GFX10.3): Totals from 223 (0.16% of 139391) affected shaders: SGPRs: 10032 -> 10096 (+0.64%) VGPRs: 7480 -> 7592 (+1.50%) CodeSize: 853960 -> 821920 (-3.75%); split: -3.76%, +0.01% MaxWaves: 5916 -> 5908 (-0.14%) Instrs: 154935 -> 150281 (-3.00%); split: -3.01%, +0.01% Cycles: 3202496 -> 3080680 (-3.80%); split: -3.81%, +0.00% VMEM: 48187 -> 46671 (-3.15%); split: +0.29%, -3.44% SMEM: 13869 -> 13850 (-0.14%); split: +1.52%, -1.66% VClause: 3110 -> 3085 (-0.80%); split: -1.03%, +0.23% SClause: 4376 -> 4381 (+0.11%) Copies: 12132 -> 12065 (-0.55%); split: -2.61%, +2.06% Branches: 5204 -> 5203 (-0.02%) PreVGPRs: 6304 -> 6359 (+0.87%); split: -0.10%, +0.97% See https://reviews.llvm.org/D82788 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8762>	2021-02-17 12:57:12 +00:00
Daniel Schürmann	fc6b5be666	aco: fix assertion in insert_exec_mask pass Fixes: `a56ddca4e8` ('aco: make all exec accesses non-temporaries ') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9047>	2021-02-15 19:50:16 +00:00
Rhys Perry	ddce1ec5f5	aco: fix transition_to_{WQM,Exact} if exec.back() is not in exec This can happen at merge blocks. fossil-db (GFX10.3): Totals from 25229 (17.25% of 146267) affected shaders: CodeSize: 58575920 -> 58571376 (-0.01%); split: -0.01%, +0.00% Instrs: 10979245 -> 10978109 (-0.01%); split: -0.01%, +0.00% SClause: 591817 -> 591816 (-0.00%) Copies: 604987 -> 603851 (-0.19%); split: -0.19%, +0.00% Cycles: 96088796 -> 96084252 (-0.00%); split: -0.00%, +0.00% VMEM: 10470372 -> 10470368 (-0.00%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `a56ddca4e8` ("aco: make all exec accesses non-temporaries") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4299 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9047>	2021-02-15 19:50:16 +00:00
Rhys Perry	3d4c13f3b8	aco: add DeviceInfo Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:44:22 +00:00
Rhys Perry	b759557cac	aco: consider that GFX10.3 allocates LDS in 1024 byte blocks fossil-db (GFX10.3): Totals from 3 (0.00% of 139391) affected shaders: VMEM: 513 -> 511 (-0.39%) SMEM: 94 -> 92 (-2.13%) VClause: 31 -> 30 (-3.23%) fossil-db (GFX10.3, wave32): Totals from 4 (0.00% of 139391) affected shaders: VClause: 82 -> 81 (-1.22%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:35:38 +00:00
Rhys Perry	7ff805a19d	radv,aco: add radv_nir_compiler_options::wgp_mode Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:35:36 +00:00
Rhys Perry	f520f4c299	aco: add Program::wgp_mode Instead of assuming WGP mode on GFX10+ in different places, add a member to Program that can be used instead. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:35:14 +00:00
Rhys Perry	592d64611c	aco: fix waves calculation for wave32 fossil-db (GFX10.3, wave32): Totals from 176 (0.13% of 139391) affected shaders: SGPRs: 16648 -> 16640 (-0.05%) VGPRs: 18920 -> 19076 (+0.82%); split: -0.30%, +1.12% CodeSize: 2354172 -> `2354288` (+0.00%); split: -0.01%, +0.01% MaxWaves: 1618 -> 1627 (+0.56%); split: +0.68%, -0.12% Instrs: 435756 -> 435761 (+0.00%); split: -0.02%, +0.02% Cycles: 8858360 -> 8869960 (+0.13%); split: -0.01%, +0.14% VMEM: 55899 -> 57220 (+2.36%); split: +2.53%, -0.17% SMEM: 10323 -> 10374 (+0.49%); split: +0.73%, -0.23% VClause: 8307 -> 8290 (-0.20%); split: -0.24%, +0.04% SClause: 16573 -> 16577 (+0.02%); split: -0.01%, +0.03% Copies: 24641 -> 24652 (+0.04%); split: -0.24%, +0.28% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:35:13 +00:00
Daniel Schürmann	8b793f9567	aco: remove dead code for the handling of exec temporaries Totals from 26026 (18.67% of 139391) affected shaders (Navi10): PreSGPRs: 370993 -> 326177 (-12.08%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>	2021-02-12 22:41:31 +00:00
Daniel Schürmann	a56ddca4e8	aco: make all exec accesses non-temporaries So that they are not counted into the register demand. Totals from 107336 (77.00% of 139391) affected shaders (Navi10): VGPRs: 4023452 -> 4023248 (-0.01%); split: -0.01%, +0.01% SpillSGPRs: 14088 -> 12571 (-10.77%); split: -11.03%, +0.26% CodeSize: 266816164 -> 266765528 (-0.02%); split: -0.04%, +0.02% MaxWaves: 1553339 -> 1553374 (+0.00%); split: +0.00%, -0.00% Instrs: 50977701 -> 50973093 (-0.01%); split: -0.02%, +0.01% Cycles: 1733911128 -> 1733605320 (-0.02%); split: -0.05%, +0.03% VMEM: 40867650 -> 40900204 (+0.08%); split: +0.13%, -0.05% SMEM: 6835980 -> 6829073 (-0.10%); split: +0.10%, -0.20% VClause: 1032783 -> 1032788 (+0.00%); split: -0.01%, +0.01% SClause: 2103705 -> 2104115 (+0.02%); split: -0.09%, +0.11% Copies: 3195658 -> 3193656 (-0.06%); split: -0.30%, +0.24% Branches: 1140213 -> 1140120 (-0.01%); split: -0.05%, +0.04% PreSGPRs: 3603785 -> 3437064 (-4.63%); split: -5.13%, +0.50% PreVGPRs: 3321996 -> 3321850 (-0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>	2021-02-12 22:41:31 +00:00
Daniel Schürmann	5d7b3bf1a7	aco: handle non-temp phi definitions and operands This will be necessary as we make exec non-temp. No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>	2021-02-12 22:41:31 +00:00
Daniel Schürmann	e663a15098	aco: don't create unnecessary exec phi on merge blocks No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8870>	2021-02-12 22:41:31 +00:00
Daniel Schürmann	44a76ba16d	aco: use VCC as regular SGPR pair on GFX10 There is no need to reserve it for special purposes, only. Totals from 139391 (100.00% of 139391) affected shaders (Navi10): VGPRs: 4738296 -> 4738156 (-0.00%); split: -0.01%, +0.00% SpillSGPRs: 16188 -> 14968 (-7.54%); split: -7.60%, +0.06% CodeSize: 294204472 -> 294118048 (-0.03%); split: -0.04%, +0.01% MaxWaves: 2119584 -> 2119619 (+0.00%); split: +0.00%, -0.00% Instrs: 56075079 -> 56056235 (-0.03%); split: -0.05%, +0.01% Cycles: 1757781564 -> 1755354032 (-0.14%); split: -0.16%, +0.02% VMEM: 52995887 -> 52996319 (+0.00%); split: +0.07%, -0.07% SMEM: 9005338 -> 9004858 (-0.01%); split: +0.16%, -0.17% VClause: 1178436 -> 1178331 (-0.01%); split: -0.02%, +0.01% SClause: 2403649 -> 2404542 (+0.04%); split: -0.14%, +0.18% Copies: 3447073 -> 3432417 (-0.43%); split: -0.66%, +0.23% Branches: 1166542 -> 1166422 (-0.01%); split: -0.11%, +0.10% PreSGPRs: 4229322 -> 4235538 (+0.15%) PreVGPRs: 3817111 -> 3817040 (-0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00
Daniel Schürmann	112f389261	aco: don't abort() if disassembly fails We used that to catch assembly errors in the past, but now, there are too many hardware features we use in ACO that are not supported by the LLVM disassembler, that it is not really suited anymore as a debugging tool. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00
Daniel Schürmann	171fbe3ae1	aco: check get_reg_specified() on register hints This ensures that max_used_sgpr is adjusted accordingly. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00
Daniel Schürmann	dd16e21e97	aco: also consider VCC in get_reg_specified() This allows split_vector and others to keep their VCC position. Totals from 4573 (3.28% of 139391) affected shaders (Navi10): CodeSize: 54292268 -> 54289324 (-0.01%); split: -0.03%, +0.03% Instrs: 10327645 -> 10326941 (-0.01%); split: -0.04%, +0.04% Cycles: 744410748 -> 744034732 (-0.05%); split: -0.07%, +0.02% VMEM: 749093 -> 749092 (-0.00%); split: +0.00%, -0.00% SMEM: 269306 -> 269322 (+0.01%) SClause: 358746 -> 358744 (-0.00%) Copies: 826051 -> 823910 (-0.26%); split: -0.55%, +0.29% Branches: 355074 -> 356493 (+0.40%); split: -0.01%, +0.41% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00
Daniel Schürmann	947bf0bd67	aco: don't decrease the vgpr_limit when encountering bpermute Instead we recalculate vgpr_limit on demand, depending on the number of needed shared VGPRs. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00
Daniel Schürmann	b98a4d4dd7	aco: refactor GPR limit calculation This patch delays the calculation of GPR limits in order to precisely incorporate extra registers (VCC etc.) and shared VGPRs. Additionally, the allocation granularity is used to set the config. This has some effect on the reported SGPR stats. Totals (Navi10): SGPRs: 6971787 -> 17753642 (+154.65%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00
Daniel Schürmann	eaf681724e	aco: change gpr_alloc_granule to full alignment This also switches the alloc_granule of Tonga and Iceland to 96, so that the calculation is consistent. Also changes the granularity for RDNA to 16 to keep better stats with the upcoming patch. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00
Daniel Schürmann	bacc3b36f5	aco: fix shared VGPR allocation on RDNA2 VGPRs are now allocated in blocks of 8 normal or 16 shared VGPRs, respectively. Fixes: `14a5021aff` ('aco/gfx10: Refactor of GFX10 wave64 bpermute.') Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8921>	2021-02-12 19:00:18 +00:00

... 3 4 5 6 7 ...

1510 commits