fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-28 09:58:22 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	989e9867a6	aco: fix additional register requirements for spilling It could happen that VGPR spilling without SGPR spilling calculated a negative spills_to_vgpr number and then increasing the VGPR target demand above the limit. Cc: mesa-stable Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10756>	2021-05-12 14:13:24 +00:00
Timur Kristóf	bb127c2130	radv: Use new NIR lowering of NGG GS when ACO is used. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	9732881729	radv: Use new NGG NIR lowering for VS/TES when ACO is used. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	89a76ff786	aco: Implement new NGG specific NIR intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	75a002f809	aco: Split ngg_emit_sendmsg_gs_alloc_req from the wave0 check. This allows us to emit the gs_alloc_req independently of the wave ID check, which is what the NIR lowering will need. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	ad8dd39bd3	aco: Fixup the NIR metadata after sanitize_cf_list. sanitize_cf_list can in fact invalidate the dominance metadata, which we need to use eg. nir_unsigned_upper_bound. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	00fd087f0a	aco: Allow workgroup barrier and shared scope for NGG shaders. NGG already needs to use workgroup barriers, but this commit allows them to come from NIR as opposed to just emitting it in ACO instruction selection. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Rhys Perry	a54f111831	radv,aco: compact vertex buffer descriptors It seems common for there to be holes. fossil-db (GFX10.3, robustBufferAccess enabled): Totals from 33791 (23.10% of 146267) affected shaders: (no statistics changed) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>	2021-05-10 12:09:14 +00:00
Rhys Perry	20a0744e22	Revert "radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2" This reverts commit `a8a6b9fb2f`. This is no longer necessary now that we fixup the size when creating the descriptors. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>	2021-05-10 12:09:14 +00:00
Rhys Perry	157c6b0f33	radv,aco: use per-attribute vertex descriptors for robustness We have to use a different num_records for each attribute to correctly implement robust buffer access. fossil-db (GFX10.3, robustBufferAccess enabled): Totals from 60059 (41.06% of 146267) affected shaders: VGPRs: 2169040 -> 2169024 (-0.00%); split: -0.02%, +0.02% CodeSize: 79473128 -> 81156016 (+2.12%); split: -0.00%, +2.12% MaxWaves: 1635360 -> 1635258 (-0.01%); split: +0.00%, -0.01% Instrs: 15559040 -> 15793205 (+1.51%); split: -0.01%, +1.52% Latency: 90954792 -> 91308768 (+0.39%); split: -0.30%, +0.69% InvThroughput: 14937873 -> 14958761 (+0.14%); split: -0.04%, +0.18% VClause: 444280 -> 412074 (-7.25%); split: -9.22%, +1.97% SClause: 588545 -> 644141 (+9.45%); split: -0.54%, +9.99% Copies: 1010395 -> 1011232 (+0.08%); split: -0.44%, +0.53% Branches: 274279 -> 274282 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 1431171 -> 1405056 (-1.82%); split: -2.89%, +1.07% PreVGPRs: 1575253 -> 1575259 (+0.00%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>	2021-05-10 12:09:14 +00:00
Rhys Perry	dfa38fa0c7	aco: group loads from the same vertex binding into the same clause In the future, we might have vertex attribute loads from the same binding but with different descriptors. Since they will be loading from the same buffer, we should continue grouping them into clauses. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7871>	2021-05-10 12:09:14 +00:00
Tony Wasserka	741e84f554	aco/spill: Fix improper handling of exec phis The "continue" was placed in the wrong loop, leading to exec being counted as a spilled register when it wasn't. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `a56ddca4e8` ('aco: make all exec accesses non-temporaries') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4533 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10486>	2021-05-03 10:31:07 +00:00
Rhys Perry	ee9b744cb5	radv,aco: use nir_address_format_vec2_index_32bit_offset The vec2 index helps the compiler make use of SMEM's SOFFSET field when loading descriptors. fossil-db (GFX10.3): Totals from 126326 (86.37% of 146267) affected shaders: VGPRs: 4898704 -> 4899088 (+0.01%); split: -0.02%, +0.03% SpillSGPRs: 13490 -> 14404 (+6.78%); split: -1.10%, +7.87% CodeSize: 306442996 -> 302277700 (-1.36%); split: -1.36%, +0.01% MaxWaves: 3277108 -> 3276624 (-0.01%); split: +0.01%, -0.02% Instrs: 58301101 -> 57469370 (-1.43%); split: -1.43%, +0.01% VClause: 1208270 -> 1199264 (-0.75%); split: -1.02%, +0.28% SClause: 2517691 -> 2432744 (-3.37%); split: -3.75%, +0.38% Copies: 3518643 -> 3161097 (-10.16%); split: -10.45%, +0.29% Branches: 1228383 -> 1228254 (-0.01%); split: -0.12%, +0.11% PreSGPRs: 3973880 -> 4031099 (+1.44%); split: -0.19%, +1.63% PreVGPRs: 3831599 -> 3831707 (+0.00%) Cycles: 1785250712 -> 1778222316 (-0.39%); split: -0.42%, +0.03% VMEM: 52873776 -> 50663317 (-4.18%); split: +0.18%, -4.36% SMEM: 8534270 -> 8361666 (-2.02%); split: +1.79%, -3.82% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9523>	2021-04-27 15:56:07 +00:00
Samuel Pitoiset	4c2add8cba	aco: adjust NGG if provoking vertex mode is last Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10449>	2021-04-27 07:31:03 +00:00
James Park	1351fcf3c3	amd: Fix warnings around variable sizes Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6162>	2021-04-23 10:37:22 +00:00
Timur Kristóf	74c467d988	aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7. On GFX6-7, the 8 and 16-bit integer add reductions use the 32-bit v_add instruction, which clobbers the VCC register. Cc: mesa-stable Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10346>	2021-04-22 11:29:49 +00:00
Rhys Perry	776ba40115	aco: add and use Program::progress This is used when printing the program and to avoid updating register demand during post-RA liveness analysis. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	2d36232e62	aco: allow SDWA sels smaller than the operand size p_extract_vector copy-propagation can create byte sels for v2b operands. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	655ba1e3a9	aco: don't update register demand during RA validation It isn't intended to be accurate after RA, so num_waves can become zero, breaking the sgpr_limit calculation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10315>	2021-04-21 11:09:33 +00:00
Rhys Perry	0eaa5dfac0	aco: remove image parameter from get_sampler_desc() We can just check whether tex_instr is NULL instead. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Rhys Perry	3cbe9894f7	aco: set TRUNC_COORD=0 for nir_texop_tg4 Fixes black squares in Assassin's Creed: Valhalla and rendering of FidelityFX-CACAO demo. fossil-db (sienna cichlid): Totals from 3052 (2.09% of 146267) affected shaders: SpillSGPRs: 8437 -> 8646 (+2.48%) CodeSize: 30993832 -> 31116916 (+0.40%); split: -0.00%, +0.40% Instrs: 5869934 -> 5886783 (+0.29%); split: -0.00%, +0.29% Latency: 250330521 -> 250463770 (+0.05%); split: -0.00%, +0.05% InvThroughput: 59797617 -> 59814584 (+0.03%); split: -0.00%, +0.03% VClause: 92114 -> 92132 (+0.02%) SClause: 197373 -> 197338 (-0.02%); split: -0.02%, +0.01% Copies: 479482 -> 482394 (+0.61%); split: -0.01%, +0.61% Branches: 219629 -> 219635 (+0.00%) PreSGPRs: 248970 -> 249366 (+0.16%) fossil-db (polaris10): Totals from 3050 (2.06% of 147787) affected shaders: SGPRs: 282864 -> 282912 (+0.02%); split: -0.01%, +0.02% VGPRs: 242572 -> 242612 (+0.02%) SpillSGPRs: 10387 -> 10675 (+2.77%) CodeSize: 31872460 -> 31996128 (+0.39%) MaxWaves: 10924 -> 10925 (+0.01%) Instrs: 6222217 -> 6239072 (+0.27%) Latency: 317482545 -> 317773685 (+0.09%); split: -0.00%, +0.09% InvThroughput: 156149624 -> 156242072 (+0.06%); split: -0.00%, +0.06% VClause: 92295 -> 92254 (-0.04%); split: -0.05%, +0.01% SClause: 243342 -> 243321 (-0.01%); split: -0.01%, +0.00% Copies: 678902 -> 681700 (+0.41%); split: -0.00%, +0.41% Branches: 219698 -> 219703 (+0.00%) PreSGPRs: 244251 -> 244644 (+0.16%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `58f25098a0` ("radv: Use TRUNC_COORD on samplers") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3110 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10036>	2021-04-20 17:42:21 +00:00
Samuel Pitoiset	9434675d60	aco: fix opquantize2f16 on GFX6-7 Make sure to preserve signed zeroes. Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.flush_to_zero on GFX6 (Pitcairn). Untested on GFX7. Fixes: `54a09545ec` ("aco: optimize a*0.0") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10319>	2021-04-19 16:33:37 +00:00
Marek Olšák	ec1ddb976a	amd/registers: rename IMG_FORMAT to GFX10_FORMAT to disambiguate the meaning Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>	2021-04-17 02:37:49 +00:00
Marek Olšák	b878444c3a	amd: drop support for LLVM 10 It doesn't support RDNA 2. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Samuel Pitoiset	936b58378c	amd: drop support for LLVM 8 It doesn't support Navi1x and the removal enables this nice code cleanup. v2: rebase - mareko Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1) Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10199>	2021-04-16 09:25:19 +00:00
Michel Dänzer	d200f45875	Use explicit break instead of fall-through to break-only case clang generates a warning if there's no explicit break or fall-through annotation. The latter would be kind of silly in this case, and not robust against any future changes turning the fall-through invalid. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Michel Dänzer	2928c21eb7	Convert most remaining free-form fall-through comments to FALLTHROUGH One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is explicitly disabled. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>	2021-04-15 16:01:22 +00:00
Rhys Perry	5b8a4516e6	aco/ra: remove live-in temporary from live_out_per_block when moving it Otherwise, handle_loop_phis() might pass it to handle_live_in() and then we could have two phis for this variable. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Rhys Perry	11fde1247c	aco/ra: use original names when renaming loop carried phi operands Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `7c64623e94` ("aco/ra: refactor SSA repairing during register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>	2021-04-14 19:04:08 +00:00
Timur Kristóf	f3e004cb56	aco: Add a simple heuristic to decide early or late primitive export. Late export is theoretically better if used with LATE_ALLOC, but in practice, the early export has an advantage of lower register usage, therefore more concurrent waves. The idea of this commit is that "small" shaders benefit from early primitive export more, due to being able to launch much more waves. Let's consider a NIR shader "small" when it has only 1 block. This yields both better performance, and better stats, than always using late export. Fossil DB on Sienna: Totals from 12807 (8.76% of 146265) affected shaders: VGPRs: 609128 -> 620216 (+1.82%); split: -0.01%, +1.83% SpillSGPRs: 1458 -> 1538 (+5.49%) CodeSize: 37028204 -> 37019320 (-0.02%); split: -0.17%, +0.14% MaxWaves: 282902 -> 278516 (-1.55%) Instrs: 7163142 -> 7162925 (-0.00%); split: -0.18%, +0.18% VClause: 169285 -> 169547 (+0.15%); split: -1.15%, +1.30% SClause: 267373 -> 267151 (-0.08%); split: -0.24%, +0.16% Copies: 446442 -> 444567 (-0.42%); split: -2.68%, +2.26% Branches: 156245 -> 156195 (-0.03%); split: -0.30%, +0.26% PreSGPRs: 434701 -> 447396 (+2.92%) PreVGPRs: 527783 -> 540527 (+2.41%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	5dbab03a80	aco: Emit fewer branches for NGG VS/TES with late primitive export. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	af7d5f5b86	aco: Set block_kind_export_end in create_vs/fs_exports. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	2b312a4fd7	aco: Extract ngg_nogs_export_prim_id to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	231ef14b3d	aco: Use s_setprio 3 at the beginning of every VS and TES. The user-set priority of shaders matters very little, but we hope this might still help speed up VS input loads especially. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	4c86c7aa15	aco: Remove useless s_setprio near gs_alloc_req. We learned that the gs_alloc_req is not actually when the export space allocation happens. So it makes no sense to prioritize it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>	2021-04-14 14:25:10 +00:00
Timur Kristóf	75cd43741a	aco: Align NGG scratch size to 16 so a single ds_read can always read it. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Timur Kristóf	c1346e5c22	aco: Optimize workgroup exclusive scan to better avoid bank conflicts. Previously, every wave had multiple active lanes read the LDS, and the data was processed by VALU DPP instructions. Now, only the first lane reads the LDS in order to avoid bank conflicts, and the results are processed by SALU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>	2021-04-14 14:05:24 +00:00
Daniel Schürmann	b6a28aaa8b	aco/cssa: don't create parallelcopies for constants and exec if we are able to spill these directly. Totals from 4913 (3.60% of 136546) affected shaders (Raven): SpillSGPRs: 16021 -> 15451 (-3.56%); split: -3.87%, +0.31% CodeSize: 58102020 -> 57371464 (-1.26%); split: -1.26%, +0.00% Instrs: 11411454 -> 11230105 (-1.59%); split: -1.59%, +0.00% Latency: 555706331 -> 550058635 (-1.02%); split: -1.07%, +0.05% InvThroughput: 273023354 -> 271854469 (-0.43%); split: -0.44%, +0.01% SClause: 385168 -> 385371 (+0.05%); split: -0.01%, +0.06% Copies: 1342084 -> 1175762 (-12.39%); split: -12.40%, +0.01% Branches: 392619 -> 378662 (-3.55%); split: -3.56%, +0.00% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	18ba93e673	aco/cssa: rewrite lower_to_cssa pass The previous pass was based on misconceptions and rounded up with bug fixes. The new pass is entirely rewritten and basically just one-to-one from the paper: "Revisiting Out-of-SSA Translation for Correctness, CodeQuality, and Efficiency" by B. Boissinot et al. It also incorporates the value-equality testing. The regressions are mainly due to creating parallelcopies for exec phis at loop headers (mitigated in the next commit). Totals from 4933 (3.61% of 136546) affected shaders (Raven): SpillSGPRs: 16249 -> 16527 (+1.71%); split: -0.28%, +1.99% SpillVGPRs: 1771 -> 1595 (-9.94%) CodeSize: 57544436 -> 58280304 (+1.28%); split: -0.00%, +1.28% Scratch: 176128 -> 179200 (+1.74%) Instrs: 11265783 -> 11445884 (+1.60%); split: -0.00%, +1.60% Latency: 552596156 -> 555880540 (+0.59%); split: -0.53%, +1.13% InvThroughput: 271431862 -> 273097423 (+0.61%); split: -0.18%, +0.79% VClause: 160240 -> 160241 (+0.00%); split: -0.02%, +0.02% SClause: 386863 -> 386685 (-0.05%); split: -0.07%, +0.02% Copies: 1180801 -> 1345633 (+13.96%); split: -0.02%, +13.98% Branches: 379129 -> 393052 (+3.67%); split: -0.01%, +3.69% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	9d73a4a412	aco: add new reindex_ssa() pass Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	d75c73e6a6	aco: fix kill flags on phi operands Fossil-db changes are likely due to how the CSSA pass works. Totals from 1782 (1.31% of 136546) affected shaders (Raven): CodeSize: 25333292 -> 25294020 (-0.16%); split: -0.16%, +0.00% Instrs: 4916059 -> 4908218 (-0.16%); split: -0.16%, +0.00% Latency: 282860167 -> 282707176 (-0.05%); split: -0.08%, +0.03% InvThroughput: 136487564 -> 136394958 (-0.07%); split: -0.12%, +0.05% VClause: 74791 -> 74795 (+0.01%) Copies: 542115 -> 534280 (-1.45%); split: -1.48%, +0.04% Branches: 168977 -> 168966 (-0.01%); split: -0.01%, +0.01% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	13e4fed01f	aco: lower p_spill with constants correctly Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	4a57787006	aco/spill: use correct next_use_distances at loop header To decide which variables to spill, we must use the distances at the beginning of the loop-header, and not the distances at the end of the loop-preheader. The difference are that the former includes phis which are viable to be spilled as opposed to the phi operands which would be reloaded by add_coupling_code(), ending up in potentially too high register pressure before the loop. Totals from 206 (0.15% of 136546) affected shaders (Raven): SpillSGPRs: 5154 -> 5000 (-2.99%) CodeSize: 3654072 -> 3647184 (-0.19%); split: -0.19%, +0.00% Instrs: 701482 -> 700526 (-0.14%); split: -0.14%, +0.00% Latency: 40988780 -> 40872506 (-0.28%); split: -0.29%, +0.00% InvThroughput: 20364560 -> 20306006 (-0.29%) SClause: 20192 -> 20198 (+0.03%) Copies: 77732 -> 77688 (-0.06%); split: -0.08%, +0.03% Branches: 24204 -> 24050 (-0.64%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	b56ea19111	aco/spill: refactor live-in registerDemand calculation This also fixes some hypothetical issue for loops without phis and for loops with higher register pressure at the end of the loop preheader. No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	282eacc3e0	aco/spill: refactor some more spill decision taking No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	dfb10e4f4b	aco/spill: don't count phis as variable access This increases the chance of evicting phis if these have longer next-use distances. Totals from 6 (0.00% of 146267) affected shaders (Navi10): CodeSize: 476992 -> 464388 (-2.64%) Instrs: 81785 -> 79952 (-2.24%) VClause: 2380 -> 2374 (-0.25%) Copies: 26836 -> 25131 (-6.35%) Branches: 2494 -> 2492 (-0.08%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	b2a6346df7	aco/spill: spill phi constants and exec directly to VGPR This lets us avoid some CSSA copies. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	99936d7142	aco/spill: reload spilled exec masks directly to exec This handles the case of exec = p_linear_phi %a, %b where %a or %b might have been spilled. By directly reloading these variables into the exec mask register, we can avoid additional CSSA parallelcopies. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Daniel Schürmann	beb292343a	aco/spill: refactor spill decision taking No fossil-db changes. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>	2021-04-13 18:40:57 +00:00
Rhys Perry	d8f12fd421	aco: fix 16-bit f2{u8,i8} on GFX6/7 Not really tested. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>	2021-04-12 16:19:46 +00:00

1 2 3 4 5 ...

1404 commits