fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 19:48:08 +02:00

Author	SHA1	Message	Date
Samuel Pitoiset	64748a2be2	aco/tests: add some tests for combining s_add+s_lshl to s_lshl<n>_add Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7539>	2020-11-12 07:36:07 +00:00
Samuel Pitoiset	ec347ee9bc	aco: fix combining add/sub to b2i if a new dest needs to be allocated The uses vector needs to be expanded to avoid out of bounds access and to make sure the number of uses is initialized to 0. This fixes combining more v_and(a, v_subbrev_co_u32). fossilds-db (Vega10): Totals from 4574 (3.28% of 139517) affected shaders: SGPRs: 291625 -> 292217 (+0.20%); split: -0.01%, +0.21% VGPRs: 276368 -> 276188 (-0.07%); split: -0.07%, +0.01% SpillSGPRs: 455 -> 533 (+17.14%) SpillVGPRs: 76 -> 78 (+2.63%) CodeSize: 23327500 -> 23304152 (-0.10%); split: -0.17%, +0.07% MaxWaves: 22044 -> 22066 (+0.10%) Instrs: 4583064 -> 4576301 (-0.15%); split: -0.15%, +0.01% Cycles: 47925276 -> 47871968 (-0.11%); split: -0.13%, +0.01% VMEM: 1599363 -> 1597473 (-0.12%); split: +0.08%, -0.19% SMEM: 331461 -> 331126 (-0.10%); split: +0.08%, -0.18% VClause: 80639 -> 80696 (+0.07%); split: -0.02%, +0.09% SClause: 155992 -> 155993 (+0.00%); split: -0.02%, +0.02% Copies: 333482 -> 333318 (-0.05%); split: -0.12%, +0.07% Branches: 70967 -> 70968 (+0.00%) PreSGPRs: 187078 -> 187711 (+0.34%); split: -0.01%, +0.35% PreVGPRs: 244918 -> 244785 (-0.05%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7513>	2020-11-10 10:25:00 +01:00
Rhys Perry	5b81e80fb6	aco: implement 64-bit images Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7234>	2020-11-09 18:28:59 +00:00
Samuel Pitoiset	bae5487659	aco: optimize v_and(a, v_subbrev_co(0, 0, vcc)) -> v_cndmask(0, a, vcc) fossils-db (Vega10): Totals from 7786 (5.70% of 136546) affected shaders: SGPRs: 517778 -> 518626 (+0.16%); split: -0.01%, +0.17% VGPRs: 488252 -> 488084 (-0.03%); split: -0.04%, +0.01% CodeSize: 42282068 -> 42250152 (-0.08%); split: -0.16%, +0.09% MaxWaves: 35697 -> 35716 (+0.05%); split: +0.06%, -0.01% Instrs: 8319309 -> 8304792 (-0.17%); split: -0.18%, +0.00% Cycles: 88619440 -> 88489636 (-0.15%); split: -0.16%, +0.01% VMEM: 2788278 -> 2780431 (-0.28%); split: +0.06%, -0.35% SMEM: 570364 -> 569370 (-0.17%); split: +0.12%, -0.30% VClause: 144906 -> 144908 (+0.00%); split: -0.05%, +0.05% SClause: 302143 -> 302055 (-0.03%); split: -0.04%, +0.01% Copies: 579124 -> 578779 (-0.06%); split: -0.14%, +0.08% PreSGPRs: 327695 -> 328845 (+0.35%); split: -0.00%, +0.35% PreVGPRs: 434280 -> 433954 (-0.08%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7438>	2020-11-09 17:36:42 +00:00
Jason Ekstrand	21b1b91549	nir,spirv: Add support for the ShaderCallKHR scope It's currently entirely trivial. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6479>	2020-11-05 23:36:46 +00:00
Tony Wasserka	1a1099c54f	aco: Fix format string used when raising validation errors Validation errors mention the pretty-printed instruction including operands with the reserved % character, which caused vasprintf to expect more format arguments than aco provided. Fixes: `c2b1978aa4` ("aco: rework the way various compilation/validation errors are reported") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7442>	2020-11-05 17:56:18 +00:00
Tony Wasserka	456beb40b8	aco/ra: Fix counting of subdword variables in get_reg_create_vector The loop variable "k" shadowed another variable in the outer scope, so this loop had no actual effect. Fixes: `52cc1f8237` ("aco: improve p_create_vector RA for sub-dword operands") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7427>	2020-11-04 12:08:49 +00:00
Rhys Perry	786828131a	aco: implement 8/16-bit instructions which can be trivially widened When nir_lower_bit_size becomes more capable, we might want to revert some of this. fossil-db (parallel-rdp, Navi): Totals from 217 (31.77% of 683) affected shaders: SGPRs: 11320 -> 10200 (-9.89%) VGPRs: 7156 -> 7364 (+2.91%) CodeSize: 1453948 -> 1430136 (-1.64%); split: -1.66%, +0.02% Instrs: 258530 -> 254840 (-1.43%); split: -1.44%, +0.01% Cycles: 37334360 -> 37247936 (-0.23%); split: -0.26%, +0.03% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>	2020-11-04 11:50:37 +00:00
Rhys Perry	ef95ba8cdd	aco: implement some 16-bit arithmetic instead of lowering fossil-db (parallel-rdp, Navi): Totals from 210 (30.75% of 683) affected shaders: SGPRs: 9704 -> 10248 (+5.61%) VGPRs: 5884 -> 5368 (-8.77%) CodeSize: 1155564 -> 1098752 (-4.92%) Instrs: 199927 -> 189940 (-5.00%) Cycles: 20438392 -> 19860124 (-2.83%) v2: use divergence analysis to determine which instructions to lower. Co-Authored-by: Daniel Schürmann <daniel@schuermann.dev> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4791>	2020-11-04 11:50:37 +00:00
Samuel Pitoiset	57c152af9c	aco: select v_mul_{hi}_u32_u24 for 24-bit multiplications This is based on the NIR range analysis. v_mul_u32_u24 is VOP2, while v_mul_lo_u32 is VOP3, so that should reduce codesize. fossils-db (Vega10): Totals from 12590 (9.22% of 136546) affected shaders: SGPRs: 680207 -> 677271 (-0.43%); split: -0.47%, +0.04% VGPRs: 620840 -> 620856 (+0.00%); split: -0.02%, +0.02% CodeSize: 37930200 -> 37774088 (-0.41%); split: -0.41%, +0.00% Instrs: 7463550 -> 7458120 (-0.07%); split: -0.07%, +0.00% Cycles: 133487628 -> 133427532 (-0.05%); split: -0.05%, +0.00% VMEM: 2514729 -> 2513426 (-0.05%); split: +0.02%, -0.08% SMEM: 1533579 -> 1532795 (-0.05%); split: +0.05%, -0.10% VClause: 231391 -> 231389 (-0.00%); split: -0.01%, +0.00% SClause: 255352 -> 255294 (-0.02%); split: -0.04%, +0.02% Copies: 605821 -> 600352 (-0.90%); split: -0.92%, +0.02% Branches: 133739 -> 133743 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 351092 -> 348048 (-0.87%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405>	2020-11-03 13:47:40 +00:00
Samuel Pitoiset	3a72021d7c	aco: store NIR range analysis data to the isel context It will be used to optimize some ALU instructions. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7405>	2020-11-03 13:47:40 +00:00
James Park	4bd18e772a	amd/llvm,aco: Replace VLA with alloca MSVC will never support VLA, so use alloca instead. Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7157>	2020-11-03 07:44:02 +00:00
Samuel Pitoiset	03f260cb27	radv,aco: optimize computing the sample mask for per-sample shading I don't know why these values were introduced for but it seems like we can optimize this by just doing: gl_SampleMaskIn[0] = (SampleCoverage & (1 << gl_SampleID)) AMDGPU-PRO and AMDVLK apply the same formula to compute the sample mask when per-sample shading is enabled. No fossils-db changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377>	2020-11-02 08:05:47 +01:00
Samuel Pitoiset	c63bcda22c	radv,aco: adjust the sample mask only if per-sample shading is enabled When per-sample shading isn't enabled, we can just load the samplemask from the hardware which is always the coverage of the entire pixel/fragment. fossilds-db (VEGA10): Totals from 131 (0.10% of 136546) affected shaders: SGPRs: 5056 -> 5048 (-0.16%) VGPRs: 2600 -> 2372 (-8.77%) CodeSize: 115788 -> 112560 (-2.79%) MaxWaves: 1266 -> 1274 (+0.63%) Instrs: 20620 -> 20071 (-2.66%) Cycles: 82416 -> 80220 (-2.66%) VMEM: 51567 -> 35532 (-31.10%); split: +0.24%, -31.34% SMEM: 8952 -> 8258 (-7.75%); split: +0.11%, -7.86% SClause: 1223 -> 1199 (-1.96%); split: -2.62%, +0.65% Copies: 1247 -> 1124 (-9.86%); split: -10.18%, +0.32% PreVGPRs: 2112 -> 1981 (-6.20%) Helps Britannia, Shadow of the Tomb Raider, Warhammer II and Control. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7377>	2020-11-02 08:05:43 +01:00
James Park	6d058ac6c9	aco: Fix accidental copies, attempt two Use auto to avoid mistyping the constness of the pair key, which triggers implicit conversions rather than compilation errors. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7346>	2020-10-30 09:55:53 +00:00
Rhys Perry	1761379481	aco: handle SDWA in the optimizer Apply SGPRs/modifiers when possible and try not to break when SDWA instructions are encountered. No shader-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>	2020-10-29 18:08:31 +00:00
Rhys Perry	ecc5b59a70	aco: don't allow destination opsel for v_cvt_pknorm It doesn't make sense to do this. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>	2020-10-29 18:08:31 +00:00
Rhys Perry	bb890f2e7c	aco: fix combine_inverse_comparison() fossil-db (Navi): Totals from 16 (0.01% of 137413) affected shaders: CodeSize: 6788 -> 6724 (-0.94%) Instrs: 1250 -> 1234 (-1.28%) Cycles: 4984 -> 4920 (-1.28%) fossil-db (Polaris): Totals from 16 (0.01% of 138881) affected shaders: CodeSize: 7024 -> 6960 (-0.91%) Instrs: 1337 -> 1321 (-1.20%) Cycles: 5332 -> 5268 (-1.20%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>	2020-10-29 18:08:31 +00:00
Rhys Perry	7e4aa8c8e9	aco: fix printing of some sdwa sels Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>	2020-10-29 18:08:31 +00:00
Rhys Perry	70320f4117	aco: assert a label only uses one of the members in ssa_info's union Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7349>	2020-10-29 18:08:31 +00:00
Rhys Perry	3dfbed2a87	aco: create s_clause on GFX10+ This seems to give no measurable benefit to Strange Brigade or Shadow of Mordor, but it's simple to do, helps in theory and all other compilers do it. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5919>	2020-10-29 15:08:05 +00:00
Daniel Schürmann	f4c090a3b3	aco: refactor split_store_data() to always split into evenly sized elements This fixes a couple of issues on GFX67 and has no negative impact on newer hardware Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7105>	2020-10-29 14:32:59 +00:00
Timur Kristóf	f74ef15879	aco/ngg: Incorporate GS invocations into workgroup size calculation. If the workgroup_size variable is lower than the actual workgroup size, that means it's possible that ACO won't emit some s_barrier instructions when in fact it should. This can possibly cause a GPU hang. This is just for the sake of general correctness, currently this can't cause a real problem because the maximum vertex count is always greater than (or equal to) the primitive count in GS, and already takes into account the number of GS invocations. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:55:54 +01:00
Timur Kristóf	09b9e52c0d	aco/ngg: Export a zero-area triangle when primitive count is 0. This is a workaround for a bug in Navi 1x NGG HW. Very rarely, the Navi 1x PA can hang when an NGG workgroup exports 0 total primitives. According to AMD, we always need this workaround when it is possible that the number of primitives is 0. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:55:47 +01:00
Timur Kristóf	73449f9a62	aco: Add a few assertions about LDS usage. This is to make sure we don't compile a shader which doesn't fit the available LDS space. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:22 +01:00
Timur Kristóf	b6654adc0e	aco: Make emitting reduction instructions a bit more convenient. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:22 +01:00
Timur Kristóf	8d6246205a	aco: Add some validation for PSEUDO_REDUCTION instructions. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:22 +01:00
Timur Kristóf	260f9c503a	aco/ngg: Put shader query reduction operand into a VGPR. The p_reduce instruction only works if this operand is in a VGPR, and otherwise gets lowered to incorrect code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:22 +01:00
Timur Kristóf	9757c3cb6b	aco: Assert that workgroup barriers are not used inappropriately. Example: It is possible for some NGG GS waves to have 0 ES and/or GS invocations, and in that case having an s_barrier inside divergent control flow can very possibly hang the GPU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:19 +01:00
Rhys Perry	ecdcf22d5d	aco: switch aco_print_asm to a FILE * Streams are really stateful and (IMO) difficult to read for non-trivial usage. This is also more consistent with NIR and the rest of ACO. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7166>	2020-10-28 17:32:32 +00:00
Rhys Perry	a293fad4ef	aco: refactor repeated instruction disassembly This seems simpler to me. It should also work correctly when repeated instructions cross blocks. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7166>	2020-10-28 17:32:32 +00:00
Rhys Perry	ed2449d55b	aco: move individual instruction disassembly to its own helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7166>	2020-10-28 17:32:32 +00:00
Rhys Perry	483657de32	aco: use mubuf helper in select_gs_copy_shader Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103>	2020-10-28 14:59:49 +00:00
Rhys Perry	ec7ecfe9cb	aco: use control flow creation helpers in select_gs_copy_shader Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103>	2020-10-28 14:59:49 +00:00
Rhys Perry	57d977a23f	aco: round bytes_written to dwords if larger than 4 bytes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7276>	2020-10-28 10:56:27 +00:00
Rhys Perry	41839d38cf	aco: default to a definition size of 32 For non-arithmetic opcodes such as buffer_load_dword and buffer_load_short, default to a definition size of 32. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7276>	2020-10-28 10:56:27 +00:00
Daniel Schürmann	543f50789a	aco: implement nir_op_unpack_[64/32]_* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6527>	2020-10-28 10:14:26 +00:00
Rhys Perry	26e53e3afa	aco: ignore the ACO-inserted continue in create_continue_phis() Otherwise, for loops without continue_or_break, create_continue_phis() always returns an undef operand. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `638cbc21a1` ("aco: handle when ACO adds new continue edges") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2848 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7148>	2020-10-27 19:53:38 +00:00
Rhys Perry	437995bb70	aco: remove all-undef phi opt This doesn't look like it would create correct IR for 8/16-bit phis and doesn't seem to help anything. If we ever want to do this, it's probably better done in nir_opt_remove_phis(). No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	70ff262cda	aco: use v_mov_b32_sdwa for some 16-bit constants Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	b882598ee1	aco: remove some unused optimizations These are unused now that we almost always use p_parallelcopy for simple copies. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	d20a752c0d	aco: use Builder::copy more fossil-db (Navi): Totals from 6973 (5.07% of 137413) affected shaders: SGPRs: 381768 -> 381776 (+0.00%) VGPRs: 306092 -> 306096 (+0.00%); split: -0.00%, +0.00% CodeSize: 24440844 -> 24421196 (-0.08%); split: -0.09%, +0.01% MaxWaves: 86581 -> 86583 (+0.00%) Instrs: 4682161 -> 4679578 (-0.06%); split: -0.06%, +0.00% Cycles: 68793116 -> 68261648 (-0.77%); split: -0.83%, +0.05% fossil-db (Polaris): Totals from 8154 (5.87% of 138881) affected shaders: VGPRs: 338916 -> 338920 (+0.00%); split: -0.00%, +0.00% CodeSize: 23540428 -> 23540488 (+0.00%); split: -0.00%, +0.00% MaxWaves: 49090 -> 49091 (+0.00%) Instrs: 4576085 -> 4576101 (+0.00%); split: -0.00%, +0.00% Cycles: 51720704 -> 51720888 (+0.00%); split: -0.00%, +0.00% Most of the Navi cycle/instruction changes are from 8/16-bit parallel-rdp shaders. They appear to be improved because the p_create_vector from lower_subdword_phis() was blocking constant propagation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	e54c111c45	aco: always use p_parallelcopy for pre-RA copies Most fossil-db changes are because literals are applied earlier (in label_instruction), so use counts are more accurate and more literals are applied. fossil-db (Navi): Totals from 79551 (57.89% of 137413) affected shaders: SGPRs: 4549610 -> 4542802 (-0.15%); split: -0.19%, +0.04% VGPRs: 3326764 -> 3324172 (-0.08%); split: -0.10%, +0.03% SpillSGPRs: 38886 -> 34562 (-11.12%); split: -11.14%, +0.02% CodeSize: 240143456 -> 240001008 (-0.06%); split: -0.11%, +0.05% MaxWaves: 1078919 -> `1079281` (+0.03%); split: +0.04%, -0.01% Instrs: 46627073 -> 46528490 (-0.21%); split: -0.22%, +0.01% fossil-db (Polaris): Totals from 98463 (70.90% of 138881) affected shaders: SGPRs: 5164689 -> 5164353 (-0.01%); split: -0.02%, +0.01% VGPRs: 3920936 -> 3921856 (+0.02%); split: -0.00%, +0.03% SpillSGPRs: 56298 -> 52259 (-7.17%); split: -7.22%, +0.04% CodeSize: 258680092 -> 258692712 (+0.00%); split: -0.02%, +0.03% MaxWaves: 620863 -> 620823 (-0.01%); split: +0.00%, -0.01% Instrs: 50776289 -> 50757577 (-0.04%); split: -0.04%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	6db5fbf9f2	aco: allow literals on sub-dword p_parallelcopy Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	74e2e9b682	aco: don't use bld.copy() in handle_operands() No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	a834d9ef86	aco: expand vectors passed as copy operands Most copies which hit this case use p_create_vector, but in the future p_parallelcopy will be used instead. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	e092f34dfa	aco: copy-propgate through p_create_vector during value numbering fossil-db (Navi): Totals from 182 (0.13% of 137413) affected shaders: SGPRs: 9304 -> 9312 (+0.09%) VGPRs: 7636 -> 7620 (-0.21%); split: -0.26%, +0.05% CodeSize: 733516 -> 733092 (-0.06%); split: -0.07%, +0.01% MaxWaves: 2478 -> 2479 (+0.04%) Instrs: 139664 -> 139561 (-0.07%); split: -0.09%, +0.02% Cycles: 3215104 -> 3214080 (-0.03%); split: -0.04%, +0.01% fossil-db (Polaris): Totals from 161 (0.12% of 138881) affected shaders: VGPRs: 5608 -> 5596 (-0.21%); split: -0.29%, +0.07% CodeSize: 605336 -> 605120 (-0.04%); split: -0.05%, +0.02% Instrs: 117957 -> 117902 (-0.05%); split: -0.07%, +0.02% Cycles: 3105008 -> 3103876 (-0.04%); split: -0.04%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	0f31fa1b64	aco: skip value numbering of copies Instead, copy-propagate through and remove them. This improves value numbering in this situation: a = ... b = copy a c = copy a use(b) use(c) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	72b307a338	aco: don't do divergent break+discard If the shader does: loop { if (divergent) discard else a() b() } then a()'s block will dominate b()'s block in the logical CFG, but not the linear CFG. This will cause value numbering to try to combine SLAU from a() and b(). This didn't happen with break/continue because sanitize_if() would move a() out of the branch. Using sanitize_if() to fix this doesn't look easy, because discards are not control flow instructions in NIR. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	d4503a9020	aco: update phi_map in add_subdword_operand() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `56345b8c61` ("aco: allow reading/writing upper halves/bytes when possible") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00

1 2 3 4 5 ...

1051 commits