fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 09:18:10 +02:00

Author	SHA1	Message	Date
Marek Olšák	edc8a4a037	ac/surface: enable DCC image stores for all displayable DCC on gfx10.3 Co-authored-by: Joshua Ashton <joshua@froggi.es> Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13153>	2021-10-02 22:56:48 +00:00
Joshua Ashton	e6fcf65578	ac/surface: Add helper for checking if a surface supports DCC Image stores We need to keep RADV and RadeonSI on the same page about this due to modifiers. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13153>	2021-10-02 22:56:48 +00:00
Marek Olšák	923c535ee8	ac/surface: don't overwrite DCC settings for imported buffers Fixes: `0f6251b31f` - ac/surface: use DCC compatible with image stores for < 4K resolutions Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13120>	2021-10-01 16:15:40 -04:00
Marek Olšák	279cd5821c	ac/gpu_info: fix the comment for the NGG->legacy transition bug Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13048>	2021-09-28 17:30:06 +00:00
Marek Olšák	a198c6b7dd	ac/surface: correct a comment about DCC image stores Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13013>	2021-09-25 08:49:05 +00:00
Marek Olšák	0f6251b31f	ac/surface: use DCC compatible with image stores for < 4K resolutions We don't have to use the special DCC settings for lower resolutions. This will cause corruption if X and an windowed app use different Mesa versions. The fix is to restart the X server. I expect to get false bug reports due to this. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13013>	2021-09-25 08:49:05 +00:00
Timur Kristóf	09f89d15e4	ac/nir/nggc: Don't reuse uniform values from divergent control flow. With NGG culling, the shaders are split into two parts: the top part that computes just the position output, and the bottom part which produces the other outputs. To reduce redundancy between the two, I added some code to reuse uniform variables between them. However, there is an edge case I didn't think about: because of vertex repacking, it is possible for the bottom part to process a different vertex. Therefore it can take a different divergent code path (though it must still take the same uniform code path). Due to this, when a uniform value comes from divergent control flow, this may be undefined in the bottom part. This commit stops reusing uniform variables from divergent control flow, to fix issues that arise from this. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 1723 (1.34% of 128647) affected shaders: VGPRs: 89312 -> 89184 (-0.14%); split: -0.15%, +0.01% SpillSGPRs: 4575 -> 120 (-97.38%) CodeSize: 10846424 -> 10873836 (+0.25%); split: -0.68%, +0.93% MaxWaves: 34582 -> 34602 (+0.06%); split: +0.06%, -0.01% Instrs: 2124471 -> 2128835 (+0.21%); split: -0.51%, +0.72% Latency: 7274569 -> 7293899 (+0.27%); split: -0.22%, +0.48% InvThroughput: 1637130 -> 1635490 (-0.10%); split: -0.17%, +0.07% VClause: 25141 -> 25414 (+1.09%); split: -0.02%, +1.10% SClause: 56367 -> 59503 (+5.56%); split: -1.36%, +6.93% Copies: 230704 -> 219313 (-4.94%); split: -5.49%, +0.55% Branches: 72781 -> 72681 (-0.14%); split: -0.21%, +0.07% PreSGPRs: 118766 -> 100176 (-15.65%); split: -15.70%, +0.05% PreVGPRs: 76876 -> 76833 (-0.06%) Fixes: `0bb543bb60` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13001>	2021-09-24 17:32:53 +00:00
Timur Kristóf	cb19ebe7ba	ac/nir/nggc: Refactor save_reusable_variables. This makes the code more elegant and also fixes the mistake of skipping the blocks that come before loops. Fossil DB changes on Sienna Cichlid with NGGC on: Totals from 1026 (0.80% of 128647) affected shaders: SpillSGPRs: 3817 -> 4035 (+5.71%) CodeSize: 5582856 -> 5538732 (-0.79%); split: -0.89%, +0.10% Instrs: 1106907 -> 1100180 (-0.61%); split: -0.68%, +0.07% Latency: 10084948 -> 10052197 (-0.32%); split: -0.37%, +0.05% InvThroughput: 1567012 -> 1564949 (-0.13%); split: -0.16%, +0.03% SClause: 39789 -> 39075 (-1.79%); split: -2.33%, +0.54% Copies: 95184 -> 96456 (+1.34%); split: -0.19%, +1.53% Branches: 44087 -> 44093 (+0.01%); split: -0.01%, +0.02% PreSGPRs: 47584 -> 51009 (+7.20%); split: -0.61%, +7.80% Fixes: `0bb543bb60` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13001>	2021-09-24 17:32:53 +00:00
Timur Kristóf	a7f2faea46	ac/nir: Emit edge flag instructions conditionally. They are not needed by RADV but will be needed by RadeonSI. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 56917 (44.24% of 128647) affected shaders: VGPRs: 1982664 -> 1975936 (-0.34%); split: -0.43%, +0.09% CodeSize: 152790880 -> 149510316 (-2.15%); split: -2.15%, +0.00% MaxWaves: 1617984 -> 1621900 (+0.24%) Instrs: 29272825 -> 28907038 (-1.25%); split: -1.26%, +0.01% Latency: 128744182 -> 127565678 (-0.92%); split: -1.14%, +0.22% InvThroughput: 20125915 -> 19805168 (-1.59%); split: -1.63%, +0.03% VClause: 521312 -> 519804 (-0.29%); split: -0.77%, +0.48% SClause: 688861 -> 688897 (+0.01%); split: -0.04%, +0.05% Copies: 3205421 -> 3177799 (-0.86%); split: -1.68%, +0.82% Branches: 1181457 -> 1183147 (+0.14%); split: -0.03%, +0.17% PreVGPRs: 1626681 -> 1595406 (-1.92%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12998>	2021-09-23 16:57:56 +02:00
Bas Nieuwenhuizen	1ca4fd31e6	radv: Add support for ray launch size. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12592>	2021-09-21 01:53:39 +00:00
Timur Kristóf	13e467a147	ac/nir: Fix match_mask to work correctly for VS outputs. match_mask checks the intrinsic type and decides whether it's per-patch or not. VS don't have per-patch outputs, so this causes wrong behaviour there. Found using the GCC undefined behavior sanitizer. Fixes the following error: runtime error: shift exponent 18446744073709551584 is too large for 64-bit type 'long unsigned int' Closes: #5319 Fixes: `bf966d1c1d` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12719>	2021-09-20 18:08:16 +00:00
Timur Kristóf	75dbb40439	ac/nir: Remove byte permute from prefix sum of the repack sequence. The byte-permute instruction v_perm_b32 is not exposed by older LLVM releases (only available on LLVM 13 and later), therefore a new sequence is needed which we can use with these LLVM versions too. The prefix sum is replaced by two alternatives: 1. For GPUs that support v_dot, we shift 0x01 to the wanted byte positions and then use v_dot to sum the results. 2. For older GPUs (Navi 10), we simply shift out the unwanted bytes and use v_sad_u8 to produce the sum. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12786>	2021-09-20 12:39:03 +02:00
Joshua Ashton	92ade3df05	ac/surface: Add ac_modifier_supports_dcc_image_stores helper Helper function to check if a modifier supports DCC image stores. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12862>	2021-09-18 00:01:01 +00:00
Joshua Ashton	fd08758bb1	ac/surface: Add modifiers capable of DCC image stores Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12862>	2021-09-18 00:01:01 +00:00
Samuel Pitoiset	c952655693	ac/rgp, radv: report wave size for shaders Fills the "Wave mode" in "Pipelines" for GPUs that supports Wave32. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>	2021-09-17 08:05:36 +00:00
Samuel Pitoiset	d29c381c64	ac/rgp, radv: report scratch memory size for shaders Fills the "Scatch Mem" with "Yes/No" in "Pipelines", this requires instruction timing to be enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12896>	2021-09-17 08:05:36 +00:00
Rhys Perry	40a0935899	ac/gpu_info: add has_accelerated_dot_product Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00
Timur Kristóf	fe6e4484ab	ac/nir/nggc: Move gs_alloc_req up in NGG culling shaders. This is the first part of a refactor to make vertex compaction optional. Additionally, it may yield a very small benefit to allocate the PC space sligtly sooner. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 58239 (45.27% of 128647) affected shaders: CodeSize: 160502348 -> 160502340 (-0.00%) Instrs: 30722664 -> 30722662 (-0.00%) Latency: 137627419 -> 137782218 (+0.11%); split: -0.00%, +0.11% InvThroughput: 21698587 -> 21699068 (+0.00%); split: -0.00%, +0.00% Copies: 3288263 -> 3288261 (-0.00%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00
Timur Kristóf	f4a65e5628	ac/nir/nggc: Only repack arguments that are needed. Don't repack everything, only what is actually used. The goal of this commit is primarily to remove unnecessary LDS stores and loads. In addition to that, it also gets rid of a few VALU instructions and reduces VGPR use. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 6951 (5.40% of 128647) affected shaders: VGPRs: 206056 -> 205360 (-0.34%); split: -0.79%, +0.45% CodeSize: 12344568 -> 12269312 (-0.61%); split: -0.62%, +0.01% MaxWaves: 211206 -> 212196 (+0.47%) Instrs: 2319459 -> 2308483 (-0.47%); split: -0.50%, +0.03% Latency: 7220829 -> 7164721 (-0.78%); split: -1.21%, +0.43% InvThroughput: 1051450 -> 1049191 (-0.21%); split: -0.36%, +0.15% VClause: 25794 -> 25445 (-1.35%); split: -1.97%, +0.61% SClause: 39192 -> 39277 (+0.22%); split: -0.21%, +0.43% Copies: 315756 -> 313404 (-0.74%); split: -1.17%, +0.42% Branches: 127878 -> 127879 (+0.00%); split: -0.00%, +0.00% PreVGPRs: 168029 -> 160162 (-4.68%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00
Timur Kristóf	02bba6aab5	ac/nir/nggc: Don't stop applying reusable variables at prim export. This was a mistake that prevented reusing variables in shaders with late primitive export. Fossil DB stats on Sienna Cichlid with NGGC on: Totals from 6547 (5.09% of 128647) affected shaders: VGPRs: 323368 -> 323824 (+0.14%); split: -0.03%, +0.18% SpillSGPRs: 45 -> 4865 (+10711.11%) CodeSize: 34208732 -> 33855952 (-1.03%); split: -1.21%, +0.18% MaxWaves: 142538 -> 142456 (-0.06%); split: +0.04%, -0.09% Instrs: 6654252 -> 6606432 (-0.72%); split: -0.89%, +0.17% Latency: 30527770 -> 30452769 (-0.25%); split: -0.42%, +0.18% InvThroughput: 5604540 -> 5609450 (+0.09%); split: -0.04%, +0.13% VClause: 121531 -> 120448 (-0.89%); split: -1.17%, +0.27% SClause: 195388 -> 177902 (-8.95%); split: -9.14%, +0.19% Copies: 617949 -> 636397 (+2.99%); split: -0.44%, +3.42% Branches: 228184 -> 228281 (+0.04%); split: -0.09%, +0.13% PreSGPRs: 271395 -> 343555 (+26.59%); split: -0.01%, +26.60% PreVGPRs: 277650 -> 277710 (+0.02%); split: -0.01%, +0.03% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00
Timur Kristóf	59de9620b4	ac/nir/ngg: Delete unused struct. This was left there by accident after a rebase mistake. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12246>	2021-09-01 14:45:14 +00:00
Marek Olšák	3362da2c53	ac/gpu_info: fix detection of smart access memory chip_class was 0. Move the code after setting chip_class. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5282 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>	2021-09-01 00:42:58 +00:00
Marek Olšák	46cb3bb4d1	ac/debug: add an option to disable colors for printed IBs Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>	2021-09-01 00:42:58 +00:00
Marek Olšák	0aed2d0cd3	radeonsi: stop using AC_EXP_PARAM_UNDEFINED because it's not useful Just use AC_EXP_PARAM_DEFAULT_VAL_0000 to keep things simple. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>	2021-09-01 00:42:57 +00:00
Timur Kristóf	395c0c52c7	ac: Calculate workgroup sizes of HW stages that operate in workgroups. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12321>	2021-08-26 09:46:18 +00:00
Timur Kristóf	5b7446d74c	radv, ac, aco: Use indices 0-2 of gs_vtx_offset argument array on GFX9+. Previously, indices 0, 2, 4 were used. This worked, but it was somewhat unintuitive. This commit changes it to use indices 0, 1, 2 instead, which makes the code easier to understand. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12511>	2021-08-26 05:20:15 +00:00
Marek Olšák	556c10c02c	ac/surface: allow arbitrary swizzle modes for displayable DCC Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430>	2021-08-20 14:28:36 +00:00
Eric Engestrom	f1eae2f8bb	python: drop python2 support Signed-off-by: Eric Engestrom <eric@engestrom.ch> Acked-by: Jose Fonseca <jfonseca@vmware.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3674>	2021-08-14 21:44:32 +00:00
Michel Zou	e4c0a34bfe	radv: fix build with mingw Cc: 21.2 mesa-stable Reviewed-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes #5092 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12178>	2021-08-13 12:13:21 +02:00
Samuel Pitoiset	16793c8efa	ac/surface: implement CmaskAddrFromCoord in NIR on GFX10+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12182>	2021-08-05 06:37:09 +00:00
Samuel Pitoiset	1d67fa4d73	ac/surface: add tests for CmaskAddrFromCoord on GFX10+ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12182>	2021-08-05 06:37:09 +00:00
Timur Kristóf	8918a809ce	ac: Remove deprecated use_late_alloc field as nobody uses it anymore. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11905>	2021-08-04 15:37:05 +00:00
Samuel Pitoiset	a49b397041	ac/surface: implement CmaskAddrFromCoord in NIR It's similar to DCC, only GFX9 is currently supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12140>	2021-08-03 07:02:48 +00:00
Samuel Pitoiset	eedc0b59b7	ac/surface: copy the CMASK equation to radeon_surf Only GFX9 is currently supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12140>	2021-08-03 07:02:48 +00:00
Samuel Pitoiset	1f12c3ccc1	ac/surface: store CMASK pitch and height to radeon_surf Only GFX9+ is currently supported. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12140>	2021-08-03 07:02:48 +00:00
Samuel Pitoiset	132b205566	ac/surface: add tests for CmaskAddrFromCoord prototype outside of addrlib Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12140>	2021-08-03 07:02:48 +00:00
Samuel Pitoiset	501db87779	ac: introduce a structure to store DCC address equations for GFX9 CMASK addr equations will use the same struct. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12140>	2021-08-03 07:02:48 +00:00
Timur Kristóf	a6110f3c3a	ac/nir: Remove unhelpful nir_opt_cse from ac_nir_lower_ngg_nogs. This CSE call adds to our compile time without adding any real benefit to the compiled code. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 1580 (1.23% of 128647) affected shaders: CodeSize: 4563912 -> 4562312 (-0.04%); split: -0.07%, +0.03% Instrs: 870722 -> 870338 (-0.04%); split: -0.09%, +0.04% Latency: 3349863 -> 3351458 (+0.05%); split: -0.10%, +0.14% InvThroughput: 617796 -> 617971 (+0.03%); split: -0.01%, +0.03% VClause: 22604 -> 22568 (-0.16%); split: -0.75%, +0.59% SClause: 16285 -> 16327 (+0.26%); split: -0.07%, +0.33% Copies: 83472 -> 83599 (+0.15%); split: -0.07%, +0.22% PreSGPRs: 62340 -> 62334 (-0.01%) No Fossil DB changes with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	4b540aef2c	ac/nir: Don't count vertices and primitives in wave after culling. These are not needed anymore, because the EXEC mask doesn't depend on them. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 58239 (45.27% of 128647) affected shaders: Latency: 138113669 -> 138285372 (+0.12%) InvThroughput: 22404840 -> 22405245 (+0.00%) No Fossil DB changes with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	a2d02c0c11	ac/nir: Use gs_accepted variable after culling. This prevents us from recalculating the EXEC mask later in the shader, and removes the requirement for counting the number of primitives. The stats are better than expected because they also show that some code that is still there is now DCE'd by ACO. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 58239 (45.27% of 128647) affected shaders: SpillSGPRs: 330 -> 340 (+3.03%) CodeSize: 166356072 -> 162805724 (-2.13%) Instrs: 31920041 -> 31089256 (-2.60%) Latency: 138815742 -> 138113669 (-0.51%); split: -0.54%, +0.03% InvThroughput: 22459553 -> 22404840 (-0.24%); split: -0.26%, +0.02% SClause: 753746 -> 753765 (+0.00%); split: -0.00%, +0.01% Copies: 3226647 -> 3268973 (+1.31%); split: -0.45%, +1.76% Branches: `1223441` -> 1223440 (-0.00%); split: -0.00%, +0.00% PreSGPRs: 2025339 -> 2091013 (+3.24%) No Fossil DB changes with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	8159868699	ac/nir: Use es_accepted variable after culling. This avoids re-calculating the exec mask for ES vertices, and makes it unnecessary to count the number of vertices left. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 58239 (45.27% of 128647) affected shaders: CodeSize: 166521108 -> 166356072 (-0.10%); split: -0.10%, +0.00% Instrs: 31961308 -> 31920041 (-0.13%); split: -0.13%, +0.00% Latency: 138820463 -> 138815742 (-0.00%); split: -0.04%, +0.04% InvThroughput: 22460177 -> 22459553 (-0.00%); split: -0.00%, +0.00% SClause: 753744 -> 753746 (+0.00%) Copies: 3093140 -> 3226647 (+4.32%); split: -0.03%, +4.34% No Fossil DB changes with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	1bbea90f50	aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags. Instead of v_bfe + v_lshl_or for each vertex, get all 3 edge flags at once of every vertex. This takes fewer VALU instructions than previously. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 56917 (44.24% of 128647) affected shaders: CodeSize: 161028288 -> 158751628 (-1.41%) Instrs: 30917985 -> 30519571 (-1.29%) Latency: 130617204 -> 129975532 (-0.49%); split: -0.50%, +0.01% InvThroughput: 21280238 -> 20927401 (-1.66%) Copies: 3011120 -> 3011125 (+0.00%); split: -0.00%, +0.00% No Fossil DB changed with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	8341af5109	radv, aco, ac/nir: Tweak position export scheduling for NGG culling. The result is about +5-ish fps in Doom Eternal. It turns out that the location of position exports matters more than we thought, and it's actually better to keep them at the bottom for culling shaders rather than schedule it up to the top. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	0bb543bb60	ac/nir: Reuse uniforms from top part of culling shaders. Uniforms have the same value in all invocations, therefore they can safely be reused by invocations even after repacking. This saves several instructions from culling shaders, mainly UBO loads and such. We exclude uniform floats, because those would harm the VGPR usage of the shaders too much. Fossil DB results on Sienna Cichlid (with NGG culling on): Totals from 55379 (43.05% of 128647) affected shaders: VGPRs: 1926472 -> 1925360 (-0.06%); split: -0.07%, +0.01% SpillSGPRs: 139 -> 330 (+137.41%) CodeSize: 159472988 -> 157462856 (-1.26%); split: -1.27%, +0.00% MaxWaves: 1571492 -> 1571412 (-0.01%) Instrs: 30665685 -> 30302076 (-1.19%); split: -1.21%, +0.02% Latency: 127385148 -> 126723891 (-0.52%); split: -0.55%, +0.03% InvThroughput: 21096298 -> 20773069 (-1.53%); split: -1.53%, +0.00% VClause: 514792 -> 511231 (-0.69%); split: -0.83%, +0.13% SClause: 713959 -> 679556 (-4.82%); split: -4.84%, +0.02% Copies: 2975106 -> 2828185 (-4.94%); split: -5.39%, +0.45% Branches: 1201921 -> 1152766 (-4.09%) PreSGPRs: 1753786 -> 1892848 (+7.93%); split: -0.00%, +7.93% PreVGPRs: 1590522 -> 1583574 (-0.44%); split: -0.44%, +0.00% Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	fc1fabbabf	ac/nir: Analyze culling shaders to remember which inputs are used when. These will be useful for some optimizations. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	faf766b864	ac/nir: Reuse the repacked output positions of culling shaders. The position outputs are stored into LDS and reloaded after repacking, therefore the repacked position values can be reused in the bottom part of the shader. Fossil DB results on Sienna Cichlid (with NGG culling on): Totals from 9016 (7.01% of 128647) affected shaders: VGPRs: 372472 -> 347560 (-6.69%); split: -6.82%, +0.13% SpillSGPRs: 437 -> 87 (-80.09%) CodeSize: 32359340 -> 30441692 (-5.93%); split: -5.93%, +0.00% MaxWaves: 222030 -> 238970 (+7.63%); split: +7.83%, -0.20% Instrs: 6207833 -> 5834149 (-6.02%); split: -6.02%, +0.00% Latency: 27626263 -> 27890632 (+0.96%); split: -5.34%, +6.29% InvThroughput: 4792958 -> 4361336 (-9.01%); split: -9.01%, +0.00% VClause: 144385 -> 139586 (-3.32%); split: -9.29%, +5.97% SClause: 141350 -> 129875 (-8.12%); split: -8.57%, +0.45% Copies: 580017 -> 568916 (-1.91%); split: -3.60%, +1.68% Branches: 209067 -> 209154 (+0.04%); split: -0.24%, +0.28% PreSGPRs: 281320 -> 277814 (-1.25%) PreVGPRs: 290040 -> 273861 (-5.58%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	e97f0463a8	ac/nir: Implement NGG deferred attribute culling in NIR. Culling is traditionally done by the rasterizer, but that can be a bottleneck when an app creates a large number of primitives. Eg. a lot of tiny triangles reduce the rasterziation efficiency. NGG makes it possible for the shader to check primitives and delete those that it can prove are not needed. After this is done, we have to repack the surviving invocations so they remain compact. This also saves bandwidth, because some memory loads are only executed by those vertices that survived the culling. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	556a690bac	ac/nir: Use a ballot that matches the wave size during NGG lowering. This generates slightly more efficient code in Wave32 mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Timur Kristóf	651a3da1b5	ac/nir: Add a NIR port of ac_llvm_cull. The algorithms were originally implemented by Marek Olšák, hence the copyright to AMD. This commit just ports the LLVM based implementation to NIR, using the new intrinsics added earlier. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Samuel Pitoiset	29f264f258	ac,radv: implement the cs_regalloc_hang HW bug workaround Might fix spurious failures on GFX6 and some GFX7 chips. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11675>	2021-07-09 13:37:37 +00:00

1 2 3 4 5 ...

1732 commits