fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 13:48:06 +02:00

Author	SHA1	Message	Date
Rhys Perry	4a909068ad	aco: look at p_{extract,split}_vector's definitions in pred_by_exec_mask() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333>	2020-03-30 17:34:46 +00:00
Jason Ekstrand	16a80ff18a	aco: Implement b2b32 and b2b1 The implementations here just clone i2b32 and i2b1. This means that b2b32 doesn't technically generate true NIR 0/-1 booleans but it should be fine as it's only ever generated for shared variable writes which will always be consumed by something which will then run it through an i2b again. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>	2020-03-30 15:46:19 +00:00
Timur Kristóf	0f847b18bc	aco: Don't store LS VS outputs to LDS when TCS doesn't need them. Totals: Code Size: 254764624 -> 254745104 (-0.01 %) bytes Totals from affected shaders: VGPRS: 12132 -> 12112 (-0.16 %) Code Size: 573364 -> 553844 (-3.40 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	798dd98d6e	aco: When LS and HS invocations are the same, pass LS outputs in temps. We know that in this case, the LS and HS invocations are working on the exact same vertex, so it's safe to skip the LDS. Totals: VGPRS: 3960744 -> 3961844 (0.03 %) Code Size: 254824300 -> 254764624 (-0.02 %) bytes Max Waves: 1053748 -> 1053574 (-0.02 %) Totals from affected shaders: VGPRS: 26152 -> 27252 (4.21 %) Code Size: 1496600 -> 1436924 (-3.99 %) bytes Max Waves: 4860 -> 4686 (-3.58 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	0a91c086b8	aco: Extract store_output_to_temps into a separate function. Will be used by LS output stores. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	0f35b3795d	aco: Fix workgroup size calculation. Clear the workgroup size for all supported shader stages. Also, unify the workgroup size calculation accross various places. As a result, insert_waitcnt can use the proper workgroup size which means that some waits can be dropped from tessellation shaders. Also, in cases where the previous calculation was wrong, we now insert s_barrier instructions. Totals from affected shaders (GFX10): Code Size: 340116 -> 338484 (-0.48 %) bytes Fixes: `a8d15ab6da` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	99ad62ff27	aco: Extract setup_tcs_info to a separate function. Will be required by the workgroup size calculation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	0ad65f2c55	aco: Zero-fill undefined elements in create_vec_from_array. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	50634ad4a0	aco: Change isel inputs/outputs to a flat array. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e4a1b246a4	aco: Treat outputs of the previous stage as inputs of the next stage. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e7d733fdab	aco: Use more optimal sequence at the beginning of merged shaders. It can be further optimized in the future, but the new sequence already has a few advantages: * Uses fewer instructions * Uses even fewer instructions in wave32 mode * Doesn't use the VALU at all Totals from affected shaders (GFX10): VGPRS: 43504 -> 43496 (-0.02 %) Code Size: 2436000 -> 2423688 (-0.51 %) bytes Max Waves: 8704 -> 8705 (0.01 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	17c779ab9e	aco: Skip 2nd read of merged wave info when TCS in/out vertices are equal. When TCS has an equal number of input and output, it means that the number of VS and TCS invocations (LS and HS) are the same; and that the HS invocations operate on the same vertices as the LS. When this is the case, this commit removes the else-if between the merged VS and TCS halves, making it possible to schedule and optimize the code accross the two halves. Totals: SGPRS: 5577367 -> 5581735 (0.08 %) VGPRS: 3958592 -> 3960752 (0.05 %) Code Size: 254867144 -> 254838244 (-0.01 %) bytes Max Waves: 1053887 -> 1053747 (-0.01 %) Totals from affected shaders: SGPRS: 29032 -> 33400 (15.05 %) VGPRS: 35664 -> 37824 (6.06 %) Code Size: 1979028 -> 1950128 (-1.46 %) bytes Max Waves: 7310 -> 7170 (-1.92 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	4ec48440a0	aco: Allow combining LDS loads when loading tess factors. Previously the tess factors were loaded individually, but now they can be loaded using a single LDS load instruction. Note that the inner and outer tess factors are not yet combined. Totals (GFX10): Code Size: 254896008 -> 254879212 (-0.01 %) bytes Totals from affected shaders (GFX10): Code Size: 2028352 -> 2011556 (-0.83 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	ace3833293	aco: Allow combining TCS output VMEM stores. Some copypasta may have stuck in the code. This was left on false by mistake. Totals (GFX10): Code Size: 254939248 -> 254896008 (-0.02 %) bytes Totals from affected shaders (GFX10): VGPRS: 16196 -> 16212 (0.10 %) Code Size: 1126332 -> 1083092 (-3.84 %) bytes Max Waves: 2336 -> 2334 (-0.09 %) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e2b1d749b1	aco: Fix handling of tess factors. There is no need to check whether they are written using indirect indices, because all tess factors should be written to VMEM only at the end of the shader. No pipeline db changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	d3f6adcaed	aco: Extract tcs_driver_location_matches_api_mask to separate function. Also clear up should_write_tcs_output_to_lds a little bit. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Timur Kristóf	e0dff5fd86	aco: Create null exports in instruction selection instead of assembler. This allows the passes after isel to assume that the exports are always correct, and also allows to schedule these null exports later. Additionally, it ensures that the correct exec mask is used for these exports. Totals from affected shaders (GFX10): SGPRS: 84224 -> 84344 (0.14 %) VGPRS: 23088 -> 23076 (-0.05 %) Code Size: 882892 -> 894368 (1.30 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>	2020-03-30 13:09:08 +00:00
Eric Engestrom	79af30768d	meson: inline `inc_common` Let's make it clear what includes are being added everywhere, so that they can be cleaned up. Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>	2020-03-28 21:36:54 +01:00
Marek Olšák	013b65635f	radv: stop including files from mesa/main Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4324>	2020-03-27 21:00:10 +00:00
Samuel Pitoiset	ba2ec1f369	ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv() Instead of emitting 1.0 / x which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. No pipeline-db changes with VEGA10/LLVM 9. pipeline-db (VEGA10/LLVM 10): Totals from affected shaders: SGPRS: 6672 -> 6672 (0.00 %) VGPRS: 6652 -> 6652 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 561780 -> 561692 (-0.02 %) bytes Max Waves: 1043 -> 1043 (0.00 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 84608 -> 83768 (-0.99 %) VGPRS: 106768 -> 106636 (-0.12 %) Spilled SGPRs: 1625 -> 1713 (5.42 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 10850936 -> 10726712 (-1.14 %) bytes Max Waves: 3152 -> 3180 (0.89 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>	2020-03-27 08:05:43 +01:00
Samuel Pitoiset	d548384fc6	ac/nir: use llvm.amdgcn.rsq for nir_op_frsq Instead of emitting 1.0 / sqrt(x) which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. pipeline-db (VEGA10/LLVM 9): Totals from affected shaders: SGPRS: 16872 -> 16864 (-0.05 %) VGPRS: 15320 -> 15464 (0.94 %) Spilled SGPRs: 2021 -> 2133 (5.54 %) Code Size: 1915464 -> 1917476 (0.11 %) bytes Max Waves: 641 -> 639 (-0.31 %) pipeline-db (VEGA10/LLVM 10): Totals from affected shaders: SGPRS: 43936 -> 44120 (0.42 %) VGPRS: 41776 -> 41972 (0.47 %) Spilled SGPRs: 875 -> 875 (0.00 %) Code Size: 4468164 -> 4468120 (-0.00 %) bytes Max Waves: 2412 -> 2414 (0.08 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 60096 -> 60096 (0.00 %) VGPRS: 63552 -> 63648 (0.15 %) Spilled SGPRs: 6135 -> 6117 (-0.29 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 6252996 -> 6249772 (-0.05 %) bytes Max Waves: 2324 -> 2337 (0.56 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>	2020-03-27 07:45:47 +01:00
Samuel Pitoiset	66426ce119	ac/nir: use llvm.amdgcn.rcp for nir_op_frcp Instead of emitting 1.0 / x which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. pipeline-db (VEG10/LLVM 9): Totals from affected shaders: SGPRS: 50384 -> 50312 (-0.14 %) VGPRS: 42572 -> 42696 (0.29 %) Spilled SGPRs: 1372 -> 1372 (0.00 %) Code Size: 5692040 -> 5691428 (-0.01 %) bytes Max Waves: 3954 -> 3951 (-0.08 %) pipeline-db (VEG10/LLVM 10): Totals from affected shaders: SGPRS: 78512 -> 78464 (-0.06 %) VGPRS: 62408 -> 62484 (0.12 %) Spilled SGPRs: 1502 -> 1502 (0.00 %) Code Size: 8106188 -> 8103372 (-0.03 %) bytes Max Waves: 7759 -> 7753 (-0.08 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 112760 -> 113232 (0.42 %) VGPRS: 111132 -> 110568 (-0.51 %) Spilled SGPRs: 5870 -> 5940 (1.19 %) Spilled VGPRs: 650 -> 652 (0.31 %) Code Size: 11887232 -> 11561744 (-2.74 %) bytes Max Waves: 8964 -> 9015 (0.57 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>	2020-03-27 07:45:43 +01:00
Pierre-Eric Pelloux-Prayer	5533c41541	ac: fix ac_build_is_helper_invocation when postponed_kill is null If there was no demote() in the shader, ac_build_is_helper_invocation behaves exactly the same as ac_build_load_helper_invocation, i.e. the helper lanes are the same as they were at the beginning of the shader. Fixes: `de57ea2a3d` ("amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301>	2020-03-25 08:19:38 +01:00
Samuel Pitoiset	238e2ed210	radv: enable VK_KHR_8bit_storage on GFX6-GFX7 Enabling a Vulkan extension doesn't mean that all features need to be implemented. DOOM Eternal crashes at launch if that ext is not supported but it doesn't matter if the features are enabled or not. Let's enable it like we did for VK_KHR_16bit_storage. Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>	2020-03-24 16:34:21 +00:00
Rhys Perry	43918c9a7f	aco: implement 64-bit VGPR constant copies in handle_operands() 64-bit VGPR constant copies can happen because of 64-bit constant copy propagation. Since this optimization is beneficial and more annoying to deal with in the optimizer, I've implemented 64-bit VGPR constant copies in handle_operands(). This also sets copy_operation::size correctly for 64-bit constant copies. Cc: 20.0 <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>	2020-03-24 11:28:55 +00:00
Rhys Perry	21ba2bc595	aco: remove dead code in handle_operands() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>	2020-03-24 11:28:55 +00:00
Rhys Perry	17c7f4e30e	aco: fix boolean undef regclass Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285>	2020-03-23 19:43:09 +00:00
Rhys Perry	9d56ed199b	aco: emit IR in IF's merge block instead if the other side ends in a jump Fixes NIR such as: if (divergent) { a = sgpr() } else { break; } use(a) Previously we would have emitted: if (divergent) { a = sgpr() } if (!divergent) { break; } use(a) But "a" isn't available at it's use. Now we emit: if (divergent) { } if (!divergent) { break; } a = sgpr() use(a) pipeline-db (Navi): Totals from affected shaders: SGPRS: 1936 -> 1936 (0.00 %) VGPRS: 1264 -> 1264 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 159408 -> 159152 (-0.16 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 81 -> 81 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2557 Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>	2020-03-23 15:55:12 +00:00
Rhys Perry	8d8c864beb	aco: improve check for unreachable loop continue blocks The old code would have previously caught: loop { ... break } when it was meant to just catch: loop { if (...) break else break } Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>	2020-03-23 15:55:12 +00:00
Rhys Perry	46e94fd854	aco: skip NIR in unreachable merge blocks NIR removes most of this but undef instructions for loop header phis can remain. These were harmless because ACO would DCE them itself. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>	2020-03-23 15:55:12 +00:00
Rhys Perry	638cbc21a1	aco: handle when ACO adds new continue edges Usually a loop ends with a uniform continue. If it doesn't and we end up adding our own continue edges (because of continue_or_break or divergent breaks at the end), we have to add extra operands to the loop header phis. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>	2020-03-23 15:55:12 +00:00
Rhys Perry	f2c4878de9	aco: handle missing second predecessors at merge block phis Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>	2020-03-23 15:55:12 +00:00
Rhys Perry	f1a2e1df78	aco: set has_divergent_branch for discards in loops Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> CC: <mesa-stable@lists.freedesktop.org> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>	2020-03-23 15:55:12 +00:00
Samuel Pitoiset	7ac8bb33cd	radv/llvm: fix subgroup shuffle for chips without bpermute bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead we have to use readlane with a waterfall loop to defeat the LLVM backend. This fixes DOOM Eternal which requires subgroup shuffle. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>	2020-03-23 14:19:03 +00:00
Samuel Pitoiset	de550805c5	radv/winsys: spoof some values for num_render_backends in the null winsys To avoid crashes when RADV_FORCE_FAMILY is set to GFX9+ because num_render_backends is used to compute binning state. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282>	2020-03-23 09:50:53 +01:00
Samuel Pitoiset	b911af06cd	radv/winsys: fix wrong PCI ID for Vega10 in the null winsys Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282>	2020-03-23 09:50:51 +01:00
Marek Olšák	303842b2db	ac: fix fast division This stopped working with LLVM 11 and might occasionally have been broken on older LLVM, because the metadata was set on the mul, not on the rcp. Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>	2020-03-21 22:34:17 +00:00
Rhys Perry	500842399a	radv/winsys: set has_syncobj_wait_for_submit in the null winsys Needed for Vulkan 1.1+ Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4249> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4249>	2020-03-20 09:51:06 +00:00
Samuel Pitoiset	2d3223ca90	radv: fix optional pSizes parameter when binding streamout buffers The Vulkan spec 1.2.135 says: "pSizes is an optional array of buffer sizes, specifying the maximum number of bytes to capture to the corresponding transform feedback buffer. If pSizes is NULL, or the value of the pSizes array element is VK_WHOLE_SIZE, then the maximum bytes captured will be the size of the corresponding buffer minus the buffer offset." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2650 Fixes: `b4eb029062` ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>	2020-03-20 09:25:14 +01:00
John Stultz	511c6408f4	Android.mk: Tweak MESA_ENABLE_LLVM checks Change the MESA_ENABLE_LLVM checks in Android.mk files in order to get mesa3d to build w/ AOSP using mmma. This tries to re-create a change that was introduced in the following merge in the AOSP branch: 69f2c0128d2b Merge branch 'aosp/upstream-18.0' Acked-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4175>	2020-03-19 21:20:08 +00:00
Rhys Perry	cf62c2b2ac	radv: call nir_shader_gather_info again pipeline-db (Navi, ACO): Totals from affected shaders: SGPRS: 11840 -> 11840 (0.00 %) VGPRS: 19012 -> 19124 (0.59 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 3696 -> 3696 (0.00 %) dwords per thread Code Size: 998680 -> 921388 (-7.74 %) bytes LDS: 19646 -> 19646 (0.00 %) blocks Max Waves: 3398 -> 3401 (0.09 %) pipeline-db (Navi, LLVM): Totals from affected shaders: SGPRS: 17016 -> 17128 (0.66 %) VGPRS: 19564 -> 14876 (-23.96 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 3872 -> 3872 (0.00 %) dwords per thread Code Size: 820416 -> 743576 (-9.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 3367 -> 3534 (4.96 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193>	2020-03-19 15:37:07 +00:00
Samuel Pitoiset	56de6f698e	radv: remove wrong assert that checks compute subgroup size Ooops. For some reasons, I have been confused with Wave32 on GFX10, but it's still possible to require a specific subgroup size if only Wave64 is supported. Fixes: `672d106199` ("radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227>	2020-03-18 21:31:47 +00:00
Samuel Pitoiset	94e37859a9	radv: fix random depth range unrestricted failures due to a cache issue The shader module name is used to compute the pipeline key. The driver used to load the wrong pipelines because the shader names were similar. This should fix random failures of dEQP-VK.pipeline.depth_range_unrestricted.* Fixes: `f11ea22666` ("radv: fix a performance regression with graphics depth/stencil clears") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>	2020-03-18 11:36:24 +00:00
Bas Nieuwenhuizen	8e4e2cedcf	amd/llvm: Fix divergent descriptor regressions with radeonsi. piglit/bin/arb_bindless_texture-limit -auto -fbo: Needed to deal with non-NULL dynamic_index without deref in tex instructions. piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto: Need to deal with non-deref images in enter_waterfall_imae. Fixes: `b83c9aca4a` "amd/llvm: Fix divergent descriptor indexing. (v3)" Acked-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>	2020-03-17 22:53:16 +01:00
Marek Olšák	8dc5e174c7	ac: don't set old denormals flags with LLVM >= 11 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>	2020-03-17 20:47:48 +00:00
Marek Olšák	63a5051ea6	ac: set new LLVM denormal flags See: https://reviews.llvm.org/D71358 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>	2020-03-17 20:47:48 +00:00
Marek Olšák	56cc10bd27	ac: unify denorm setting enforcement Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>	2020-03-17 20:47:48 +00:00
Samuel Pitoiset	c923de68dd	radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control If compute shaders require a specific subgroup size (ie. Wave32), we have to use the correct ballot size. Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize. Fixes: `fb07fd4e6c` ("radv: implement VK_EXT_subgroup_size_control") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>	2020-03-17 12:45:01 +00:00
Samuel Pitoiset	672d106199	radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control If compute shaders require a specific subgroup size (ie. Wave32), we have to return the correct one. Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*. Fixes: `fb07fd4e6c` ("radv: implement VK_EXT_subgroup_size_control") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>	2020-03-17 12:45:01 +00:00
Samuel Pitoiset	46e8ba1344	radv: only inject implicit subpass dependencies if necessary The Vulkan 1.2.134 spec update clarified when implicit subpass dependencies should be injected by the driver. They only make sense if automatic layout transitions are performed. This should fix a performance regression with RPCS3 (although they added a workaround for RADV since the regression has been found). Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2502 Fixes: `e60de08547` ("radv: handle missing implicit subpass dependencies") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>	2020-03-17 13:24:36 +01:00

1 2 3 4 5 ...

4843 commits