fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 00:48:07 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	d887eb141b	aco: propagate SGPRs into VOP1 instructions early. This helps DCE. We should reconsider our optimization order or maybe do the dead code analysis twice Totals from 106 (0.08% of 136546) affected shaders (RAVEN): SGPRs: 7184 -> 7152 (-0.45%) CodeSize: 736912 -> 736052 (-0.12%) Instrs: 145739 -> 145509 (-0.16%) Cycles: 2085344 -> 2084268 (-0.05%) VMEM: 14819 -> 14807 (-0.08%) SMEM: 7109 -> 7100 (-0.13%); split: +0.04%, -0.17% SClause: 5383 -> 5385 (+0.04%) Copies: 13290 -> 13189 (-0.76%) PreSGPRs: 5265 -> 5221 (-0.84%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Samuel Pitoiset	20d73a9049	aco: adjust an assertion about the wavesize in emit_gfx10_wave64_bpermute() This gets rids of one more use of radv_shader_info. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>	2020-10-14 15:09:34 +00:00
Samuel Pitoiset	112e66fa09	aco: compute the CS workgroup size from the shader NIR info cs.block_size is copied from cs.local_size during the shader info pass. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>	2020-10-14 15:09:34 +00:00
Samuel Pitoiset	e3e8d13ada	radv: move compiler statistics to ACO They are really specific to ACO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>	2020-10-14 15:09:34 +00:00
Samuel Pitoiset	97afb2a0a9	aco: remove unused radv_shader.h includes Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>	2020-10-14 15:09:34 +00:00
Samuel Pitoiset	408195ec53	aco: remove useless occurences of radv_nir_compiler_options Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>	2020-10-14 15:09:34 +00:00
Samuel Pitoiset	8a6f60fc6b	aco: remove stub lower_wqm() prototype Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>	2020-10-14 15:09:34 +00:00
Samuel Pitoiset	48b988e35f	radv: fix ignoring the vertex attribute stride if set as dynamic The vertex attribute stride should be ignored, so make sure it's initialized to zero if dynamic to avoid computing a wrong offset. The fact that each element of pStrides must be greater than or equal to the maximum extent of all vertex input attributes fetched saves us one user SGPR for the dynamic stride. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3627 Cc: 20.2 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7101>	2020-10-14 12:29:39 +00:00
James Park	28d02b9d3e	ac,amd/llvm,radv: Initialize structs with {0} Necessary to compile with MSVC. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7123>	2020-10-14 12:15:23 +00:00
Samuel Pitoiset	b84d1a0c42	radv/aco: disable NGG GS support because it randomly hangs the GPU Disable ACO NGG GS until the random GPU hangs are fixed (one CTS run == one GPU hang here). No hangs so far after 5 full CTS runs with this disabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7108>	2020-10-14 13:52:42 +02:00
James Park	7758664788	radv: Only close local_fd when valid Necessary when drm_device is bypassed. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>	2020-10-13 22:56:31 +00:00
James Park	1026e2ac0f	radv: Increased const usage Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>	2020-10-13 22:56:31 +00:00
James Park	1b551857f9	amd/addrlib: Fix warning list for msvc Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7119>	2020-10-13 22:56:31 +00:00
Rhys Perry	c122315702	aco: fix get_ssbo_size with a vgpr resource The result of load_vulkan_descriptor is passed directly to get_ssbo_size. This caused convert_pointer_to_64_bit() to skip creating a v_readfirstlane_b32 if it was necessary. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `05b6612b4e` ('radv: do not lower UBO/SSBO access to offsets') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3628 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7095>	2020-10-13 14:20:28 +00:00
Rhys Perry	bcf7a70008	aco: use nir_opt_uniform_atomics Significantly improves performance of a Control compute shader. Also seems to increase FPS at the very start of the game by ~9% (RX 580, 1080p, medium settings, no MSAA). fossil-db (Navi): Totals from 315 (0.23% of 135946) affected shaders: SGPRs: 18296 -> 18336 (+0.22%); split: -0.26%, +0.48% VGPRs: 11856 -> 11844 (-0.10%); split: -0.81%, +0.71% CodeSize: 2233800 -> 2457508 (+10.01%) MaxWaves: 4506 -> 4497 (-0.20%); split: +0.04%, -0.24% Instrs: 438766 -> 486215 (+10.81%); split: -0.00%, +10.81% Cycles: 7880180 -> 8963340 (+13.75%); split: -0.00%, +13.75% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:21 +00:00
Rhys Perry	e1120f274f	nir: move divergence analysis options to nir_shader_compiler_options Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:21 +00:00
Rhys Perry	bb5c0ba0d2	aco: implement last_invocation Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:21 +00:00
Rhys Perry	8850a63161	radv/aco,nir/lower_subgroups: don't lower elect ACO can implement this better. fossil-db (Navi): Totals from 33 (0.02% of 135946) affected shaders: SGPRs: 1736 -> 1744 (+0.46%) VGPRs: 1680 -> 1656 (-1.43%) CodeSize: 246160 -> 245916 (-0.10%); split: -0.14%, +0.04% MaxWaves: 449 -> 461 (+2.67%) Instrs: 48301 -> 48266 (-0.07%); split: -0.12%, +0.05% Cycles: 469740 -> 469240 (-0.11%); split: -0.18%, +0.08% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Rhys Perry	36da9c4aa2	aco: implement elect Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Rhys Perry	bf77f539ee	aco: optimize more uniform reductions/scans Uniform atomic optimization will create these. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Samuel Pitoiset	b9ca4923d6	aco: implement missing nir_op_unpack_half_2x16_split_{x,y}_flush_to_zero SPIRV->NIR emits nir_op_unpack_half_2x16_flush_to_zero instead of nir_op_unpack_half_2x16 if the shader enables denorm flush to zero for 16-bit floating point. This doesn't fix anything known and CTS doesn't have tests. Fixes: `56d9bcdded` ("radv: enable more float_controls features") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6939>	2020-10-13 08:35:22 +02:00
Eric Engestrom	c02e933de4	radv: add missing u_atomic.h include Fixes: `7568c97df1` ("radv: Use atomics to read query results.") Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7050>	2020-10-12 23:24:18 +02:00
Bas Nieuwenhuizen	1fb3e1fb70	radv: Fix mipmap extent adjustment on GFX9+. With arrays we really have to use the correct size for the base mipmap to get the right array pitch. In particular, using surf_pitch results in pitch that is bigger than the base mipmap and hence results in wrong pitches computed by the HW. It seems that on GFX9 this has mostly been hidden by the epitch provided in the descriptor but this is not something we do on GFX10 anymore. Now this has some draw-backs: 1. normalized coordinates don't work 2. Bounds checking uses slightly bigger bounds. 2 mostly is not an issue as we still ensure that they're within the texture memory and not overlapping other layers/mips, but we can't properly ignore writes. 1 is kinda dead in the water ... On the other hand I'd argue that using normalized coords & a filter for sampling a block view of a compressed format is extraordinarily useless. The old method we employed already had these drawbacks for everything except the base miplevel of the imageview. AFAICT this is the same tradeoff AMDVLK makes and no CTS test hits this. (once it does I think the HW is dead in the water ... Only workaround I can think of is shader processing which is hard because we don't know texture formats at compile time.) I also removed the extra calculations when the image has only 1 mip level because they ended up being a no-op in that case. CC: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2292 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2266 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2483 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2906 Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3607 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7090>	2020-10-12 21:00:00 +00:00
Samuel Pitoiset	66d7bb0f23	radv: fix adjusting vertex alpha AC_FETCH_FORMAT_NONE is not zero... Oops. Fixes: `b0829c6af7` ("radv: replace RADV_ALPHA_ADJUST by AC_FETCH_FORMAT") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7103>	2020-10-12 19:48:35 +02:00
Samuel Pitoiset	b32a8f83dc	radv: move lower_io_arrays_to_elements before lower_io_to_scalar_early nir_lower_io_arrays_to_elements lowers arrays or matrices to elements, which ends up to vectors for matrices, but a bunch of IO optimizations only work for scalars. Calling it before lower_io_to_scalar_early allows nir_link_opt_varyings to remove duplicated inputs and replace constant inputs. fossils-db (Navi10): Totals from 294 (0.22% of 136546) affected shaders: CodeSize: 861356 -> 860224 (-0.13%); split: -0.13%, +0.00% Instrs: 161972 -> 161832 (-0.09%); split: -0.09%, +0.00% Cycles: 1185680 -> 1185120 (-0.05%); split: -0.05%, +0.00% SMEM: 31422 -> 31424 (+0.01%) Copies: 9065 -> 9068 (+0.03%) Only Talos and Dark Souls 3 are affected. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7041>	2020-10-12 15:01:05 +00:00
Samuel Pitoiset	b0829c6af7	radv: replace RADV_ALPHA_ADJUST by AC_FETCH_FORMAT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7065>	2020-10-12 13:13:40 +00:00
Samuel Pitoiset	5000c344cc	ac/llvm: move AC_FETCH_FORMAT to non-LLVM code While we are it, give it a name. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7065>	2020-10-12 13:13:40 +00:00
Rhys Perry	cf3b638f47	radv: remove RDR2 discard workaround The game appears to use HLSL, so this workaround now lives in SPIR-V -> NIR. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7062>	2020-10-12 11:07:39 +00:00
Bas Nieuwenhuizen	875ff8414f	radv/winsys: Expand scope of allbos lock. With us not creating a bo_list anymore, there is a problem if we delete a buffer between enumerating all buffers and doing the submission. Also changes this to a rwlock given the wider scope of the things under lock. (especialy some of the syncobj stuff is now under the lock) Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7091>	2020-10-12 10:55:08 +00:00
Bas Nieuwenhuizen	ea778693bf	radv: Fix event write cmdbuffer allocation when tracing. The trace emit is another 7 words. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7091>	2020-10-12 10:55:08 +00:00
Samuel Pitoiset	98f538dfca	radv: remove one leftover TODO in the shader info pass Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:26 +02:00
Samuel Pitoiset	cec12d4f98	radv/llvm: reduce LDS size for tess by using NIR IO assigned locations To match ACO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:26 +02:00
Samuel Pitoiset	47e26bf334	radv/llvm: reduce the ESGS itemsize by using NIR IO assigned locations There is no longer gaps in the ESGS ring. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:26 +02:00
Samuel Pitoiset	569b894835	radv/llvm: switch to NIR IO assigned locations To match ACO. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:25 +02:00
Samuel Pitoiset	6387341cce	ac/nir: pass the variable location to store_tcs_outputs It's actually simpler for the backend to know the variable location. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7022>	2020-10-12 09:23:25 +02:00
Samuel Pitoiset	8f8ee5b95b	ac,radv,radeonsi: stop multiplying driver_location by 4 It's no longer needed to do that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7010>	2020-10-12 08:34:02 +02:00
Samuel Pitoiset	0a90dab6b4	radv/llvm: stop assigning driver_location in NIR->LLVM It's already assigned just after NIR linking shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7010>	2020-10-12 08:33:57 +02:00
Marek Olšák	a4e4644eff	ac/surface: fix valgrind warnings in DCC retile tile lookups ==12920== Conditional jump or move depends on uninitialised value(s) ==12920== at 0x8F39391: util_fast_urem32 (fast_urem_by_const.h:71) ==12920== by 0x8F39391: hash_table_search (hash_table.c:285) ==12920== by 0x8B06D5D: ac_compute_dcc_retile_tile_indices (ac_surface.c:136) Fixes: `a37aeb128d` "amd/common: Cache intra-tile addresses for retile map." Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7055>	2020-10-09 23:13:40 +00:00
Rhys Perry	8e981453ed	radv: use radv_optimize_nir() less in radv_link_shaders() fossil-db (Navi): Totals from 11 (0.01% of 137413) affected shaders: CodeSize: 99372 -> 99480 (+0.11%) Instrs: 19119 -> 19110 (-0.05%) Cycles: 222144 -> 222000 (-0.06%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6891>	2020-10-09 15:48:00 +00:00
Rhys Perry	55254f241f	radv: move optimizations in shader_compile_to_nir() to after io_to_scalar This results in at least one less radv_optimize_nir() iteration. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6891>	2020-10-09 15:47:59 +00:00
Bas Nieuwenhuizen	da132d802b	radv: Set fce metadata correctly on DCC initialization. The fce metadata can always be set to false as we don't care about the compressed clear color. Avoiding useless fast clear eliminates improves basemark performance by 1%-1.5%. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7005>	2020-10-09 13:46:49 +00:00
Timur Kristóf	5ae3656890	aco/ngg: Calculate workgroup size of NGG shaders. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	61280bb4b6	aco/ngg: Allocate NGG GS space early for const vertex/primitive counts. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	e8a0409d01	aco/ngg: Use more efficient LDS layout to help reduce bank conflicts. The LLVM backend has a trick which helps reduce LDS bank conflicts by swizzling the LDS address where each vertex is emitted. This commit implements the same thing for ACO. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	9bf92d4357	radv/aco: Enable NGG GS by default. ACO NGG GS now supports everything we need except streamout (aka. transform feedback), but we don't use NGG anyway when streamout is needed. Also add a note to the new features txt. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	dd73719856	aco/ngg: Add shader query support to NGG GS. In each GS thread, we calculate the number of "real" primitives that were emitted (points, lines, triangles, not strips). Then we accumulate the number of "real" primitives emitted by the entire threadgroup in GDS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	df62c8fbea	aco/ngg: Place workgroup barrier outside control flow for NGG GS. Merged shaders have a workgroup barrier which makes sure that the first half is completed in every wave before the 2nd half is started. This barrier is located in divergent control flow, so that waves that don't have any invocations in the 2nd half can finish as early as possible. This is problematic for NGG GS because it has more workgroup barriers after the 2nd half. So, for NGG GS we need to put the barrier outside control flow because otherwise the waves that have 0 GS threads won't be able to wait for the waves which have non-zero GS threads. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	1129575d5e	aco/ngg: Implement NGG GS output. We store emitted GS vertices in LDS. Then, at the end of the shader, the emitted vertices are compacted and each thread loads a single vertex from LDS in order to export a primitive as needed, and the vertex attributes. The reason this is done is because there is an impedance mismatch between how API GS and the NGG HW works. API GS can emit an arbitrary number of vertices and primites in each thread, but NGG HW can only export one vertex per thread. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	62b5012ec3	aco/ngg: Implement workgroup reduce / exclusive scan for NGG GS. This function calculates two things at once: 1. The total number of vertices emitted by the threadgroup. 2. Exclusive scan of emitted vertex count accross the threadgroup. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	c29e288fb5	aco/ngg: Create LDS layout for NGG GS. For NGG GS, we need to store the following in LDS: 1. The ESGS ring, similarly to legacy ESGS. 2. Emitted vertices from the GS threads. 3. Temporary space used by the workgroup scan. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00

1 2 3 4 5 ...

5996 commits