fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 22:20:14 +01:00

Author	SHA1	Message	Date
Timur Kristóf	b6654adc0e	aco: Make emitting reduction instructions a bit more convenient. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:22 +01:00
Timur Kristóf	260f9c503a	aco/ngg: Put shader query reduction operand into a VGPR. The p_reduce instruction only works if this operand is in a VGPR, and otherwise gets lowered to incorrect code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:22 +01:00
Timur Kristóf	9757c3cb6b	aco: Assert that workgroup barriers are not used inappropriately. Example: It is possible for some NGG GS waves to have 0 ES and/or GS invocations, and in that case having an s_barrier inside divergent control flow can very possibly hang the GPU. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7232>	2020-10-28 21:47:19 +01:00
Rhys Perry	483657de32	aco: use mubuf helper in select_gs_copy_shader Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103>	2020-10-28 14:59:49 +00:00
Rhys Perry	ec7ecfe9cb	aco: use control flow creation helpers in select_gs_copy_shader Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6103>	2020-10-28 14:59:49 +00:00
Daniel Schürmann	543f50789a	aco: implement nir_op_unpack_[64/32]_* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6527>	2020-10-28 10:14:26 +00:00
Rhys Perry	26e53e3afa	aco: ignore the ACO-inserted continue in create_continue_phis() Otherwise, for loops without continue_or_break, create_continue_phis() always returns an undef operand. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `638cbc21a1` ("aco: handle when ACO adds new continue edges") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2848 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7148>	2020-10-27 19:53:38 +00:00
Rhys Perry	437995bb70	aco: remove all-undef phi opt This doesn't look like it would create correct IR for 8/16-bit phis and doesn't seem to help anything. If we ever want to do this, it's probably better done in nir_opt_remove_phis(). No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	d20a752c0d	aco: use Builder::copy more fossil-db (Navi): Totals from 6973 (5.07% of 137413) affected shaders: SGPRs: 381768 -> 381776 (+0.00%) VGPRs: 306092 -> 306096 (+0.00%); split: -0.00%, +0.00% CodeSize: 24440844 -> 24421196 (-0.08%); split: -0.09%, +0.01% MaxWaves: 86581 -> 86583 (+0.00%) Instrs: 4682161 -> 4679578 (-0.06%); split: -0.06%, +0.00% Cycles: 68793116 -> 68261648 (-0.77%); split: -0.83%, +0.05% fossil-db (Polaris): Totals from 8154 (5.87% of 138881) affected shaders: VGPRs: 338916 -> 338920 (+0.00%); split: -0.00%, +0.00% CodeSize: 23540428 -> 23540488 (+0.00%); split: -0.00%, +0.00% MaxWaves: 49090 -> 49091 (+0.00%) Instrs: 4576085 -> 4576101 (+0.00%); split: -0.00%, +0.00% Cycles: 51720704 -> 51720888 (+0.00%); split: -0.00%, +0.00% Most of the Navi cycle/instruction changes are from 8/16-bit parallel-rdp shaders. They appear to be improved because the p_create_vector from lower_subdword_phis() was blocking constant propagation. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	72b307a338	aco: don't do divergent break+discard If the shader does: loop { if (divergent) discard else a() b() } then a()'s block will dominate b()'s block in the logical CFG, but not the linear CFG. This will cause value numbering to try to combine SLAU from a() and b(). This didn't happen with break/continue because sanitize_if() would move a() out of the branch. Using sanitize_if() to fix this doesn't look easy, because discards are not control flow instructions in NIR. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>	2020-10-27 15:24:38 +00:00
Rhys Perry	27ce5d921e	aco: remove isel_context::allocated Now that we have Program::temp_rc, we can replace it with the first temporary id allocated for NIR's ssa defs. No fossil-db changes on Navi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7067>	2020-10-26 15:14:32 +00:00
Samuel Pitoiset	4e2fe34aa9	aco: fix determining if LOD is zero for nir_texop_txf/nir_texop_txs txf/txs expects LOD to be a 32-bit unsigned integer while other texture operations expects a float. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3668 Fixes: `93c8ebfa78` ("aco: Initial commit of independent AMD compiler") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7256>	2020-10-22 11:30:43 +00:00
Samuel Pitoiset	eb6877d3af	radv,aco: fix use of texop_samples_identical in the resolve meta path The return value of this texture intrinsic should be a NIR 1-bit bool. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7236>	2020-10-21 13:06:53 +02:00
Tony Wasserka	fd038132de	aco/isel: Miscellaneous cleanups using the new Stage API Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094>	2020-10-21 09:49:38 +00:00
Tony Wasserka	34bc9477de	aco: Clean up symbol names and comments related to NGG Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094>	2020-10-21 09:49:38 +00:00
Tony Wasserka	86c227c10c	aco: Use strong typing to model SW<->HW stage mappings Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094>	2020-10-21 09:49:38 +00:00
Bas Nieuwenhuizen	76421667ec	aco: Add VK_KHR_shader_terminate_invocation support. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7226>	2020-10-20 22:53:08 +00:00
Timur Kristóf	d8435c1628	aco/ngg: Add assertion to make sure we always know the vertex count. Just a sanity check to avoid hangs caused by missing this in the future. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7213>	2020-10-20 07:11:29 +00:00
James Park	af8d488ea5	util,ac,aco,radv: Cross-platform memstream API POSIX memstream is not available on Windows. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7143>	2020-10-19 03:37:42 -07:00
Rhys Perry	fdb65b8b23	aco: add missing SCC clobber in get_buffer_size Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `fcd6d83245` ("aco: fix imageSize()/textureSize() with large buffers on GFX8") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7162>	2020-10-15 21:11:45 +00:00
Tony Wasserka	d5a72319d6	aco/isel: Remove now unused VS-related code from create_null_export Also replaced a hardcoded constant with the appropriate register macro. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>	2020-10-14 16:22:51 +00:00
Tony Wasserka	c22c702f35	aco/isel: Remove some dead code exported_pos was always initialized to true (due to the is_pos argument of the first export_vs_varying call being true), so none of this code has any effect. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>	2020-10-14 16:22:51 +00:00
Tony Wasserka	bf51b11c04	aco/isel: Always export position data from VS/NGG AMD ISA docs explicitly require this for VS, and this likely extends to NGG too. Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3615 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7102>	2020-10-14 16:22:51 +00:00
Daniel Schürmann	f29c81f863	aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible This patch also does a slight rework of export_fs_mrt_color() to avoid setting of enabled channels which are not used. Totals from 52404 (38.38% of 136546) affected shaders (NAVI): SGPRs: 3097443 -> 3097435 (-0.00%) CodeSize: 189151600 -> 188546200 (-0.32%) Instrs: 36445061 -> 36445104 (+0.00%); split: -0.00%, +0.00% Cycles: 1739388020 -> 1739388192 (+0.00%); split: -0.00%, +0.00% VMEM: 21071501 -> 21071665 (+0.00%); split: +0.00%, -0.00% SMEM: 3470983 -> 3470982 (-0.00%); split: +0.00%, -0.00% PreSGPRs: 2058965 -> 2058962 (-0.00%) PreVGPRs: 1860294 -> 1860295 (+0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Daniel Schürmann	7240edec2a	aco: use VOP2 version of v_cvt_pkrtz_f16_f32 on GFX_6_7_10 Totals from 767 (0.56% of 136546) affected shaders (NAVI): CodeSize: 2862208 -> 2850036 (-0.43%) Instrs: 561572 -> 561574 (+0.00%) Cycles: 6455420 -> 6455428 (+0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Daniel Schürmann	2f125908b3	radv,aco: lower_pack_half_2x16 This patch also optimizes pack_half_2x16(a, 0.0). Totals from 1949 (1.43% of 136546) affected shaders (RAVEN): SGPRs: 83376 -> 83336 (-0.05%) CodeSize: 3532144 -> 3512352 (-0.56%) Instrs: 660746 -> 660682 (-0.01%); split: -0.01%, +0.00% Cycles: 6780716 -> 6780472 (-0.00%); split: -0.00%, +0.00% VMEM: 990886 -> 990883 (-0.00%); split: +0.00%, -0.00% SMEM: 150506 -> 150538 (+0.02%); split: +0.05%, -0.03% SClause: 30595 -> 30594 (-0.00%); split: -0.01%, +0.00% Copies: 40801 -> 40729 (-0.18%) PreSGPRs: 52335 -> 52341 (+0.01%); split: -0.03%, +0.04% PreVGPRs: 45104 -> 45097 (-0.02%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Daniel Schürmann	dae1e6f756	aco: use v_cvt_pkrtz_f16_f32 for pack_half_2x16 Apparently, we forgot to remove some debug code. This patch also fixes the round mode check to consider the destination bit width. Totals from 2218 (1.62% of 136546) affected shaders (RAVEN): SGPRs: 100848 -> 100280 (-0.56%) VGPRs: 68536 -> 66044 (-3.64%); split: -3.68%, +0.05% CodeSize: 4882296 -> `4837220` (-0.92%); split: -0.94%, +0.01% MaxWaves: 18990 -> 19019 (+0.15%); split: +0.19%, -0.04% Instrs: 938150 -> 930388 (-0.83%); split: -0.83%, +0.00% Cycles: 8699824 -> 8667648 (-0.37%); split: -0.38%, +0.01% VMEM: 1144502 -> 1059680 (-7.41%); split: +0.06%, -7.48% SMEM: 170076 -> 167999 (-1.22%); split: +0.22%, -1.44% VClause: 18428 -> 18422 (-0.03%) SClause: 41375 -> 41353 (-0.05%); split: -0.06%, +0.00% Copies: 60008 -> 60054 (+0.08%); split: -0.31%, +0.39% PreVGPRs: 56163 -> 56142 (-0.04%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Daniel Schürmann	aec872cda0	aco: use p_split_vector for nir_op_unpack_half_* This enables the use of SDWA if possible Totals from 9933 (7.27% of 136546) affected shaders (RAVEN): VGPRs: 731764 -> 731772 (+0.00%); split: -0.00%, +0.00% CodeSize: 90944852 -> 90671472 (-0.30%); split: -0.30%, +0.00% Instrs: 17881885 -> 17867831 (-0.08%); split: -0.08%, +0.00% Cycles: 1597904072 -> 1597771260 (-0.01%); split: -0.01%, +0.00% VMEM: 1702328 -> 1697383 (-0.29%); split: +0.13%, -0.42% SMEM: 659583 -> 659049 (-0.08%); split: +0.01%, -0.09% VClause: 318024 -> 318025 (+0.00%); split: -0.00%, +0.00% SClause: 631670 -> 631707 (+0.01%); split: -0.01%, +0.01% Copies: 1504107 -> 1504626 (+0.03%); split: -0.01%, +0.04% PreVGPRs: 683153 -> 683180 (+0.00%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Daniel Schürmann	a38a497b86	aco: use p_create_vector for nir_op_pack_half_2x16 This enables the use of SDWA if possible Totals from 2218 (1.62% of 136546) affected shaders (RAVEN): VGPRs: 68508 -> 68516 (+0.01%) CodeSize: 4897024 -> 4881068 (-0.33%); split: -0.33%, +0.00% MaxWaves: 18992 -> 18990 (-0.01%) Instrs: 946942 -> 939161 (-0.82%); split: -0.82%, +0.00% Cycles: 8737668 -> 8705704 (-0.37%); split: -0.37%, +0.00% VMEM: 1155362 -> 1145245 (-0.88%); split: +0.00%, -0.88% SMEM: 170435 -> 170165 (-0.16%); split: +0.01%, -0.16% VClause: 18426 -> 18425 (-0.01%) SClause: 41376 -> 41375 (-0.00%) Copies: 59813 -> 59787 (-0.04%); split: -0.15%, +0.10% PreVGPRs: 56126 -> 56136 (+0.02%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Rhys Perry	c122315702	aco: fix get_ssbo_size with a vgpr resource The result of load_vulkan_descriptor is passed directly to get_ssbo_size. This caused convert_pointer_to_64_bit() to skip creating a v_readfirstlane_b32 if it was necessary. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `05b6612b4e` ('radv: do not lower UBO/SSBO access to offsets') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3628 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7095>	2020-10-13 14:20:28 +00:00
Rhys Perry	bb5c0ba0d2	aco: implement last_invocation Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:21 +00:00
Rhys Perry	36da9c4aa2	aco: implement elect Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Rhys Perry	bf77f539ee	aco: optimize more uniform reductions/scans Uniform atomic optimization will create these. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>	2020-10-13 12:47:20 +00:00
Samuel Pitoiset	b9ca4923d6	aco: implement missing nir_op_unpack_half_2x16_split_{x,y}_flush_to_zero SPIRV->NIR emits nir_op_unpack_half_2x16_flush_to_zero instead of nir_op_unpack_half_2x16 if the shader enables denorm flush to zero for 16-bit floating point. This doesn't fix anything known and CTS doesn't have tests. Fixes: `56d9bcdded` ("radv: enable more float_controls features") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6939>	2020-10-13 08:35:22 +02:00
Samuel Pitoiset	b0829c6af7	radv: replace RADV_ALPHA_ADJUST by AC_FETCH_FORMAT Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7065>	2020-10-12 13:13:40 +00:00
Timur Kristóf	61280bb4b6	aco/ngg: Allocate NGG GS space early for const vertex/primitive counts. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	e8a0409d01	aco/ngg: Use more efficient LDS layout to help reduce bank conflicts. The LLVM backend has a trick which helps reduce LDS bank conflicts by swizzling the LDS address where each vertex is emitted. This commit implements the same thing for ACO. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	dd73719856	aco/ngg: Add shader query support to NGG GS. In each GS thread, we calculate the number of "real" primitives that were emitted (points, lines, triangles, not strips). Then we accumulate the number of "real" primitives emitted by the entire threadgroup in GDS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	df62c8fbea	aco/ngg: Place workgroup barrier outside control flow for NGG GS. Merged shaders have a workgroup barrier which makes sure that the first half is completed in every wave before the 2nd half is started. This barrier is located in divergent control flow, so that waves that don't have any invocations in the 2nd half can finish as early as possible. This is problematic for NGG GS because it has more workgroup barriers after the 2nd half. So, for NGG GS we need to put the barrier outside control flow because otherwise the waves that have 0 GS threads won't be able to wait for the waves which have non-zero GS threads. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	1129575d5e	aco/ngg: Implement NGG GS output. We store emitted GS vertices in LDS. Then, at the end of the shader, the emitted vertices are compacted and each thread loads a single vertex from LDS in order to export a primitive as needed, and the vertex attributes. The reason this is done is because there is an impedance mismatch between how API GS and the NGG HW works. API GS can emit an arbitrary number of vertices and primites in each thread, but NGG HW can only export one vertex per thread. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	62b5012ec3	aco/ngg: Implement workgroup reduce / exclusive scan for NGG GS. This function calculates two things at once: 1. The total number of vertices emitted by the threadgroup. 2. Exclusive scan of emitted vertex count accross the threadgroup. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	c29e288fb5	aco/ngg: Create LDS layout for NGG GS. For NGG GS, we need to store the following in LDS: 1. The ESGS ring, similarly to legacy ESGS. 2. Emitted vertices from the GS threads. 3. Temporary space used by the workgroup scan. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	9c3d8404de	aco/ngg: Allow NGG GS to create VS exports. NGG GS need to use the same instructions to export vertex attributes at the end. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b67878f328	aco/ngg: Allow NGG GS to load per-vertex GS inputs. They work the same way as in legacy GS, so we can reuse that. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	8f25d9f821	aco/ngg: Allow NGG GS to store ES outputs. We can reuse the existing ES output code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b57b1a06e4	aco/ngg: Clean up and reorganize NGG VS/TES code. Make the NGG VS/TES code easier to follow, give better names to some functions and make ngg_nogs_early_prim_export a variable. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	3645a3106a	aco/ngg: Make primitive export packing less prone to error. Use lshl_or instead of lshl_add, which makes it more robust in handling -1 and -2 indices which will now just become null exports, which is what we want. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	0bfe0495c1	aco/ngg: Refactor ngg_emit_prim_export in preparation for NGG GS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b08ced08a2	aco/ngg: Refactor gs_alloc_req in preparation for NGG GS. Previously, this function inferred the vertex and primitive counts from the gs_tg_info shader argument, but in case of NGG GS, it will need to be calculated in runtime. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	57d8799284	aco: Optimize thread_id_in_threadgroup when there is just one wave. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00

... 4 5 6 7 8 ...

677 commits