fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 20:10:14 +01:00

Author	SHA1	Message	Date
Timur Kristóf	e8a0409d01	aco/ngg: Use more efficient LDS layout to help reduce bank conflicts. The LLVM backend has a trick which helps reduce LDS bank conflicts by swizzling the LDS address where each vertex is emitted. This commit implements the same thing for ACO. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	dd73719856	aco/ngg: Add shader query support to NGG GS. In each GS thread, we calculate the number of "real" primitives that were emitted (points, lines, triangles, not strips). Then we accumulate the number of "real" primitives emitted by the entire threadgroup in GDS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	df62c8fbea	aco/ngg: Place workgroup barrier outside control flow for NGG GS. Merged shaders have a workgroup barrier which makes sure that the first half is completed in every wave before the 2nd half is started. This barrier is located in divergent control flow, so that waves that don't have any invocations in the 2nd half can finish as early as possible. This is problematic for NGG GS because it has more workgroup barriers after the 2nd half. So, for NGG GS we need to put the barrier outside control flow because otherwise the waves that have 0 GS threads won't be able to wait for the waves which have non-zero GS threads. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	1129575d5e	aco/ngg: Implement NGG GS output. We store emitted GS vertices in LDS. Then, at the end of the shader, the emitted vertices are compacted and each thread loads a single vertex from LDS in order to export a primitive as needed, and the vertex attributes. The reason this is done is because there is an impedance mismatch between how API GS and the NGG HW works. API GS can emit an arbitrary number of vertices and primites in each thread, but NGG HW can only export one vertex per thread. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	62b5012ec3	aco/ngg: Implement workgroup reduce / exclusive scan for NGG GS. This function calculates two things at once: 1. The total number of vertices emitted by the threadgroup. 2. Exclusive scan of emitted vertex count accross the threadgroup. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	c29e288fb5	aco/ngg: Create LDS layout for NGG GS. For NGG GS, we need to store the following in LDS: 1. The ESGS ring, similarly to legacy ESGS. 2. Emitted vertices from the GS threads. 3. Temporary space used by the workgroup scan. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:15 +02:00
Timur Kristóf	9c3d8404de	aco/ngg: Allow NGG GS to create VS exports. NGG GS need to use the same instructions to export vertex attributes at the end. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b67878f328	aco/ngg: Allow NGG GS to load per-vertex GS inputs. They work the same way as in legacy GS, so we can reuse that. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	8f25d9f821	aco/ngg: Allow NGG GS to store ES outputs. We can reuse the existing ES output code. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b57b1a06e4	aco/ngg: Clean up and reorganize NGG VS/TES code. Make the NGG VS/TES code easier to follow, give better names to some functions and make ngg_nogs_early_prim_export a variable. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	3645a3106a	aco/ngg: Make primitive export packing less prone to error. Use lshl_or instead of lshl_add, which makes it more robust in handling -1 and -2 indices which will now just become null exports, which is what we want. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	0bfe0495c1	aco/ngg: Refactor ngg_emit_prim_export in preparation for NGG GS. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b08ced08a2	aco/ngg: Refactor gs_alloc_req in preparation for NGG GS. Previously, this function inferred the vertex and primitive counts from the gs_tg_info shader argument, but in case of NGG GS, it will need to be calculated in runtime. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	57d8799284	aco: Optimize thread_id_in_threadgroup when there is just one wave. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	5e31fb49a3	aco: Use thread_id_in_threadgroup helper for ES outputs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	924f816fe1	aco: Extract thread_id_in_threadgroup to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	b1964ad4d6	aco: Extract lanecount_to_mask to a separate function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Rhys Perry	c1d11bb92c	aco: Add loop creation helpers. Will be useful for NGG GS and probably testing. The helpers take care of divergence but not creating correct phis. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Timur Kristóf	2be99012e9	nir: Add ability to count emitted GS primitives. Add an option to nir_lower_gs_intrinsics which tells it to track the number of emitted primitives, not just vertices. Additionally, also make it per-stream. Also rename the set_vertex_count intrinsic to set_vertex_and_primitive_count. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>	2020-10-09 15:26:14 +02:00
Tony Wasserka	5f7810dcb2	aco/isel: Fix out-of-bounds write in visit_load_input Shaders may read out components past the attributes provided by the application, so the read mask can indicate a larger component count than were actually reserved in the array. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6728>	2020-10-07 19:50:01 +00:00
Samuel Pitoiset	3c5eb1f761	aco: more uses of nir_get_io_offset_src() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7003>	2020-10-07 13:31:36 +02:00
Samuel Pitoiset	1211d05bef	aco: bail out if the NIR IO base offset isn't zero nir_io_add_const_offset_to_base takes care of this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7003>	2020-10-07 13:31:25 +02:00
Jason Ekstrand	9750164c09	nir: Rename get_buffer_size to get_ssbo_size This makes it explicit that this intrinsic is only for SSBOs. For the v3dv driver, we'll be adding a get_ubo_size intrinsic and we want to be able to distinguish between the two. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6812>	2020-09-22 13:34:12 +00:00
Rhys Perry	f100cf0d30	aco: stop multiplying driver_location by 4 This didn't really serve any purpose, doesn't match how FS inputs are currently done, and prevented us from using nir_io_add_const_offset_to_base in the future. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>	2020-09-22 12:38:43 +00:00
Rhys Perry	fd872c3cf7	aco: remove dead indirect fs input loading It's asserted that the visit_load_input code isn't reached. It also didn't handle divergent indexing and this situation should have been lowered anyway. I think this used to be needed to pass a dEQP-VK.glsl.indexing.* test, but it doesn't seem needed anymore. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>	2020-09-22 12:38:43 +00:00
Rhys Perry	7f51a0c670	aco: use nir's constant source helpers more Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>	2020-09-22 12:38:43 +00:00
Rhys Perry	430cc90071	aco: use nir_get_io_offset_src() in visit_load_input() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>	2020-09-22 12:38:43 +00:00
Rhys Perry	9bba79088d	aco: use io semantics to get an intrinsic's slot Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>	2020-09-22 12:38:43 +00:00
Timur Kristóf	d58a1a87cc	aco: Use NIR IO semantics for tess factor IO locations. Previously we relied on looping over the NIR output variables to remember the driver location of the tess factors, now use the new NIR IO semantics instead. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6689>	2020-09-22 12:38:43 +00:00
Samuel Pitoiset	05b6612b4e	radv: do not lower UBO/SSBO access to offsets Use nir_lower_explicit_io instead of lowering to offsets. Extra (useless) additions are removed by lowering load_vulkan_descriptor to vec2(src.x, 0). fossils-db (Navi): Totals from 18236 (13.21% of 138013) affected shaders: SGPRs: 1172766 -> 1168278 (-0.38%); split: -0.89%, +0.50% VGPRs: 940156 -> 952232 (+1.28%); split: -0.08%, +1.37% SpillSGPRs: 30286 -> 31109 (+2.72%); split: -0.78%, +3.50% SpillVGPRs: 1893 -> 1909 (+0.85%) CodeSize: 87910396 -> 88113592 (+0.23%); split: -0.35%, +0.58% Scratch: 819200 -> 823296 (+0.50%) MaxWaves: 205535 -> 202102 (-1.67%); split: +0.05%, -1.72% Instrs: 17052527 -> 17113484 (+0.36%); split: -0.32%, +0.67% Cycles: 670794876 -> 669084540 (-0.25%); split: -0.38%, +0.13% VMEM: 5274728 -> 5388556 (+2.16%); split: +3.10%, -0.94% SMEM: 1196146 -> 1165850 (-2.53%); split: +2.06%, -4.60% VClause: 381463 -> 399217 (+4.65%); split: -1.08%, +5.73% SClause: 666216 -> 631135 (-5.27%); split: -5.44%, +0.18% Copies: 1292720 -> 1289318 (-0.26%); split: -1.28%, +1.01% Branches: 467336 -> 473028 (+1.22%); split: -0.67%, +1.89% PreSGPRs: 766459 -> 772175 (+0.75%); split: -0.53%, +1.28% PreVGPRs: 819746 -> 825327 (+0.68%); split: -0.05%, +0.73% Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6202>	2020-09-21 15:37:11 +00:00
Rhys Perry	ec2185c598	aco: keep track of temporaries' regclasses in the Program A future change will switch the liveness sets to bit vectors, which don't contain regclass information. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6733>	2020-09-21 13:47:28 +00:00
Rhys Perry	2228835fb5	radv,aco: fix reading primitive ID in FS after TES Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Cc: mesa-stable Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3530 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6760>	2020-09-21 11:54:53 +00:00
Rhys Perry	4ac4cdb5bf	aco: fix incorrect assertion in emit_vop3a_instruction() Fixes some float controls tests on Polaris10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `0b6448bbe7` ('aco/isel: refactor emit_vop3a_instruction() to handle 2 operand instructions') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6744>	2020-09-17 09:52:22 +00:00
Timur Kristóf	26299c87f8	aco: Add base argument to emit_mbcnt. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6699>	2020-09-14 12:19:24 +00:00
Timur Kristóf	f3780e7b8c	aco: Clean up emit_mbcnt. Make it less error-prone and more consistent with other helpers. Pass the masks as a single argument rather than two. In wave64 mode, split the argument into low and high halves in emit_mbcnt rather than where it is called. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6699>	2020-09-14 12:19:24 +00:00
Timur Kristóf	efa1c760d1	aco: Fix emit_boolean_exclusive_scan in wave32 mode. Use the lane mask instead of s2 for the register class. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6699>	2020-09-14 12:19:24 +00:00
Rhys Perry	834b449a46	aco: fix value numbering of reductions Non-ssa definitions caused an assertion in value numbering. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6662>	2020-09-09 15:00:45 +00:00
Tony Wasserka	fefeaeef06	aco/isel: Compile all helper functions with static linkage Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	793dc668ea	aco/isel: Move add_startpgm to aco_instruction_selection.cpp Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	47de553283	aco/isel: Move context initialization code to a dedicated file aco_instruction_selection_setup.cpp (previously used as a header) has been split into a header and an implementation file. The latter "only" implements init_context and setup_isel_context, but since these files carry a long trail of helper functions, this cleans up the isel header a lot. Reduces library size by 3.1% due to more functions being compiled with static linkage. Makes aco_instruction_selection.cpp compile 3% faster. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	150de6358d	aco/isel: Consistently use references for input parameters in emit_load Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	dab0af0616	aco/isel: Simplify nested branching code Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Tony Wasserka	757de68a43	aco/isel: Turn the function template emit_load into a proper function Statically known values were encoded using template parameters previously, causing specializations for each of the 5 sets of template arguments to be generated. Since emit_load is not performance critical (the inner loop never runs more often than twice), it's better for build time to use runtime arguments everywhere. Reduces build time of this file by 9% (17.3s -> 15.7s on my machine) and reduces libaco's size by 2.6%. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6504>	2020-09-08 20:13:51 +00:00
Daniel Schürmann	0b6448bbe7	aco/isel: refactor emit_vop3a_instruction() to handle 2 operand instructions Only AC:O has been affected. Totals from 4 (0.00% of 136546) affected shaders (RAVEN): CodeSize: 16428 -> 16420 (-0.05%) Instrs: 3294 -> 3292 (-0.06%) Cycles: 14208 -> 14200 (-0.06%) VMEM: 936 -> 978 (+4.49%) VClause: 80 -> 77 (-3.75%) Copies: 211 -> 209 (-0.95%) PreVGPRs: 127 -> 126 (-0.79%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6635>	2020-09-08 16:20:44 +00:00
Daniel Schürmann	5b31056257	aco/isel: refactor code and remove unnecessary v_mov Changes mainly due to avoided v_movs for fmin/fmax/fadd/fmul. Totals from 12783 (9.36% of 136546) affected shaders (RAVEN): SGPRs: 1097752 -> 1098264 (+0.05%); split: -0.09%, +0.14% VGPRs: 856920 -> 850800 (-0.71%); split: -0.82%, +0.11% SpillSGPRs: 49494 -> 49496 (+0.00%); split: -0.00%, +0.01% CodeSize: 99997916 -> 99989948 (-0.01%); split: -0.04%, +0.03% MaxWaves: 53895 -> 54448 (+1.03%) Instrs: 19634960 -> 19632626 (-0.01%); split: -0.05%, +0.04% Cycles: 1620601696 -> 1620900712 (+0.02%); split: -0.02%, +0.04% VMEM: 3334181 -> 3299626 (-1.04%); split: +1.62%, -2.66% SMEM: 865573 -> 865876 (+0.04%); split: +0.84%, -0.81% VClause: 337100 -> 335224 (-0.56%); split: -0.88%, +0.32% SClause: 696813 -> 697267 (+0.07%); split: -0.14%, +0.21% Copies: 1549897 -> 1548023 (-0.12%); split: -0.52%, +0.40% Branches: 682118 -> 682108 (-0.00%); split: -0.01%, +0.00% PreSGPRs: 893524 -> 895129 (+0.18%); split: -0.00%, +0.18% PreVGPRs: 790180 -> 783036 (-0.90%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6635>	2020-09-08 16:20:44 +00:00
Rhys Perry	6049dc1a9d	aco: improve fsign selection Idea from https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6284 fossil-db (Navi): Totals from 4053 (2.95% of 137413) affected shaders: SGPRs: 305810 -> 305906 (+0.03%); split: -0.01%, +0.04% VGPRs: 249000 -> 249144 (+0.06%); split: -0.01%, +0.07% CodeSize: 29967092 -> 29885768 (-0.27%); split: -0.27%, +0.00% Instrs: 5749494 -> 5737971 (-0.20%); split: -0.20%, +0.00% Cycles: 255028584 -> 254955444 (-0.03%); split: -0.04%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6583>	2020-09-08 12:17:43 +00:00
Samuel Pitoiset	73eb24ab31	aco: handle unaligned loads on GFX10.3 Same as GFX10. Cc: 20.2 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6594>	2020-09-04 13:19:45 +00:00
Rhys Perry	8faf85f687	aco: fix byte_align_scalar for 3 dword vectors Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `fe08f0ccf9` ('aco: add byte_align_scalar() & trim_subdword_vector() helper functions') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4710>	2020-09-04 13:03:50 +00:00
Samuel Pitoiset	8076c7596d	aco: fix wrong source position for constant with nir_op_cube_face_coord Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6480>	2020-08-28 08:03:55 +02:00
Rhys Perry	d2cf6a8399	aco: sink get_alu_src() in bfe lowering Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6424>	2020-08-26 13:46:23 +00:00

1 2 3 4 5 ...

391 commits