fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-19 04:58:08 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	aded548e66	aco: ensure that spilled VGPR reloads are done after p_logical_start Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	a7ff1bb5b9	aco: simplify calculation of target register pressure when spilling Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Rhys Perry	e73de4e1d8	aco: fix new_demand calculation for first instructions Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	93b42a1907	aco: don't add interferences between spilled phi operands Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	fdf8ad0256	aco: consider loop_exit blocks like merge blocks, even if they have only one predecessor Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	d48d72e98a	aco: don't insert the exec mask into set of live-out variables when spilling Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	cd20e29de1	aco: fix transitive affinities of spilled variables Variables spilled on both branch legs need to be assigned to the same spilling slot. These affinities can be transitive through multiple merge blocks. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	8023dcd71e	aco: fix live-range splits of phis Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	655a703349	aco: remove potential critical edge on loops. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:33 +00:00
Daniel Schürmann	78bca0d0ce	aco: improve live variable analysis This patch makes the live variable analysis more precise w.r.t. killed phi operands and the block's register pressure. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:32 +00:00
Daniel Schürmann	0b8216b2cd	aco: Lower to CSSA Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes. Previously, it was possible that phi operands have intersecting live-ranges, and thus, couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to spill phis, even if it was beneficial. This patch implements a conversion pass which is currently only called if spilling is necessary. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 19:48:32 +00:00
Bas Nieuwenhuizen	780c937a5d	radv: Start signalling semaphores in WSI acquire. Winsys semaphores without signal operation get silently ignored. Not so for syncobjs, so actually signal them. Fixes: `84d9551b23` "radv: Always enable syncobj when supported for all fences/semaphores." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 19:42:10 +01:00
Rhys Perry	e1bcc7a828	aco: rename README to README.md Closes: #1974 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 18:16:00 +00:00
Rhys Perry	d4684a294b	aco: a couple loop handling fixes for GFX10 hazard pass It was joining from the wrong blocks and block.kind is a bitmask instead of an enum. Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-30 18:13:53 +00:00
Timur Kristóf	f53811aeac	radv: Enable ACO on Navi. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 16:54:41 +00:00
Rhys Perry	8235bc6411	aco: try to group together VMEM loads of the same resource v2: remove accidental shaderInt16 change v2: simplify can_move_down initialization v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-30 17:23:49 +01:00
Daniel Schürmann	8b5aee78cc	aco: don't schedule instructions through depending VMEM instructions Previously, the scheduler tried to move up instructions from below depending VMEM instructions only to move them down again when scheduling the VMEM instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Daniel Schürmann	636d45e46a	aco: add can_reorder flags to load_ubo and load_constant These got lost due to some refactoring. Due to the way our scheduler works currently, for now we add back the reorder flag for divergent loads only. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Daniel Schürmann	576f92d900	aco: only skip RAR dependencies if the variable is killed somewhere This patch changes VMEM scheduling in a way that they can only be moved upwards by previous VMEM instructions but not downwards. This way, it improves the order of VMEM instructions in relation to their users. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Daniel Schürmann	703ce617ca	aco: restrict scheduling depending on max_waves Previously, we allowed all shaders to reduce the number of max_waves to as low as 5. Restricting this on shaders with low register demand, increases the total number of waves while the VMEM def-use distances hardly change. This patch also changes the max number of move operations per MEM instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Samuel Pitoiset	5a9d777f5a	radv: fix perftest options RADV_PERFTEST=outooforder has been removed a while ago. This fixes dumping the options into hang reports. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:30 +01:00
Samuel Pitoiset	c895e08281	radv: move nomemorycache debug option at the right palce Fixes: `6571000071` ("radv: add debug option to turn off in memory cache") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:28 +01:00
Samuel Pitoiset	d4e0bef1bb	radv: fix dumping SPIR-V into hang reports Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 13:02:08 +00:00
Bas Nieuwenhuizen	396195e8f1	radv: Enable VK_KHR_timeline_semaphore. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	4aa75bb3bd	radv: Add wait-before-submit support for timelines. This is actually a non-threaded implementation. I'd summarize this as event-based submission. When submit happens we walk a tree of submissions that depend on the syncobj signal operations to be submitted and if those submission we no other dependencies we start to execute them immediately. Or, well I still use a list to avoid issues with long chains and the stacksize when using recursion. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	88d41367b8	radv: Add timelines with a VK_KHR_timeline_semaphore impl. This does not fully do wait-before-submit, to be done in a follow up patch. For kernels without support for timeline syncobjs, this adds an implementation of non-shareable timelines using legacy syncobjs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2117c53b72	radv: Add temporary datastructure for submissions. So we can defer them. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	c3eae659e7	radv: Split semaphore into two parts as enum+union. This is in preparation to adding more types. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	84d9551b23	radv: Always enable syncobj when supported for all fences/semaphores. This simplifies code for timeline semaphores by needing to support less configurations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	45f4a639a8	radv: Improve fence signalling in QueueSubmit. Only signalling it once. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	a9c8424e08	radv: Do sparse binding in queue submission. So we have one place to do queue things if we end up deferring submissions. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	915e9178fa	radv: Split out commandbuffer submission. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	43ba44357c	radv: Clean up unused variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2e3a635ee6	radv: Add an early exit in the secure compile if we already have the cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen	d78809632f	radv: Compute hashes in secure process for secure compilation. To prevent poisoning arbitrary cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:37:41 +01:00
Timothy Arceri	cf25664686	radv: make use of radv_sc_read() Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	28fff3efbc	radv: add radv_sc_read() helper This is a function with timeout support for reading from the pipe between processes used for secure compile. Initially we hardcode the timeout to 5 seconds. We can adjust the timeout limit in future if needed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	23a6827e4d	radv: allow select() calls in secure compile This will be used in the following patch to support timeouts for reading the pipe between processes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Marek Olšák	9edcce2a32	ac: get tcc_harvested from the kernel Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 21:38:01 -04:00
Timur Kristóf	c52ebbcea4	aco: Introduce vgpr_limit to keep track of available VGPRs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Timur Kristóf	d59f702e26	aco: Implement subgroup shuffle in GFX10 wave64 mode. Previously subgroup shuffle was implemented using the bpermute instruction, which only works accross half-waves, so by itself it's not suitable for implementing subgroup shuffle when the shader is running in wave64 mode. This commit adds a trick using shared VGPRs that allows to implement subgroup shuffle still relatively effectively in this mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Rhys Perry	c2eebfe3ea	aco: Remove dead code in reduction lowering. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Rhys Perry	3865448012	aco: Fix reductions on GFX10. Fixes p_reduce (all cluster sizes), p_inclusive_scan and p_exclusive_scan with all reduction operations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Timothy Arceri	7f106a2b5d	util: rename list_empty() to list_is_empty() This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	c578600489	util: remove LIST_DEL macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	255de06c59	util: remove LIST_ADDTAIL macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	7ae1be1028	util: remove LIST_INITHEAD macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Samuel Pitoiset	5912792501	radv: fix OpQuantizeToF16 for NaN on GFX6-7 Do not flush NaN to 0. Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.propagated_nans Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 09:31:52 +01:00
Samuel Pitoiset	d82dfca872	radv: enable fast depth/stencil clears with separate aspects on GFX8 It's similar to GFX9+. Shadow of Mordor (Vulkan beta) hits that path and it works fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 07:54:11 +00:00
Eric Engestrom	c2430f3edc	radv: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:10:31 +00:00

1 2 3 4 5 ...

4182 commits