Commit graph

4340 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
344ba56b0f radv: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.

CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-31 09:47:56 +00:00
Mauro Rossi
d688e4166c android: aco: fix Lower to CSSA
Fixes the following building error:

external/mesa/src/amd/compiler/aco_spill.cpp:1768:
error: undefined reference to 'aco::lower_to_cssa(aco::Program*, aco::live&, radv_nir_compiler_options const*)'

Fixes: 0b8216b ("aco: Lower to CSSA")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
2019-10-31 07:38:46 +00:00
Samuel Pitoiset
1e36a8f41d radv: declare NGG scratch for VS or TES and only on GFX10
Do not need to declare it for other stages because this is for
streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-31 06:51:01 +00:00
Bas Nieuwenhuizen
ec770085c2 radv: Fix timeout handling in syncobj wait.
libdrm returns -errno instead of directly the ioctl ret of -1.

Fixes: 1c3cda7d27 "radv: Add syncobj signal/reset/wait to winsys."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-31 00:48:17 +01:00
Bas Nieuwenhuizen
ae454a03b7 radv: Allocate space for temp. semaphore parts.
Calculated the number for allocation and did not
reserve space ....

Fixes: 2117c53b72 "radv: Add temporary datastructure for submissions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 20:51:39 +01:00
Daniel Schürmann
8678699918 aco: implement VGPR spilling
VGPR spilling is implemented via MUBUF instructions and scratch memory.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
c79972b604 aco: always set scratch_offset in startpgm
This patch also moves private_segment_buffer and
scratch_offset to Program to easily access it.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
b0de16b7de aco: omit linear VGPRs as spill variables
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
aded548e66 aco: ensure that spilled VGPR reloads are done after p_logical_start
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
a7ff1bb5b9 aco: simplify calculation of target register pressure when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Rhys Perry
e73de4e1d8 aco: fix new_demand calculation for first instructions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
93b42a1907 aco: don't add interferences between spilled phi operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
fdf8ad0256 aco: consider loop_exit blocks like merge blocks, even if they have only one predecessor
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
d48d72e98a aco: don't insert the exec mask into set of live-out variables when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
cd20e29de1 aco: fix transitive affinities of spilled variables
Variables spilled on both branch legs need to be assigned to the same spilling slot.
These affinities can be transitive through multiple merge blocks.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
8023dcd71e aco: fix live-range splits of phis
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
655a703349 aco: remove potential critical edge on loops.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:33 +00:00
Daniel Schürmann
78bca0d0ce aco: improve live variable analysis
This patch makes the live variable analysis more precise
w.r.t. killed phi operands and the block's register pressure.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:32 +00:00
Daniel Schürmann
0b8216b2cd aco: Lower to CSSA
Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes.
Previously, it was possible that phi operands have intersecting live-ranges, and thus,
couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to
spill phis, even if it was beneficial.
This patch implements a conversion pass which is currently only called if spilling is necessary.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 19:48:32 +00:00
Bas Nieuwenhuizen
780c937a5d radv: Start signalling semaphores in WSI acquire.
Winsys semaphores without signal operation get silently ignored.

Not so for syncobjs, so actually signal them.

Fixes: 84d9551b23 "radv: Always enable syncobj when supported for all fences/semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 19:42:10 +01:00
Rhys Perry
e1bcc7a828 aco: rename README to README.md
Closes: #1974
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 18:16:00 +00:00
Rhys Perry
d4684a294b aco: a couple loop handling fixes for GFX10 hazard pass
It was joining from the wrong blocks and block.kind is a bitmask instead
of an enum.

Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-30 18:13:53 +00:00
Timur Kristóf
f53811aeac radv: Enable ACO on Navi.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 16:54:41 +00:00
Rhys Perry
8235bc6411 aco: try to group together VMEM loads of the same resource
v2: remove accidental shaderInt16 change
v2: simplify can_move_down initialization
v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-30 17:23:49 +01:00
Daniel Schürmann
8b5aee78cc aco: don't schedule instructions through depending VMEM instructions
Previously, the scheduler tried to move up instructions from below depending
VMEM instructions only to move them down again when scheduling the VMEM
instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
636d45e46a aco: add can_reorder flags to load_ubo and load_constant
These got lost due to some refactoring.
Due to the way our scheduler works currently, for now
we add back the reorder flag for divergent loads only.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
576f92d900 aco: only skip RAR dependencies if the variable is killed somewhere
This patch changes VMEM scheduling in a way that they can only
be moved upwards by previous VMEM instructions but not downwards.
This way, it improves the order of VMEM instructions in relation
to their users.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Daniel Schürmann
703ce617ca aco: restrict scheduling depending on max_waves
Previously, we allowed all shaders to reduce the number of max_waves to as low as 5.
Restricting this on shaders with low register demand, increases the total number of waves
while the VMEM def-use distances hardly change.
This patch also changes the max number of move operations per MEM instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
2019-10-30 16:12:10 +00:00
Samuel Pitoiset
5a9d777f5a radv: fix perftest options
RADV_PERFTEST=outooforder has been removed a while ago. This fixes
dumping the options into hang reports.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 14:49:30 +01:00
Samuel Pitoiset
c895e08281 radv: move nomemorycache debug option at the right palce
Fixes: 6571000071 ("radv: add debug option to turn off in memory cache")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 14:49:28 +01:00
Samuel Pitoiset
d4e0bef1bb radv: fix dumping SPIR-V into hang reports
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 13:02:08 +00:00
Bas Nieuwenhuizen
396195e8f1 radv: Enable VK_KHR_timeline_semaphore.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
4aa75bb3bd radv: Add wait-before-submit support for timelines.
This is actually a non-threaded implementation. I'd summarize this
as event-based submission.

When submit happens we walk a tree of submissions that depend on
the syncobj signal operations to be submitted and if those submission
we no other dependencies we start to execute them immediately.

Or, well I still use a list to avoid issues with long chains and
the stacksize when using recursion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
88d41367b8 radv: Add timelines with a VK_KHR_timeline_semaphore impl.
This does not fully do wait-before-submit, to be done in a follow
up patch.

For kernels without support for timeline syncobjs, this adds an
implementation of non-shareable timelines using legacy syncobjs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2117c53b72 radv: Add temporary datastructure for submissions.
So we can defer them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
c3eae659e7 radv: Split semaphore into two parts as enum+union.
This is in preparation to adding more types.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
84d9551b23 radv: Always enable syncobj when supported for all fences/semaphores.
This simplifies code for timeline semaphores by needing to support
less configurations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
45f4a639a8 radv: Improve fence signalling in QueueSubmit.
Only signalling it once.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
a9c8424e08 radv: Do sparse binding in queue submission.
So we have one place to do queue things if we end up deferring
submissions.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
915e9178fa radv: Split out commandbuffer submission.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
43ba44357c radv: Clean up unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2e3a635ee6 radv: Add an early exit in the secure compile if we already have the cache entries.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen
d78809632f radv: Compute hashes in secure process for secure compilation.
To prevent poisoning arbitrary cache entries.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:37:41 +01:00
Timothy Arceri
cf25664686 radv: make use of radv_sc_read()
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
28fff3efbc radv: add radv_sc_read() helper
This is a function with timeout support for reading from the pipe
between processes used for secure compile.

Initially we hardcode the timeout to 5 seconds. We can adjust the
timeout limit in future if needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
23a6827e4d radv: allow select() calls in secure compile
This will be used in the following patch to support timeouts for
reading the pipe between processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Marek Olšák
9edcce2a32 ac: get tcc_harvested from the kernel
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-28 21:38:01 -04:00
Timur Kristóf
c52ebbcea4 aco: Introduce vgpr_limit to keep track of available VGPRs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Timur Kristóf
d59f702e26 aco: Implement subgroup shuffle in GFX10 wave64 mode.
Previously subgroup shuffle was implemented using the bpermute
instruction, which only works accross half-waves, so by itself it's
not suitable for implementing subgroup shuffle when the shader is
running in wave64 mode.

This commit adds a trick using shared VGPRs that allows to implement
subgroup shuffle still relatively effectively in this mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00
Rhys Perry
c2eebfe3ea aco: Remove dead code in reduction lowering.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-28 23:52:50 +00:00