Commit graph

2763 commits

Author SHA1 Message Date
Bas Nieuwenhuizen
4aa75bb3bd radv: Add wait-before-submit support for timelines.
This is actually a non-threaded implementation. I'd summarize this
as event-based submission.

When submit happens we walk a tree of submissions that depend on
the syncobj signal operations to be submitted and if those submission
we no other dependencies we start to execute them immediately.

Or, well I still use a list to avoid issues with long chains and
the stacksize when using recursion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
88d41367b8 radv: Add timelines with a VK_KHR_timeline_semaphore impl.
This does not fully do wait-before-submit, to be done in a follow
up patch.

For kernels without support for timeline syncobjs, this adds an
implementation of non-shareable timelines using legacy syncobjs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2117c53b72 radv: Add temporary datastructure for submissions.
So we can defer them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
c3eae659e7 radv: Split semaphore into two parts as enum+union.
This is in preparation to adding more types.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
84d9551b23 radv: Always enable syncobj when supported for all fences/semaphores.
This simplifies code for timeline semaphores by needing to support
less configurations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
45f4a639a8 radv: Improve fence signalling in QueueSubmit.
Only signalling it once.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
a9c8424e08 radv: Do sparse binding in queue submission.
So we have one place to do queue things if we end up deferring
submissions.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
915e9178fa radv: Split out commandbuffer submission.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
43ba44357c radv: Clean up unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen
2e3a635ee6 radv: Add an early exit in the secure compile if we already have the cache entries.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen
d78809632f radv: Compute hashes in secure process for secure compilation.
To prevent poisoning arbitrary cache entries.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-10-30 11:37:41 +01:00
Timothy Arceri
cf25664686 radv: make use of radv_sc_read()
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
28fff3efbc radv: add radv_sc_read() helper
This is a function with timeout support for reading from the pipe
between processes used for secure compile.

Initially we hardcode the timeout to 5 seconds. We can adjust the
timeout limit in future if needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
23a6827e4d radv: allow select() calls in secure compile
This will be used in the following patch to support timeouts for
reading the pipe between processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Timothy Arceri
7f106a2b5d util: rename list_empty() to list_is_empty()
This makes it clear that it's a boolean test and not an action
(eg. "empty the list").

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
c578600489 util: remove LIST_DEL macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
255de06c59 util: remove LIST_ADDTAIL macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Timothy Arceri
7ae1be1028 util: remove LIST_INITHEAD macro
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa54.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
2019-10-28 11:24:38 +00:00
Samuel Pitoiset
d82dfca872 radv: enable fast depth/stencil clears with separate aspects on GFX8
It's similar to GFX9+. Shadow of Mordor (Vulkan beta) hits that
path and it works fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-28 07:54:11 +00:00
Eric Engestrom
c2430f3edc radv: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:10:31 +00:00
Timothy Arceri
cff53da374 radv: enable secure compile support
Can be enabled via the environment variable which tells the
driver how many compilation threads are expected to be called,
and therefore how many forked processes the driver should
create.

For example we would expect to call fossilize replay with
something like this:

RADV_SECURE_COMPILE_THREADS=8 ./fossilize-replay --num-threads 8 \
--shader-cache-size 0 --ignore-derived-pipelines pipeline_cache.foz

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
57c95d2ce2 radv: a support for a secure compile fork at device creation
This added support for the fork, the installation of the seccomp
filter, and the main loop for the actual compilation to be called
from i.e. run_secure_compile_device().

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
3f2283b3e2 radv: add radv_secure_compile()
This function will be called by the parent process when doing a
secure compile. It first selects a free process to work with then
passes it all the information it needs to compile the pipeline.

Once the pipeline information has been passed to the secure
process, it then waits around to read/write any disk cache entries
required before exiting.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
07692f703f radv: for secure compile exit early from radv_shader_variant_create()
We don't have permission to be creating shared memory etc.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
5cd437b1ed radv: allow the secure process to read and write from disk cache
This allows the secure process to read and write to the disk cache
via the parent process. This commit just adds the functionality
needed for the secure process, the following commit will add the
functionality for the parent process.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
5d25aee005 radv: add radv_device_use_secure_compile() helper
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
d33f2165c9 radv: add some new members to radv device and instance for secure compile
These will be used by the following commits to hold information about
the forked secure compile processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
e8cb13d499 radv: add radv_secure_compile_type enum
This will be used to identify information being passed between the
parent and secure process during a secure compile.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
2d2b113e86 radv: add radv_create_shaders() to radv_shader.h
In a follwing commit we want to be able to call this for secure
compiles from radv_device.c

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
6571000071 radv: add debug option to turn off in memory cache
This can be usefull for debugging the on disk cache, but is also
useful in the following patch for secure compiles which will be
used to compile huge pipeline collections. These pipeline
collections can be multiple GBs and the in memory cache grows to
multiple GBs very quickly when they are compiled so we want to
be able to turn off the in memory cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri
637776629d radv: get topology from pipeline key rather than VkGraphicsPipelineCreateInfo
This is cleaner and avoids having to read/write an additional copy of
topology for use with secure compile.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Samuel Pitoiset
2bf8a9b337 radv: fix VK_KHR_shader_float_controls dependency on GFX6-7
From the Vulkan spec 1.1.126 :
   "VK_SHADER_FLOAT_CONTROLS_INDEPENDENCE_32_BIT_ONLY_KHR specifies
    that shader float controls for 32-bit floating point can be set
    independently; other bit widths must be set identically to each
    other."

Forgot to update this when I enabled that extension recently.

Fixes dEQP-VK.spirv_assembly.instruction.compute.float_controls.independence_settings.independence_setting

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-25 07:49:20 +02:00
Samuel Pitoiset
4b17311e52 radv: compute the number of records correctly for vertex buffers
On GFX8 the number of records is in bytes while on other chips
it's in units of "stride".

Fixes dEQP-VK.robustness.vertex_access.*.draw.vertex_* on RAVEN.

Tested on GFX6, GFX8, GFX10 and RAVEN.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-24 17:14:43 +02:00
Rhys Perry
7453c1adff radv: round vgprs/sgprs before calculating max_waves
Note that ACO doesn't correctly round SGPR counts on GFX8/GFX9.

pipeline-db (ACO/Vega):
SGPRS: 11000 -> 11000 (0.00 %)
VGPRS: 3120 -> 3120 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 164328 -> 164328 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1125 -> 1000 (-11.11 %)

v2: consider wave32

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-23 19:11:20 +01:00
Samuel Pitoiset
f11ea22666 radv: fix a performance regression with graphics depth/stencil clears
I recently changed the slow depth/stencil clear path to make sure
depth values are explicitly exported by the fragment shader. This
is actually only useful when VK_EXT_depth_range_unrestricted is
enabled.

While this path is correct, it introduced a performance regression
with Heroes of the Storm, Shadow of Mordor (Vulkan beta) and
probably more titles. This is because it prevents the hardware
to do some optimizations like discarding fragments.

This commit re-introduces the previous (a bit faster) slow
depth/stencil clear path and it selects the unrestricted path
only if VK_EXT_depth_range_unrestricted is enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/863
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 10:23:47 +02:00
Samuel Pitoiset
7562a2cbe3 radv: fix vkUpdateDescriptorSets with inline uniform blocks
descriptorCount is the number of bytes into the descriptor, so
it shouldn't be used as an index. srcArrayElement/dstArrayElement
specify the starting byte offset within the binding to copy from/to.

This fixes new CTS tests:
dEQP-VK.binding_model.descriptor_copy.*.inline_uniform_block_*
dEQP-VK.binding_model.descriptor_copy.*.mix_3
dEQP-VK.binding_model.descriptor_copy.*.mix_array1

Fixes: 8d2654a419 ("radv: Support VK_EXT_inline_uniform_block.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 09:59:22 +02:00
Samuel Pitoiset
9c92a21fe5 radv/gfx10: fix 3D images
GFX10 does act like GFX9 actually.

This fixes
dEQP-VK.glsl.texture_functions.query.texturesize.*sampler3d_*.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 09:45:49 +02:00
Samuel Pitoiset
41ace1d939 radv/gfx10: re-enable fast depth/stencil clears with separate aspects
It used to cause weird issues on GFX10 in the past with vkmark and
Wreckfest, and they can't be reproduced now. Shadow Of Mordor
(Vulkan beta) hits that path and it works fine.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 09:18:06 +02:00
Samuel Pitoiset
956d825ed8 radv: do not emit rbplus if attachments are undefined
Fixes some crashes with dEQP-VK.geometry.layered.*.secondary_cmd_buffer
on Raven and other chips that allow rbplus.

This just prevents a crash and rbplus probaby needs more work.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 08:57:31 +02:00
Samuel Pitoiset
411ad8e7c5 radv: add an assertion in radv_gfx10_compute_bin_size()
To prevent out of bounds access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 08:33:12 +02:00
Samuel Pitoiset
f4ab58c1a0 radv: do not create meta pipelines with 16 samples
The driver only supports up to 8 samples, so it's useless to
create more pipelines than needed.

This fixes a conditional jump reported by Valgrind on GFX10:

==194282== Conditional jump or move depends on uninitialised value(s)
==194282==    at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242)
==194282==    by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334)
==194282==    by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440)
==194282==    by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764)
==194282==    by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788)
==194282==    by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114)
==194282==    by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297)
==194282==    by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277)
==194282==    by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363)
==194282==    by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080

This is caused by an out of bound access of 'fmask_array' (ie. index
is 4 as for 16 samples).

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 08:33:08 +02:00
Samuel Pitoiset
a13320370e radv: fix updating bound fast ds clear values with different aspects
On GFX9, the driver is able to do an optimized fast depth/stencil
clear with only one aspect (ie. clear the stencil part of a
depth/stencil image). When this happens, the driver should only
update the clear values of the given aspect.

Note that it's currently only supported on GFX9 but I have some
local patches that extend this optimized path for other gens.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-22 11:16:13 +02:00
Samuel Pitoiset
b72205a4c1 radv: advertise VK_KHR_spirv_1_4
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 09:21:40 +02:00
Samuel Pitoiset
b139198b06 radv: do not dump descriptors twice in hang reports
If a pipeline has both graphics and compute, descriptors are same.
While we are at it, use queue->device for simplicity.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
cf5e55558e radv: dump trace files earlier if a GPU hang is detected
To make sure a trace file is generated in case the driver crashes
during the hang report generation (which happens sometimes).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
bc2319deb2 radv: print which ring is dumped in hang reports
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
076f9dce7c radv: do not print useless descriptors info in hang reports
This information has never been useful. All descriptors are
already dumped with colors etc, and it's more useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:50:39 +02:00
Samuel Pitoiset
9da94e510c radv: enable VK_KHR_shader_float_controls on GFX6-GFX7
Disable 16-bit features because fp16 isn't exposed on these chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-21 08:47:28 +02:00
Samuel Pitoiset
7c50214aab radv: implement VK_KHR_shader_float_controls
This exposes what's required for DX and this is what we already
configure. The driver flushes denorms for FP32 and preserves them
for FP16/FP64. Note that we can't allow both preserving and
flushing denorms because this won't work for merged shaders. This
will require LLVM to update the float mode register to make it work.

Only enabled on GFX8+ with the LLVM path because it's untested on
previous chips and ACO doesn't support it.

This extension is required for SPIRV 1.4.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-18 16:55:58 +02:00
Bas Nieuwenhuizen
fd21ee8b52 radv: Fix single stage constant flush with merged shaders.
e.g. a VERTEX only flush with tess on Vega should look at the TCS
to see which bits are needed.

CC: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1953
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-18 10:49:29 +00:00