fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-18 18:08:15 +02:00

Author	SHA1	Message	Date
Daniel Schürmann	576f92d900	aco: only skip RAR dependencies if the variable is killed somewhere This patch changes VMEM scheduling in a way that they can only be moved upwards by previous VMEM instructions but not downwards. This way, it improves the order of VMEM instructions in relation to their users. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Daniel Schürmann	703ce617ca	aco: restrict scheduling depending on max_waves Previously, we allowed all shaders to reduce the number of max_waves to as low as 5. Restricting this on shaders with low register demand, increases the total number of waves while the VMEM def-use distances hardly change. This patch also changes the max number of move operations per MEM instruction. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-10-30 16:12:10 +00:00
Samuel Pitoiset	5a9d777f5a	radv: fix perftest options RADV_PERFTEST=outooforder has been removed a while ago. This fixes dumping the options into hang reports. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:30 +01:00
Samuel Pitoiset	c895e08281	radv: move nomemorycache debug option at the right palce Fixes: `6571000071` ("radv: add debug option to turn off in memory cache") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 14:49:28 +01:00
Samuel Pitoiset	d4e0bef1bb	radv: fix dumping SPIR-V into hang reports Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 13:02:08 +00:00
Bas Nieuwenhuizen	396195e8f1	radv: Enable VK_KHR_timeline_semaphore. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	4aa75bb3bd	radv: Add wait-before-submit support for timelines. This is actually a non-threaded implementation. I'd summarize this as event-based submission. When submit happens we walk a tree of submissions that depend on the syncobj signal operations to be submitted and if those submission we no other dependencies we start to execute them immediately. Or, well I still use a list to avoid issues with long chains and the stacksize when using recursion. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	88d41367b8	radv: Add timelines with a VK_KHR_timeline_semaphore impl. This does not fully do wait-before-submit, to be done in a follow up patch. For kernels without support for timeline syncobjs, this adds an implementation of non-shareable timelines using legacy syncobjs. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2117c53b72	radv: Add temporary datastructure for submissions. So we can defer them. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	c3eae659e7	radv: Split semaphore into two parts as enum+union. This is in preparation to adding more types. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	84d9551b23	radv: Always enable syncobj when supported for all fences/semaphores. This simplifies code for timeline semaphores by needing to support less configurations. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	45f4a639a8	radv: Improve fence signalling in QueueSubmit. Only signalling it once. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	a9c8424e08	radv: Do sparse binding in queue submission. So we have one place to do queue things if we end up deferring submissions. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	915e9178fa	radv: Split out commandbuffer submission. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	43ba44357c	radv: Clean up unused variable. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen	2e3a635ee6	radv: Add an early exit in the secure compile if we already have the cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:38:50 +01:00
Bas Nieuwenhuizen	d78809632f	radv: Compute hashes in secure process for secure compilation. To prevent poisoning arbitrary cache entries. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-10-30 11:37:41 +01:00
Timothy Arceri	cf25664686	radv: make use of radv_sc_read() Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	28fff3efbc	radv: add radv_sc_read() helper This is a function with timeout support for reading from the pipe between processes used for secure compile. Initially we hardcode the timeout to 5 seconds. We can adjust the timeout limit in future if needed. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Timothy Arceri	23a6827e4d	radv: allow select() calls in secure compile This will be used in the following patch to support timeouts for reading the pipe between processes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-30 04:49:58 +00:00
Marek Olšák	9edcce2a32	ac: get tcc_harvested from the kernel Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 21:38:01 -04:00
Timur Kristóf	c52ebbcea4	aco: Introduce vgpr_limit to keep track of available VGPRs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Timur Kristóf	d59f702e26	aco: Implement subgroup shuffle in GFX10 wave64 mode. Previously subgroup shuffle was implemented using the bpermute instruction, which only works accross half-waves, so by itself it's not suitable for implementing subgroup shuffle when the shader is running in wave64 mode. This commit adds a trick using shared VGPRs that allows to implement subgroup shuffle still relatively effectively in this mode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Rhys Perry	c2eebfe3ea	aco: Remove dead code in reduction lowering. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Rhys Perry	3865448012	aco: Fix reductions on GFX10. Fixes p_reduce (all cluster sizes), p_inclusive_scan and p_exclusive_scan with all reduction operations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-28 23:52:50 +00:00
Timothy Arceri	7f106a2b5d	util: rename list_empty() to list_is_empty() This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	c578600489	util: remove LIST_DEL macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	255de06c59	util: remove LIST_ADDTAIL macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Timothy Arceri	7ae1be1028	util: remove LIST_INITHEAD macro Just use the inlined function directly. The macro was replaced with the function in `ebe304fa54`. Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-10-28 11:24:38 +00:00
Samuel Pitoiset	5912792501	radv: fix OpQuantizeToF16 for NaN on GFX6-7 Do not flush NaN to 0. Fixes dEQP-VK.spirv_assembly.instruction.compute.opquantize.propagated_nans Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 09:31:52 +01:00
Samuel Pitoiset	d82dfca872	radv: enable fast depth/stencil clears with separate aspects on GFX8 It's similar to GFX9+. Shadow of Mordor (Vulkan beta) hits that path and it works fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-28 07:54:11 +00:00
Eric Engestrom	c2430f3edc	radv: fix empty-body instruction Fixes: `8d43e2b2de` ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-10-27 22:10:31 +00:00
Timothy Arceri	cff53da374	radv: enable secure compile support Can be enabled via the environment variable which tells the driver how many compilation threads are expected to be called, and therefore how many forked processes the driver should create. For example we would expect to call fossilize replay with something like this: RADV_SECURE_COMPILE_THREADS=8 ./fossilize-replay --num-threads 8 \ --shader-cache-size 0 --ignore-derived-pipelines pipeline_cache.foz Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	57c95d2ce2	radv: a support for a secure compile fork at device creation This added support for the fork, the installation of the seccomp filter, and the main loop for the actual compilation to be called from i.e. run_secure_compile_device(). Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	3f2283b3e2	radv: add radv_secure_compile() This function will be called by the parent process when doing a secure compile. It first selects a free process to work with then passes it all the information it needs to compile the pipeline. Once the pipeline information has been passed to the secure process, it then waits around to read/write any disk cache entries required before exiting. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	07692f703f	radv: for secure compile exit early from radv_shader_variant_create() We don't have permission to be creating shared memory etc. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	5cd437b1ed	radv: allow the secure process to read and write from disk cache This allows the secure process to read and write to the disk cache via the parent process. This commit just adds the functionality needed for the secure process, the following commit will add the functionality for the parent process. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	5d25aee005	radv: add radv_device_use_secure_compile() helper Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	d33f2165c9	radv: add some new members to radv device and instance for secure compile These will be used by the following commits to hold information about the forked secure compile processes. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	e8cb13d499	radv: add radv_secure_compile_type enum This will be used to identify information being passed between the parent and secure process during a secure compile. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	2d2b113e86	radv: add radv_create_shaders() to radv_shader.h In a follwing commit we want to be able to call this for secure compiles from radv_device.c Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	6571000071	radv: add debug option to turn off in memory cache This can be usefull for debugging the on disk cache, but is also useful in the following patch for secure compiles which will be used to compile huge pipeline collections. These pipeline collections can be multiple GBs and the in memory cache grows to multiple GBs very quickly when they are compiled so we want to be able to turn off the in memory cache. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timothy Arceri	637776629d	radv: get topology from pipeline key rather than VkGraphicsPipelineCreateInfo This is cleaner and avoids having to read/write an additional copy of topology for use with secure compile. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-10-26 13:04:12 +11:00
Timur Kristóf	c580f134ae	aco: Refactor hazard mitigations, separate pass for GFX10. GFX10 hazards require a different approach compared to previous generations, for example it doesn't need s_nop, and most hazards can't be solved by adding NOPs at all. Also, they are not resolved by branch instructions. This commit reorganizes aco_insert_NOPs so that there is now a separate pass for GFX10. The new GFX10 pass also respects the control flow of the shader. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	b01847bd94	aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard. This commit refines the VMEMtoScalarWriteHazard mitigation, based upon a closer look at what LLVM does. Also changes the code to match the structure of the other hazard mitigations. * The hazard is not only triggered by VMEM, FLAT and GLOBAL but also SCRATCH and DS instructions. * The SMEM/SALU instructions only cause a hazard when they write a register that the VMEM/etc. are reading. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	c037ba1bb7	aco/gfx10: Mitigate LdsBranchVmemWARHazard. There is a hazard caused by there is a branch between a VMEM/GLOBAL/SCRATCH instruction and a DS instruction. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	09d676d81a	aco/gfx10: Mitigate SMEMtoVectorWriteHazard. There is a hazard that happens when an SMEM instruction reads an SGPR and then a VALU instruction writes that same SGPR. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	d6dfce02d0	aco/gfx10: Mitigate VcmpxExecWARHazard. There is a hazard when a non-VALU instruction reads the EXEC mask and then a VALU instruction writes the EXEC mask. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	e5a8616973	aco/gfx10: Mitigate VcmpxPermlaneHazard. Any permlane instruction that follows any VOPC instruction can cause a hazard, this commit implements a workaround that avoids this causing a problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	99aed688d3	aco/gfx10: Add notes about some GFX10 hazards. ACO currently mitigates VMEMtoScalarWriteHazard and Offset3fBug (names from LLVM). There are some bugs that ACO needn't care about. Just to be on the safe side, add an assertion that makes sure that we aren't hit by FlatSegmentOffsetBug. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:41 +02:00

... 12 13 14 15 16 ...

4814 commits