fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-21 21:58:10 +02:00

Author	SHA1	Message	Date
Rhys Perry	be4a34966c	aco: fix neighboring register check in get_reg_simple() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>	2020-04-28 23:16:55 +00:00
Rhys Perry	fb59ed6bb9	aco: check alignment of non-subdword registers in get_reg_specified() When splitting a v6b vector into v1 and v2b components, we should ensure the v1 definition doesn't start at the upper half. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>	2020-04-28 23:16:55 +00:00
Rhys Perry	916cc3e231	aco: make RegisterFile::block() take a regclass Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4772>	2020-04-28 23:16:55 +00:00
Samuel Pitoiset	0549fba3cc	radv: advertise VK_AMD_memory_overallocation_behavior Doom Eternal explicitly allows overallocation via this extension but that shouldn't change anything because it's the default RADV behavior. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4785>	2020-04-28 21:03:26 +00:00
Samuel Pitoiset	5832f2b8a3	radv: track memory heaps usage if overallocation is explicitly disallowed By default, RADV supports overallocation by the sense that it doesn't reject an allocation if the target heap is full. With VK_AMD_overallocation_behaviour, apps can disable overallocation and the driver should account for all allocations explicitly made by the application, and reject if the heap is full. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4785>	2020-04-28 21:03:26 +00:00
Samuel Pitoiset	32035cca3f	radv: remove unused radv_device_memory::map_size field Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4785>	2020-04-28 21:03:25 +00:00
Samuel Pitoiset	523e9603d3	radv: enable FMASK for color attachments only The reason behind this is that FMASK requires CMASK and also that FMASK for non color attachments looks unnecessary. It's currently much easier to add this simple check because the driver tries to always enable DCC first and if we enable FMASK only if CMASK, we might loose some FMASK compressions. This helps fixing some new robustness2 tests which fails because only FMASK is enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4783>	2020-04-28 17:23:05 +02:00
Bas Nieuwenhuizen	7262c743dc	radv: Determine memory type for import based on fd. This would be necessary for an application to figure out if the memory was allocated using a memory type with VK_MEMORY_PROPERTY_PROTECTED_BIT. It also allows one to determine VRAM vs. GTT etc. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4751>	2020-04-28 15:45:03 +02:00
Bas Nieuwenhuizen	f30983be3a	radv/winsys: Add function to get domains/flags from fd. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4751>	2020-04-28 15:45:00 +02:00
Bas Nieuwenhuizen	bec9285027	radv: Stop using memory type indices. Lots of extra coding was involved in managing them. And for protected memory I was thinking of making a function that goes from domain+flags to memory types, which can reuse this array. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4751>	2020-04-28 15:44:56 +02:00
Bas Nieuwenhuizen	4a8d172d3f	radv: Use actual memory type count for setting app-visible bitset. Otherwise we might make a bitset that is too large. Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4751>	2020-04-28 15:44:27 +02:00
Samuel Pitoiset	7a0a6a7180	radv: do not expose GTT as device local memory mostly for APUs On APUs, the memory is unified (all heaps are equally fast) and apps should count all memory heaps together. But some games like Id Tech games (Youngblood and such) don't manage memory correctly on APUs and they spill everything when one VRAM heap is full. Instead of spilling buffers, they should just allocate new buffers in the second heap but it seems like these games are confused if two memory heaps have the DEVICE_LOCAL_BIT set. This is probably a first step towards better memory management on APUs but there is still some work to do if we want to run most apps with a small dedicated VRAM (256MB or so). This gives a huge boost for Id Tech games on APUs, and doesn't seem to reduce Feral games performance. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4771>	2020-04-27 22:41:41 +00:00
Bas Nieuwenhuizen	cbeda7f78e	radv: Add WSI buffers to BO list only if they can be used. Also reverse the BO list removal loop. This way typical WSI usage should find the entry in O(active swapchains) iterations, which should not be a performance issues. Tested with Doom(2106) which found the entry in 1 iteration every time. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4306>	2020-04-27 18:01:24 +00:00
Samuel Pitoiset	42b1696ef6	ac,radeonsi: fix compilations issues with LLVM 11 Latest LLVM replaced LLVMVectorTypeKind. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2826 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4755>	2020-04-27 17:13:36 +00:00
Bas Nieuwenhuizen	531728d6cb	drm-uapi,radv,radeonsi: Add amdgpu_drm.h header. Use it instead of the libdrm provided amdgpu_drm.h header. I used the kernel revision from the README to get the header so the header versions should be consistent. Tested by removing /usr/include/libdrm/amdgpu_drm.h from my dev-machine. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4749>	2020-04-27 12:27:02 +00:00
Marek Olšák	cf2f3c2753	ac: reassociate FP expressions for inexact instructions for radeonsi Totals: SGPRS: 2591784 -> 2590696 (-0.04 %) VGPRS: 1666888 -> 1666736 (-0.01 %) Spilled SGPRs: 4131 -> 4107 (-0.58 %) Spilled VGPRs: 38 -> 38 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2228 -> 2228 (0.00 %) dwords per thread Code Size: 52715468 -> 52693584 (-0.04 %) bytes LDS: 92 -> 92 (0.00 %) blocks Max Waves: 479897 -> 479892 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>	2020-04-27 11:20:16 +00:00
Marek Olšák	4b9370cb0f	ac: generate FMA for inexact instructions for radeonsi NIR mostly does this already. Totals: SGPRS: 2588520 -> 2591784 (0.13 %) VGPRS: 1666984 -> 1666888 (-0.01 %) Spilled SGPRs: 4074 -> 4131 (1.40 %) Spilled VGPRs: 38 -> 38 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2228 -> 2228 (0.00 %) dwords per thread Code Size: 52726872 -> 52715468 (-0.02 %) bytes LDS: 92 -> 92 (0.00 %) blocks Max Waves: 479872 -> 479897 (0.01 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>	2020-04-27 11:20:16 +00:00
Marek Olšák	f2c2a28073	ac: update and document fast math flags used by radeonsi This should have no effect, because we never use FP division, but it's safer for the future. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>	2020-04-27 11:20:16 +00:00
Marek Olšák	3bb65c0670	ac: force enable -structurizecfg-skip-uniform-regions for LLVM 11 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>	2020-04-27 11:20:16 +00:00
Samuel Pitoiset	574196d5f6	radv: fix robust_buffer_access if enabled via VkPhysicalDeviceFeatures2 It can be enabled via pEnabledFeatures or via vkPhysicalDeviceFeatures2. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4706>	2020-04-27 09:33:44 +02:00
Joshua Ashton	0b44582394	radv: Pass logical device to si_emit_graphics We'll need this in order to retrieve the va of a bo for a future ext. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4728>	2020-04-25 00:32:20 +00:00
Rhys Perry	5c5c2dd48f	radv/aco: enable 8/16-bit storage and int8/int16 on GFX8+ With this, Doom Eternal should now run with ACO on GFX8+. The generated 8/16-bit storage code is okay but the generated int8/int16 code is currently pretty bad but it works and apparently Doom Eternal doesn't actually use it (even though it requires it). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4707>	2020-04-24 20:04:39 +01:00
Rhys Perry	eeccb1a941	aco: lower 8/16-bit integer arithmetic dEQP-VK.spirv_assembly.type.* passes with the features and extensions enabled. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4707>	2020-04-24 20:03:59 +01:00
Rhys Perry	bcd9467d5c	aco: improve sub-dword emit_split_vector() with sgprs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	a3dc1441f0	aco: clobber scc in s_bfe_u32 in get_alu_src() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	78389f4cbc	aco: handle undef p_create_vector operands in the optimizer Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	deea4b7c5a	aco: vectorize global loads/stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	7db7206631	aco: allow 8/16-bit shared loads These should work now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	48b7beb7b0	aco: add and use get_buffer_store_op() helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	936b70c8cf	aco: refactor visit_store_scratch() to use new helpers Should support 8/16-bit stores now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	18817041f7	aco: refactor visit_store_global() to use new helpers Should support 8/16-bit stores now Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	c7bd69b3ae	aco: refactor visit_store_ssbo() to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	f75c830433	aco: refactor store_vmem_mubuf() to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	98b4cc7110	aco: refactor store_lds() to use new helpers It should also work correctly for 8/16-bit stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	562353e1f1	aco: add helpers for splitting stores split_store_data() splits a vector and p_as_uniforms it if needed. scan_write_mask()/advance_write_mask() are similar to u_bit_scan_consecutive_range(), but makes it easier to only clear part of the range and will also give ranges for zero'd bits. split_buffer_store() is a helper for splitting VMEM/SMEM stores. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	211a9f2057	aco: use emit_load helper for VMEM/SMEM loads Also implements 8/16-bit loads for scratch/global. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	57e6886f98	aco: refactor load_lds to use new helpers Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	542733dbbf	aco: add emit_load helper This helper is used for recombining split loads, passing the result to p_as_uniform, aligning the offset down and shifting it right if needed and handling large constant offsets. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	b77d638e1b	aco: add and use RegClass::get() helper Eventually, we'll probably want to replace the current RegClass(type, size) constructor with this. This has a functional change in that get_reg_class() now creates v1/v2 instead of v4b/v8b. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	69b92db131	aco: be more careful about using SMEM for load_global Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	03568249f9	radv: allocate larger shader memory slabs if needed Fixes dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 hang with ACO (features needed for the test are implemented in a later commit) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Rhys Perry	51363bd475	radv: align buffer descriptor sizes to dword This is needed to prevent bounds checking issues when load 8/16-bit values with dword loads. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>	2020-04-24 18:52:54 +00:00
Timur Kristóf	62ff2ff808	aco: Move s_setprio to correct place after the gs_alloc_req. Previously the setprio was inside the branch, so it would only reset the priority on the first wave, but not the others. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	277f37d036	aco: Use 24-bit multiplication for NGG wave id and thread id. Both of them should always fit 24 bits anyway. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	eafc1e7365	aco: Use 24-bit multiplication in TCS I/O The TCS inputs and outputs must always fit into the LDS, which implies that their addresses also always fit 24 bits. On AMD GPUs, 24-bit multiplication is much faster than 32-bit multiplication, so we can take the opportunity to use that for TCS I/O instead. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	64332a0937	aco: Const correctness for aco_print_ir. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	0c0691d43e	aco: Const correctness for get_barrier_interaction. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	f321dc33c8	aco: Abort when RA can't find a register. Previously, it was just unreachable, which means it will generate invalid shaders when it encounters a situation when it can't allocate registers for eg. a large load. This commit makes it slightly easier to notice such problems without triggering a GPU hang. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	f2e7aee244	aco: Increase barrier_count to 7 to include barrier_barrier. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00
Timur Kristóf	25775d346c	aco: Only store TCS outputs to VMEM when they are read by TES. Totals from affected shaders (GFX10): Code Size: 10832 -> 10736 (-0.89 %) bytes Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>	2020-04-24 17:58:57 +00:00

1 2 3 4 5 ...

5076 commits