fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-17 20:28:05 +02:00

Author	SHA1	Message	Date
Bas Nieuwenhuizen	ea08a296fe	radv: Handle VK_ATTACHMENT_UNUSED in color attachments. This just sets them to INVALID COLOR, instead of shifting the attachments together. This also fixes a number of cases where we use it first and only then check if it is VK_ATTACHMENT_UNUSED. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-07-24 01:50:52 +02:00
Dave Airlie	22bca8ef19	radv: reset non-syncobj semaphore context after wait. When I ported from libdrm, I forgot to add the line to reset the sem, we just need to reset the context. This fixes a regression in DOOM. Fixes: `9ac1432a57` ("radv: port to new libdrm API.") Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-22 00:03:26 +01:00
Dylan Baker	59a141c95a	radv: rebase radv_entrypoints_gen.py on anv_entrypoints_gen.py The two generators forked from each other, and they remain basically the same. This rebases the radv version on the anv version, but with the radv changes ported over. The result is that we get rid of the "cat \|" madness and gain mako, correct "generated by" attributions, and write files out directly. The only differences between the output is whitespace and comments. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-21 14:27:02 -07:00
Alex Smith	af9d6a8a99	radv: Generate storage image descriptors unconditionally We can also use storage images internally for resolves, which don't require TRANSFER_DST usage on the image, so currently we may not create the needed descriptors. Just create these descriptors unconditionally. Fixes: `0e1886efb9` ("radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT") Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-22 06:40:29 +10:00
Dave Airlie	eaa56eab6d	radv: initial support for shared semaphores (v2) This adds support for sharing semaphores using kernel syncobjects. Syncobj backed semaphores are used for any semaphore which is created with external flags, and when a semaphore is imported, otherwise we use the current non-kernel semaphores. Temporary imports from syncobj fd are also available, these just override the current user until the next wait, when the temp syncobj is dropped. v2: allocate more chunks upfront, fix off by one after previous refactor of syncobj setup, remove unnecessary null check. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-21 21:31:54 +01:00
Dave Airlie	b5670beb31	radv/winsys: add syncobj hooks This just adds syncobj create/destroy/export/import paths into the winsys interface. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-21 21:31:54 +01:00
Dave Airlie	80562f2b77	ac/gpu: add code to detect if kernel supports sync objects. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-21 21:31:54 +01:00
Bas Nieuwenhuizen	21d777a122	radv: Add support for VK_KHR_variable_pointers. Just a trivial enable. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-20 09:13:01 +02:00
Bas Nieuwenhuizen	31469c0265	radv: Add VK_KHR_storage_buffer_storage_class support. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-20 09:13:01 +02:00
Dave Airlie	9ac1432a57	radv: port to new libdrm API. This bumps the libdrm requirement for amdgpu to the 2.4.82. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-20 01:56:04 +01:00
Dave Airlie	aee382510e	radv: introduce some wrapper in cs code to make porting off libdrm_amdgpu easier. This just introduces a central semaphore info struct, and passes it around, and introduces some wrappers that will make porting off libdrm_amdgpu easier. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-20 01:55:36 +01:00
Alex Smith	f25c7f9f3e	radv: Set the RADEON_SURF_OPTIMIZE_FOR_SPACE flag for images This looks like a regression from `df30123794` ("radv: use ac_compute_surface"). Before that, the opt4Space addrlib flag was set to true unless the image has FMASK (ac_compute_surface will similarly only set that flag for images without FMASK). This saves multiple gigabytes of VRAM on one of our games, and brings its VRAM utilisation on RADV in line with AMDGPU-PRO and NVIDIA. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-18 16:18:35 +10:00
Dave Airlie	687d241559	radv: don't shadow meta_va. Coverity warned about dead code below, as meta_va was being shadowed. Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-18 16:17:28 +10:00
Connor Abbott	91dd2ca99f	ac/nir: rewrite shared variable handling (v2) Translate the NIR variables directly to LLVM instead of lowering to a TGSI-style giant array of vec4's and then back to a variable. This should fix indirect dereferences, make shared variables more tightly packed, and make LLVM's alias analysis more precise. This should fix an upcoming Feral title, which has a compute shader that was failing to compile because the extra padding made us run out of LDS space. v2: Combine the previous two patches into one, only use this for shared variables for now until LLVM becomes smarter. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Alex Smith <asmith@feralinteractive.com>	2017-07-17 14:16:03 -07:00
Marek Olšák	3d1a576fa6	ac/gpu_info: if clock crystal frequency is 0, print an error and set 1 During bring-up, this is often 0. Prevent automatic disablement of ARB_timer_query and demotion of the OpenGL version to 3.2 by setting a non-zero frequency. Print an error message instead. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-17 10:56:59 -04:00
Marek Olšák	ddbd2f4c54	ac/surface/gfx9: flags.texture currently refers to TC-compatible HTILE This should lead to better MSAA performance on GFX9. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-17 10:56:46 -04:00
Marek Olšák	4560f2b90a	radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-17 10:50:39 -04:00
Emil Velikov	4168c162c5	radv: advertise v6 of the wayland surface extension Jason updated the Khronos spec to explicitly state that Wayland surfaces must support VK_PRESENT_MODE_MAILBOX_KHR. ANV did so since day one (back in 2015) Cc: mesa-stable@lists.freedesktop.org Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-07-17 15:24:48 +01:00
Dave Airlie	9ee67467c9	radv: predicate cmask eliminate when using DCC. When using DCC some clear values don't require a cmask eliminate step. This patch adds support for black and black with alpha 1, there are other values, but I don't have access to a comprehensive list. This works by setting the cmask eliminate predicate when doing the fast clear, and later when doing the cmask elimination making sure the draws are predicated. This increases the fps on Sascha Willems deferred. Tonga: 580fps->670fps on a Tonga PRO card. Polaris 730->850fps Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-17 01:44:43 +01:00
Dave Airlie	8eed291c2c	radv/clear: add r32g32b32a32 fast clear support (v2) We can only fast clear 128-bit images if the r/g/b channels are the same, and we are using DCC. For DCC we'll bail out on translate if this isn't true, and we catch cmask clears explicitly. v2: remove 64-bit block (Bas), add uint32 as well. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-17 01:44:25 +01:00
Dave Airlie	acf1e132af	amd/addrlib: fix typo in api name. This fixes the misspelling of ALIGNMENTS in addrlib. Reviewed-by: Eduardo Lima Mitev <elima@igalia.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-17 01:44:14 +01:00
Dave Airlie	f8d5b377c8	radv: set cb base tile swizzles for MRT speedups (v4) This patch uses addrlib to workout the tile swizzles according to the surface index. It seems to produce the same values as amdgpu-pro for the deferred test. v2: don't apply swizzle to CMASK. the eg docs don't mention it, and we clearly don't align cmask for that. v3: disable surf index for dedicated images, as these will most likely be shared, and I don't think the metadata has space for this info in it yet. v4: update for shareable images, rename combined_swizzle to tile_swizzle This gets the deferred demo from 730->950fps on my rx480. (dcc cmask elim predication patches get it further) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-17 01:43:41 +01:00
Dave Airlie	b86f86f55c	radv: allow clear merging for depth/stencil with no care stencil Some of the Sascha Willems demos pick a D32/S8 format for the depth buffer, then do a LOAD_OP_CLEAR/LOAD_OP_DONT_CARE on it, which means we don't get to merge the undefined->depth and clear htile transitions. This add the stencil aspect to the pending clears if there is a depth clear pending and the stencil aspect is don't care. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-17 01:16:59 +01:00
Bas Nieuwenhuizen	373f707fbb	radv: Remove NV dedicated alloc extension. To not confuse apps in thinking it might be faster. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com>	2017-07-15 20:10:43 +02:00
Bas Nieuwenhuizen	515da29360	radv: Use the KHR dedicated alloc for the WSI. NV isn't valid for external images anymore. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `6ddc64b93e` "radv: Add support for VK_KHR_dedicated_allocation." Reviewed-by: Andres Rodriguez <andresx7@gmail.com>	2017-07-15 20:10:25 +02:00
Jason Ekstrand	b70829708a	radv: Implement VK_KHR_external_memory This effectively reverts commit 43a171878bb4b5aedb36a. Technically, VK_KHR_get_memory_requirements2 and VK_KHR_dedicated_allocation are required for the KHR version but this at least restores the removed functionality. This patch builds but has received zero testing. Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-15 08:59:38 -07:00
Bas Nieuwenhuizen	6ddc64b93e	radv: Add support for VK_KHR_dedicated_allocation. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-15 08:59:38 -07:00
Bas Nieuwenhuizen	97931f0297	radv: Add support for VK_KHR_get_memory_requirements2. Fished the SparseImage call out of the headers as the spec missed the definition. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-15 08:59:38 -07:00
Jason Ekstrand	3b95e03b2c	radv: Drop support for VK_KHX_external_semaphore_* These have been formally deprecated by Khronos never to be shipped again. The KHR versions should be implemented/used instead. Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-15 08:58:55 -07:00
Alex Smith	0e1886efb9	radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT If a cube image has VK_IMAGE_USAGE_STORAGE_BIT set, the type in an image view's descriptor was set to a 2D array (and a few other fields adjusted accordingly). This is correct when the image view is actually bound as a storage image, but not when bound as a sampled image. In that case the type should be set as a cube. Fix by generating 2 sets of descriptors at view creation time for both storage and non-storage usage, and then choose between them based on descriptor type when writing descriptor sets. v2: Generate storage descriptors for images with TRANSFER_DST, since those may be used as storage images internally. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-07-13 00:21:20 +02:00
Alex Smith	4d5c0c189d	radv: Fix possible invalid free of dynamic descriptors This free was left in after dynamic descriptors were changed to not be allocated separately from the descriptor set, and can cause a crash. Fixes: `39644fa40a` ("radv: Don't allocate dynamic descriptors separately") Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-07-13 00:21:20 +02:00
Dave Airlie	7b5f2e0070	radv/ac: drop setting xnack Since radv uses compute rings and we can't know when we are setting up the shaders what ring they are to be used on, we should just use the default xnack setting. This may be suboptimal in some places, but if we hit a problem, we likely should try and address this between llvm and mesa. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-09 22:21:43 +01:00
Dave Airlie	edf2acbeb1	radv: add support for using addrlib max alignment. Rather than using 64k, use what addrlib returns as the base alignment for vulkan allocations. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-09 22:17:59 +01:00
Bas Nieuwenhuizen	1aba0e7f58	radv: Add compute htile clear for combined depth+stencil surfaces. Figured out the clear value when we have a combined depth stencil surface. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-07-08 16:11:29 +02:00
Alex Smith	c2a5cb6427	ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics The NIR parameters are ordered "compare, data", matching GLSL, but both the image and buffer LLVM intrinsics take them the other way around. This is already handled correctly for SSBO atomics. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver"	2017-07-07 00:57:25 +02:00
Dave Airlie	8950fac6ab	radv: don't overallocate depth/stencil formats For depth/stencil formats the surface layer allocates the stencil separately, so we don't need to include it in the bpe. This reduces the side of d32s8 allocates to something closer to pro. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:23:22 +01:00
Dave Airlie	09d7c7be4f	radv: enable sisched toggle in perftest flags. RADV_PERFTEST=sisched to enable it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:07:49 +01:00
Dave Airlie	d97275e42c	ac/llvm: set xnack like radeonsi does. Use family, but only set xnack+ for gfx9. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:07:45 +01:00
Dave Airlie	01e958d631	ac/llvm: create features list using snprintf. Just more moving code around before adding things to it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:06:04 +01:00
Dave Airlie	9d9f051390	ac/radv: change api to create target machine This just modifies the API to make it easier to add other flags to target machine creation. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 23:05:59 +01:00
Dave Airlie	a6c2001ace	radv: add support for cmd predication. This doesn't get used yet, it just adds support to various PKT3 emissions to enable it later. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 02:06:49 +01:00
Bas Nieuwenhuizen	860a8e6b99	ac/nir: Move VS position exports before param exports. According to Nicolai the SX can already start work when all the position exports are done, so do those first. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-07-05 20:23:00 +02:00
Bas Nieuwenhuizen	3d527ba19b	radv: Always set depthbuffer using image format instead of iview format. We have some cases where changing between depth and stencil only aspect was causing hangs. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-05 20:23:00 +02:00
Bas Nieuwenhuizen	7c7196e35c	radv: Disable depth & stencil tests when the depthbuffer doesn't support it. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Acked-by: Dave Airlie <airlied@redhat.com>	2017-07-05 20:23:00 +02:00
Dave Airlie	1bc40ae952	radv: enable Int64 capability (v2) I'm not 100% sure this is all wired up but it looks like it is. v2: actually enable extension. Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-07-03 11:58:59 -07:00
Connor Abbott	2ec77f7a3c	ac/nir: fix 64-bit shifts NIR always makes the shift amount 32 bits, but LLVM asserts if the two sources aren't the same type. Zero-extend the shift amount to make LLVM happy. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-07-03 11:58:59 -07:00
Connor Abbott	7168425dd7	ac/nir: implement 64-bit packing and unpacking We implement the split opcodes, and tell NIR to lower the original ones. The lowering to LLVM is a little more complicated, but NIR can optimize the split ones a little better, and some NIR lowering passes that we might want to use (particularly for doubles) emit the split ones. This should fix pack/unpackDouble2x32, which seems like a bug since when we enabled the Float64 capability. It will also fix pack/unpackInt2x32 when we enable the Int64 capability. Fixes: `798ae37c` ("radv: Enable Float64 support.") Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-07-03 11:58:58 -07:00
Bas Nieuwenhuizen	87d3349393	radv: Use v4i32 variant of llvm.SI.load.const. We apparently still used v16i8 .... As radeonsi doesn't use it with LLVM version checks I don't think we need them either. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-06-30 23:30:55 +02:00
Dave Airlie	ff422500cc	ac/nir: remove last remnants of v16i8 llvm doesn't need this workaround anymore. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-28 20:22:30 +01:00
Alex Smith	909184ac9c	ac/nir: Use correct LLVM intrinsics for atomic ops on imageBuffers The buffer intrinsics should be used instead of the image ones. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-06-28 21:05:04 +02:00

1 2 3 4 5 ...

993 commits