fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-07 09:18:04 +02:00

Author	SHA1	Message	Date
Jason Ekstrand	66603bff6f	spirv: Claim support for the simple memory model It's rather surprising that we've never actually hit this before. Aparently, Ian's SPIR-V generator currently claims the Simple when you don't do anything complex. We really shouldn't assert-fail on it. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `8ab9820d34`)	2017-10-27 18:55:46 +01:00
Marek Olšák	b0082632eb	radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI See my LLVM patch which fixes the root cause. Users have to apply this patch and then they have 2 choices: - Downgrade to LLVM 5.0 - Update to LLVM git after my LLVM patch is pushed. It won't be possible to use current and earlier development version of LLVM 6.0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: 17.3 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `3f8e3c2bd8`)	2017-10-27 18:55:43 +01:00
Leo Liu	3da6dd8003	radeon/video: add gfx9 offsets when rejoin the video surface For CPU access. Signed-off-by: Leo Liu <leo.liu@amd.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com> (cherry picked from commit `ea3dc75d72`)	2017-10-27 18:55:41 +01:00
Jason Ekstrand	2e33d68046	anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir We currently have a bug where nir_lower_system_values gets called before nir_lower_var_copies so it will miss any system value uses which come from a copy_var intrinsic. Moving it to after brw_preprocess_nir fixes this problem. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `279f8fb69c`)	2017-10-27 18:55:38 +01:00
Jason Ekstrand	3b699fdd19	anv/pipeline: Drop nir_lower_clip_cull_distance_arrays We already handle it in brw_preprocess_nir Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit `afa0ddb81e`)	2017-10-27 18:55:24 +01:00
Jason Ekstrand	a2123968fa	intel/fs: Handle flag read/write aliasing in needs_src_copy In order to implement the ballot intrinsic, we do a MOV from flag register to some GRF. If that GRF is used in a SEL, cmod propagation helpfully changes it into a MOV from the flag register with a cmod. This is perfectly valid but when lower_simd_width comes along, it simply splits into two instructions which both have conditional modifiers. This is a problem since we're reading the flag register. This commit makes us check whether or not flags_written() overlaps with the flag values that we are reading via the instruction source and, if we have any interference, will force us to emit a copy of the source. Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org (cherry picked from commit `fa6e74e33e`)	2017-10-27 18:50:27 +01:00
Jan Vesely	1ce3fbeb91	clover: Fix compilation after clang r315871 v2: use a more generic compat function v3: rename and formatting cleanup Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103388 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Francisco Jerez <currojerez@riseup.net> CC: <mesa-stable@lists.freedesktop.org> (cherry picked from commit `a6d38f476b`)	2017-10-27 18:50:24 +01:00
Jason Ekstrand	8f2bc19856	nir/intrinsics: Set the correct num_indices for load_output Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (cherry picked from commit `c1b84256cc`)	2017-10-27 18:50:21 +01:00
Matthew Nicholls	b6f0c16a89	ac/nir: generate correct instruction for atomic min/max on unsigned images v2: fix silly typo Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit `27a0b24bf2`)	2017-10-27 18:50:19 +01:00
Dave Airlie	5c8eb88553	radv: use device name in cache creation like radeonsi. Not sure how useful this is, but it makes it more consistent. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: "17.3" <mesa-stable@lists.freedesktop.org> Signed-off-by: Dave Airlie <airlied@redhat.com> (cherry picked from commit `d8cefaa197`)	2017-10-27 18:50:12 +01:00
Alex Smith	afdb9da492	radv: Update code pointer correctly if a variant is already created This was the actual cause of GPU hangs fixed by `0fdd531457` ("radv: Fix pipeline cache locking issues"), since multiple threads would end up trying to create the variants for a single entry. Now that we're locking around the whole of this function, this isn't really necessary (we either create all or none of the variants), but fix this anyway in case things change later. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> CC: 17.3 <mesa-stable@lists.freedesktop.org> (cherry picked from commit `fee9d05e21`)	2017-10-27 18:50:09 +01:00
Kenneth Graunke	b8f10fdf34	i965: Revert absolute mode for constant buffer pointers. The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2 registers at context initialization time. Instead, they're inherited from whatever happened to be running on the GPU prior to first run of a new context. So, when we started setting these, other contexts in the system started inheriting our values. Since this controls whether 3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong setting is fatal for almost any process which isn't expecting this. Unfortunately, VA-API and Beignet don't initialize this (nor does older Mesa), so they will die horribly if we start doing this. UXA and SNA don't use any push constants, so they are unaffected. Until we have some kind of solution to this problem, I'm going to revert this patch and abandon using the feature for now. It will lead to fewer pushed UBO ranges on Broadwell+, which may lead to lower performance, though I don't have any data on the impact. Cc: "17.3 17.2" <mesa-stable@lists.freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774 (cherry picked from commit `013d331220`)	2017-10-27 18:50:07 +01:00
Nicolai Hähnle	ea132f9265	amd/common/gfx9: workaround DCC corruption more conservatively Fixes KHR-GL45.texture_swizzle.smoke and others on Vega. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102809 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> (cherry picked from commit `f9ccfda9bc`)	2017-10-27 18:50:04 +01:00
Ilia Mirkin	08b41e70dd	glsl: fix derived cs variables There are two issues with the current implementation. First, it relies on the layout(local_size_*) happening in the same shader as the main function, and secondly it doesn't work for variable group sizes. In both cases, the simplest fix is to move the setup of these derived values to a later time, similar to how the gl_VertexID workarounds are done. There already exist system values defined for both of the derived values, so we use them unconditionally, and lower them after linking is performed. While we're at it, we move to using gl_LocalGroupSizeARB instead of gl_WorkGroupSize for variable group sizes. Also the dead code elimination avoidance can be removed, since there can be situations where gl_LocalGroupSizeARB is needed but has not been inserted for the shader with main function. As a result, the lowering code has to insert its own copies of the system values if needed. Reported-by: Stephane Chevigny <stephane.chevigny@polymtl.ca> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103393 Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (cherry picked from commit `4d24a7cb97`)	2017-10-27 18:50:02 +01:00
Emil Velikov	ae720e2873	Update version to 17.3.0-rc1 Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-23 13:30:56 +01:00
Juan A. Suarez Romero	2665d012a8	radv: automake: include radv_extensions.py in the tarball Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-10-23 12:37:01 +02:00
Bas Nieuwenhuizen	a548b727a1	ac/nir: Only clamp shadow reference on radeonsi. Vulkan CTS does not expect the value to be clamped (at least for D32), and it makes a differences even though depth is in [0,1], due to strict inequalities. I couldn't find anything in the Vulkan spec about this, but the test seemed to be copied from GL tests and the GL spec only specifies clamping for fixed point formats. Hence I expect radeonsi to run into this at some point as well, but given that they still have a usecase with the Z16->Z32 promotion, I'll leave that for someone else to clean up. This at least fixes radv dEQP-VK.texture.shadow.* on VI. Fixes: `0f9e32519b` 'ac/nir: clamp shadow texture comparison value on VI' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-23 09:13:38 +02:00
Bas Nieuwenhuizen	c07d719e8b	radv: Disallow indirect outputs for GS on GFX9 as well. Since it also uses the output vector before writing to memory. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-23 00:27:44 +02:00
Bas Nieuwenhuizen	2c5b43c87f	ac/nir: Fix nir_texop_lod on GFX for 1D arrays. Fixes: `1bcb953e16` 'radv: handle GFX9 1D textures' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-23 00:27:44 +02:00
Dave Airlie	da9c3cd3ee	radv/ac/nir: only emit tess factors to storage if tes reads them Otherwise we just need to write them to the tf ring. this seems to improve the tessellation demo on Bonarie ~2190->~2230 fps Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen	6ce550453f	radv: Don't use vgpr indexing for outputs on GFX9. Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.* tests. Fixes: `e38685cc62` 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-22 02:36:37 +02:00
Bas Nieuwenhuizen	ad727b96b6	ac/nir: Account for compact array index in GS input load from LDS. Mirrors the vram path. Fixes: `d4ecc3c929` 'ac/nir: Add loading from LDS for merged GS.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen	67648c0faa	radv: Don't compile shaders when they are cached already. When the gs_copy_shader is NULL (due to an incomplete cache), but the main shaders are found, we still do the nir, but we shouldn't compile the shaders again. For merged shaders we should also account for the missing shaders. Fixes: `ce03c119ce` 'radv: Add code to compile merged shaders.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 22:29:34 +02:00
Bas Nieuwenhuizen	3bf954b28e	radv: Don't check for max GL GS invocations. We specify 127 instead of 32 as the limit in vulkan. Fixes: `6bc42855f9` 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 22:29:09 +02:00
Bas Nieuwenhuizen	050f7e2df2	radv: Don't explicitly reference vertex shader for draw_id. With merged shaders the vertex shader may not exist. This got in because the offending patch was written before merged shaders were upstream, but committed after. Fixes: `75dfab24a2` 'radv: refactor indirect draws with radv_draw_info' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-21 20:00:22 +02:00
Bas Nieuwenhuizen	20fb15bfe4	radv: Don't reset cmd_buffer->state.dirty. Otherwise for non-indexed draws we set and immediately unset RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should clear their own bit, this is unnecessary. Fixes: `341529dbee` 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-21 20:00:16 +02:00
Bas Nieuwenhuizen	fb55477990	radv: Correctly detect changed shaders for vertex descriptors. As they were emitted after the new pipeline, the changed pipeline detection was not working anymore. Fixes: `341529dbee` 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-10-21 19:59:44 +02:00
Bas Nieuwenhuizen	24fe4e6143	ac/nir: Set larged wrokgroup size for GS on GFX9. They don't take a single wave anymore and we need the barriers. Fixes: `6bc42855f9` 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen	9e82f2b3ea	ac/nir: Take the max workgroup size of all provided shaders. Fixes: `ffaf4d608a` 'radv: Enable tessellation shaders for GFX9.' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-21 12:46:28 +02:00
Alex Smith	0fdd531457	radv: Fix pipeline cache locking issues Need to lock around the whole process of retrieving cached shaders, and around GetPipelineCacheData. This fixes GPU hangs observed when creating multiple pipelines in parallel, which appeared to be due to invalid shader code being pulled from the cache. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 03:52:43 +02:00
Lionel Landwerlin	c71d44c7f8	anv: don't assert on device init on Cannonlake v2: Warn that support is still in alpha (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-21 02:37:33 +01:00
Lionel Landwerlin	0c95adaf9e	anv: disable stencil pma fix on Gen > 9 This workaround isn't listed on Gen10. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-21 02:37:33 +01:00
Lionel Landwerlin	0c92651a3b	blorp: enable R32G32B32X32 blorp ccs copies Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-10-21 02:37:33 +01:00
Eric Anholt	48615d1ead	meson: Fix vc5 deps on the XML-generated headers. I typoed and was depending on v3d_xml.h (the gzipped xml)_, not on the v3d_packet_v33_pack.h that the compiler and QPU packing actually use.	2017-10-20 17:16:00 -07:00
Eric Anholt	07bfdb478b	broadcom/vc5: Propagate vc4 aliasing fix to vc5. See `e5fea0d621`	2017-10-20 17:09:47 -07:00
Stefan Schake	e5fea0d621	broadcom/vc4: Fix aliasing issue This was causing Android clang version 3.8.256229 to miscompile, presumably due to strict aliasing. Fixes: `14dc281c13` ("vc4: Enforce one-uniform-per-instruction after optimization.")	2017-10-20 17:09:35 -07:00
Dylan Baker	035ec7a2bb	meson: Add support for EGL glvnd Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Lyude Paul <lyude@redhat.com>	2017-10-20 16:46:48 -07:00
Dylan Baker	108d257a16	meson: build libEGL This is based heavily on Daniel Stone's work for the same, rebased on master and with a number of TODO's fixed. This does not implement glvnd (which is coming in a later patch) Meson builds egl slightly differently than autotools, namely it doesn't build an intermediate shared library. It doesn't do this because meson doesn't have problems with the name of the library being dynamically generated, so the glvnd and non-glvnd code can follow the same path. v2: - Don't reuse variable (Eric E.) Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2017-10-20 16:46:48 -07:00
Dylan Baker	ddf06a05ad	meson: move wayland_drm_protocol generation to wayland-drm These files are needed by both vulkan wayland-wsi and by egl wayland-wsi, since the XML file is in src/egl/wayland/wayland-drm and we can include this directory in such a way that it will be loaded before egl and vulkan this allows us to avoid multiple calls to the same generator. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>	2017-10-20 16:46:48 -07:00
Dylan Baker	8d3b1210cb	meson: Don't allow glx to be built without platform_x11 Previously this failed to change with_glx to disabled from auto if platform_x11 was unset or if no opengl apis were being built. v2: - swap conditional positions Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-and-Tested-by: Eric Engestrom <eric@engestrom.ch>	2017-10-20 16:46:48 -07:00
Dylan Baker	8792a9e01b	meson: bump libdrm_amdgpu requirement to 2.4.85 fixes: `b603725703` ("configure.ac: Bump libdrm_amdgpu version to 2.4.85.") Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-20 16:45:39 -07:00
Eric Anholt	5a0d3e1129	nir: Print the components referenced for split or packed shader in/outs. Having 4 variables all called "gl_in_TexCoord0@n" isn't very informative, much better to see: decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0 (VARYING_SLOT_VAR0.x, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@0 (VARYING_SLOT_VAR0.y, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@1 (VARYING_SLOT_VAR0.z, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@2 (VARYING_SLOT_VAR0.w, 1, 0) v2: Handle arrays and structs better (by Timothy) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-20 16:26:46 -07:00
Eric Anholt	d9ce4ac990	nir: Add a safety check that we don't remove dead I/O vars after lowering. The pass only looks at var load/store intrinsics, not input load/store intrinsics, so assert that we don't see the other type. v2: Adjust comment indentation. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-10-20 16:26:07 -07:00
Andres Rodriguez	a2c6fbb3ee	radv: disable implicit sync for radv allocated bos v3 Implicit sync kicks in when a buffer is used by two different amdgpu contexts simultaneously. Jobs that use explicit synchronization mechanisms end up needlessly waiting to be scheduled for long periods of time in order to achieve serialized execution. This patch disables implicit synchronization for all radv allocations except for wsi bos. The only systems that require implicit synchronization are DRI2/3 and PRIME. v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC v3: Add drm version check (Bas) Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:15:54 +02:00
Andres Rodriguez	eff2bdbd82	radv: factor out radv_alloc_memory This allows us to pass extra parameters to the memory allocation operation that are not defined in the vulkan spec. This is useful for internal usage. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:15:49 +02:00
Andres Rodriguez	92724338ba	radv: Expose VK_EXT_global_priority Expose the extension string as supported Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	9f7edf4d1f	radv: don't skip PS/VS partial flush This patch helps lower high priority compute latency. Found by bisecting a perf regression on computeparticles with high priority compute queues enabled. Reverting this micro-optimization doesn't seem to have any negative effect on performance on Dota2 or ssao. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	fd04f3eb86	radv: Implement VK_EXT_global_priority This extension allows the caller to change a queue's system wide priority. This is useful for applications with specific latency constraints. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	557de3b9ae	radeonsi: hardcode shader WAVE_LIMIT to the maximum value This is part of a cooperative scheduling approach used by radv. All drivers in the stack must opt-in to resource arbitration, otherwise GL based apps will be able to ignore system priorities. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Andres Rodriguez	986c4b0bd4	radv: hardcode shader WAVE_LIMIT to the maximum value When WAVE_LIMIT is set, a submission will opt-in for SPI based resource scheduling. Because this mechanism is cooperative, we must ensure that all submissions have this field set, otherwise they will bypass resource arbitration. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00

1 2 3 4 5 ...

96957 commits