fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-23 04:40:09 +01:00

Author	SHA1	Message	Date
Marc-André Lureau	cf54bd5e83	drisw: use shared memory when possible If drisw_loader_funcs implements put_image_shm, allocates display target data with shared memory and display with put_image_shm(). Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:54 +10:00
Marc-André Lureau	63c427fa71	drisw: use putImageShm if available If the DRIswrastLoaderExtension implements putImageShm, bind it to drisw_loader_funcs. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Adam Jackson <ajax@redhat.com>	2018-05-30 09:11:53 +10:00
Dave Airlie	b7ac0779e0	gallium/winsys: rename DRM_API_HANDLE_* to WINSYS_HANDLE_* This just renames this as we want to add an shm handle which isn't really drm related. Originally by: Marc-André Lureau <marcandre.lureau@gmail.com> (airlied: I used this sed script instead) This was generated with: git grep -l 'DRM_API_' \| xargs sed -i 's/DRM_API_/WINSYS_/g' Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:11:53 +10:00
Marc-André Lureau	d2eaff33d0	gallium: move winsys handle to it's own file. This will be used in the drisw interface later, which isn't drm specific. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-30 09:11:53 +10:00
Francisco Jerez	4bd2047dee	intel/fs: Add explicit last_rt flag to fb writes orthogonal to eot. When using multiple RT write messages to the same RT such as for dual-source blending or all RT writes in SIMD32, we have to set the "Last Render Target Select" bit on all write messages that target the last RT but only set EOT on the last RT write in the shader. Special-casing for dual-source blend works today because that is the only case which requires multiple RT write messages per RT. When we start doing SIMD32, this will become much more common so we add a dedicated bit for it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	d3cd6b7215	intel/fs: Replace the CINTERP opcode with a simple MOV The only reason it was it's own opcode was so that we could detect it and adjust the source register based on the payload setup. Now that we're using the ATTR file for FS inputs, there's no point in having a magic opcode for this. v2 (Jason Ekstrand): - Break the bit which removes the CINTERP opcode into its own patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	39de901a96	intel/fs: Use the ATTR file for FS inputs This replaces the special magic opcodes which implicitly read inputs with explicit use of the ATTR file. v2 (Jason Ekstrand): - Break into multiple patches - Change the units of the FS ATTR to be in logical scalars Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	4bfa2ac2ea	intel/fs: Rename a local variable so it doesn't shadow component() v2 (Jason Ekstrand): - Break the refactor into its own patch Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Francisco Jerez	11c71f0e75	intel/eu: Remove brw_codegen::compressed_stack. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jason Ekstrand	71a86d1fc6	intel/fs: Use groups for SIMD16 LINTERP on gen11+ This is better than compression control because it naturally extends to SIMD32. v2: - Push/pop instruction state around adjusted codegen (Ken) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jason Ekstrand	a1a850cd34	intel/fs: Assert that the gen4-6 plane restrictions are followed The fall-back does not work correctly in SIMD16 mode and the register allocator should ensure that we never hit this case anyway. Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-05-29 15:44:50 -07:00
Jan Vesely	41b878e1bd	clover: Cleanup compat code for llvm < 3.9 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com>	2018-05-29 17:36:16 -04:00
Jan Vesely	d424be0fed	clover: Fix build after llvm r332881. v2: fix whitespace and indentation r332881 added an extra parameter to the emit function. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106619 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> Tested-By: Aaron Watry <awatry@gmail.com> Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>	2018-05-29 17:36:16 -04:00
Chris Wilson	3ac5fbadfd	i965: Only emit VF cache invalidations when the high bits changes Commit `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") tried to only emit the VF invalidate if the high bits changed, but it accidentally always set need_invalidate to true; causing it to emit unconditionally emit the pipe control before every primitive. Fixes: `92f01fc5f9` ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106708 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-29 12:16:26 -07:00
Eric Engestrom	e4fe2fd3bb	vulkan: don't free uninitialised memory The modifiers array hasn't been initialised by then, much less with data that would need freeing. Move the label after the loop to fix this. Fixes: `c80c08e226` ("vulkan/wsi/x11: Add support for DRI3 v1.2") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2018-05-29 17:44:13 +01:00
Eric Engestrom	51a17e7fee	dri: replace two-way switch case with a table lookup Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> --- v2: rebased on top of `432df741e0` "dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format."	2018-05-29 17:44:13 +01:00
Eric Engestrom	d3ca7bd452	dri: fix error value returned by driGLFormatToImageFormat() 0 is not a valid value for the __DRI_IMAGE_FORMAT_* enum. It is, however, the value of MESA_FORMAT_NONE, which two of the callers (i915 & i965) checked for. The other callers (that check for errors, ie. st/dri) already check for __DRI_IMAGE_FORMAT_NONE. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-05-29 17:44:13 +01:00
Eric Engestrom	1945231b48	egl/x11: fix build with DRI3 disabled Fixes: `473af0b541` "egl/x11: deduplicate depth-to-format logic" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Tested-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Yogesh Marathe <yogesh.marathe@intel.com>	2018-05-29 17:01:21 +01:00
Thierry Reding	9e539012df	tegra: Treat resources with modifiers as scanout Resources created with modifiers are treated as scanout because there is no way for applications to specify the usage (though that capability may be useful to have in the future). Currently all the resources created by applications with modifiers are for scanout, so make sure they have bind flags set accordingly. This is necessary in order to properly export buffers for such resources so that they can be shared with scanout hardware. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:48:37 +02:00
Thierry Reding	9603d81df0	tegra: Fix scanout resources without modifiers Resources created for scanout but without modifiers need to be treated as pitch-linear. This is because applications that don't use modifiers to create resources must be assumed to not understand modifiers and in turn won't be able to create a DRM framebuffer and passing along which modifiers were picked by the implementation. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:48:34 +02:00
Thierry Reding	bd3e97e5aa	tegra: Remove usage of non-stable UAPI This code path is no longer required with framebuffer modifier support. Tested-by: Daniel Kolesa <daniel@octaforge.org> Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Thierry Reding <treding@nvidia.com>	2018-05-29 16:47:45 +02:00
Karol Herbst	56792a0876	nir/print: fix printing of 8/16 bit constant variables v2 (Jose Maria Casanova Crespo <jmcasanova@igalia.com>): add float16 support Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>	2018-05-29 13:43:49 +02:00
Pierre Moreau	f0e80e123c	nv50/ir: Extend ImmediateValue::applyLog2 to 64-bit integers Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 13:37:45 +02:00
Pierre Moreau	03f592a164	util/u_math: Implement a logbase2 function for unsigned long v2 (Karol Herbst <kherbst@redhat.com>): * removed unneeded ll * ll -> ull Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-29 13:37:45 +02:00
Samuel Pitoiset	88d1ed0f81	radv: emit shader descriptor pointers consecutively This reduces the number of SET_SH_REG packets which are emitted for applications that use more than one descriptor set per stage. We should be able to emit more SET_SH_REG packets consecutively (like push constants and vertex buffers for the vertex stage), but this will be improved later. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:18 +02:00
Samuel Pitoiset	21baf33a94	radv: allow radv_emit_shader_pointer_head() to emit more pointers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:16 +02:00
Samuel Pitoiset	288fe7ec71	radv: split radv_emit_shader_pointer() This will allow to emit consecutive shader pointers for reducing the number of emitted SET_SH_REG packets, which is recommended. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-29 10:07:13 +02:00
Rhys Perry	57e721a456	gm107/ir: prevent WaW hazards in instruction scheduling Previously, findFirstUse() only considered reads "uses". This fixes that by making it check both an instruction's sources and definitions. It also shortens both findFistUse() and findFirstDef() along the way. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2018-05-28 13:59:56 -04:00
Bas Nieuwenhuizen	a29bc043ae	radv: Implement VK_KHR_draw_indirect_count. Literally the same as the AMD ext. Passes indirect_draw_count CTS tests. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-28 12:08:26 +02:00
Bas Nieuwenhuizen	b0002e4e05	vulkan: Update header+vk.xml to 1.1.76 Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 12:08:20 +02:00
Bas Nieuwenhuizen	6914d5a2c0	radv: Implement alternate GFX9 scissor workaround. This improves dota2 performance for me by 11% when I force the GPU DPM level to low (otherwise dota2 is CPU limited for 4k on my threadripper), which should be a large part of the radv-amdvlk gap. (For me with that was radv 60.3 -> 66.6, while AMDVLK does about 68 fps) It looks like dota2 rendered the GUI with a bunch of draws with a SetScissors before almost each draw, causing a lot of pipeline stalls. I'm not really happy with the duplication of code, but overriding radeon_set_context_reg would also be messy since we have the pre-recorded pipelines and a bunch of si_cmd_buffer code, as well as some memory->context reg loads for which things would be more complicated. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-28 12:04:25 +02:00
Eric Anholt	3b6dfcf7ae	Revert "st/nir: use NIR for asm programs" This reverts commit `5c33e8c772`. It broke fixed function vertex programs on vc4 and v3d, and apparently caused trouble for radeonsi's NIR paths as well. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> https://bugs.freedesktop.org/show_bug.cgi?id=106673	2018-05-28 14:41:03 +10:00
Scott D Phillips	4714784dae	anv: move canonical_address calculation into a separate function A later patch will make use of this in other places. Also, remove dependency on undefined behavior of left-shifting a signed value. v2: - move function into a separate header (Chris) v3: (by Ken) Add new header to the various build systems. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-27 19:24:33 -07:00
Gert Wollny	1aec4a07d4	r600: Fix SSG when not all components are written Make sure only those components are written to that are specified in the write mask. Fixes: dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_float_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.lowp_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.mediump_vec3_fragment dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_vertex dEQP-GLES2.functional.shaders.operator.common_functions.sign.highp_vec3_fragment Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 02:57:46 +01:00
Gert Wollny	42cd2810aa	r600: Correct IDIV if DST and SRC use the same temporary In cases like IDIV TEMP[0].xy TEMP[0].xx TEMP[1].yy the result will be written to the same register that is also a source register. Since the components are evaluated one by one, this may result in overwriting the source value for a later operation. Work around this by adding another temporary to store the result if the destination temporary index is equal to one of the source temporary indices. Fixes: dEQP-GLES2.functional.shaders.operator.binary_operator.div.* Signed-off-by: Gert Wollny <gw.fossdev@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-05-28 02:57:46 +01:00
Kenneth Graunke	58fb613a51	i965: Revert recent tiled memcpy changes. This reverts commit `79fe00efb4`. This reverts commit `f5e8b13f78`. This reverts commit `d21c086d81`. They broke the Android build and I'd rather not leave it broken for the long holiday weekend.	2018-05-26 16:25:50 -07:00
Scott D Phillips	79fe00efb4	i965/miptree: Use cpu tiling/detiling when mapping Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. Tiling/detiling with the cpu will be the only way to handle Yf/Ys tiling, when support is added for those formats. v2: Compute extents properly in the x\|y-rounded-down case (Chris Wilson) v3: Add units to parameter names of tile_extents (Nanley Chery) Use _mesa_align_malloc for the shadow copy (Nanley) Continue using gtt maps on gen4 (Nanley) v4: Use streaming_load_memcpy when detiling v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it takes precedence. Add intel_miptree_access_raw, needed after rebasing on commit `b499b85b0f`. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 21:35:50 -07:00
Chris Wilson	f5e8b13f78	i915: Fix streaming loads for intel_tiled_memcpy We stream from a tiled and aligned source into an unaligned user buffer, so we need to use _mm_storeu_si128. Fixes: `d21c086d81` (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 21:35:50 -07:00
Marek Olšák	18c50498db	radeonsi: remove unused variable addr_vec trivial	2018-05-25 18:37:57 -04:00
Jason Ekstrand	ae514ca695	intel/blorp: Support blits and clears on surfaces with offsets For certain EGLImage cases, we represent a single slice or LOD of an image with a byte offset to a tile and X/Y intratile offsets to the given slice. Most of i965 is fine with this but it breaks blorp. This is a terrible way to represent slices of a surface in EGL and we should stop some day but that's a very scary and thorny path. This gets blorp to start working with those surfaces and fixes some dEQP EGL test bugs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 14:01:44 -07:00
Marek Olšák	2f65c67043	radeonsi: fix passing gl_ClipVertex for GS and tess Also add the fprintf call. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	a7d61c0753	radeonsi: fix color inputs/outputs for GS and tess GS is tested, tessellation is untested. Have outputs_written_before_ps for HW VS and outputs_written for other stages. The reason is that COLOR and BCOLOR alias for HW VS, which drives elimination of VS outputs based on PS inputs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	92ea9329e5	radeonsi: fix incorrect parentheses around VS-PS varying elimination I don't know if it caused issues. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	a4ba7cd6a2	st/mesa: simplify lastLevel determination in st_finalize_texture This fixes shader images where we always bind stObj->pt and not individual gl_texture_images. Roughly based on i965 commit `845ad2667a` which does a similar thing but for a different reason. This fixes GL CTS assertion failures introduced by Ilia. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:31:36 -04:00
Scott D Phillips	d21c086d81	i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear The reference for MOVNTDQA says: For WC memory type, the nontemporal hint may be implemented by loading a temporary internal buffer with the equivalent of an aligned cache line without filling this data to the cache. [...] Subsequent MOVNTDQA reads to unread portions of the WC cache line will receive data from the temporary internal buffer if data is available. This hidden cache line sized temporary buffer can improve the read performance from wc maps. v2: Add mfence at start of tiled_to_linear for streaming loads (Chris) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matt Turner <mattst88@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-05-25 11:05:46 -07:00
Alok Hota	fb20ae0374	swr/rast: Adjusted avx512 primitive assembly for msvc codegen Optimize AVX-512 PA Assemble (PA_STATE_OPT). Reduced generated code by about 4x, MSVC compiler was going crazy making temporaries and split-loading inputs onto the stack unless explicit AVX-512 load ops were added Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:57:02 -05:00
Alok Hota	b3360f5c8b	swr/rast: Moved memory init out of core swr init Added two new files for a wrapper function for initialization v2: added missing include for single architecture builds Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:55 -05:00
Alok Hota	b6b114c1ae	swr/rast: Removed superfluous JitManager argument from passes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:49 -05:00
Alok Hota	98d0201577	swr/rast: Renamed MetaData calls Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:43 -05:00
Alok Hota	14b5cac0be	swr/rast: Use metadata to communicate between passes Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2018-05-25 10:56:37 -05:00

... 15 16 17 18 19 ...

95186 commits