fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-04-30 21:21:39 +02:00

Author	SHA1	Message	Date
Lionel Landwerlin	903e9952fb	i965: add performance query support on CNL v2: Add brw_oa_cnl.xml to EXTRA_DIST (Emil) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	e7f6d1e5f8	i965: perf: add support for new equation operators Some equations of the CNL metrics started to use operators we haven't defined yet, just add those. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	57a11550bc	i965: perf: query topology With the introduction of asymmetric slices in CNL, we cannot rely on the previous SUBSLICE_MASK getparam to tell userspace what subslices are available. We introduce a new uAPI in the kernel driver to report exactly what part of the GPU are fused and require this to be available on Gen10+. Prior generations can continue to rely on GETPARAM on older kernels. This patch is quite a lot of code because we have to support lots of different kernel versions, ranging from not providing any information (for Haswell on 4.13 through 4.17), to being able to query through GETPARAM (for gen8/9 on 4.13 through 4.17), to finally requiring 4.17 for Gen10+. This change stores topology information in a unified way on brw_context.topology from the various kernel APIs. And then generates the appropriate values for the equations from that unified topology. v2: Move slice/subslice masks fields to gen_device_info (Rafael) v3: Add a gen_device_info_subslice_available() helper (Lionel) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	c1900f5b0f	intel: devinfo: add helper functions to fill fusing masks values There are a couple of ways we can get the fusing information from the kernel : - Through DRM_I915_GETPARAM with the SLICE_MASK/SUBSLICE_MASK parameters - Through the new DRM_IOCTL_I915_QUERY by requesting the DRM_I915_QUERY_TOPOLOGY_INFO The second method is more accurate and also gives us the EUs fusing masks. It's also a requirement for CNL as this platform has asymetric subslices and the first method SUBSLICE_MASK value is assumed uniform across slices. v2: Change gen_device_info_update_from_masks() to generate topology and call into gen_device_info_update_from_topology (Lionel/Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	2d26c99933	intel: devinfo: meson: include drm uapi Already available with the autotools build. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	c471716574	intel: devinfo: store slice/subslice/eu masks We want to store values coming from the kernel but as a first step, we can generate mask values out the numbers already stored in the gen_device_info masks. v2: Add a helper to set EU masks (Lionel/Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	7e2c6147da	intel: devinfo: store number of EUs per subslice This will be reused to store values reported by the kernel. The main use case will be for use as the input values of the metric sets equations for the INTEL_performance_queries extension. By storing this information in the gen_device_info we make this non GL specific so this can be reused by Vulkan if we ever have an equivalent extension. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 20:14:22 +00:00
Lionel Landwerlin	1603ce1921	i965/perf: fix config registration when uploading to kernel When registring configurations to the kernel for the first time, we run into an issue where the id number is not properly set (we're using the wrong variable). As a result when trying to use that id later on, we get an error. This issue manifest itself the first time you use frameretrace after reboot, subsequent runs are fine. Fixes: `27ee83eaf7` ("i965: perf: add support for userspace configurations") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 18:21:57 +00:00
Lepton Wu	a8b846bccd	gallium/winsys/kms: Add support for multi-planes Add a new struct kms_sw_plane which delegate a plane and use it in place of sw_displaytarget. Multiple planes share same underlying kms_sw_displaytarget. v2: - add more check for plane size (Tomasz) v3: - split from larger patch (Emil) v4: - no change from v3 v5: - remove mapped field (Tomasz) v6: - remove change-id in commit message (Tomasz) v7: - add revision history in commit message (Emil) Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Signed-off-by: Lepton Wu <lepton@chromium.org>	2018-03-22 18:10:44 +00:00
Lepton Wu	d891f28df9	gallium/winsys/kms: Fix possible leak in map/unmap. If user calls map twice for kms_sw_displaytarget, the first mapped buffer could get leaked. Instead of calling mmap every time, just reuse previous mapping. Since user could map same displaytarget with different flags, we have to keep two different pointers, one for rw mapping and one for ro mapping. Also introduce reference count for mapped buffer so we can unmap them at right time. v2: - avoid duplicated mapping and leaked mapping (Tomasz) v3: - split from larger patch (Emil) v4: - remove munmap from dt_destory (Emil) v5: - introduce reference count for mapping (Tomasz) - add back munmap in dt_destory v6: - remove change-id in commit message (Tomasz) v7: - remove munmap from dt_destory again (Emil) - add revision history in commit message (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Lepton Wu <lepton@chromium.org>	2018-03-22 18:10:42 +00:00
Juan A. Suarez Romero	4db269f30c	broadcom/vc4: add path to nir_builder.h As the other VC4 files do. Otherwise, it won't find nir_builder.h v2: add path in source code rather changing autotools (Emil) Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	d39e828c82	autotools: add tegra header files Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	40ecee89b7	swr/rast: autotools: add events_private.proto in dist tarball. Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	0bf1274883	radv: autotools: add radv_extensions.h in the generated VULKAN list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	13459c637a	anv/radv: autotools: include vulkan_*.h headers Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Juan A. Suarez Romero	f8b749b7c0	nir: autotools, meson: add GLSL.ext.AMD.h in the files list Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2018-03-22 18:25:39 +01:00
Matt Turner	724586a266	intel/compiler: Readd ICL to test_eu_validate.cpp Now that the PCI IDs are upstream, this can be readded.	2018-03-22 09:56:09 -07:00
Matt Turner	65b060d9cb	intel/compiler: Skip 64-bit type tests when types not available Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	1065acfb69	intel: Disable fast color clear on icl Disabling fast color clear makes fbo-clearmipmap test render correct texture in base miplevel. Fast color clear is anyways disabled for non-base miplevels. Acked-by: Matt Turner <mattst88@gmail.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Jason Ekstrand	d2eecf0b0b	intel/compiler/icl: Clear "null render target" bit in extended message descriptor Otherwise all our render target writes go no where. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	1484876ef7	intel/compiler/icl: Update the assert in brw_stage_has_packed_dispatch() Rafael ran piglit with the test code enabled and saw no additional GPU hangs. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	f05e0d9c2a	intel/common/icl: Disable hiz surface sampling On gen11+ AUX_HIZ is not a supported value for surfaces being sampled by the 3D sampler. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Anuj Phogat	370af9dcc0	intel/common/icl: Add L3 config ICL uses the same L3 configs as CNL, just leaving the SLM configs out. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Matt Turner	f56693af4b	intel/tools/aubinator: Drop platform list from print_help() We all know the platform names, and I don't want to update this list continually. Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2018-03-22 09:56:09 -07:00
Derek Foreman	aa18a63512	egl/wayland: Make swrast display_sync the correct queue commit `03dd9a88b0` introduced per surface queues, but the display_sync for swrast_commit_backbuffer remained on the old queue. This is likely to break when dispatching the correct queue at the top of function (which can't dispatch the sync callback we're waiting for). The easiest known reproduction case is running weston-subsurfaces under weston --use-pixman Signed-off-by: Derek Foreman <derekf@osg.samsung.com> Reviewed-by: Daniel Stone <daniels@collabora.com>	2018-03-22 15:27:35 +00:00
Samuel Pitoiset	52fba3f45d	radv: remove unused radv_pipeline::needs_data_cache variable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-22 14:30:37 +01:00
Mathias Fröhlich	862c872c48	vbo: Remove now duplicate _DrawVAO notification. The DriverFlags.NewArray bit is already set to NewDriverState in _mesa_set_draw_vao since we have actually just above changed the VAOs content. So this can be removed. The _vbo_update_inputs is called by the vbo...recalculate_inputs being set through the same mechanism as described above. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:53 +01:00
Mathias Fröhlich	006b5e798a	vbo: Remove now duplicate _vbo_update_inputs from dlist draw. At the current state, _vbo_update_inputs is called from the draw callback if vbo...recalculate_inputs is set. But that is now set of the _DrawVAO or its content or the vertex program mode is changed. So remove _vbo_update_inputs from the direct dlist draw path. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:53 +01:00
Mathias Fröhlich	2887c98140	vbo: Remove redundant set of DriverFlags.NewArray in vbo_bind_arrays. Now that setting vbo...recalculate_inputs also sets the DriverFlags.NewArray bits into the NewDriverState setting that from vbo_bind_arrays is redundant. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	9f5b6ef2ef	vbo: Remove vbo...recalculate_inputs from vbo_exec_invalidate_state. This flag is now set when the actual Array._DrawVAO changes. So setting this flag is redundant here. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	bf328359a7	mesa: A change of gl_vertex_processing_mode needs an array update. Since arrays also handle the mapping of current values into the disabled array slots, we need to tell the array update code that this mapping has changed. Also mark only dirty if it has changed. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	5b91786225	mesa: Set DriverFlags.NewArray together with vbo...recalculate_inputs. Both mean something very similar and are set at the same time now. For that vbo module to be set from core mesa, implement a public vbo module method to set that flag. In the longer term the flag should vanish in favor of a driver flag of the appropriate driver. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	d3c604e12e	mesa: Update VAO internal state when setting the _DrawVAO. Update the VAO internal state on Array._DrawVAO instead of Array.VAO. Also the VAO internal state update gets triggered now by a change of Array._DrawVAO instead of the _NEW_ARRAY state flag. Also no driver looks at any VAO's NewArrays value from within the Driver.UpdateState callback. So it should be safe to move this update into the _mesa_set_draw_vao method. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	c4c56ff303	vbo: Move vbo_bind_arrays into a dd_driver_functions draw callback. Factor out that common call into the almost single place. Remove the _mesa_set_drawing_arrays call from vbo_{exec,save}_draw code paths as the function is now called through vbo_bind_arrays. Prepare updating the list of struct gl_vertex_array entries via calling _vbo_update_inputs for being pushed into those drivers that finally work on that long list of gl_vertex_array pointers. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Mathias Fröhlich	6307d1be0a	mesa: Move vbo draw functions into dd_function_table. Move vbo draw functions into struct dd_function_table. For now just wrap the underlying vbo functions. Reviewed-by: Brian Paul <brianp@vmware.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>	2018-03-22 04:58:52 +01:00
Aaron Watry	23100acc8f	clover/llvm: Fix build against LLVM/Clang 4.0 The opencl 1.0 langstandard was renamed in 5.0+ v2: Move preprocessor check into compat.hpp Reviewed-by: Francisco Jerez <currojerez@riseup.net>	2018-03-21 21:03:23 -05:00
Timothy Arceri	c135316555	ac/nir_to_llvm: add frexp support Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-03-22 12:42:34 +11:00
Timothy Arceri	cca2141745	nir: add frexp_exp and frexp_sig opcodes Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-22 12:42:34 +11:00
Caio Marcelo de Oliveira Filho	12c22b897a	anv/pipeline: don't pass constant view index in multiview If view mask has only one bit set, view index is effectively a constant, so doesn't need to be passed to the next stages, just always set it. Part of this was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Caio Marcelo de Oliveira Filho	5e7c1d05d4	anv/pipeline: use less instructions for multiview The view_index is encoded in the remainder of dividing instance id by the number of views in the view mask (n). In the general case (handled by the else clause), there is a need to map from 0..n-1 into the number of the view being masked. For that a map is encoded. In the case only the first n bits in the mask are set, the mapping is trivial, 0..n-1 already represent what view is being referred to. That case was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 14:49:50 -07:00
Eric Anholt	baeb6a4b4a	broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI. Unfortunately TGSI doesn't record the type of the FS output like GLSL does, but VC5's TLB writes depend on the output's base type. Just record the type in the key at variant compile time when we've got a TGSI input and then fix it up. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a GPU hang that breaks most tests that come after it.	2018-03-21 14:02:34 -07:00
Neil Roberts	61603f0e42	spirv: Add a 64-bit implementation of Frexp The implementation is inspired by lower_instructions_visitor::dfrexp_sig_to_arith. This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4 test using the ARB_gl_spirv branch. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2018-03-21 20:18:44 +01:00
Rafael Antognolli	5297a17571	aubinator_error_decode: Compare only the class_name of the ring. ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to first compare the class name only, then get the instance id. Without this, INSTDONE is not being decoded. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-03-21 11:35:15 -07:00
Thomas Helland	8d5cd91ca0	nir: Migrate nir_dce to instr worklist Shader-db runtime change avarage of five runs: Before 125,77 seconds (+/- 0,09%) After 124,48 seconds (+/- 0,07%) Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>	2018-03-21 19:26:40 +01:00
Thomas Helland	edb18564c7	nir: Initial implementation of a nir_instr_worklist Make a simple worklist by basically just wrapping u_vector. This is intended used in nir_opt_dce to reduce the number of calls to ralloc, as we are currenlty spamming ralloc quite bad. It should also give better cache locality and much lower memory usage. Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>	2018-03-21 19:26:27 +01:00
Scott D Phillips	cab8df1e3e	intel/tools: aubinator: Catch gen11 "enhanced execlist" submission Different registers are used for execlist submission in gen11, so also watch those. This code only watches element zero of the submit queue, which is all aubdump currently writes. Tested-by: Rafael Antognolli <rafael.antognolli@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2018-03-21 11:07:15 -07:00
Marek Olšák	a8d55374dc	radeonsi: fix a snprintf warning on gcc 7.3.0	2018-03-21 13:43:09 -04:00
Marek Olšák	cf0a95afac	radeonsi/gfx9: print the swizzle mode for testdma Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Marek Olšák	f7ffa504a0	ac/surface: compute tile swizzle for GFX9 Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-03-21 13:40:06 -04:00
Eric Anholt	9f0c9c6d18	broadcom/vc5: Don't skip job submit just because everything is scissored. The coordinate shaders may now have side effects in the form of transform feedback. Part of fixing GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_misc	2018-03-21 10:04:21 -07:00

1 2 3 4 5 ...

93166 commits