fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-06-11 20:38:32 +02:00

Author	SHA1	Message	Date
Robert Bragg	a678b79ef4	i965: Add more Haswell OA metrics sets This extends the brw_oa_hsw.xml to expose these additional queries: - Compute Metrics Basic Gen7.5 - Compute Metrics Extended Gen7.5 - Memory Reads Distribution Gen7.5 - Memory Writes Distribution Gen7.5 - Metric set Sampler Balance Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:51 +00:00
Robert Bragg	458468c136	i965: Expose OA counters via INTEL_performance_query This adds support for exposing basic Observation Architecture performance counters on Haswell. This support is based on the i915 perf kernel interface which is used to configure the OA unit, allowing Mesa to emit MI_REPORT_PERF_COUNT commands around queries to collect counter snapshots. To take into account the small chance that some of the 32bit counters could wrap around for long queries (~50 milliseconds for a GT3 Haswell @ 1.1GHz) the implementation also collects periodic metrics. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:50 +00:00
Robert Bragg	a98ffe2477	exec_list: Add a foreach_list_typed_from macro This allows iterating list nodes from a given start point instead of necessarily the list head. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:50 +00:00
Robert Bragg	e56550565e	i965: Add script to gen code for OA counter queries Avoiding lots of error prone boilerplate and easing our ability to add + maintain support for multiple OA performance counter queries for each generation: This adds a python script to generate code for building up performance_queries from the metric sets and counters described in brw_oa_hsw.xml as well as functions to normalize each counter based on the RPN expressions given. Although the XML file currently only includes a single metric set, the code generated assumes there could be many sets. The metrics as described in XML get translated into C structures which are registered in a brw->perfquery.oa_metrics_table hash table keyed by the GUID of the metric set in XML. v2: numerous python style improvements (Dylan) v3: Makefile.am fixups (Emil) v4: Pattern rule for codegen + orthogonal .c and .h rules (Robert) Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 13:45:44 +00:00
Robert Bragg	f46e58e018	i965: extend query/counter structs for OA queries In preparation for generating code from brw_oa_hsw.xml for describing OA performance counter queries this adds some OA specific members to brw_perf_query that our generated code will initialize: - The oa_metric_set_id is the ID we will pass to DRM_IOCTL_I915_PERF_OPEN, and is an ID got via sysfs under: /sys/class/drm/<card>/metrics/<guid/id - The oa_format is the OA report layout we will request from the kernel - The accumulator offsets determine where the different groups of A, B and C counters are located within an intermediate 64bit 'accumulator' buffer. Additionally brw_perf_query_counter now has 64bit or float _read() callback members for OA counters. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 12:53:07 +00:00
Robert Bragg	eaab41c9db	i965: brw_context.h additions for OA unit query codegen In preparation for generating code from the XML performance counter meta data, this makes some additions to brw_context.h for this code to be able to reference. It adds a brw->perfquery.oa_metrics_table hash table for indexing built up query descriptions by the GUID that is expected to be advertised by the kernel (via sysfs) to be able to use that query. It adds an 'OA_COUNTERS' brw_query_kind to be assigned to queries built up by generated code. It adds a brw->perfquery.sys_vars structure to have a consistent place to represent the different system variables like $EuCoresTotalCount and $EuSlicesTotalCount that are referenced by OA counter normalization equations. Although extending + referencing gen_device_info for these variables was considered, these are some of the (mostly minor) reasons for going with a dedicated structure: - Currently we only need this info for the performance_query backend and it might be a bit tedious to go back and initialize the state for pre-Haswell devinfo structures. - Considering the $SubsliceMask then the requirement for how multiple per-slice masks are packed only comes from how the variables are references by availability tests in XML, and might not be a good general representation for tracking subslice masks if another use case arises. - If we used gen_device_info then we'd likely want to avoid making assumptions about the C types during codegen and adding explicit casts, while that's not necessary with a dedicated struct with all members being uint64_t. - This structure and the code for initializing it is currently shared (just through copy & paste) with a few other projects dealing with OA counters, and that's been convenient so far. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 12:53:07 +00:00
Robert Bragg	b79268174b	i965: XML description of Haswell OA metric set In preparation for exposing Gen Observation Architecture performance counters via INTEL_performance_query this adds an XML description for an initial 'Render Metrics Basic Gen7.5' query and corresponding counters. The intention is to auto generate code for building a query from these counters as well as the code for normalizing the individual counters. Note that the upstream for this XML data is currently GPU Top: https://github.com/rib/gputop The files are maintained under gputop-data/ and they are themselves derived from files in an internal 'MDAPI XML' schema. There are scripts under gputop-scripts/ and make rules in gputop-data/Makefile.xml for maintaining these files. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 12:53:07 +00:00
Pierre Moreau	655c395f65	nv50/ir: check for origin insn in findOriginForTestWithZero Function arguments do not have an "origin" instruction, causing a NULL-pointer dereference without this check. Signed-off-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-03-09 12:42:46 +01:00
Samuel Pitoiset	d54b498694	mesa/main: make use of lookup_samplerobj_locked() There is no need to check sampler == 0 twice. This removes now unused _mesa_lookup_samplerobj_locked(). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-09 11:01:37 +01:00
Samuel Pitoiset	58b4ae0411	mesa/main: inline {begin,end}_samplerobj_lookups() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2017-03-09 11:01:31 +01:00
Grazvydas Ignotas	8cd83a6c81	glsl/blob: clear padding bytes Since blob is intended for serializing data, it's not a good idea to leave padding holes with uninitialized data, which may leak heap contents and hurt compression if the blob is later compressed, like done by shader cache. Clear it. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-09 20:41:02 +11:00
Grazvydas Ignotas	61bbb25a08	util/disk_cache: fix size subtraction on 32bit Negating size_t on 32bit produces a 32bit result. This was effectively adding values close to UINT_MAX to the cache size (the files are usually small) instead of intended subtraction. Fixes 'make check' disk_cache failures on 32bit. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-09 20:26:30 +11:00
Grazvydas Ignotas	926bcacfd3	util/disk_cache: fix compressed size calculation It incorrectly doubles the size on each iteration. Fixes: `85a9b1b5` "util/disk_cache: compress individual cache entries" Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2017-03-09 20:26:23 +11:00
Lionel Landwerlin	f81ede4699	glsl: builtin: always return clones of the builtins Builtins are created once and allocated using their own private ralloc context. When reparenting IR that includes builtins, we might be steal bits of builtins. This is problematic because these builtins might now be freed when the shader that includes then last is disposed. This might also lead to inconsistent ralloc trees/lists if shaders are created on multiple threads. Rather than including builtins directly into a shader's IR, we should include clones of them in the ralloc context of the shader that requires them. This fixes double free issues we've been seeing when running shader-db on a big multicore (72 threads) server. v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better reflect how this function is used. (Ken) v3: Rename ctx to mem_ctx (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-03-09 08:30:36 +00:00
Kenneth Graunke	071d80bde2	i965: Delete render ring prelude. This was a hook I came up when trying to do the initial performance counter work years ago. Nothing's used it for a long time, and the upcoming performance counter support doesn't want it either. So, goodbye render ring prelude. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-03-08 23:01:21 -08:00
Vinson Lee	d64ded7b50	swr: s/uint/enum pipe_render_cond_flag/ Fix build error. swr_context.cpp: In function ‘void swr_blit(pipe_context, const pipe_blit_info)’: swr_context.cpp:336:44: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_render_cond_flag’ [-fpermissive] ctx->render_cond_mode); ~~~~~^~~~~~~~~~~~~~~~ Fixes: `b0d3938430` ("gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100133 Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2017-03-08 21:43:07 -08:00
Bas Nieuwenhuizen	7d6e1a341a	radv: Don't flush the CB before doing a fast clear eliminate. The only way we write CMASK/DCC compressed textures through shaders is fast clears and CMASK/DCC inits, which have their own flushes. Hence the CB cache is always up to date. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:28 +01:00
Bas Nieuwenhuizen	8700329785	radv: Don't emit cache flushes on subpass switch. I think we should only flush right before an action (draw/dispatch etc.), as otherwise it is too easy to issue redundant flushes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:23 +01:00
Bas Nieuwenhuizen	9251f8b35e	radv: Only flush for the needed stages, and before the flushes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:19 +01:00
Bas Nieuwenhuizen	f92a118434	radv: Don't invalidate CB/DB for images that aren't modified outside CB/DB. Without stores, the only writes are fast clears, transfers and metadata initialization, each of which have the appropiate invalidations already. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:14 +01:00
Bas Nieuwenhuizen	0567ab0407	radv: Flush more caches after writes. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:10 +01:00
Bas Nieuwenhuizen	7a600bbc81	radv: Don't flush for fixed-function reading. The data should always be in memory after a src flush. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:05 +01:00
Bas Nieuwenhuizen	dd094e4ff9	radv: Invalidate the correct caches for CB/DB dst barriers. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:35:01 +01:00
Bas Nieuwenhuizen	b075eb7d47	radv: Determine cache flushes per object. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-03-09 02:34:42 +01:00
Samuel Pitoiset	2568d9d0cd	mesa/main: remove unused _mesa_new_texture_image() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-03-09 01:57:20 +01:00
Dave Airlie	e6902be900	radv/ac: fixup texture coord to have right number of channels. Jason has patches to add validation to this area, this should fix radv shaders. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-03-09 09:17:11 +10:00
Timothy Arceri	0e34966340	st/nine: pass NULL to ureg_get_tokens() The number of tokens in never used and the pointer is NULL checked so just pass NULL. Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-03-09 09:29:07 +11:00
Matt Turner	a45cd8107d	docs: ARB_shader_atomic_counter_ops is enabled on i965/gen7+. This extension was enabled in commit `40dd45d0c6` ("i965: Enable ARB_shader_atomic_counter_ops") but the commit failed to update the release notes or features.txt. The release notes ship has sailed, since the commit was in 13.0.	2017-03-08 13:58:52 -08:00
Eric Anholt	19f571ba6d	vc4: Fix math with a condition flag set. Math results land in r4, regardless of the condition. To implement them, we just need to ensure that the results are moved out of r4 (as often happens anyway, the values is live across another math instruction), so that we can attach the condition to the MOV. Fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.93 and a couple others, that were assertion failing that their conditions hadn't been handled during the QIR->QPU stage.	2017-03-08 13:44:17 -08:00
Eric Anholt	615f6653b0	vc4: Fix register pressure cost estimates when a src appears twice. This ended up confusing the scheduler for things like fabs (implemented as fmaxabs x, x) or squaring a number, and it would try to avoid scheduling them because it appeared more expensive than other instructions. Fixes failure to register allocate in dEQP-GLES2.functional.uniform_api.random.3 with almost no shader-db effects (+.35% max temps)	2017-03-08 13:44:17 -08:00
Eric Anholt	0fca01d027	vc4: Report to shader-db how many threads a fragment shader has. Doing instruction count analysis when we emit the thread switches that will save us from tons of stalls is kind of missing the point.	2017-03-08 13:44:17 -08:00
Eric Anholt	61359324c1	Revert "vc4: Lazily emit our FS/VS input loads." This reverts commit `292c24ddac`. It broke a lot of GLES2 deqp, and I see at least one problem that will require some serious rework to fix.	2017-03-08 13:44:17 -08:00
Marek Olšák	ab12a126fd	radeonsi: fix elimination of literal VS outputs broken when switched to the new intrinsics. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-08 19:56:36 +01:00
Fabio Estevam	78c5772633	loader: Move non-error message to debug level Currently when running mesa on imx6 the following loader warnings are seen: # kmscube -D /dev/dri/card1 MESA-LOADER: device is not located on the PCI bus MESA-LOADER: device is not located on the PCI bus MESA-LOADER: device is not located on the PCI bus Using display 0x1920948 with EGL version 1.4 As this is not an error message, change it to debug level in order to have a cleaner log output. Signed-off-by: Fabio Estevam <festevam@gmail.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-08 16:35:00 +00:00
Mauro Rossi	61c38d14b7	android: r600: fix libmesa_amd_common dependency Adding libmesa_amd_common dependency and exporting its headers, avoids the following building error: external/mesa/src/gallium/drivers/r600/evergreen_compute.c:29:10: fatal error: 'ac_binary.h' file not found ^ 1 error generated. Fixes: `3bbbb63` "automake: r600: radeonsi: correctly manage libamd_common.la linking" Fixes: `503fb13` "radeon/ac: switch to ac_shader_binary_config_start()" v2 [Emil Velikov: drop unneeded LOCAL_EXPORT_C_INCLUDE_DIRS] Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-08 16:27:23 +00:00
Emil Velikov	1fe4d638a1	gallium/targets: rework the empty targets removal Earlier commit added extra tracking and we've attempted to remove the vdpau/other folder if empty. V2 of said commit dropped the pipe to /dev/null and the explicit "true" override. Sadly both of those are needed since there's no guarantee that the folder will be empty before we [mesa] make install. Since we're bringing those two back, there's no need to track if we've installed anything, and simply do "rm -d foo/ &>/dev/null \|\| true" Tested-by: Andy Furniss <adf.lists@gmail.com> Reported-by: Andy Furniss <adf.lists@gmail.com> Fixes: `1cd4fde053` ("gallium/targets: don't leave an empty target directory(ies)") Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-03-08 16:23:07 +00:00
Brian Paul	2f3f5728f7	util/indices: minor clean-ups Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:21 -07:00
Brian Paul	a0927da006	radeonsi: s/uint/enum pipe_shader_type/ This can probably be done in more places in the driver. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	b0d3938430	gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	2b9ab605aa	gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	73bafb5ee3	gallium: s/unsigned/enum pipe_shader_type/ for get_compiler_options() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	1564a768ae	virgl: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	6614b060fb	swr: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	f676c700cc	softpipe: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	0fc5110a6e	llvmpipe: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	4aec68176d	freedreno: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	7532ed106f	etnaviv: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	b4191b712b	draw: s/unsigned/enum pipe_shader_type/ and some s/uint/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	ed66c9d7b8	cso: s/unsigned/enum pipe_shader_type/ Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00
Brian Paul	637e5719b5	gallium: s/unsigned/enum pipe_shader_type/ for pipe_screen::get_shader_param() Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2017-03-08 08:50:20 -07:00

... 49 50 51 52 53 ...

92185 commits