Commit graph

92185 commits

Author SHA1 Message Date
Robert Bragg
a678b79ef4 i965: Add more Haswell OA metrics sets
This extends the brw_oa_hsw.xml to expose these additional queries:

- Compute Metrics Basic Gen7.5
- Compute Metrics Extended Gen7.5
- Memory Reads Distribution Gen7.5
- Memory Writes Distribution Gen7.5
- Metric set Sampler Balance

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:51 +00:00
Robert Bragg
458468c136 i965: Expose OA counters via INTEL_performance_query
This adds support for exposing basic Observation Architecture
performance counters on Haswell.

This support is based on the i915 perf kernel interface which is used
to configure the OA unit, allowing Mesa to emit MI_REPORT_PERF_COUNT
commands around queries to collect counter snapshots.

To take into account the small chance that some of the 32bit counters
could wrap around for long queries (~50 milliseconds for a GT3 Haswell @
1.1GHz) the implementation also collects periodic metrics.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:50 +00:00
Robert Bragg
a98ffe2477 exec_list: Add a foreach_list_typed_from macro
This allows iterating list nodes from a given start point instead of
necessarily the list head.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:50 +00:00
Robert Bragg
e56550565e i965: Add script to gen code for OA counter queries
Avoiding lots of error prone boilerplate and easing our ability to add +
maintain support for multiple OA performance counter queries for each
generation:

This adds a python script to generate code for building up
performance_queries from the metric sets and counters described in
brw_oa_hsw.xml as well as functions to normalize each counter based on
the RPN expressions given.

Although the XML file currently only includes a single metric set, the
code generated assumes there could be many sets.

The metrics as described in XML get translated into C structures
which are registered in a brw->perfquery.oa_metrics_table hash table
keyed by the GUID of the metric set in XML.

v2: numerous python style improvements (Dylan)
v3: Makefile.am fixups (Emil)
v4: Pattern rule for codegen + orthogonal .c and .h rules (Robert)

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 13:45:44 +00:00
Robert Bragg
f46e58e018 i965: extend query/counter structs for OA queries
In preparation for generating code from brw_oa_hsw.xml for describing OA
performance counter queries this adds some OA specific members to
brw_perf_query that our generated code will initialize:

- The oa_metric_set_id is the ID we will pass to
  DRM_IOCTL_I915_PERF_OPEN, and is an ID got via sysfs under:
  /sys/class/drm/<card>/metrics/<guid/id

- The oa_format is the OA report layout we will request from the kernel

- The accumulator offsets determine where the different groups of A, B
  and C counters are located within an intermediate 64bit 'accumulator'
  buffer.

Additionally brw_perf_query_counter now has 64bit or float _read()
callback members for OA counters.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 12:53:07 +00:00
Robert Bragg
eaab41c9db i965: brw_context.h additions for OA unit query codegen
In preparation for generating code from the XML performance counter meta
data, this makes some additions to brw_context.h for this code to be
able to reference.

It adds a brw->perfquery.oa_metrics_table hash table for indexing built
up query descriptions by the GUID that is expected to be advertised by
the kernel (via sysfs) to be able to use that query.

It adds an 'OA_COUNTERS' brw_query_kind to be assigned to queries built
up by generated code.

It adds a brw->perfquery.sys_vars structure to have a consistent place
to represent the different system variables like $EuCoresTotalCount and
$EuSlicesTotalCount that are referenced by OA counter normalization
equations.

  Although extending + referencing gen_device_info for these variables
  was considered, these are some of the (mostly minor) reasons for
  going with a dedicated structure:

  - Currently we only need this info for the performance_query backend
    and it might be a bit tedious to go back and initialize the state
    for pre-Haswell devinfo structures.
  - Considering the $SubsliceMask then the requirement for how multiple
    per-slice masks are packed only comes from how the variables are
    references by availability tests in XML, and might not be a good
    general representation for tracking subslice masks if another use
    case arises.
  - If we used gen_device_info then we'd likely want to avoid making
    assumptions about the C types during codegen and adding explicit
    casts, while that's not necessary with a dedicated struct with all
    members being uint64_t.
  - This structure and the code for initializing it is currently shared
    (just through copy & paste) with a few other projects dealing with
    OA counters, and that's been convenient so far.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 12:53:07 +00:00
Robert Bragg
b79268174b i965: XML description of Haswell OA metric set
In preparation for exposing Gen Observation Architecture performance
counters via INTEL_performance_query this adds an XML description for an
initial 'Render Metrics Basic Gen7.5' query and corresponding counters.

The intention is to auto generate code for building a query from these
counters as well as the code for normalizing the individual counters.

Note that the upstream for this XML data is currently GPU Top:

  https://github.com/rib/gputop

The files are maintained under gputop-data/ and they are themselves
derived from files in an internal 'MDAPI XML' schema. There are scripts
under gputop-scripts/ and make rules in gputop-data/Makefile.xml for
maintaining these files.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 12:53:07 +00:00
Pierre Moreau
655c395f65 nv50/ir: check for origin insn in findOriginForTestWithZero
Function arguments do not have an "origin" instruction, causing a
NULL-pointer dereference without this check.

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-03-09 12:42:46 +01:00
Samuel Pitoiset
d54b498694 mesa/main: make use of lookup_samplerobj_locked()
There is no need to check sampler == 0 twice. This removes now
unused _mesa_lookup_samplerobj_locked().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-09 11:01:37 +01:00
Samuel Pitoiset
58b4ae0411 mesa/main: inline {begin,end}_samplerobj_lookups()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-09 11:01:31 +01:00
Grazvydas Ignotas
8cd83a6c81 glsl/blob: clear padding bytes
Since blob is intended for serializing data, it's not a good idea to
leave padding holes with uninitialized data, which may leak heap
contents and hurt compression if the blob is later compressed, like
done by shader cache. Clear it.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09 20:41:02 +11:00
Grazvydas Ignotas
61bbb25a08 util/disk_cache: fix size subtraction on 32bit
Negating size_t on 32bit produces a 32bit result. This was effectively
adding values close to UINT_MAX to the cache size (the files are usually
small) instead of intended subtraction.
Fixes 'make check' disk_cache failures on 32bit.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09 20:26:30 +11:00
Grazvydas Ignotas
926bcacfd3 util/disk_cache: fix compressed size calculation
It incorrectly doubles the size on each iteration.

Fixes: 85a9b1b5 "util/disk_cache: compress individual cache entries"

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-09 20:26:23 +11:00
Lionel Landwerlin
f81ede4699 glsl: builtin: always return clones of the builtins
Builtins are created once and allocated using their own private ralloc
context. When reparenting IR that includes builtins, we might be steal
bits of builtins. This is problematic because these builtins might now
be freed when the shader that includes then last is disposed. This
might also lead to inconsistent ralloc trees/lists if shaders are
created on multiple threads.

Rather than including builtins directly into a shader's IR, we should
include clones of them in the ralloc context of the shader that
requires them. This fixes double free issues we've been seeing when
running shader-db on a big multicore (72 threads) server.

v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better
    reflect how this function is used. (Ken)

v3: Rename ctx to mem_ctx (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-09 08:30:36 +00:00
Kenneth Graunke
071d80bde2 i965: Delete render ring prelude.
This was a hook I came up when trying to do the initial performance
counter work years ago.  Nothing's used it for a long time, and the
upcoming performance counter support doesn't want it either.

So, goodbye render ring prelude.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-03-08 23:01:21 -08:00
Vinson Lee
d64ded7b50 swr: s/uint/enum pipe_render_cond_flag/
Fix build error.

swr_context.cpp: In function ‘void swr_blit(pipe_context*, const pipe_blit_info*)’:
swr_context.cpp:336:44: error: invalid conversion from ‘uint {aka unsigned int}’ to ‘pipe_render_cond_flag’ [-fpermissive]
                                       ctx->render_cond_mode);
                                       ~~~~~^~~~~~~~~~~~~~~~

Fixes: b0d3938430 ("gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100133
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-03-08 21:43:07 -08:00
Bas Nieuwenhuizen
7d6e1a341a radv: Don't flush the CB before doing a fast clear eliminate.
The only way we write CMASK/DCC compressed textures through shaders
is fast clears and CMASK/DCC inits, which have their own flushes.
Hence the CB cache is always up to date.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:28 +01:00
Bas Nieuwenhuizen
8700329785 radv: Don't emit cache flushes on subpass switch.
I think we should only flush right before an action (draw/dispatch etc.),
as otherwise it is too easy to issue redundant flushes.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:23 +01:00
Bas Nieuwenhuizen
9251f8b35e radv: Only flush for the needed stages, and before the flushes.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:19 +01:00
Bas Nieuwenhuizen
f92a118434 radv: Don't invalidate CB/DB for images that aren't modified outside CB/DB.
Without stores, the only writes are fast clears, transfers and metadata
initialization, each of which have the appropiate invalidations already.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:14 +01:00
Bas Nieuwenhuizen
0567ab0407 radv: Flush more caches after writes.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:10 +01:00
Bas Nieuwenhuizen
7a600bbc81 radv: Don't flush for fixed-function reading.
The data should always be in memory after a src flush.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:05 +01:00
Bas Nieuwenhuizen
dd094e4ff9 radv: Invalidate the correct caches for CB/DB dst barriers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:01 +01:00
Bas Nieuwenhuizen
b075eb7d47 radv: Determine cache flushes per object.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:34:42 +01:00
Samuel Pitoiset
2568d9d0cd mesa/main: remove unused _mesa_new_texture_image()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-03-09 01:57:20 +01:00
Dave Airlie
e6902be900 radv/ac: fixup texture coord to have right number of channels.
Jason has patches to add validation to this area, this should fix
radv shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-09 09:17:11 +10:00
Timothy Arceri
0e34966340 st/nine: pass NULL to ureg_get_tokens()
The number of tokens in never used and the pointer is NULL checked
so just pass NULL.

Reviewed-by: Axel Davy <axel.davy@ens.fr>
2017-03-09 09:29:07 +11:00
Matt Turner
a45cd8107d docs: ARB_shader_atomic_counter_ops is enabled on i965/gen7+.
This extension was enabled in commit 40dd45d0c6 ("i965: Enable
ARB_shader_atomic_counter_ops") but the commit failed to update the
release notes or features.txt. The release notes ship has sailed, since
the commit was in 13.0.
2017-03-08 13:58:52 -08:00
Eric Anholt
19f571ba6d vc4: Fix math with a condition flag set.
Math results land in r4, regardless of the condition.  To implement them,
we just need to ensure that the results are moved out of r4 (as often
happens anyway, the values is live across another math instruction), so
that we can attach the condition to the MOV.

Fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.93 and a
couple others, that were assertion failing that their conditions hadn't
been handled during the QIR->QPU stage.
2017-03-08 13:44:17 -08:00
Eric Anholt
615f6653b0 vc4: Fix register pressure cost estimates when a src appears twice.
This ended up confusing the scheduler for things like fabs (implemented as
fmaxabs x, x) or squaring a number, and it would try to avoid scheduling
them because it appeared more expensive than other instructions.

Fixes failure to register allocate in
dEQP-GLES2.functional.uniform_api.random.3 with almost no shader-db
effects (+.35% max temps)
2017-03-08 13:44:17 -08:00
Eric Anholt
0fca01d027 vc4: Report to shader-db how many threads a fragment shader has.
Doing instruction count analysis when we emit the thread switches that
will save us from tons of stalls is kind of missing the point.
2017-03-08 13:44:17 -08:00
Eric Anholt
61359324c1 Revert "vc4: Lazily emit our FS/VS input loads."
This reverts commit 292c24ddac.  It broke a
lot of GLES2 deqp, and I see at least one problem that will require some
serious rework to fix.
2017-03-08 13:44:17 -08:00
Marek Olšák
ab12a126fd radeonsi: fix elimination of literal VS outputs
broken when switched to the new intrinsics.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-03-08 19:56:36 +01:00
Fabio Estevam
78c5772633 loader: Move non-error message to debug level
Currently when running mesa on imx6 the following loader warnings
are seen:

# kmscube -D /dev/dri/card1
MESA-LOADER: device is not located on the PCI bus
MESA-LOADER: device is not located on the PCI bus
MESA-LOADER: device is not located on the PCI bus
Using display 0x1920948 with EGL version 1.4

As this is not an error message, change it to debug level in
order to have a cleaner log output.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-08 16:35:00 +00:00
Mauro Rossi
61c38d14b7 android: r600: fix libmesa_amd_common dependency
Adding libmesa_amd_common dependency and exporting its headers,
avoids the following building error:

external/mesa/src/gallium/drivers/r600/evergreen_compute.c:29:10: fatal error: 'ac_binary.h' file not found
         ^
1 error generated.

Fixes: 3bbbb63 "automake: r600: radeonsi: correctly manage libamd_common.la linking"
Fixes: 503fb13 "radeon/ac: switch to ac_shader_binary_config_start()"

v2 [Emil Velikov: drop unneeded LOCAL_EXPORT_C_INCLUDE_DIRS]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-08 16:27:23 +00:00
Emil Velikov
1fe4d638a1 gallium/targets: rework the empty targets removal
Earlier commit added extra tracking and we've attempted to remove the
vdpau/other folder if empty. V2 of said commit dropped the pipe
to /dev/null and the explicit "true" override.

Sadly both of those are needed since there's no guarantee that the
folder will be empty before we [mesa] make install.

Since we're bringing those two back, there's no need to track if we've
installed anything, and simply do "rm -d foo/ &>/dev/null || true"

Tested-by: Andy Furniss <adf.lists@gmail.com>
Reported-by: Andy Furniss <adf.lists@gmail.com>
Fixes: 1cd4fde053 ("gallium/targets: don't leave an empty target directory(ies)")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-08 16:23:07 +00:00
Brian Paul
2f3f5728f7 util/indices: minor clean-ups
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:21 -07:00
Brian Paul
a0927da006 radeonsi: s/uint/enum pipe_shader_type/
This can probably be done in more places in the driver.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
b0d3938430 gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
2b9ab605aa gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
73bafb5ee3 gallium: s/unsigned/enum pipe_shader_type/ for get_compiler_options()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
1564a768ae virgl: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
6614b060fb swr: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
f676c700cc softpipe: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
0fc5110a6e llvmpipe: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
4aec68176d freedreno: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
7532ed106f etnaviv: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
b4191b712b draw: s/unsigned/enum pipe_shader_type/
and some s/uint/enum pipe_shader_type/

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
ed66c9d7b8 cso: s/unsigned/enum pipe_shader_type/
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00
Brian Paul
637e5719b5 gallium: s/unsigned/enum pipe_shader_type/ for pipe_screen::get_shader_param()
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-08 08:50:20 -07:00