Commit graph

92185 commits

Author SHA1 Message Date
Jason Ekstrand
c773ae88df anv/query: Break GPU query calculation into a helper
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
7de73f0c94 genxml: Add pipeline statistics registers on gen7+
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
0557dfdb4a anv/query: Add a helper for writing a query pool result
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:50 -07:00
Jason Ekstrand
bce4a935c6 anv/query: Use a variable-length slot size
Not all queries are the same.  Even the two queries we support today
require a different amount of data per slot.  Once we introduce pipeline
statistics queries, the size will vary wildly.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:49 -07:00
Jason Ekstrand
1c797af2c6 anv/query: Move the available bits to the front
We're about to make slots variable-length and always having the
available bits at the front makes certain operations substantially
easier once we do that.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:12:47 -07:00
Jason Ekstrand
9d43afa3dc anv/query: Let 32-bit values wrap
From the Vulkan 1.0.39 Specification:

   "If VK_QUERY_RESULT_64_BIT is not set and the result overflows a
   32-bit value, the value may either wrap or saturate."

So we can either clamp or wrap.  Wrapping is both easier and what the
user gets if they use vkCmdCopyQueryPoolResults and we should be
consistent.  We could make vkCmdCopyQueryPoolResults clamp but it's
annoying and ends up burning extra batch for something the spec clearly
doesn't require.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 12:11:35 -07:00
Alex Deucher
c2a97fb7ae radeonsi: add new polaris12 pci id
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-17 14:13:17 -04:00
Marek Olšák
4b064d16e5 gallium/radeon: formalize that create_batch_query doesn't need pipe_context
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
be6173e7d6 gallium/radeon: formalize that create_query doesn't need pipe_context
for threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
04e6977e5d gallium/radeon: reference pipe_resource in pipe_transfer
for threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
03127bb6d5 radeonsi: compile all TGSI compute shaders asynchronously
required by threaded gallium

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
e9c6953ddb radeonsi: require that compiler threads are enabled
threaded gallium can't use pipe_context's LLVM target machine, because
create_shader_selector can be called from a non-driver thread.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-17 18:30:21 +01:00
Marek Olšák
080f322f06 trace: remove leftover assertions after pipe_resource wrapping removal 2017-03-17 18:30:21 +01:00
Marek Olšák
6c0a28084d gallium/u_upload: make the first persistent mapping unsynchronized
This is simpler for drivers.
2017-03-17 18:30:21 +01:00
Robert Bragg
a27b62e794 anv/device: init timestampPeriod from devinfo
Now that there's a timebase_scale in gen_device_info which is
effectively the 'period' this switches anv_GetPhysicalDeviceProperties
to using this common device info to initialize the timestampPeriod
device limit.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-17 16:10:22 +00:00
Robert Bragg
344d1a4015 i965: Allow a per gen timebase scale factor
Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock
with the convenient property of being able to scale by an integer (80)
to nanosecond units.

For Skylake the frequency is 12MHz or a scale factor of 83.333333

This updates gen_device_info to track a floating point timebase_scale
factor and makes corresponding _queryobj.c changes to no longer assume a
scale factor of 80 works across all gens.

Although the gen6_ code could have been been left alone, the changes
keep the code more comparable, and it now shares a few utility functions
for scaling raw timestamps and calculating deltas. The utility for
calculating deltas takes into account 32 or 36bit overflow depending on
the current kernel version.

Note: this leaves the timestamp handling of ARB_query_buffer_object
untouched, which continues to use an incorrect scale of 80 on Skylake
for now. This is more awkward to solve since the scaling is currently
done using a very limited uint64 ALU available to the command parser
that doesn't support multiply or divide where it's already taking a
large number of instructions just to effectively multiple by 80.

This fixes piglit arb_timer_query-timestamp-get on Skylake

v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-17 15:45:19 +00:00
Jason Ekstrand
28b134c75c anv/device: Remove a use of a compound literal
Older versions of GCC don't like compound literals in static const
variable declarations because they don't think it's an actual constant
value.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 08:40:30 -07:00
Robert Bragg
76dc49f3fb i965: bounds checks while concatenating sysfs paths
This adds some missing return value checks for all uses of snprintf in
brw_performance_query.c. This also switches a use of strncpy + strncat
for snprintf for consistency and to avoid the chance of the strncpy
leaving an unterminated string in the dest buffer if the src is too
long.

This issue with strncpy was picked up by Coverity.

CID: 1402201
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-17 13:40:29 +00:00
Emil Velikov
f8b1b9404e mesa: automake: add all headers to the tarball.
Fixes: d8d81fbc31 ("mesa: Add infrastructure for a worker thread to process GL commands.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 13:10:09 +00:00
Emil Velikov
d9a41ce8aa mapi: automake: add all python scripts to EXTRA_DIST
Otherwise it'll be missing in the tarball and make distcheck will fail.

Fixes: 05dd4a1104 ("glapi: Generate GL API marshalling code from the XML.")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 13:10:09 +00:00
Jonathan Gray
9e8d6ba1d6 glapi: avoid using $< in non-suffix make rules
Using $< in non-suffix make rules is a GNU extension.  Explicitly use
the name of the python script to fix the build on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabore.com>
2017-03-17 13:06:26 +00:00
Alex Smith
ce4058dafd radv/ac: Fix shared memory offset calculation
The index passed to get_shared_memory_ptr is an attribute slot index,
i.e. the index of a vec4 within LDS. Therefore this must be scaled by
sizeof(vec4) to give the LDS byte offset.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: <mesa-stable@lists.freedesktop.org>
2017-03-17 09:35:48 +01:00
James Legg
e88cac1df0 radv: Fix using more than 4 bound descriptor sets
Avoid a buffer overflow in ac_nir_to_llvm.c's create_function when
using more than 4 descriptor sets. radv claims support for 8.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 09:12:43 +01:00
Tapani Pälli
70d25cae8b util/build-id: check dlpi_name before strstr call
According to dl_iterate_phdr man page first object visited is the
main program where dlpi_name is an empty string. This fixes segfault
on Android when using build-id as identifier.

Fixes: d4fa083e11 ("util: Add utility build-id code.")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-17 07:34:26 +02:00
Tapani Pälli
4d4558411d android: fix segfault within swap_buffers
Function droid_swap_buffers may get called without dri2_surf->buffer set,
in these cases we don't have a back buffer set either. Patch fixes segfault
seen with 3DMark that uses android.opengl.GLSurfaceView for rendering it's UI.

backtrace:
   #00 pc 00013f88  /system/lib/egl/libGLES_mesa.so (droid_swap_buffers+104)
   #01 pc 000117b2  /system/lib/egl/libGLES_mesa.so (dri2_swap_buffers+50)
   #02 pc 000058b2  /system/lib/egl/libGLES_mesa.so (eglSwapBuffers+386)
   #03 pc 00011329  /system/lib/libEGL.so (eglSwapBuffersWithDamageKHR+553)
   #04 pc 000118e7  /system/lib/libEGL.so (eglSwapBuffers+55)
   #05 pc 000754dc  /system/lib/libandroid_runtime.so

v2: do like other backends, call get_back_bo (Emil Velikov)

Fixes: 2acc69d ("EGL/Android: Add EGL_EXT_buffer_age extension")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-17 07:30:34 +02:00
Timothy Arceri
72ab7bb765 radv: make sure gs copy shader is retrieved from the cache with the variant
Apps can limit the size of the cache via VkAllocationCallbacks so we
can't be sure that both are always in the cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
2845a108a9 radv: fallback to an in-memory cache when no pipline cache is provided
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
315e8a9321 radv: always create an fallback pipeline cache
This will be used as an in-memory cache when a pipeline cache is
not provided by the app.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
4ffdab78b9 radv: move cache check inside insert and search functions
This will allow us to use fallback in-memory and on-disk caches
should the app not provide a pipeline cache.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-17 16:17:10 +11:00
Timothy Arceri
124ec417f9 st/mesa: call glthread_destroy() before _vbo_DestroyContext()
Otherwise we have a race condition between vbo calls in the
glthread and the _vbo_DestroyContext() call.

This fixes a bunch of piglit crashes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-17 09:47:02 +11:00
Jason Ekstrand
08df015b9d anv/GetQueryPoolResults: Actually implement the spec
The Vulkan spec is fairly clear about when we should and should not
write query pool results.  We're also supposed to return VK_NOT_READY if
VK_QUERY_RESULT_PARTIAL_BIT is not set and we come across any queries
which are not yet finished.  This fixes rendering corruptions on The
Talos Principle where geometry flickers in and out due to bogus query
results being returned by the driver.  These issues are most noticable
on Sky Lake GT4 2hen running on "ultra" settings.

Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100182
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:18 -07:00
Jason Ekstrand
81840130c0 anv/query: Invalidate the correct range
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-16 15:08:17 -07:00
Jason Ekstrand
4bbb4b95b8 anv/query: Fix the location of timestamp availability
Reviewed-By: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0 13.0" <mesa-dev@lists.freedesktop.org>
2017-03-16 15:08:17 -07:00
Jason Ekstrand
9e60f59e62 genxml: Add XML version tags
There's not much point to having them or not having them but this
reduces some pointless diff from the version we can auto-generate

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 15:08:17 -07:00
Kenneth Graunke
f51a320b12 aubinator: Use fprintf for output.
This will make it easier to choose an output file.  For now, it remains
stdout.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:44 -07:00
Kenneth Graunke
65a9d5eabb aubinator: Reuse decode_structure code for handling commands
The code for decoding structures and commands was almost identical.
The only differences are: we print dword headers for commands, and
we skip the first one (with the command opcode and lengths).

So, generalize decode_structure to add a starting DWord, and a flag
for printing the DWord headers, and reuse it.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:41 -07:00
Kenneth Graunke
f0aa8fd4e4 aubinator: Delete redundant NULL check.
handle_struct_decode() is just a wrapper around decode_structure()
with a NULL check.  But the only caller already does that NULL check.

So, just use decode_structure() directly.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:37 -07:00
Kenneth Graunke
65138ce019 aubinator: Fix indentation.
Three space, not four.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-16 10:48:32 -07:00
Topi Pohjolainen
bd25d9670b i965/gen8+: Do full stall when switching pipeline
just as earlier gens do.

CC: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96743
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 18:44:15 +02:00
Jonathan Gray
46707bc27b i965: remove uneeded asm/unistd.h include
Fix the build on OpenBSD by removing an uneeded include for asm/unistd.h.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:56:40 +00:00
Emil Velikov
e6bef50f4c i965: automake: remove spurious white space
Unintentionally introduced by yours truly with the i965 compiler move.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:55:42 +00:00
Jonathan Gray
d2bb0c8590 i965: avoid using a GNU make pattern rule
% pattern rules are a GNU extension.  As there is only one file here
avoid patterns and globbing entirely to fix the build on non-GNU make.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
v2 [Emil Velikov: brw_oa.py dependency]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:55:23 +00:00
Emil Velikov
ccb89e72aa docs/releasing: document how to squash/announce queued patches
In the odd case where a patch needs to be fixed, squash the appropriate
fix and document how. Add a note in the pre-release notes, such that
devs can quickly spot it.

v2: Grammar/typo fixes (Eric). Use upstream commit [SHA] as reference.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-16 13:22:40 +00:00
Emil Velikov
0f988add50 docs/releasing: release.sh is located in xorg/util-modular
Correct the silly typo s/macros/modular/ and add a reference to the
repository.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-16 13:18:13 +00:00
Emil Velikov
79562033b5 docs/releasing: remove "git clean" step
release.sh from master, does not require the tree to be clean.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-16 13:18:11 +00:00
Emil Velikov
c81c563fbb mapi: remove Xlib/xcb include in gl_marshal.py
The only use of the header is to provide the _X_INLINE macro. We already
require (and provide where needed) 'inline', plus it's used in the file
already.

So replace the macro and drop the include. This fixes the build on
platforms which lack the header - from X-less Linuxes to Androids.

Fixes: 05dd4a1104 ("glapi: Generate GL API marshalling code from the XML.")
Reported-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100223
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-16 13:12:26 +00:00
Eric Engestrom
8a82f551cd docs/specs: update Khronos registries URLs
The registries were migrated to git and are now hosted on GitHub.
The old svn is now read-only, and will not be updated anymore.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-16 11:50:40 +00:00
Iago Toral Quiroga
ca34a3125f anv: improve error reporting when creating pipelines
Specifically, report 'out of memory' errors that might have happened while
emitting the pipeline's batch.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
1d7468311d anv: handle errors in emit_binding_table() and emit_samplers()
These can fail to allocate device memory, however, the driver can recover
from this error by allocating a new binding table block and trying again.

v2:
  - Instead of tracking the errors in these functions and making callers
    reset the batch's status before attempting to allocate a new block
    for the binding table, simply make callers responsible for setting
    the error status if they fail to allocate memory during the second
    attempt (Jason).

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00
Iago Toral Quiroga
dd8348c8be anv: handle errors while allocating new binding table blocks
Also, we had a couple of instances in flush_descriptor_sets() were
we were returning a VkResult directly upon error, but the return
value of this function is not a VkResult but a uint32_t dirty mask,
so simply return 0 in these cases which reduces the amount of
work the driver will do after the error has been raised.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-16 11:40:05 +01:00