Commit graph

89863 commits

Author SHA1 Message Date
Emil Velikov
eec0cd71cd glx: don't expose systemTimeExtension for DRI2/DRI3/DRISW
Used/applicable to only dri1 drivers.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2017-03-15 11:48:50 +00:00
Emil Velikov
b1fb6e8d8c anv: do not open random render node(s)
drmGetDevices2() provides us with enough flexibility to build heuristics
upon. Opening a random node on the other hand will wake up the device,
regardless if it's the one we're interested or not.

v2: Rebase, explicitly require/check for libdrm
v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia)
v4: Rebase

Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:38:05 +00:00
Emil Velikov
743315f269 radv: do not open random render node(s)
drmGetDevices2() provides us with enough flexibility to build heuristics
upon. Opening a random node on the other hand will wake up the device,
regardless if it's the one we're interested or not.

v2: Rebase.
v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia)

Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:38:02 +00:00
Emil Velikov
8ff2937dfa radv/winsys: use drmGetDevice2 API
Analogous to previous commit

v2: Add explicit require_libdrm check.

Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:38:00 +00:00
Emil Velikov
858170e8a4 winsys/amdgpu: use drmGetDevice2 API
Analogous to previous commit

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98502
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:37:58 +00:00
Emil Velikov
a50c4eb2a0 loader: use drmGetDevice[s]2 API
By this allows us to fetch the device list/info w/o the revision field.
At the moment retrieving the latter wakes up the device.

Note: kernel patch to resolve that should be in 4.10.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:37:55 +00:00
Emil Velikov
2c72e78ff5 autoconf/scons: bump libdrm to 2.4.75
We'll be using the drmGetDevice[s]2 API in src/loader with next patch.

v2: Rebase.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Mike Lothian <mike@fireburn.co.uk>
2017-03-15 11:37:39 +00:00
Emil Velikov
0fd61fb639 util/sha1: drop _mesa_sha1_{update, format} return type
Unused/unchecked by any of the callers.

v2: Fix the glsl cases that have crept in since v1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:18:45 +00:00
Emil Velikov
a9a4028fd7 util/sha1: rework _mesa_sha1_{init,final}
Rather than having an extra memory allocation [that we currently do not
and act accordingly] just make the API take an pointer to a stack
allocated instance.

This and follow-up steps will effectively make the _mesa_sha1_foo simple
define/inlines around their SHA1 counterparts.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:18:43 +00:00
Emil Velikov
c96127e873 util/sha1: add non-typedef name for the SHA1_CTX struct
Using typedef(s) is not always the answer and makes it harder for people
to do clever (or one might call nasty) things with the code.

Add a struct name which we will use with follow-up commit.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:53 +00:00
Bas Nieuwenhuizen
ef43eeb09f radv: Remove unused descriptor set field.
Trivial.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
2017-03-15 09:06:52 +01:00
Dave Airlie
686d060458 r600: refactor binding code for attach buffer to CB.
This refactors out the code and fixes it up to be used
for images later. It uses the code in the current RAT binding
for compute.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:33:26 +10:00
Dave Airlie
222e42e45f r600: refactor out CB setup.
This moves the code to create CB info out into
a separate function so it can be reused in images
code to create RATs.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:33:23 +10:00
Dave Airlie
0cf717821e r600: refactor texture resource words setup code.
This refactors out the code to setup a texture resource
so we can reuse it later from the images code.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:33:06 +10:00
Dave Airlie
95a976b651 r600: factor out the code to initialise a buffer resource.
This takes the code required to initialise a buffer resource
out of the texture buffer code, into it's own function.

This is going to be used for the image support later.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:32:48 +10:00
Dave Airlie
cf2af021b9 r600g: make framebuffer atom rely on dual src blend state.
In order to make ARB_shader_image_load_store, we have to share
the CB space with RATs, so we should only steal the dual src
space if we have dual src enabled.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 14:32:44 +10:00
Jason Ekstrand
d142c7436c intel/debug: Add a common INTEL_DEBUG=nohiz option
The GL driver had a driconf option (which doesn't make much sense) and
the Vulkan driver had a hand-rolled environment variable.  Instead,
let's tie both into the INTEL_DEBUG mechanism and unify things.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 21:00:09 -07:00
Jason Ekstrand
c09bb956ca anv/image: Move handling of INTEL_VK_HIZ
This makes it so that you don't get an "Implement gen7 HiZ" perf warning
when you manually disable HiZ on gen8.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 21:00:09 -07:00
Timothy Arceri
304b35b0e9 radv: trivial tidy ups
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-15 11:45:04 +11:00
Alan Swanson
b7e03d87e4 util/disk_cache: scale cache according to filesystem size
Select higher of current 1G default or 10% of filesystem where
cache is located.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Alan Swanson
f1e9671442 util/disk_cache: actually enforce cache size
Currently only a one in one out eviction so if at max_size and
cache files were to constantly increase in size then so would the
cache. Restrict to limit of 8 evictions per new cache entry.

V2: (Timothy Arceri) fix make check tests

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Alan Swanson
af09b86732 util/disk_cache: use LRU eviction rather than random eviction
Still using fast random selection of two-character subdirectory in
which to check cache files rather than scanning entire cache.

v2: Factor out double strlen call
v3: C99 declaration of variables where used

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
c2793e2c89 util/disk_cache: don't fallback to an empty cache dir on evict
If we fail to randomly select a two letter cache dir, don't select
an empty dir on fallback.

In real world use we should never hit the fallback path but it can
be hit by tests when the cache is set to a very small max value.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
50989f87e6 util/disk_cache: use a thread queue to write to shader cache
This should help reduce any overhead added by the shader cache
when programs are not found in the cache.

To avoid creating any special function just for the sake of the
tests we add a one second delay whenever we call dick_cache_put()
to give it time to finish.

V2: poll for file when waiting for thread in test
V3: fix poll delay to really be 100ms, and simplify the wait function

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
fc5ec64ba3 util/disk_cache: add helpers for creating/destroying disk cache put jobs
V2: Make a copy of the data so we don't have to worry about it being
freed before we are done compressing/writing.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
e2c4435b07 util/disk_cache: add thread queue to disk cache
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:10 +11:00
Dave Airlie
7372e3cf5f radv/ac: workaround regression in llvm 4.0 release
LLVM 4.0 released with a pretty messy regression, that hopefully
get fixed in the future.

This work around was proposed by Tom, and it fixes the CTS regressions
here at least, I'm not sure if this will cause any major side effects,
but correctness over speed and all that.

radeonsi should possibly consider the same workaround until an llvm
fix can be found.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 09:51:53 +10:00
Dave Airlie
3ece76f03d radv/ac: gather4 cube workaround integer
This fix is extracted from amdgpu-pro shader traces.

It appears the gather4 workaround for integer types doesn't
work for cubes, so instead if forces a float scaled sample,
then converts to integer.

It modifies the descriptor before calling the gather.

This also produces some ugly asm code for reasons specified
in the patch, llvm could probably do better than dumping
sgprs to vgprs.

This fixes:
dEQP-VK.glsl.texture_gather.basic.cube.rgba8*

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-15 09:51:53 +10:00
Bas Nieuwenhuizen
407fa77669 radv: Set driver version to mesa version;
I couldn't really find an encoding in the spec. I'm not sure it
prescribes VK_MAKE_VERSION format, but vulkan.gpuinfo.org interprets
it that way by default. vulkaninfo gives the raw number, so we could
alternatively do something like 17001000, but that doesn't show
up right on vulkan.gpuinfo.org again. Looking at that site, the -pro
driver also uses VK_MAKE_VERSION, so keeping consistency is probably
best.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-15 00:37:56 +01:00
Bas Nieuwenhuizen
ed28ae71f5 radv: Increase api version to 1.0.42.
I've skimmed to changes from 1.0.5 to 1.0.42 and I think we have all
changes. We're still not conformant ofcourse, but this should not
regress stuff,

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2017-03-15 00:37:56 +01:00
Jason Ekstrand
2e98db68e4 util/vk: Add helpers for finding an extension struct
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-15 08:22:02 +10:00
Alex Smith
e0cc32b85b radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer
Need to flush before updating the buffer to ensure that the copy is
ordered after previous accesses (assuming the app has performed the
appropriate barriers).

This fixes potential issues due to draws prior to an update reading
the new buffer content, despite having the necessary barriers between
them.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-14 22:17:03 +01:00
Bas Nieuwenhuizen
cce43f6d8c radv: Emit cache flushes before CP DMA.
The flushes could be due to TRANSFER barriers.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-14 22:16:34 +01:00
Jan Beich
fe56c745b8 Convert sed(1) syntax to be compatible with FreeBSD and OpenBSD
BSD regex library doesn't support extended RE escapes (e.g. \+) and
shorthand character classes (e.g. \s, \S) and SVR4-style word
delimiters[1] (on DragonFly and NetBSD). Both GNU and BSD sed support
-E and -r to enable extended RE but OS X still lacks -r.

[1] https://www.illumos.org/issues/516

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com> (GNU sed)
2017-03-14 17:07:04 +00:00
Jason Ekstrand
aed2714145 anv: Properly enumerate physical devices when none are present 2017-03-14 09:08:07 -07:00
Jason Ekstrand
9d559ba39d nir/constant_expressions: Refactor helper functions
Apart from avoiding some unneeded size cases, this shouldn't have any
actual functional impact.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
7107b32155 i965/fs: Re-arrange conversion operations
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
bab4610e9c i965/vec4: Get rid of the type parameter from to/from_double
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
702d1af8ba glsl/nir: Use nir_type_conversion_op
Using the helper is way better than hand-coding the universe.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
6eb051e36f nir: Rewrite nir_type_conversion_op
The original version was very convoluted and tried way too hard to not
just have the nested switch statement that it needs.  Let's just write
the obvious code and then we know it's correct.  This fixes a bunch of
missing cases particularly with int64.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
9084b1db30 nir: Add a get_nir_type_for_glsl_base_type helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
a136884139 nir/validate: Rework ALU bit-size rule validation
The original bit-size validation wasn't capable of properly dealing with
instructions with variable bit sizes.  An attempt was made to handle it
by looking at source and destinations but, because the validation was
done in validate_alu_(src|dest), it didn't really have the needed
information.  The new validation code is much more straightforward and
should be more correct.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
370d68babc nir/validate: Validate that bit sizes and components always match
We've always required bit sizes to match but the rules for number of
components have been a bit loose.  You've never been allowed to source
from something with less components than you consume, but more has
always been fine.  This changes the validator to require that they match
exactly.  The fact that they don't always match has been a source of
confusion in NIR for quite some time and it's time we got rid of it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
e9a45a3d5d nir: Make image_size a variable-width intrinsic
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
b377be9213 i965/fs: Use num_components from the SSA def in image intrinsics
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
0bf0365393 nir/lower_tex: Use tex_instr_dest_size for txs destinations
Using coord_components of the source texture is correct for everything
except cube maps where it's off by one.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
fffa4111df nir/spirv: Restrict the number of channels in texture coordinates
Some SPIR-V texturing instructions pack more than the texture coordinate
into the coordinate source.  We need to mask off the unused channels.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
3c312be7b3 nir/copy_prop: Respect the source's number of components
In the near future we are going to require that the num_components in a
src dereference match the num_components of the SSA value being
dereferenced.  To do that, we need copy_prop to not remove our MOVs from
a larger SSA value into an instruction that uses fewer channels.

Because we suddenly have to know how many components each source has,
this makes the pass a bit more complicated.  Fortunately, copy
propagation is the only pass that cares about the number of components
are read by any given source so it's fairly contained.

Shader-db results on Sky Lake:

   total instructions in shared programs: 13318947 -> 13320265 (0.01%)
   instructions in affected programs: 260633 -> 261951 (0.51%)
   helped: 324
   HURT: 1027

Looking through the hurt programs, about a dozen are hurt by 3
instructions and the rest are all hurt by 2 instructions.  From a
spot-check of the shaders, the story is always the same:  They get a
vec4 from somewhere (frequently an input) and use the first two or three
components as a texture coordinate.  Because of the vector component
mismatch, we have a mov or, more likely, a vecN sitting between the
texture instruction and the input.  This means that the back-end inserts
a bunch of MOVs and split_virtual_grfs() goes to town.  Because the
texture coordinate is also used by some other calculation, register
coalesce can't combine them back together and we end up with an extra 2
MOV instructions in our shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
60d1aac28a nir/intrinsics: Make load_barycentric_input take a 2-component coor
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00