Commit graph

1670 commits

Author SHA1 Message Date
Matt Turner
abc8a702d0 nir: Return progress from nir_lower_io().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
a934b00222 nir: Return progress from nir_lower_regs_to_ssa().
And from nir_lower_regs_to_ssa_impl() as well.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
b0e72defc2 nir: Return progress from nir_lower_samplers().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
01548f9f01 nir: Return progress from nir_lower_atomics().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
0bd615d961 nir: Return progress from nir_lower_clamp_color_outputs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
9dbf91f5c0 nir: Return progress from nir_lower_clip_fs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
4e4927cd95 nir: Return progress from nir_lower_clip_vs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner
6077cc75aa nir: Return progress from nir_move_vec_src_uses_to_dest().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
a539e05d00 nir: Return progress from nir_lower_to_source_mods().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
5a7e4ae23d nir: Return progress from nir_lower_clip_cull_distance_arrays().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
19345fc160 nir: Return progress from nir_lower_var_copies().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
b831b8d2e1 nir: Return progress from nir_lower_load_const_to_scalar().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
adb157ddfd nir: Return progress from nir_lower_64bit_pack().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
0012a6144a nir: Return progress from nir_lower_doubles().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
c597f87739 nir: Return progress from nir_lower_vars_to_ssa().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
7d41bf8d7b nir: Fix syntax.
et is not an abbreviation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
70c0455974 nir: Fix misspellings.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner
d6e2bdfed3 nir: Stop using apostrophes to pluralize.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Vinson Lee
bb32ea4fc6 glsl: Link glsl_compiler with CLOCK_LIB.
Fix linking error on CentOS 6.

  CXXLD  glsl_compiler
glsl/.libs/libstandalone.a(lt16-libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 14:58:18 -07:00
Samuel Pitoiset
737c734cd4 glsl: lower sqrt(abs()) and inversesqrt(abs()) if requested
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-03-22 22:02:12 +01:00
Rob Herring
5410c60112 Android: clean-up trailing '\' in make variables
Fixed with the following command:

perl -pe 'BEGIN{undef $/;} s/ \\\n\n/\n\n/smg' $(find . -name 'Android.*')

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-03-22 17:52:06 +00:00
Emil Velikov
b27a883205 spirv: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
e3de145fa2 nir: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
b08aee305e glsl: consistently use ifndef guards over pragma once
Through the glsl headers we had an odd mix of guards be that
"ifndef", "pragma once" neither or both.

Simplify things by using the more common ones (ifndef) and annotating
all the sources, barring the generated builting header -
builtin_int64.h.

The final header - udivmod64.h - is [seemingly] unused and on its way
out (patch purge it is on the mailing list).

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Emil Velikov
b0bfb5f89c compiler: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:21 +00:00
Vinson Lee
1fa432741c nir: Add positional argument specifiers.
Fix build with Python < 2.7.

  File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module>
    from nir_opcodes import opcodes
  File "src/compiler/nir/nir_opcodes.py", line 178, in <module>
    unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size),
ValueError: zero length field name in format

Fixes: 762a6333f2 ("nir: Rework conversion opcodes")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-03-21 13:38:00 -07:00
Grazvydas Ignotas
af73acca2b tests/cache_test: use the blob key's actual first byte
There is no need to hardcode it, we can just use blob_key[0].
This is needed because the next patches are going to change how cache
keys are computed.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-21 11:15:52 +11:00
Grazvydas Ignotas
529a767041 util/disk_cache: use a helper to compute cache keys
This will allow to hash additional data into the cache keys or even
change the hashing algorithm easily, should we decide to do so.

v2: don't try to compute key (and crash) if cache is disabled

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-03-21 11:15:52 +11:00
Jason Ekstrand
fcca6a83cd spirv: Implement IsInf using an integer comparison
Since we already do fabs on the one source, we're guaranteed to get
positive infinity if we get any infinity at all.  Since +inf only has
one IEEE 754 representation, we can use an integer comparison and avoid
all of the ordered/unordered issues.

Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-20 14:08:19 +10:00
Timothy Arceri
bf8bc6190e glsl: use set for copy propagation kills
Previously each time we saw a variable we just created a duplicate
entry in the list. This is particularly bad for loops were we add
everything twice, and then throw nested loops into the mix and the
list was growing expoentially.

This stops the glsl-vs-unroll-explosion test which has 16 nested
loops from reaching the tests mem usage limit in this pass. The
test now hits the mem limit in opt_copy_propagation_elements()
instead.

I suspect this was also part of the reason this pass can be so
slow with some shaders.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-03-18 14:21:09 +11:00
Timothy Arceri
40bc1afc94 glsl: don't leak memory when trying to count loop iterations
Suggested-by: Damian Dixon <damian.dixon@gmail.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99789
2017-03-18 14:12:40 +11:00
Emil Velikov
0fd61fb639 util/sha1: drop _mesa_sha1_{update, format} return type
Unused/unchecked by any of the callers.

v2: Fix the glsl cases that have crept in since v1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:18:45 +00:00
Alan Swanson
f1e9671442 util/disk_cache: actually enforce cache size
Currently only a one in one out eviction so if at max_size and
cache files were to constantly increase in size then so would the
cache. Restrict to limit of 8 evictions per new cache entry.

V2: (Timothy Arceri) fix make check tests

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Timothy Arceri
50989f87e6 util/disk_cache: use a thread queue to write to shader cache
This should help reduce any overhead added by the shader cache
when programs are not found in the cache.

To avoid creating any special function just for the sake of the
tests we add a one second delay whenever we call dick_cache_put()
to give it time to finish.

V2: poll for file when waiting for thread in test
V3: fix poll delay to really be 100ms, and simplify the wait function

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-03-15 11:15:11 +11:00
Jason Ekstrand
9d559ba39d nir/constant_expressions: Refactor helper functions
Apart from avoiding some unneeded size cases, this shouldn't have any
actual functional impact.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
702d1af8ba glsl/nir: Use nir_type_conversion_op
Using the helper is way better than hand-coding the universe.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
6eb051e36f nir: Rewrite nir_type_conversion_op
The original version was very convoluted and tried way too hard to not
just have the nested switch statement that it needs.  Let's just write
the obvious code and then we know it's correct.  This fixes a bunch of
missing cases particularly with int64.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
9084b1db30 nir: Add a get_nir_type_for_glsl_base_type helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
a136884139 nir/validate: Rework ALU bit-size rule validation
The original bit-size validation wasn't capable of properly dealing with
instructions with variable bit sizes.  An attempt was made to handle it
by looking at source and destinations but, because the validation was
done in validate_alu_(src|dest), it didn't really have the needed
information.  The new validation code is much more straightforward and
should be more correct.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
370d68babc nir/validate: Validate that bit sizes and components always match
We've always required bit sizes to match but the rules for number of
components have been a bit loose.  You've never been allowed to source
from something with less components than you consume, but more has
always been fine.  This changes the validator to require that they match
exactly.  The fact that they don't always match has been a source of
confusion in NIR for quite some time and it's time we got rid of it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
e9a45a3d5d nir: Make image_size a variable-width intrinsic
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand
0bf0365393 nir/lower_tex: Use tex_instr_dest_size for txs destinations
Using coord_components of the source texture is correct for everything
except cube maps where it's off by one.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
fffa4111df nir/spirv: Restrict the number of channels in texture coordinates
Some SPIR-V texturing instructions pack more than the texture coordinate
into the coordinate source.  We need to mask off the unused channels.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
3c312be7b3 nir/copy_prop: Respect the source's number of components
In the near future we are going to require that the num_components in a
src dereference match the num_components of the SSA value being
dereferenced.  To do that, we need copy_prop to not remove our MOVs from
a larger SSA value into an instruction that uses fewer channels.

Because we suddenly have to know how many components each source has,
this makes the pass a bit more complicated.  Fortunately, copy
propagation is the only pass that cares about the number of components
are read by any given source so it's fairly contained.

Shader-db results on Sky Lake:

   total instructions in shared programs: 13318947 -> 13320265 (0.01%)
   instructions in affected programs: 260633 -> 261951 (0.51%)
   helped: 324
   HURT: 1027

Looking through the hurt programs, about a dozen are hurt by 3
instructions and the rest are all hurt by 2 instructions.  From a
spot-check of the shaders, the story is always the same:  They get a
vec4 from somewhere (frequently an input) and use the first two or three
components as a texture coordinate.  Because of the vector component
mismatch, we have a mov or, more likely, a vecN sitting between the
texture instruction and the input.  This means that the back-end inserts
a bunch of MOVs and split_virtual_grfs() goes to town.  Because the
texture coordinate is also used by some other calculation, register
coalesce can't combine them back together and we end up with an extra 2
MOV instructions in our shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand
60d1aac28a nir/intrinsics: Make load_barycentric_input take a 2-component coor
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Timothy Arceri
df1d5fc442 glsl: don't use ralloc for blob creation
There is no need to use ralloc here.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-03-13 09:50:19 +11:00
Timothy Arceri
b607aad8e1 glsl: don't recompile a shader on fallback unless needed
Because we optimistically skip compiling shaders if we have seen them
before we may need to compile them later at link time if they haven't
yet been use in a specific combination to create a program.

Rather than always recompiling we take advantage of the
gl_compile_status enum introduced in the previous patch to only
compile when we have previously skipped compilation.

This helps with regressions in app start-up times on cold cache
runs, compared with no cache.

Deus Ex: Mankind Divided start-up times:

cache disabled:               ~3m15s
cold cache master:            ~4m23s
cold cache with this patch:   ~3m33s

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:26:08 +11:00
Timothy Arceri
bfa95997c4 mesa/glsl: introduce new gl_compile_status enum
This will allow us to tell if a shader really has been compiled or
if the shader cache has just seen it before.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-03-12 17:24:40 +11:00
Emil Velikov
a3782f2b7a glsl/tests: remove any bashisms
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:48 +00:00