Commit graph

2123 commits

Author SHA1 Message Date
Eric Anholt
d9ce4ac990 nir: Add a safety check that we don't remove dead I/O vars after lowering.
The pass only looks at var load/store intrinsics, not input load/store
intrinsics, so assert that we don't see the other type.

v2: Adjust comment indentation.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-20 16:26:07 -07:00
Jason Ekstrand
59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Samuel Iglesias Gonsálvez
e382890e25 nir: set default lod to texture opcodes that needed it but don't provide it
v2:
- Use helper to add a new source to the texture instruction.

v3:
- Use nir_tex_instr_src_index() to simplify the patch (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-20 08:29:09 +02:00
Iago Toral Quiroga
2d87caa279 glsl/linker: produce error when invalid explicit locations are used
We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.

v2: compute location slots properly depending on shader stage and
    variable type / direction

Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-19 11:27:12 +02:00
Jason Ekstrand
41c75b5354 nir: Add a helper for adding texture instruction sources
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-17 07:36:00 -07:00
Timothy Arceri
f1eb5e6399 nir: add component level support to remove_unused_io_vars()
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
9f7127f5d2 glsl: mark xfb inputs as always_active_io
We won't split varyings marked as always active because there
is no point in doing so. This means we need to mark both
sides of the interface as always active otherwise we will have
a mismatch and start removing things we shouldn't.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
6af5e0bec9 nir: add variant of lower_io_to_scalar to be called earlier
This is intended to be called before nir_lower_io() so that we
can do some linking optimisations with the results. It can also
be used with drivers that don't use nir_lower_io() at all such
as RADV.

v2: pass mode mask rather than first and last stage integer.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
3b59f5ca17 nir: add glsl_channel_type() helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-16 09:06:53 +11:00
Timothy Arceri
421c1b9bd6 nir: add glsl_type_is_64bit() to nir_types
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-16 09:06:53 +11:00
Jason Ekstrand
1cec500c69 blob: Use intptr_t instead of ssize_t
ssize_t is a GNU extension and is not available on Windows or MacOS.
Instead, we use intptr_t which should be effectively equivalent and is
part of the C standard.  This should fix the Windows and Mac OS builds.

Fixes: 3af1c82989
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103253
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
2017-10-13 15:02:34 -07:00
Dylan Baker
142dc8b9de meson: fix blob test includes
Since blob.h moved up to src/compiler the test should include that
instead of src/compiler/glsl

fixes: 0e3bd56c6e ("compiler: Move blob up a level")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-13 10:40:23 -07:00
Jason Ekstrand
3442c9fc3e nir: Get rid of the variable on vote intrinsics
This looks like a copy+paste error.  They don't actually write into that
variable as would be implied by putting the return there.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 22:39:29 -07:00
Jason Ekstrand
a0947921eb nir/opcodes: Fix constant-folding of ufind_msb
We didn't fold correctly in the case of 0x1 because we never let the
loop counter hit 0.  Switching it to bit >= 0 solves this problem.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 22:39:29 -07:00
Jason Ekstrand
6a41a52e62 compiler/blob: Make some parameters void instead of uint8_t
There are certain advantages to using uint8_t internally such as
well-defined arithmetic on all platforms.  However, interfaces that
work in terms of raw data should use a void* type.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
4d56ff0a71 compiler/blob: Constify the reader
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
3af1c82989 compiler/blob: Add (reserve|overwrite)_(uint32|intptr) helpers
These helpers not only call blob_reserve_bytes but also make sure that
the blob is properly aligned as if blob_write_* were called.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Connor Abbott
6935440967 compiler/blob: make blob_reserve_bytes() more useful
Despite the name, it could only be used if you immediately wrote to the
pointer. Noboby was using it outside of one test, so clearly this
behavior wasn't that useful. Instead, make it return an offset into the
data buffer so that the result isn't invalidated if you later write to
the blob. In conjunction with blob_overwrite_bytes(), this will be
useful for leaving a placeholder and then filling it in later, which
we'll need to do for handling phi nodes when serializing NIR.

v2 (Jason Ekstrand):
 - Detect overflow in the offset + to_write computation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
8ae03af4ed compiler/blob: Allow for fixed-size blobs with a NULL data pointer
These can be used to easily count up the number of bytes that will be
required by "writing" it into the NULL blob.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
26f6d4e5c7 compiler/blob: Add a concept of a fixed-allocation blob
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
49bb9f785a compiler/blob: Switch to init/finish instead of create/destroy
There's no reason why that tiny bit of memory needs to be on the heap.
We always put blob_reader on the stack, so why not do the same with the
writable blob.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
0e3bd56c6e compiler: Move blob up a level
We're going to want to use the blob for Vulkan pipeline caching so it
makes sense to have it in libcompiler not libglsl.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-12 21:47:06 -07:00
Jason Ekstrand
8f42a43d08 meson: Add inc_compiler to the libglsl includes 2017-10-12 21:47:06 -07:00
Jason Ekstrand
e03717efbd glsl/blob: Return false from grow_to_fit if we've ever failed
Otherwise we could have a failure followed by a smaller write that
succeeds and get a corrupted blob.  If we ever OOM, we should stop.

v2 (Jason Ekstrand):
 - Initialize the new boolean member in create_blob

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 21:47:06 -07:00
Jason Ekstrand
7118851374 glsl/blob: Return false from ensure_can_read on overrun
Otherwise, if you have a large read fail and then try to do a small
read, the small read may succeed even though it's at the wrong offset.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 21:47:06 -07:00
Kenneth Graunke
a576c148cd nir: Make nir_shader_gather_info() track texelFetch texture accesses.
For TGSI-based drivers, st_glsl_to_tgsi records this information.
For NIR-based drivers, nir_shader_gather_info() will do so.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-10-12 17:22:42 -07:00
Kenneth Graunke
fbf4c2916c compiler: Move gl_program::TexelFetchSamplers to shader_info.
I'd like to put this sort of metadata in the shader_info structure,
rather than adding more things to gl_program.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-10-12 17:22:39 -07:00
Dave Airlie
2d36efdb7f nir: bump loop unroll limit to 96.
With the ssao demo from Vulkan demos:
radv/rx480: 440->440fps
anv/haswell: 24->34 fps

The demo does a 0->32 loop across a ubo with 32 members.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-11 10:11:36 +10:00
Eric Anholt
c34295b1a3 nir: Move vc4's alpha test lowering to core NIR.
I've been doing this inside of vc4, but vc5 wants it as well and it may be
useful for other drivers (Intel has a related path for pre-gen6 with MRT,
and freedreno had a TGSI path for it at one point).

This required defining a common enum for the standard comparison
functions, but other lowering passes are likely to also want that enum.

v2: Add to meson.build as well.

Acked-by: Rob Clark <robdclark@gmail.com>
2017-10-10 11:42:04 -07:00
Nicolai Hähnle
a2c8812f91 glsl/linker: add check for compute shared memory size
Unlike uniforms, the limit on shared memory size is not called out
explicitly in the list of things that cause linker errors, but presumably
that's just an oversight in the spec.

Fixes dEQP-GLES31.functional.debug.negative_coverage.{callbacks,get_error,log}.compute.exceed_shared_memory_size_limit

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-10 13:58:43 +02:00
Józef Kucia
e0acb630a5 spirv: Fix SpvOpAtomicISub
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-10-09 16:28:11 -07:00
Timothy Arceri
7a7fb90af7 glsl: tidy up IR after loop unrolling
c7affbf687 enabled GLSLOptimizeConservatively on some
drivers. The idea was to speed up compile times by running
the GLSL IR passes only once each time do_common_optimization()
is called. However loop unrolling can create a big mess and
with large loops can actually case compile times to increase
significantly due to a bunch of redundant if statements being
propagated to other IRs.

Here we make sure to clean things up before moving on.

There was no measureable difference in shader-db compile times,
but it makes compile times of some piglit tests go from a couple
of seconds to basically instant.

The shader-db results seemed positive also:

Totals:
SGPRS: 2829456 -> 2828376 (-0.04 %)
VGPRS: 1720793 -> 1721457 (0.04 %)
Spilled SGPRs: 7707 -> 7707 (0.00 %)
Spilled VGPRs: 33 -> 33 (0.00 %)
Private memory VGPRs: 3140 -> 2060 (-34.39 %)
Scratch size: 3308 -> 2180 (-34.10 %) dwords per thread
Code Size: 79441464 -> 79214616 (-0.29 %) bytes
LDS: 436 -> 436 (0.00 %) blocks
Max Waves: 558670 -> 558571 (-0.02 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
646621c66d glsl: make loop unrolling more like the nir unrolling path
The old code assumed that loop terminators will always be at
the start of the loop, resulting in otherwise unrollable
loops not being unrolled at all. For example the current
code would unroll:

  int j = 0;
  do {
     if (j > 5)
        break;

     ... do stuff ...

     j++;
  } while (j < 4);

But would fail to unroll the following as no iteration limit was
calculated because it failed to find the terminator:

  int j = 0;
  do {
     ... do stuff ...

     j++;
  } while (j < 4);

Also we would fail to unroll the following as we ended up
calculating the iteration limit as 6 rather than 4. The unroll
code then assumed we had 3 terminators rather the 2 as it
wasn't able to determine that "if (j > 5)" was redundant.

  int j = 0;
  do {
     if (j > 5)
        break;

     ... do stuff ...

     if (bool(i))
        break;

     j++;
  } while (j < 4);

This patch changes this pass to be more like the NIR unrolling pass.
With this change we handle loop terminators correctly and also
handle cases where the terminators have instructions in their
branches other than a break.

V2:
- fixed regression where loops with a break in else were never
  unrolled in v1.
- fixed confusing/wrong naming of bools in complex unrolling.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
d24e16fe1f glsl: check if induction var incremented before use in terminator
do-while loops can increment the starting value before the
condition is checked. e.g.

  do {
    ndx++;
  } while (ndx < 3);

This commit changes the code to detect this and reduces the
iteration count by 1 if found.

V2: fix terminator spelling

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Timothy Arceri
ab23b759f2 glsl: don't drop instructions from unreachable terminators continue branch
These instructions will be executed on every iteration of the loop
we cannot drop them.

V2:
- move removal of unreachable terminators from the terminator list
  to the same place they are removed from the IR as suggested by
  Nicolai.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-10-10 10:05:37 +11:00
Dylan Baker
3218056e0e meson: Build i965 and dri stack
This gets pretty much the entire classic tree building, as well as
i965, including the various glapis. There are some workarounds for bugs
that are fixed in meson 0.43.0, which is due out on October 8th.

I have tested this with piglit using glx.

v2: - fix typo "vaule" -> "value"
    - use gtest dep instead of linking to libgtest (rebase error)
    - use gtest dep instead of linking against libgtest (rebase error)
    - copy the megadriver, then create hard links from that, then delete
      the megadriver. This matches the behavior of the autotools build.
      (Eric A)
    - Use host_machine instead of target_machine (Eric A)
    - Put a comment in the right place (Eric A)
    - Don't have two variables for the same information (Eric A)
    - Put pre_args at top of file in this patch (Eric A)
    - Fix glx generators in this patch instead of next (Eric A)
    - Remove -DMESON hack (Eric A)
    - add sha1_h to mesa in this patch (Eric A)
    - Put generators in loops when possible to reduce code in
      mapi/glapi/gen (Eric A)
v3: - put HAVE_X11_PLATFORM in this patch

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:44 -07:00
Dylan Baker
001b65a899 meson: add nir_linking_helpers.c to libnir
This was missed in a rebase, and doesn't affect radv or anv, only i965.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-09 13:42:43 -07:00
Jason Ekstrand
49a6fb8474 spirv: Don't warn on the ImageCubeArray capability
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-10-07 14:52:03 -07:00
Dylan Baker
7a5a986ddd meson: convert gtest to an internal dependency
In truth gtest is an external dependency that upstream expects you to
"vendor" into your own tree. As such, it makes sense to treat it more
like a dependency than an internal library, and collect it's
requirements together in a dependency object.

v2: - include with -isystem instead of setting compiler args (Eric)

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-10-03 10:02:08 -07:00
Ian Romanick
765e1fa372 glsl: Remove spurious assertions
It's inside an if-statement that already checks that the variables are
not NULL.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:11 -07:00
Ian Romanick
ff5254bf08 glsl: Move 'foo = foo;' optimization to opt_dead_code_local
The optimization as done in opt_copy_propagation would have to be
removed in the next patch.  If we just eliminate that optimization
altogether, shader-db results, even on platforms that use NIR, are hurt
quite substantially.  I have not investigated why NIR isn't picking up
the slack here.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
2017-10-02 14:46:11 -07:00
Ian Romanick
623002f0b2 glsl/ast: Use logical-or instead of conditional assignment to set fallthru_var
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:11 -07:00
Ian Romanick
d5361d9f01 glsl/ast: Generate a more compact expression to disable execution of default case
Instead of generating a sequence like:

    run_default = true;
    if (i == 3) // some label that appears after default
        run_default = false;
    if (i == 4) // some label that appears after default
        run_default = false;
    ...
    if (run_default) {
        ...
    }

generate something like:

    run_default = !((i == 3) || (i == 4) || ...);
    if (run_default) {
        ...
    }

This eliminates one use of conditional assignment, and it enables the
elimination of another.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
3e5cd2aba9 glsl/ast: Explicitly track the set of case labels that occur after default
Previously the instruction stream was walked looking for comparisons
with case-label values.  This should generate nearly identical code.
For at least fs-default-notlast-fallthrough.shader_test, the code is
identical.

This change will make later changes possible.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
f307de2838 glsl/ast: Convert ast_case_label::hir to ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
4a8086c5a5 glsl/ast: Use ir_binop_equal instead of ir_binop_all_equal
The values being compared are scalars, so these are the same.  While
I'm here, simplify the run_default condition to just deref the flag
(instead of comparing a scalar bool with true).

There is a bit of extra change in this patch.  When constructing an
ir_binop_equal ir_expression, there is an assertion that the types are
the same.  There is no such assertion for ir_binop_all_equal, so
passing glsl_type::uint_type with glsl_type::int_type was previously
fine.  A bunch of the code motion is to deal with that.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:10 -07:00
Ian Romanick
ed80746c1c glsl/ast: Stop processing a switch-statement after an error in the init-expression
This happens to work now because ir_binop_all_equal is used.  This
causes vector typed init-expressions to produce scalar Boolean values
after comparison.

The next commit changes ir_binop_all_equal to ir_binop_equal.  Vector
typed init-expressions will then produce vector Boolean values, and, in
debug builds, the ir_assignment constructor will fail an assertion.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
6d1765c63a glsl: Don't pass NULL to ir_assignment constructor when not necessary
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
3cc997c7c8 glsl: Convert lower_variable_index_to_cond_assign to ir_builder
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-10-02 14:46:02 -07:00
Ian Romanick
eb58668525 glsl: Fix coding standards issues in lower_variable_index_to_cond_assign
Mostly tabs-before-spaces, but there was some other trivium too.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
2017-10-02 14:46:02 -07:00