Commit graph

2950 commits

Author SHA1 Message Date
Jason Ekstrand
a45b6fb452 spirv: Pass SSA values through functions
Previously, we would create temporary variables and fill them out.
Instead, we create as many function parameters as we need and pass them
through as SSA defs.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-10-30 11:22:44 -05:00
Michał Janiszewski
0654450911 glsl: Add missing include guards
Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>

Reviewed-by: Brian Paul <brianp@vmware.com>
2018-10-30 06:19:09 -06:00
Vadym Shovkoplias
7d66eddbbd glsl/linker: Fix out variables linking during single stage
Since out variables are copied from shader objects instruction
streams to linked shader instruction steam it should be cloned
at first to keep source instruction steam unaltered.

Fixes: 966a797e43 ("glsl/linker: Link all out vars from a shader
objects on a single stage")

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
2018-10-30 10:19:17 +11:00
Brian Paul
9007c0ed26 nir: fix yet another MSVC build break
Trivial.
2018-10-29 11:15:12 -06:00
Jason Ekstrand
19064b8c3a nir: Add a pass for gathering transform feedback info
This is different from the GL_ARB_spirv pass because it generates a much
simpler data structure that isn't tied to OpenGL and mtypes.h.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-29 17:09:08 +01:00
Brian Paul
7e64e39f8b nir: Fix array initializer
Empty initializer is not standard C.  This fixes MSVC build.

Trivial.
2018-10-26 12:35:48 -06:00
Jason Ekstrand
18fb2c5d92 spirv: Initialize subgroup destinations with the destination type
Instead of initializing them manually, just use the type that we already
have sitting there.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
8fa70cfcfd spirv: Use the right bit-size for spec constant ops
Previously, we would always pull the bit size from the destination which
is wrong for opcodes like nir_ilt where the sources are variable-sized
but the destination is a fixed size.  We were getting lucky before
because nir_op_ilt returns a 32-bit value and basically everyone who
uses spec constants uses 32-bit ones.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
2fe3031440 glsl/nir: Use i2b instead of ine for fixing UBO/SSBO Booleans
They do the same thing in the end but i2b is a bit simpler.  Also, let's
clean up the mess of code for SSBO handling with one line of builder.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
5bfce5fcc2 nir/system_values: Use the bit size from the load_deref
This isn't a great solution for bit-sizes but we don't have a
particularly convenient way to get a bit size from the system value enum
and this keeps the lowering pass from changing it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
a3b4cb3458 nir/opt_if: Rework condition propagation
Instead of doing our own constant folding, we just emit instructions and
let constant folding happen.  This is substantially simpler and lets us
use the nir_imm_bool helper instead of dealing with the const_value's
ourselves.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
4cd8a58595 nir/search: Use the nir_imm_* helpers from nir_builder
This requires that we rework the interface a bit to use nir_builder but
that's a nice little modernization anyway.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
6e32115bd6 nir/builder: Handle 16-bit floats in nir_imm_floatN_t
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
ff45649bc2 nir/builder: Add a nir_imm_true/false helpers
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
249e32ab17 nir/constant_folding: Use nir_src_as_bool for discard_if
Missed one while converting to the nir_src_as_* helpers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
6de1869e86 nir/constant_folding: Add an unreachable to a switch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Jason Ekstrand
28bb6abd1d nir/validate: Print when the validation failed
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-10-26 11:45:29 -05:00
Eric Engestrom
e27902a261 util: use C99 declaration in the for-loop set_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Eric Engestrom
bb84fa146f util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-25 12:43:18 +01:00
Jose Fonseca
d9a04196d9 nir: Fix array initializer.
Empty initializer is not standard C.  This fixes MSVC build.

Trivial.
2018-10-24 11:37:09 +01:00
Juan A. Suarez Romero
3112da346b nir: fix nir_copy_propagation test
Use nir_src_comp_as_uint() to read the proper second component, as
nir_src_as_uint() returns the first one.

v2: Use nir_src_comp_as_uint() [Jason]

Fixes: 16870de8a0 ("nir: Use nir_src_is_const and nir_src_as_* in core
                     code")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-24 09:13:24 +02:00
Samuel Pitoiset
7c694cbfa4 nir: add linking helper nir_link_xfb_varyings()
The linking opts shouldn't try removing or compacting XFB varyings
in the consumer. To avoid this we copy the always_active_io flag
from the producer.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-24 08:21:29 +11:00
Jason Ekstrand
ecb7775e1c nir/algebraic: Fix a typo in the bit size validation code
The conon_bit_class and canon_var_class variables got switched.

Fixes: 932c650e0b "nir/algebraic: Loosen a restriction on variables"
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-10-23 12:22:29 -05:00
Jason Ekstrand
bf441d22a7 nir/algebraic: Provide descriptive asserts for bit size checks
This will hopefully make debugging opt_algebraic bit-size compile
failures easier.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
932c650e0b nir/algebraic: Loosen a restriction on variables
Previously, we would fail if a variable had an assigned but unknown bit
size X and we tried to assign it an actual bit size.  However, this is
ok because, at the time we do the search, the variable does have an
actual bit size and it will match X because of the NIR rules.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
ea9e651423 nir/algebraic: A bit of validation refactoring'
We rename some local variables in validate() to be more readable and
plumb the var through to get/set_var_bit_class instead of the var index.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
641f4be8e8 nir/algebraic: Make internal classes str-able
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
6068be543b nir/algebraic: Generalize an optimization
There's nothing boolean about (a | ~a) ~> -1

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
69618a8678 nir/algebraic: Use bool internally instead of bool32
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-10-22 16:00:18 -05:00
Jason Ekstrand
16870de8a0 nir: Use nir_src_is_const and nir_src_as_* in core code
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
ce36f412c9 nir/search_helpers: Use nir_src_is_const and friends
This not only makes them safe for more bit sizes but it also fixes a bug
in is_zero_to_one where it would return true for constant NaN.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
7bae7828aa nir/search: Use nir_src_is_const and friends
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
bca5c2c688 nir: Add some new helpers for working with const sources
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 14:24:15 -05:00
Jason Ekstrand
891886da2f spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1
This extension adds two new decorations which carry meaning only for
HLSL shaders.  They are expected to be handled by higher level layers
and can be ignored by implementations.  However, it does save the client
a bit of work if the implementation safely ignores them instead of the
client having to strip them out of the SPIR-V in order for it to be
valid.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:49:53 -05:00
Jason Ekstrand
5f0322d5c3 spirv: Add support for SPV_GOOGLE_decorate_string
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-22 10:49:53 -05:00
Vadym Shovkoplias
ad558408ff glsl: Check the subroutine associated functions names
Adding compile time check for subroutine functions with
the same names. Similar check for intrastage linking was already
landed in commit 5f0567a4f6.

From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification

    "A program will fail to compile or link if any shader
     or stage contains two or more functions with the same
     name if the name is associated with a subroutine type."

Fixes:
    * no-overloads.vert

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-16 08:15:21 +03:00
Vadym Shovkoplias
d2ea3d4a76 glsl/linker: Change the format of spec quotation
Also there is no "OpenGL ES Shading Language 4.00" spec,
so change it to GLSL 4.00 spec.

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-10-16 08:15:21 +03:00
Dave Airlie
ff281e6204 nir: fix clip cull lowering to not assert if GLSL already lowered.
If GLSL has already done the lowering, we'd rather not crash in this pass.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-10-15 18:53:48 -07:00
Caio Marcelo de Oliveira Filho
b3c6146925 nir: Copy propagation between blocks
Extend the pass to propagate the copies information along the control
flow graph.  It performs two walks, first it collects the vars
that were written inside each node. Then it walks applying the copy
propagation using a list of copies previously available.  At each node
the list is invalidated according to results from the first walk.

This approach is simpler than a full data-flow analysis, but covers
various cases.  If derefs are used for operating on more memory
resources (e.g. SSBOs), the difference from a regular pass is expected
to be more visible -- as the SSA copy propagation pass won't apply to
those.

A full data-flow analysis would handle more scenarios: conditional
breaks in the control flow and merge equivalent effects from multiple
branches (e.g. using a phi node to merge the source for writes to the
same deref).  However, as previous commentary in the code stated, its
complexity 'rapidly get out of hand'.  The current patch is a good
intermediate step towards more complex analysis.

The 'copies' linked list was modified to use util_dynarray to make it
more convenient to clone it (to handle ifs/loops).

Annotated shader-db results for Skylake:

    total instructions in shared programs: 15105796 -> 15105451 (<.01%)
    instructions in affected programs: 152293 -> 151948 (-0.23%)
    helped: 96
    HURT: 17

        All the HURTs and many HELPs are one instruction.  Looking
        at pass by pass outputs, the copy prop kicks in removing a
        bunch of loads correctly, which ends up altering what other
        other optimizations kick.  In those cases the copies would be
        propagated after lowering to SSA.

        In few HELPs we are actually helping doing more than was
        possible previously, e.g. consolidating load_uniforms from
        different blocks.  Most of those are from
        shaders/dolphin/ubershaders/.

    total cycles in shared programs: 566048861 -> 565954876 (-0.02%)
    cycles in affected programs: 151461830 -> 151367845 (-0.06%)
    helped: 2933
    HURT: 2950

        A lot of noise on both sides.

    total loops in shared programs: 4603 -> 4603 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total spills in shared programs: 11085 -> 11073 (-0.11%)
    spills in affected programs: 23 -> 11 (-52.17%)
    helped: 1
    HURT: 0

        The shaders/dolphin/ubershaders/12.shader_test was able to
        pull a couple of loads from inside if statements and reuse
        them.

    total fills in shared programs: 23143 -> 23089 (-0.23%)
    fills in affected programs: 2718 -> 2664 (-1.99%)
    helped: 27
    HURT: 0

        All from shaders/dolphin/ubershaders/.

    LOST:   0
    GAINED: 0

The other generations follow the same overall shape.  The spills and
fills HURTs are all from the same game.

shader-db results for Broadwell.

    total instructions in shared programs: 15402037 -> 15401841 (<.01%)
    instructions in affected programs: 144386 -> 144190 (-0.14%)
    helped: 86
    HURT: 9

    total cycles in shared programs: 600912755 -> 600902486 (<.01%)
    cycles in affected programs: 185662820 -> 185652551 (<.01%)
    helped: 2598
    HURT: 3053

    total loops in shared programs: 4579 -> 4579 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total spills in shared programs: 80929 -> 80924 (<.01%)
    spills in affected programs: 720 -> 715 (-0.69%)
    helped: 1
    HURT: 5

    total fills in shared programs: 93057 -> 93013 (-0.05%)
    fills in affected programs: 3398 -> 3354 (-1.29%)
    helped: 27
    HURT: 5

    LOST:   0
    GAINED: 2

shader-db results for Haswell:

    total instructions in shared programs: 9231975 -> 9230357 (-0.02%)
    instructions in affected programs: 44992 -> 43374 (-3.60%)
    helped: 27
    HURT: 69

    total cycles in shared programs: 87760587 -> 87727502 (-0.04%)
    cycles in affected programs: 7720673 -> 7687588 (-0.43%)
    helped: 1609
    HURT: 1416

    total loops in shared programs: 1830 -> 1830 (0.00%)
    loops in affected programs: 0 -> 0
    helped: 0
    HURT: 0

    total spills in shared programs: 1988 -> 1692 (-14.89%)
    spills in affected programs: 296 -> 0
    helped: 1
    HURT: 0

    total fills in shared programs: 2103 -> 1668 (-20.68%)
    fills in affected programs: 438 -> 3 (-99.32%)
    helped: 4
    HURT: 0

    LOST:   0
    GAINED: 1

v2: Remove the DISABLE prefix from tests we now pass.

v3: Add comments about missing write_mask handling. (Caio)
    Add unreachable when switching on cf_node type. (Jason)
    Properly merge the component information in written map
    instead of replacing. (Jason)
    Explain how removal from written arrays works. (Jason)
    Use mode directly from deref instead of getting the var. (Jason)

v4: Register the local written mode for calls. (Jason)
    Prefer cf_node instead of node. (Jason)
    Clarify that remove inside iteration only works in backward
    iterations. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
dc349f07b5 nir: Take call instruction into account in copy_prop_vars
Calls are not used yet (functions are inlined), but since new code is
already taking them into account, do it here too.  The convention here
and in other places is that no writable memory is assumed to remain
unchanged, as well as global variables.

Also, explicitly state the modes affected (instead of using the
reverse logic) in one of the apply_for_barrier_modes calls.

Suggested by Jason.

v2: Consider local vars used by a call to be conservative, SPIR-V has
    such cases. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
797f01c220 nir: Add tests for copy propagation of derefs
Also tests for removal of redundant loads, that we currently handle as
part of the copy propagation.

Note some tests involve multiple blocks and are currently DISABLED
because they (expectedly) fail.

v2: Add missing DISABLED prefix to "multi block" tests. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
4dfa7adc10 nir: Remove handling of dead writes from copy_prop_vars
These are covered by another pass now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
cb126cf67a nir: Separate dead write removal into its own pass
Instead of doing this as part of the existing copy_prop_vars pass.

Separation makes easier to expand the scope of both passes to be more
than per-block.  For copy propagation, the information about valid
copies comes from previous instructions; while the dead write removal
depends on information from later instructions ("have any instruction
used this deref before overwrite it?").

Also change the tests to use this pass (instead of copy prop vars).
Note that the disabled tests continue to fail, since the standalone
pass is still per-block.

v2: Remove entries from dynarray instead of marking items as
    deleted.  Use foreach_reverse. (Caio)

    (all from Jason)
    Do not cache nir_deref_path.  Not worthy for this patch.
    Clear unused writes when hitting a call instruction.
    Clean up enumeration of modes for barriers.
    Move metadata calls to the inner function.

v3: For copies, use the vector length to calculate the mask.

    (all from Jason)
    Use nir_component_mask_t when applicable.
    Rename functions for clarity.
    Consider local vars used by a call to be conservative (SPIR-V has
    such cases).
    Comment and assert the assumption that stores and copies are
    always to a deref that ends with a vector or scalar.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
a02fd7000d nir: Add tests for dead write elimination
Note at the moment the pass called is nir_opt_copy_prop_vars, because
dead write elimination is implemented there.

Also added tests that involve identifying dead writes in multiple
blocks (e.g. the overwrite happens in another block).  Those currently
fail as expected, so are marked to be skipped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
bbda2a17f7 nir: Add test file for vars related passes
Add basic helpers for doing tests on the vars related optimization
passes.  The main goal is to lower the barrier to create tests during
development and debugging of the passes.  Full coverage is not a
requirement.

v2: Make find_next_intrinsic() skip blocks before 'after'. (Jason)
    Move nir_imm_ivec2() to nir_builder.h. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Caio Marcelo de Oliveira Filho
c869646b7d nir: Add nir_imm_ivec2 helper
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-10-15 17:29:46 -07:00
Eric Anholt
7d77fe1bcc nir: Expose nir_remove_unused_io_vars().
For gallium drivers where you want to do some linking at variant compile
time, you don't have the other producer/consumer shader on hand to modify.
By exposing the inner function, the driver can have the used varyings in
the compiled shader cache key and still do linking.

This is also useful for V3D, where the binning shader wants to only output
position and TF varyings.  We've been removing those after nir_lower_io,
but this will be less driver-specific code and let more of the shader get
DCEed early in NIR.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-15 17:16:44 -07:00
Eric Anholt
b788ab6d5c nir: Be sure to fix deref modes after demoting shader i/o vars to global.
Fixes assertion failures when calling nir_remove_unused_varyings() or
nir_remove_unused_io_vars().

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-10-15 17:16:44 -07:00
Kenneth Graunke
ed169c9ad2 nir: Create sampler2D variables in nir_lower_{bitmap,drawpixels}.
This is needed for nir_gather_info to actually count the new textures,
since it operates solely on variables.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-14 23:35:35 -07:00
Jason Ekstrand
b7397b09d5 spirv: Update SPIR-V json and headers to Khronos master
This corresponds to commit 801cca8104245c07e8cc532 on GitHub.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-10-13 09:56:18 -05:00