Commit graph

2858 commits

Author SHA1 Message Date
Jason Ekstrand
7cdf8f9339 nir/format_convert: Fix a bitmask in unpack_11f11f10f
Fixes: 4e337b42f9 "nir/format_convert: Add pack/unpack for R11F_G11F_B10F"

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
1f7be4968f nir/format_convert: Rename pack_r11g11b10f to pack_11f11f10f
This matches the unpack function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
7bd0363d6f nir/format_convert: Add [us]norm conversion helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
152fdeddbb nir/format_convert: Rename nir_format_bitcast_uint_vec
We have a name for that, it's called a uvec.  This just makes the
function name a bit shorter.  While we're here, we also add an assert
for one of the assumptions this function makes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
7c5df52bdc nir/format_convert: Add vec mask and sign-extend helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
ea4f200864 nir/format_convert: Add support for unpacking signed integers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
80c424148b nir/opcodes: Make unpack_half_2x16_split_* variable-width
There is nothing inherent about these opcodes that requires them to only
take scalars.  It's very convenient if we let them take vectors as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
d448fa3ae3 nir/algebraic: Add some max/min optimizations
Found by inspection.  This doesn't help much now but we'll see this
pattern with images if you load UNORM and then store UNORM.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166916 -> 15166910 (<.01%)
    instructions in affected programs: 761 -> 755 (-0.79%)
    helped: 6
    HURT: 0

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
4dd5263663 nir/algebraic: Add more extract_[iu](8|16) optimizations
This adds the "(a << N) >> M" family of mask or sign-extensions.  Not a
huge win right now but this pattern will soon be generated by NIR format
lowering code.

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15166918 -> 15166916 (<.01%)
    instructions in affected programs: 36 -> 34 (-5.56%)
    helped: 2
    HURT: 0

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Jason Ekstrand
116b47fe3c nir/algebraic: Be more careful converting ushr to extract_u8/16
If it's not the right bit-size, it may not actually be the correct
extraction.  For now, we'll only worry about 32-bit versions.

Fixes: 905ff86198 "nir: Recognize open-coded extract_u16"
Fixes: 76289fbfa8 "nir: Recognize open-coded extract_u8"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
vadym.shovkoplias
966a797e43 glsl/linker: Link all out vars from a shader objects on a single stage
During intra stage linking some out variables can be dropped because
it is not used in a shader with the main function. But these out vars
can be referenced on later stages which can lead to further linking
errors.

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105731
2018-08-29 20:03:56 +10:00
Timothy Arceri
5db981952a nir: add loop unroll support for wrapper loops
This adds support for unrolling the classic

    do {
        // ...
    } while (false)

that is used to wrap multi-line macros. GLSL IR also wraps switch
statements in a loop like this.

shader-db results IVB:

total loops in shared programs: 2515 -> 2512 (-0.12%)
loops in affected programs: 33 -> 30 (-9.09%)
helped: 3
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Timothy Arceri
0f450b57a1 nir/opt_loop_unroll: Remove unneeded phis if we make progress
Now that SSA values can be derefs and they have special rules, we have
to be a bit more careful about our LCSSA phis.  In particular, we need
to clean up in case LCSSA ended up creating a phi node for a deref.
This avoids validation issues with some CTS tests with the following
patch, but its possible this we could also see the same problem with
the existing unrolling passes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Timothy Arceri
5a6b04d94b nir: add complex_loop bool to loop info
In order to be sure loop_terminator_list is an accurate
representation of all the jumps in the loop we need to be sure we
didn't encounter any other complex behaviour such as continues,
nested breaks, etc during analysis.

This will be used in the following patch.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Timothy Arceri
fef6325e58 nir: always attempt to find loop terminators
This will help later patches with unrolling loops that end with a
break i.e. loops the always exit on their first interation.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-29 16:02:05 +10:00
Caio Marcelo de Oliveira Filho
f172a77dd8 nir: Remove outdated comment
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-28 08:11:03 -07:00
Kevin Rogovin
119435c877 mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering
This extension provides new GLSL built-in function
beginFragmentShaderOrderingIntel() that guarantees
(taking wording of GL_INTEL_fragment_shader_ordering
extension) that any memory transactions issued by
shader invocations from previous primitives mapped to
same xy window coordinates (and same sample when
per-sample shading is active), complete and are visible
to the shader invocation that called
beginFragmentShaderOrderingINTEL().

One advantage of INTEL_fragment_shader_ordering over
ARB_fragment_shader_interlock is that it provides a
function that operates as a memory barrie (instead
of a defining a critcial section) that can be called
under arbitary control flow from any function (in
contrast the begin/end of ARB_fragment_shader_interlock
may only be called once, from main(), under no control
flow.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-08-28 17:15:10 +03:00
vadym.shovkoplias
4a8444d5bc glsl/linker: Allow unused in blocks which are not declated on previous stage
>From Section 4.3.4 (Inputs) of the GLSL 1.50 spec:

    "Only the input variables that are actually read need to be written
     by the previous stage; it is allowed to have superfluous
     declarations of input variables."

Fixes:
    * interstage-multiple-shader-objects.shader_test

v2:
  Update comment in ir.h since the usage of "used" field
  has been extended.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101247
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-27 12:13:53 +02:00
Jason Ekstrand
07a227f543 nir: Pull block_ends_in_jump into nir.h
We had two different implementations in different files.  May as well
have one and put it in nir.h.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-27 02:15:38 -05:00
Emil Velikov
cff80b6c15 Revert "configure: allow building with python3"
This reverts commit ae7898dfdb.

Turns out the python scripts are _not_ fully python 3 compatible.
As Ilia reported using get_xmlpool.py with LANG=C produces some weird
output - see the link for details.

Even though the issue was spotted with the autoconf build, it exposes a
genuine problem with the script (and lack of lang handling of the meson
build.)

https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
2018-08-24 11:14:15 +01:00
Marek Olšák
b3c17330e6 mesa: expose AMD_gpu_shader_int64
because the closed driver exposes it.

It's equivalent to ARB_gpu_shader_int64.
In this patch, I did everything the same as we do for ARB_gpu_shader_int64.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-08-24 00:36:18 -04:00
Marek Olšák
1cf3631b9c mesa: expose ARB_post_depth_coverage in the Compatibility profile
It only contains GLSL changes.

v2: allow the layout qualifier on GLSL <= 1.30
2018-08-24 00:36:18 -04:00
Jason Ekstrand
53072582dc nir: Add an array copy optimization
This peephole optimization looks for a series of load/store_deref or
copy_deref instructions that copy an array from one variable to another
and turns it into a copy_deref that copies the entire array.  The
pattern it looks for is extremely specific but it's good enough to pick
up on the input array copies in DXVK and should also be able to pick up
the sequence generated by spirv_to_nir for a OpLoad of a large composite
followed by OpStore.  It can always be improved later if needed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:47:47 -05:00
Jason Ekstrand
be8d009908 nir: Add a array-of-vector variable shrinking pass
This pass looks for variables with vector or array-of-vector types and
narrows the type to only the components used.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:46:56 -05:00
Jason Ekstrand
fa6417495c nir: Add an array splitting pass
This pass looks for array variables where at least one level of the
array is never indirected and splits it into multiple smaller variables.

This pass doesn't really do much now because nir_lower_vars_to_ssa can
already see through arrays of arrays and can detect indirects on just
one level or even see that arr[i][0][5] does not alias arr[i][1][j].
This pass exists to help other passes more easily see through arrays of
arrays.  If a back-end does implement arrays using scratch or indirects
on registers, having more smaller arrays is likely to have better memory
efficiency.

v2 (Jason Ekstrand):
 - Better comments and naming (some from Caio)
 - Rework to use one hash map instead of two

v2.1 (Jason Ekstrand):
 - Fix a couple of bugs that were added in the rework including one
   which basically prevented it from running

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand
26eb077ec4 nir: Add a structure splitting pass
This pass doesn't really do much now because nir_lower_vars_to_ssa can
already see through structures and considers them to be "split".  This
pass exists to help other passes more easily see through structure
variables.  If a back-end does implement arrays using scratch or
indirects on registers, having more smaller arrays is likely to have
better memory efficiency.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-23 21:44:14 -05:00
Jason Ekstrand
b489998e63 nir/types: Add array_or_matrix helpers
Reviewed-by: Thomas Helland<thomashelland90@gmail.com>
2018-08-23 21:44:14 -05:00
Marek Olšák
3867af39f9 glsl: fix error checking against MAX_UNIFORM_LOCATIONS
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Marek Olšák
a8b71f2db8 mesa: add ctx->Const.MaxGeometryShaderInvocations
radeonsi wants to report a different value

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-08-23 16:56:17 -04:00
Emil Velikov
ae7898dfdb configure: allow building with python3
Pretty much all of the scripts are python2+3 compatible.
Check and allow using python3, while adjusting the PYTHON2 refs.

Note:
 - python3.4 is used as it's the earliest supported version
 - python3 chosen prior to python2

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 17:00:13 +01:00
Emil Velikov
48820ed8da glsl: remove execute bit and shebang from python tests
Just like the rest of the tree - these should be run either as part of
the build system check target, or at the very least with an explicitly
versioned python executable.

Fixes: db8cd8e367 ("glcpp/tests: Convert shell scripts to a python script")
Fixes: 97c28cb082 ("glsl/tests: Convert optimization-test.sh to pure python")
Fixes: 3b52d29227 ("glsl/tests: reimplement warnings-test in python")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
2018-08-23 12:02:45 +01:00
Ian Romanick
0842655ac6 nir: Add floating point atomic min, max, and compare-swap instrinsics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
69ce7baa9e nir: Add floating point atomic add instrinsics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
a390158d10 glsl: Add support for lowering shared-variable float atomics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
39bf3100ac glsl: Add support for lowering SSBO float atomics
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
280ab4afa8 glsl: Add built-in functions for INTEL_shader_atomic_float_minmax
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
c9d52c83a4 mesa: Extension boilerplate for INTEL_shader_atomic_float_minmax
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
88b6c7bc14 glsl: Add built-in functions for NV_shader_atomic_float
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Ian Romanick
9527bb4e70 mesa: Extension boilerplate for NV_shader_atomic_float
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 20:31:32 -07:00
Caio Marcelo de Oliveira Filho
410de0e3f1 nir: Give end_block its own index
Since there's no particular reason for the index to be 0, choose an
index that is not used by other block.  This is convenient when we
store "per-block" data in an array AND look for the successors
data (e.g. any kind of backwards data-flow analysis).

v2: Add a note about end_block's index. (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho
8364ec3fce nir: Skip common instructions when comparing deref paths
Deref paths may share the same deref instructions in their chains,
e.g.

    ssa_100 = deref_var A
    ssa_101 = deref_struct "array_field" of ssa_100
    ssa_102 = deref_array "[1]" of ssa_101
    ssa_103 = deref_struct "field_a" of ssa_102
    ssa_104 = deref_struct "field_a" of ssa_103

when comparing the two last deref instructions, their paths will share
a common sequence ssa_100, ssa_101, ssa_102.  This patch skips to next
iteration if the deref instructions are the same.  Path[0] (the var)
is still handled specially, so in the case above, only ssa_101 and
ssa_102 will be skipped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 14:41:26 -07:00
Caio Marcelo de Oliveira Filho
5196041e93 nir: Export deref comparison functions
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-22 14:41:26 -07:00
Jason Ekstrand
68ae66542a nir/vars_to_ssa: Don't build deref nodes for non-local variables
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2018-08-22 14:17:38 -05:00
Mathieu Bridon
e15686567c meson: Run the test with Python 3
This is a patch from me and a patch from Mathieu Bridon squashed
together.

Signed-off-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>
2018-08-22 08:41:01 -07:00
Mathieu Bridon
ff0ce31e2a python: Disable universal newlines
We are testing the behaviour of a tool, for different input files, each
one using a different newline sequence. ('\n' on UNIX, '\r\n' on
Windows, …)

Unfortunately, when opening a file in text mode, Python 3 will by
default enable the "universal newlines" mode, which means it replaces
all the known newline sequences by '\n'.

This (usually useful) behaviour breaks the tests, which are specifically
trying to handle files with newline sequences different from '\n'.

Disabling the universal newlines mode fixes the tests.

However, to keep the script compatible with both Python 2 and 3, we must
use the io.open() function instead of the open() builtin, as the latter
only knows about the `newline` argument on Python 3.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-22 08:41:01 -07:00
Mathieu Bridon
fc708069f7 python: difflib prefers unicode strings
Python 3 does not automatically convert from bytes to unicode strings
like Python 2 used to do.

This commit makes sure we pass unicode strings to difflib.unified_diff,
so that the script works on both Python 2 and 3.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-08-22 08:41:01 -07:00
Dylan Baker
477d4b9960 compiler/glsl/tests: Make tests python3 safe
v2: - explicitly decode the output of subprocesses
    - handle bytes and string types consistently rather than relying on
      python 2's coercion for bytes and ignoring them in python 3
v3: - explicitly set encode as well as decode
    - python 2.7 and 3.x `bytes` instead of defining an alias

Reviewed-by: Mathieu Bridon <bochecha@daitauha.fr>
2018-08-22 08:41:01 -07:00
Kevin Rogovin
7ec308d978 Add NV_fragment_shader_interlock support.
The main purpose for having NV_fragment_shader_interlock
extension is because that extension is also for GLES31 while
the ARB extension is for GL only.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2018-08-20 13:32:43 -07:00
Kai Wasserbäch
6f0647c0b2 nir: mark *prev_block as MAYBE_UNUSED in opt_peel_loop_initial_if
Only used, when asserts are enabled.

Fixes an unused-variable warning with gcc-8:
 ../../../src/compiler/nir/nir_opt_if.c: In function 'opt_peel_loop_initial_if':
 ../../../src/compiler/nir/nir_opt_if.c:109:15: warning: unused variable 'prev_block' [-Wunused-variable]
     nir_block *prev_block =
                ^~~~~~~~~~

Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-08-18 10:34:15 +10:00
Timothy Arceri
d0803dea11 nir: allow more nested loops to be unrolled
The innermost check was added to stop us from unrolling multiple
loops in a single pass, and to stop outer loops from unrolling.

When we successfully unroll a loop we need to run the analysis
pass again before deciding if we want to go ahead an unroll a
second loop.

However the logic was flawed because it never tried to unroll any
nested loops other than the first innermost loop it found.
If this innermost loop is not unrolled we end up skipping all
other nested loops.

This unrolls a loop in a Deus Ex: MD shader on ultra settings and
also unrolls a loop in a shader from the game Prey when running
on DXVK.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-08-18 09:03:13 +10:00