Commit graph

4061 commits

Author SHA1 Message Date
Caio Marcelo de Oliveira Filho
0661480029 compiler/glsl: Fix warning about unused function
The helper check_node_type() is only used when DEBUG is set (in the
function below), but ASSERTED macro uses NDEBUG.  So just guard the
helper with #ifdef.  If we see more such cases we might consider a
ASSERTED-like macro for the DEBUG case.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-23 13:25:27 -07:00
Alyssa Rosenzweig
a8f86fcb51 nir: Remove nir_const_load_to_arr
There are no remaining users in-tree.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-22 12:24:13 -07:00
Jason Ekstrand
951cf94521 nir: Add explicit signs to image min/max intrinsics
This better matches all the other atomic intrinsics such as those for
SSBOs and shared variables where the sign is part of the intrinsic
opcode.  Both generators (GLSL and SPIR-V) know the sign from the type
of the image variable or handle.  In SPIR-V, signed min/max are separate
opcodes from unsigned.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-21 17:19:55 +00:00
Danylo Piliaiev
e71fc7f238 nir/loop_analyze: Treat do{}while(false) loops as 0 iterations
Loops like:

block block_0:
vec1 32 ssa_2 = load_const (0x00000020)
vec1 32 ssa_3 = load_const (0x00000001)
loop {
    vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9
    vec1 1 ssa_8 = ige ssa_2, ssa_7
    if ssa_8 {
        break
    } else {
    }
    vec1 32 ssa_9 = iadd ssa_7, ssa_1
}

Were treated as having more than 1 iteration and after unrolling
produced wrong results, however such loop will exit during
the first iteration if not unrolled.

So we check if loop will actually loop.

Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 11:01:15 +00:00
Danylo Piliaiev
84b3ef6a96 nir/loop_unroll: Prepare loop for unrolling in wrapper_unroll
Without loop_prepare_for_unroll loops are losing phis.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111411
Fixes: 5db98195 "nir: add loop unroll support for wrapper loops"
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 10:43:27 +00:00
Danylo Piliaiev
8869f44e9a nir/loop_unroll: Update the comments for loop_prepare_for_unroll
The comments say that we should remove continue if it is the last
intruction in a loop however we remove any kind of jump.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-08-21 10:43:27 +00:00
Daniel Schürmann
7fa1740035 nir/algebraic: some subtraction optimizations
Changes with RADV/ACO:
Totals from affected shaders:
SGPRS: 444087 -> 455543 (2.58 %)
VGPRS: 436468 -> 436768 (0.07 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 13448928 -> 13353520 (-0.71 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 68060 -> 67979 (-0.12 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-21 08:51:49 +00:00
Lionel Landwerlin
e4da8b9c33 mesa/compiler: rework tear down of builtin/types
The issue we're running into when running CTS is that glsl types are
deleted while builtins depending on them are not.

This happens because on one hand we have glsl types ref counted, but
builtins are not. Instead builtins are destroyed when unloading libGL
or explicitly calling glReleaseShaderCompiler().

This change removes almost entirely any dealing with glsl types
ref/unref by letting the builtins deal with it instead. In turn we
introduce a builtin ref count mechanism. Each GL context takes a
reference on the builtins when compiling a shader for the first time.
It releases the reference when the context is destroyed. It can also
explicitly release those when glReleaseShaderCompiler() is called.

Finally we also take a reference on the glsl types when loading libGL
to avoid recreating glsl types too often.

v2: Ensure we take a reference if we don't have one in link step (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin
9f37bc419c compiler: ensure glsl types are not created without a reference
We want to detect invalid refcounting so assert we have at least one
use before creating types.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin
8b913bd1ce nir/tests: take reference on glsl types
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Lionel Landwerlin
3ade8f0040 glsl/tests: take refs on glsl types
Much like each driver, tests as standalone entities must take
references on the glsl types.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-21 09:44:10 +02:00
Daniel Schürmann
df86c5ffb3 nir: add divergence analysis pass.
This pass expects the shader to be in LCSSA form.
The algorithm is based on 'The Simple Divergence Analysis' from
Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira.
Divergence Analysis. ACM Transactions on Programming Languages and Systems (TOPLAS)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-20 17:40:13 +02:00
Rhys Perry
7b07034931 nir/subgroups: Lower clustered reductions with cluster_size >= subgroup_size into reductions
The behavior for reductions with cluster_size >= subgroup_size is implementation defined.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:40:10 +02:00
Rhys Perry
911a1dfad2 nir/lcssa: allow to create LCSSA phis for loop-invariant booleans
ACO depends on LCSSA phis for divergent booleans to work correctly.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:40:05 +02:00
Daniel Schürmann
9c40ad49d5 nir/lcssa: Skip loop invariant variables when converting to LCSSA.
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:40:01 +02:00
Rhys Perry
8a6cfaa15a nir: make nir_to_lcssa() a general NIR pass.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-20 17:39:54 +02:00
Daniel Schürmann
204846ad06 nir/lcssa: handle deref instructions properly
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 414148cdc1 "nir: Support deref instructions in loop_analyze"
2019-08-20 17:39:52 +02:00
Jason Ekstrand
5167e94f23 nir: Add more source types to nir_tex_instr_src_type
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 17:03:34 +00:00
Roman Stratiienko
fdd6151612 nir: Add missing dependency in Android.nir.gen.mk
Fixes incremental build with Android

Signed-off-by: Roman Stratiienko <roman.stratiienko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-08-19 09:53:18 +03:00
Vasily Khoruzhick
0e394cda0d glsl/standalone: init shader stage in init_gl_program()
Otherwise lima standalone compiler fails when trying to compile fragment
shader with:

lima_compiler: ../src/compiler/nir/nir.c:55: nir_shader_create: Assertion `si->stage == stage' failed

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
2019-08-17 11:14:40 -07:00
Rhys Perry
0a790c3019 nir/algebraic: add a few masking-before-unpack optimizations
Helps some Dawn of War 3 and F1 2017 shaders with ACO:
Totals from affected shaders:
SGPRS: 2136 -> 2128 (-0.37 %)
VGPRS: 1624 -> 1628 (0.25 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 168068 -> 164332 (-2.22 %) bytes
LDS: 44 -> 44 (0.00 %) blocks
Max Waves: 222 -> 221 (-0.45 %)
Wait states: 0 -> 0 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-16 12:13:01 +01:00
Erik Faye-Lund
544b088616 win32: unify strcasecmp definitions
There was two incompatible definitions of strcasecmp, which lead to a
compiler warning. Let's clean this up by only leaving one of them, and
using that one all the time.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:44 +02:00
Erik Faye-Lund
c646cd4bac nir: avoid warning when casting bogus pointer
This intentionally-bogus pointer generates a warning on some 64-bit
systems, so let's cast to a properly-sized integer first.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:35 +02:00
Erik Faye-Lund
b355eef920 glsl: fixup u64-warning
Similarly to the unsigned-version, we need to first cast the result to a
suiting integer before negating the number, otherwise we'll trigger a
warning.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-15 20:23:13 +02:00
Eric Engestrom
a3d6024199 meson: add nir tests to the compiler/nir test suite
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-14 22:17:06 +01:00
Ian Romanick
0e6581b87d nir/algebraic: Reassociate shift-by-constant of shift-by-constant
v2: After some review discussion with Alyssa, the replacements now
correct account for cases where (b+c) >= bitsize.

v3: Use a temporary to simplify the Python code quite a bit.  Suggested
by Jason.

Haswell and all Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16251155 -> 16249576 (<.01%)
instructions in affected programs: 232627 -> 231048 (-0.68%)
helped: 547
HURT: 1
helped stats (abs) min: 1 max: 15 x̄: 2.89 x̃: 3
helped stats (rel) min: 0.04% max: 7.84% x̄: 1.14% x̃: 1.06%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
95% mean confidence interval for instructions value: -3.12 -2.65
95% mean confidence interval for instructions %-change: -1.20% -1.06%
Instructions are helped.

total cycles in shared programs: 365924392 -> 365372103 (-0.15%)
cycles in affected programs: 59207053 -> 58654764 (-0.93%)
helped: 497
HURT: 34
helped stats (abs) min: 1 max: 29300 x̄: 1118.16 x̃: 16
helped stats (rel) min: <.01% max: 10.59% x̄: 1.82% x̃: 1.82%
HURT stats (abs)   min: 2 max: 424 x̄: 101.03 x̃: 63
HURT stats (rel)   min: 0.07% max: 46.17% x̄: 4.72% x̃: 2.06%
95% mean confidence interval for cycles value: -1426.41 -653.77
95% mean confidence interval for cycles %-change: -1.66% -1.15%
Cycles are helped.

total spills in shared programs: 8870 -> 8871 (0.01%)
spills in affected programs: 104 -> 105 (0.96%)
helped: 0
HURT: 1

Ivy Bridge and all pre-Gen7 platforms had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11956236 -> 11955635 (<.01%)
instructions in affected programs: 94110 -> 93509 (-0.64%)
helped: 106
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 5.67 x̃: 4
helped stats (rel) min: 0.12% max: 4.71% x̄: 1.96% x̃: 0.76%
95% mean confidence interval for instructions value: -6.62 -4.72
95% mean confidence interval for instructions %-change: -2.27% -1.64%
Instructions are helped.

total cycles in shared programs: 179296340 -> 178788044 (-0.28%)
cycles in affected programs: 51009603 -> 50501307 (-1.00%)
helped: 82
HURT: 7
helped stats (abs) min: 5 max: 27820 x̄: 6199.00 x̃: 16
helped stats (rel) min: 0.30% max: 8.16% x̄: 2.58% x̃: 3.11%
HURT stats (abs)   min: 2 max: 8 x̄: 3.14 x̃: 2
HURT stats (rel)   min: 0.02% max: 1.40% x̄: 0.34% x̃: 0.10%
95% mean confidence interval for cycles value: -7649.38 -3773.00
95% mean confidence interval for cycles %-change: -2.71% -1.99%
Cycles are helped.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v2]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-14 11:15:37 -07:00
Ian Romanick
73aaeac0a3 nir/algebraic: Reassociate add-and-shift to be shift-and-add
A common thing in many shaders:

    uniform vs { vec4 bones[...]; };

    ...

    x = some_calculation(bones[i + 0]);
    y = some_calculation(bones[i + 1]);
    z = some_calculation(bones[i + 2]);

This turns into stuff like

    vec1 32 ssa_12 = iadd ssa_11, ssa_0
    vec1 32 ssa_13 = ishl ssa_12, ssa_3
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_15 = iadd ssa_11, ssa_1
    vec1 32 ssa_16 = ishl ssa_15, ssa_3
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_18 = iadd ssa_11, ssa_2
    vec1 32 ssa_19 = ishl ssa_18, ssa_3
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

By reassociating the shift and the add, we can reduce this to

    vec1 32 ssa_12 = ishl ssa_11, ssa_3
    vec1 32 ssa_13 = iadd ssa_12, ssa_0
    vec1 32 ssa_14 = intrinsic load_ssbo (ssa_7, ssa_13) (16, 4, 0)
    vec1 32 ssa_16 = iadd ssa_12, ssa_1
    vec1 32 ssa_17 = intrinsic load_ssbo (ssa_7, ssa_16) (16, 4, 0)
    vec1 32 ssa_19 = iadd ssa_12, ssa_2
    vec1 32 ssa_20 = intrinsic load_ssbo (ssa_7, ssa_19) (16, 4, 0)

v2: Add some commentary from Rhys Perry's nearly identical patch.

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16277758 -> 16250704 (-0.17%)
instructions in affected programs: 1440284 -> 1413230 (-1.88%)
helped: 4920
HURT: 6
helped stats (abs) min: 1 max: 69 x̄: 5.50 x̃: 4
helped stats (rel) min: 0.10% max: 18.33% x̄: 2.21% x̃: 1.79%
HURT stats (abs)   min: 1 max: 12 x̄: 4.50 x̃: 3
HURT stats (rel)   min: 0.18% max: 3.23% x̄: 1.91% x̃: 2.55%
95% mean confidence interval for instructions value: -5.67 -5.31
95% mean confidence interval for instructions %-change: -2.26% -2.16%
Instructions are helped.

total cycles in shared programs: 367118526 -> 365895358 (-0.33%)
cycles in affected programs: 93504145 -> 92280977 (-1.31%)
helped: 2754
HURT: 1269
helped stats (abs) min: 1 max: 47039 x̄: 460.66 x̃: 16
helped stats (rel) min: <.01% max: 34.93% x̄: 3.77% x̃: 1.12%
HURT stats (abs)   min: 1 max: 1500 x̄: 35.85 x̃: 9
HURT stats (rel)   min: 0.01% max: 17.35% x̄: 2.18% x̃: 0.75%
95% mean confidence interval for cycles value: -387.31 -220.78
95% mean confidence interval for cycles %-change: -2.11% -1.68%
Cycles are helped.

LOST:   1
GAINED: 1

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-14 11:15:32 -07:00
Andrii Simiklit
ff2225cf88 nir/find_array_copies: Reject copies with mismatched lengths
copy_deref for wildcard dereferences requires the same
arrays lengths otherwise it leads to a crash in optimizations
like 'nir_opt_copy_prop_vars' because these optimizations expect
'copy_deref' just for arrays with the same lengths.

v2: check was moved to 'try_match_deref' to fix aoa cases
                 (Jason Ekstrand <jason@jlekstrand.net>)
v3: -fixed comment
    -the condition merged with other one
                 (Jason Ekstrand <jason@jlekstrand.net>)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111286
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
2019-08-14 18:11:31 +00:00
Ian Romanick
f2965fde9b nir/range-analysis: Fail gracefully on non-SSA sources
Tested-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-14 09:02:38 -07:00
Iago Toral Quiroga
48f5c34301 nir: add a pass to clamp gl_PointSize to a range
The OpenGL and OpenGL ES specs require that implementations clamp the
value of gl_PointSize to an implementation-depedent range. This pass
is useful for any GPU hardware that doesn't do this automatically
for either one or both sides of the range, such as V3D.

v2:
 - Turn into a generic NIR pass (Eric).
 - Make the pass work before lower I/O so we can use the deref variable
   to inspect if we are writing to gl_PointSize (Eric).
 - Make the pass take the range to clamp as parameter and allow it
   to clamp to both sides of the range or just one side.
 - Make the pass report progress.

v3:
 - Fix copyright header (Eric)
 - use fmin/fmax instead of bcsel to clamp (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-13 09:44:12 +02:00
Rhys Perry
7740149852 nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo
v2: add to series
v3: update Makefile.sources
v4: don't remove a comment and break statement
v4: use nir_can_move_instr

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 22:01:30 +00:00
Rhys Perry
da8ed68aca nir: replace nir_move_load_const() with nir_opt_sink()
This is mostly the same as nir_move_load_const() but can also move
undef instructions, comparisons and some intrinsics (being careful with
loops).

v2: actually delete nir_move_load_const.c
v3: fix nir_opt_sink() usage in freedreno
v3: update Makefile.sources
v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def
v4: handle if uses
v4: fix handling of nested loops
v5: re-write adjust_block_for_loops
v5: re-write setting of use_block for if uses

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-08-12 22:01:30 +00:00
Marek Olšák
9c7746ceae compiler: add SYSTEM_VALUE_TESS_LEVEL_OUTER/INNER_DEFAULT
TCS system values for internal passthru TCS, needed by radeonsi NIR support

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Marek Olšák
1b881852bc compiler: add SYSTEM_VALUE_USER_DATA_AMD
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák
f0ccc5457a compiler: add shader_info.cs.user_data_components_amd 2019-08-12 14:52:17 -04:00
Marek Olšák
028dbd35ba compiler: add shader_info.vs.blit_sgprs_amd
for internal radeonsi shaders
2019-08-12 14:52:17 -04:00
Marek Olšák
9fb2fd0b43 compiler: add ACCESS_STREAM_CACHE_POLICY
radeonsi will use this.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-08-12 14:52:17 -04:00
Kenneth Graunke
5180a222c0 glsl: Optimize the SoftFP64 shader when first creating it.
By optimizing the shader before inlining, we avoid having to redo this
work for each inlined copy of a function.  It should also reduce the
memory consumption a bit.

This cuts the KHR-GL46.arrays_of_arrays_gl.SubroutineFunctionCalls2
runtime by 25% on my Icelake.  That test compiles many shaders, which
contain large types (dmat4) and division (expensive operations).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-08-12 10:42:32 -07:00
Caio Marcelo de Oliveira Filho
5ed4e31c08 spirv: Drop lower_workgroup_access_to_offsets
Intel drivers are not using this anymore, and turnip still don't have
Compute Shaders, so won't make a difference to stop using this option.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-08-10 22:15:35 -07:00
Rhys Perry
fd73ed1bd7 nir: add nir_lower_to_explicit()
v2: use glsl_type_size_align_func
v2: move get_explicit_type() to glsl_types.cpp/nir_types.cpp
v2: use align() instead of util_align_npot()
v2: pack arrays a bit tighter
v2: rename mem_* to field_*
v2: don't attempt to handle when struct offsets are already set
v2: use column_type() instead of recreating it
v2: use a branch instead of |= in nir_lower_to_explicit_impl()
v2: assign locations to variables and update shared_size and num_shared
v2: allow the pass to be used with nir_var_{shader_temp,function_temp}
v4: rebase
v5: add TODO
v5: small formatting changes
v5: remove incorrect assert in get_explicit_type()
v5: rename to nir_lower_vars_to_explicit_types
v5: correctly update progress when only variables are updated
v5: rename get_explicit_type() to get_explicit_shared_type()
v5: add comment explaining how get_explicit_shared_type() is different
v5: update cast strides
v6: update progress when lowering nir_var_function_temp variables
v6: formatting changes
v6: add more detailed documentation comment for get_explicit_shared_type
v6: rename get_explicit_shared_type to get_explicit_type_for_size_align
v7: fix comment in nir_lower_vars_to_explicit_types_impl()

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Rhys Perry
8bd2e138f5 nir/lower_explicit_io: add nir_var_mem_shared support
v2: require nir_address_format_32bit_offset instead
v3: don't call nir_intrinsic_set_access() for shared atomics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-08-08 12:10:39 -05:00
Erik Faye-Lund
75097114d9 spirv: fixup signature
This avoids a warning on some compiler, complaining about implicitly
casting the function-pointer.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: d482a8f "spirv: Update the OpenCL.std.h header"
Acked-by: Eric Engestrom <eric@engestrom.ch>
2019-08-08 18:20:29 +02:00
Connor Abbott
e7fd90e8ef nir/builder: Add nir_b2i
Same as nir_b2f but for integers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 18:03:10 -04:00
Pierre-Eric Pelloux-Prayer
a9ec718652 nir: add atomic_inc_wrap/atomic_dec_wrap image intrinsics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:02 -04:00
Pierre-Eric Pelloux-Prayer
fc0a2e5d01 glsl: add EXT_shader_image_load_store new image functions
This extension has 2 functions that are missing from the ARB versions:
- imageAtomicIncWrap
- imageAtomicDecWrap

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:41:00 -04:00
Pierre-Eric Pelloux-Prayer
70a47fb032 glsl: add EXT_shader_image_load_store keywords to lexer
All of them already existed for ARB_shader_image_load_store.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:58 -04:00
Pierre-Eric Pelloux-Prayer
cfba168b6c glsl: add size qualifiers from EXT_shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:56 -04:00
Pierre-Eric Pelloux-Prayer
cd45d09226 glsl: handle differences between ARB/EXT versions of shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-06 17:40:55 -04:00
Antia Puentes
954224b714 nir/spirv: Fix gl_BaseVertex for non-indexed draws for OpenGL
Lowers BaseVertex to the correct system value for OpenGL.

v2: use options->environment rather than adding a new flag to
    spirv_to_nir_options

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-06 09:11:27 -07:00
Jonathan Marek
b514f41183 glcpp: use pre-expansion line number for __LINE__
Fixes the following deqp tests:
dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.line_2_*

It don't see the spec requiring this, but it seems to be better, as the
clang preprocessor for example has this behavior.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-08-06 11:27:04 +00:00