Commit graph

5457 commits

Author SHA1 Message Date
Timothy Arceri
dfe2f19855 st/nir: fix atomic lowering for gallium drivers
i965 and gallium handle the atomic buffer index differently. It was
just by luck that the single piglit test for this was passing.

For gallium we use the atomic binding so that we match the handling
in st_bind_atomics().

On radeonsi this fixes the CTS test:
KHR-GL43.shader_storage_buffer_object.advanced-write-fragment

It also fixes tressfx hair rendering in Tomb Raider.

Reviewed-by: Marek Olšák  <marek.olsak@amd.com>
2018-03-20 14:29:53 +11:00
Timothy Arceri
ffa4bbe466 st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker
This will only ever be used by gallium drivers so it probably doesn't
belong in the nir toolkit. Also we want to pass it some non NIR
things in the following patch.

To avoid regressions we wrap the lowering calls that have been moved
to st_glsl_to_nir with a quick hack so that they are only called for
radeonsi, we will replace the hack with a check for uniform packing
in a following patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:34 +11:00
Timothy Arceri
edded12376 mesa: rework ParameterList to allow packing
Currently everything is padded to 4 components. Making the list
more flexible will allow us to do uniform packing.

V2 (suggestions from Nicolai):
- always pass existing calls to _mesa_add_parameter() true for padd_and_align
- fix bindless param value offsets
- remove left over wip logic from pad and align code
- zero out param value padding
- whitespace fix

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-03-20 14:17:33 +11:00
Ian Romanick
6aeaa7d363 nir: Don't compare b2f or b2i with zero
All of the shaders that had loops changed were in Tomb Raider.  The one
shader that lost SIMD16 is one of those.

Skylake
total instructions in shared programs: 14391653 -> 14390468 (<.01%)
instructions in affected programs: 111891 -> 110706 (-1.06%)
helped: 501
HURT: 0
helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1
helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01%
95% mean confidence interval for instructions value: -3.23 -1.50
95% mean confidence interval for instructions %-change: -1.77% -1.45%
Instructions are helped.

total cycles in shared programs: 532793024 -> 532776598 (<.01%)
cycles in affected programs: 987682 -> 971256 (-1.66%)
helped: 348
nnHURT: 41
helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18
helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68%
HURT stats (abs)   min: 1 max: 422 x̄: 65.39 x̃: 24
HURT stats (rel)   min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02%
95% mean confidence interval for cycles value: -64.08 -20.38
95% mean confidence interval for cycles %-change: -2.78% -1.23%
Cycles are helped.

total loops in shared programs: 4854 -> 4829 (-0.52%)
loops in affected programs: 27 -> 2 (-92.59%)
helped: 18
HURT: 0

LOST:   1
GAINED: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 13:52:35 -07:00
Jordan Justen
b5baaee0d6 glsl/serialize: Save shader program metadata sha1
When the shader cache is used, this can be generated. In fact, the
shader cache uses this sha1 to lookup the serialized GL shader
program.

If a GL shader program is restored with ProgramBinary, the shaders are
not available, and therefore the correct sha1 cannot be generated. If
this is restored, then we can use the shader cache to restore the
binary programs to the program that was loaded with ProgramBinary.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
9b473f9e3c glsl: Remove api_enabled tracking for transform feedback
We used this to prevent usage of the disk shader cache when transform
feedback was enabled via the GL API. This is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Jordan Justen
6d830940f7 glsl/shader_cache: Allow shader cache usage with transform feedback
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Suggested-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-19 09:57:09 -07:00
Samuel Pitoiset
af355aaa07 nir: add nir_opt_move_load_ubo() optimization pass
This pass moves load UBO operations just before their first use,
loosely based on nir_opt_move_comparisons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-03-16 09:50:31 +01:00
Alejandro Piñeiro
50767214a7 spirv/radv: add AMD_gcn_shader capability, remove current extensions
So now, during spirv_to_nir, it uses the capability instead of the
extension. Note that we are really doing here is treating
SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader
is not the first SPV extension supported. For example, the capability
draw_parameters infers if the extension SPV_KHR_shader_draw_parameters
is supported or not.

This could be seen as counter-intuitive, and that it would be easier
to define which extensions are supported, and based our checks on
that, but we need to take into account that some capabilities are
optional from core, and others came from new extensions.

Also this commit would make the implementation of ARB_spirv_extensions
easier.

v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann)

Reviewed-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-15 12:08:25 +01:00
Samuel Iglesias Gonsálvez
adf58e59d3 spirv: update arguments for vtn_nir_alu_op_for_spirv_opcode()
We don't need anymore the source and destination's data type, just
their bitsize.

v2:
- Use glsl_get_bit_size () instead (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:56:15 +01:00
Samuel Iglesias Gonsálvez
ce2fd87056 spirv: fix the translation of SPIR-V conversion opcodes to NIR
There are some SPIRV opcodes (like UConvert and SConvert) have some
expectations of the output that doesn't depend on the operands
data type. Generalize the solution of all of them.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-15 08:51:01 +01:00
Thomas Helland
5f129c05e6 glsl: Use hash table cloning in copy propagation
Walking the whole hash table, inserting entries by hashing them first
is just a really bad idea. We can simply memcpy the whole thing.

V2: Remove leftover creation of acp in two places

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-03-14 19:52:02 +01:00
Karol Herbst
b617bfcccf compiler: int8/uint8 support
OpenCL kernels also have int8/uint8.

v2: remove changes in nir_search as Jason posted a patch for that

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-03-14 10:08:42 -04:00
Neil Roberts
25a966a23d spirv: Handle doubles when multiplying a mat by a scalar
The code to handle mat multiplication by a scalar tries to pick either
imul or fmul depending on whether the matrix is float or integer.
However it was doing this by checking whether the base type is float.
This was making it choose the int path for doubles (and presumably
float16s).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-14 08:43:33 +01:00
Rob Clark
4e4428482e nir: lower_load_const_to_scalar fix for 8/16b types
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-03-13 20:17:04 -04:00
Jason Ekstrand
3d1d7e8561 nir/subgroups: Add lowering for vote_ieq/vote_feq to a ballot
This is based heavily on 97f10934ed, "ac/nir: Add vote_ieq/vote_feq
lowering pass." from Bas Nieuwenhuizen.  This version is a bit more
general since it's in common code.  It also properly handles NaN due to
not flipping the comparison for floats.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-13 13:25:15 -07:00
Eric Anholt
191bc7ce61 spirv: Silence compiler warning about undefined srcs[0]
v2: Use assume() at the srcs[] definition instead.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-03-13 10:32:55 -07:00
Ian Romanick
6878c9aabc nir: Don't i2b a value that is already Boolean
A bunch of shaders have sequences like:

    i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? -1 : 0))))

Other optimizations (and NIR's typeless nature) reduce this to

    i2b(x == y)

which is silly.

Skylake
total instructions in shared programs: 14498698 -> 14497948 (<.01%)
instructions in affected programs: 74480 -> 73730 (-1.01%)
helped: 277
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68%
95% mean confidence interval for instructions value: -3.35 -2.06
95% mean confidence interval for instructions %-change: -1.74% -1.16%
Instructions are helped.

total cycles in shared programs: 532015500 -> 531999238 (<.01%)
cycles in affected programs: 5943878 -> 5927616 (-0.27%)
helped: 251
HURT: 74
helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14
helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53%
HURT stats (abs)   min: 1 max: 4550 x̄: 214.04 x̃: 15
HURT stats (rel)   min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33%
95% mean confidence interval for cycles value: -158.51 58.43
95% mean confidence interval for cycles %-change: -1.07% -0.04%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4753 -> 4735 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

Haswell and Broadwell had simliar results. (Broadwell shown)
total instructions in shared programs: 14791877 -> 14791127 (<.01%)
instructions in affected programs: 77326 -> 76576 (-0.97%)
helped: 278
HURT: 1
helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49%
95% mean confidence interval for instructions value: -3.33 -2.05
95% mean confidence interval for instructions %-change: -1.70% -1.13%
Instructions are helped.

total cycles in shared programs: 558250067 -> 558252872 (<.01%)
cycles in affected programs: 5806328 -> 5809133 (0.05%)
helped: 235
HURT: 83
helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16
helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51%
HURT stats (abs)   min: 1 max: 10590 x̄: 265.19 x̃: 20
HURT stats (rel)   min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54%
95% mean confidence interval for cycles value: -89.87 107.51
95% mean confidence interval for cycles %-change: -1.06% -0.32%
Inconclusive result (value mean confidence interval includes 0).

total loops in shared programs: 4735 -> 4717 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

total fills in shared programs: 83111 -> 83110 (<.01%)
fills in affected programs: 28 -> 27 (-3.57%)
helped: 1
HURT: 0

Ivy Bridge
total instructions in shared programs: 11774173 -> 11773436 (<.01%)
instructions in affected programs: 70819 -> 70082 (-1.04%)
helped: 267
HURT: 0
helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2
helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63%
95% mean confidence interval for instructions value: -3.51 -2.01
95% mean confidence interval for instructions %-change: -1.94% -1.21%
Instructions are helped.

total cycles in shared programs: 257153833 -> 257148932 (<.01%)
cycles in affected programs: 585341 -> 580440 (-0.84%)
helped: 167
HURT: 100
helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16
helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88%
HURT stats (abs)   min: 1 max: 200 x̄: 25.95 x̃: 16
HURT stats (rel)   min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65%
95% mean confidence interval for cycles value: -33.25 -3.46
95% mean confidence interval for cycles %-change: -1.47% -0.54%
Cycles are helped.

total loops in shared programs: 3416 -> 3398 (-0.53%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   2
GAINED: 0

Sandy Bridge
total instructions in shared programs: 10499306 -> 10499094 (<.01%)
instructions in affected programs: 6051 -> 5839 (-3.50%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2
helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45%
95% mean confidence interval for instructions value: -7.66 -2.20
95% mean confidence interval for instructions %-change: -5.47% -3.12%
Instructions are helped.

total cycles in shared programs: 145862568 -> 145861370 (<.01%)
cycles in affected programs: 61733 -> 60535 (-1.94%)
helped: 36
HURT: 2
helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35
helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81%
HURT stats (abs)   min: 18 max: 102 x̄: 60.00 x̃: 60
HURT stats (rel)   min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48%
95% mean confidence interval for cycles value: -41.28 -21.77
95% mean confidence interval for cycles %-change: -6.16% -3.00%
Cycles are helped.

total loops in shared programs: 1803 -> 1785 (-1.00%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.

LOST:   4
GAINED: 0

No changes on Iron Lake of GM45.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-08 15:26:26 -08:00
Ian Romanick
54e8d2268d nir: Narrow some dot product operations
On vector platforms, this helps elide some constant loads.

v2: Reorder the transformations.

No changes on Broadwell or Skylake.

Haswell
total instructions in shared programs: 13093793 -> 13060163 (-0.26%)
instructions in affected programs: 1277532 -> 1243902 (-2.63%)
helped: 13216
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.57 -2.49
95% mean confidence interval for instructions %-change: -3.65% -3.54%
Instructions are helped.

total cycles in shared programs: 409580819 -> 409268463 (-0.08%)
cycles in affected programs: 71730652 -> 71418296 (-0.44%)
helped: 9898
HURT: 2352
helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16
helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50%
HURT stats (abs)   min: 2 max: 276 x̄: 23.25 x̃: 6
HURT stats (rel)   min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97%
95% mean confidence interval for cycles value: -33.19 -17.80
95% mean confidence interval for cycles %-change: -4.50% -4.26%
Cycles are helped.

total fills in shared programs: 82059 -> 82052 (<.01%)
fills in affected programs: 21 -> 14 (-33.33%)
helped: 7
HURT: 0

Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown)
total instructions in shared programs: 11811851 -> 11780605 (-0.26%)
instructions in affected programs: 1155007 -> 1123761 (-2.71%)
helped: 12304
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86%
HURT stats (abs)   min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel)   min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.56 -2.48
95% mean confidence interval for instructions %-change: -3.71% -3.59%
Instructions are helped.

total cycles in shared programs: 257618409 -> 257316805 (-0.12%)
cycles in affected programs: 71999580 -> 71697976 (-0.42%)
helped: 9155
HURT: 2380
helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16
helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62%
HURT stats (abs)   min: 2 max: 290 x̄: 21.14 x̃: 4
HURT stats (rel)   min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33%
95% mean confidence interval for cycles value: -34.32 -17.97
95% mean confidence interval for cycles %-change: -4.55% -4.29%
Cycles are helped.

GM45 and Iron Lake had nearly identical results (Iron Lake shown)
total instructions in shared programs: 7886750 -> 7879944 (-0.09%)
instructions in affected programs: 373781 -> 366975 (-1.82%)
helped: 3715
HURT: 47
helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1
helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06%
HURT stats (abs)   min: 1 max: 6 x̄: 2.55 x̃: 2
HURT stats (rel)   min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35%
95% mean confidence interval for instructions value: -1.85 -1.77
95% mean confidence interval for instructions %-change: -2.91% -2.73%
Instructions are helped.

total cycles in shared programs: 178114636 -> 178095452 (-0.01%)
cycles in affected programs: 7227666 -> 7208482 (-0.27%)
helped: 3349
HURT: 301
helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4
helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63%
HURT stats (abs)   min: 2 max: 42 x̄: 9.13 x̃: 10
HURT stats (rel)   min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50%
95% mean confidence interval for cycles value: -5.52 -4.99
95% mean confidence interval for cycles %-change: -0.81% -0.73%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
2018-03-08 15:26:26 -08:00
Timothy Arceri
f4b877631e spirv: fix autotools builds
Fixes: 68a6a3b51a "spirv: handle AMD_gcn_shader extended instructions"

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-08 10:45:56 +11:00
Daniel Schürmann
68a6a3b51a spirv: handle AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
a1a2a8dfda nir: add AMD_gcn_shader extended instructions
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Daniel Schürmann
39437025de spirv: import AMD extensions header from glslang
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Jason Ekstrand
57bff0a546 spirv: Add support for subgroup arithmetic
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
789221dcfa nir: Add a helper for getting binop identities
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
82d493a939 nir: Add subgroup arithmetic reduction intrinsics
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
b3a5b0f3fc spirv: Add subgroup quad support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
493a165544 nir: Add quad operations and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
8256ee3fa3 spirv: Add subgroup shuffle support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
149b92ccf2 nir: Add subgroup shuffle intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0e893356fe nir/lower_subgroups: Add scalarizing for vote_eq
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
d792f3d4cd spirv: Add subgroup vote support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
44681e4795 nir: Generalize nir_intrinsic_vote_eq
The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type.  This has two implications:

 1) We need to be able to handle vectors so we switch the vote_eq
    intrinsics to be vectorized intrinsics.

 2) We need to handle floats which have different behavior with respect
    to +-0, NaN, etc. than the integer variant so we need two variants.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
9812fce60b spirv: Add subgroup ballot support
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
adc077797a spirv: Add initial subgroup support
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
5162a1d884 nir: Add new SPIR-V ballot intrinsics and lowering
Someone can make the lowering optional later if they want something
different for their hardware.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
752e969703 compiler: Add two new system values for subgroups
This will be required for SPIR-V subgroup support

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
34c60ea02b nir: Add new SPIR-V ballot ALU intrinsics and lowering
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
cc587ee9a7 spirv: Handle the new OpModuleProcessed instruction
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
ff9db1a4cc nir/spirv: Add support for device groups
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
0128187335 spirv: Update the SPIR-V headers and json to 1.3.1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
68af9f04a4 spirv: Rework barriers
Our previous handling of barriers always used the big hammer and didn't
correctly emit memory barriers when specified along with a control
barrier.  This commit completely reworks the way we emit barriers to
make things both more precise and more correct.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Jason Ekstrand
de518f38e5 spirv: Add a vtn_constant_value helper
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 12:13:47 -08:00
Simon Hausmann
fb5825e7ce glsl: Fix memory leak with known glsl_type instances
When looking up known glsl_type instances in the various hash tables, we
end up leaking the key instances used for the lookup, as the glsl_type
constructor allocates memory on the global mem_ctx. This patch changes
glsl_type to manage its own memory, which fixes the leak and also allows
getting rid of the global mem_ctx and its mutex.

v2: remove lambda usage (Tapani)
    (+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Simon Hausmann <simon.hausmann@qt.io>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-03-07 14:33:34 +02:00
Caio Marcelo de Oliveira Filho
c17808562e spirv: Add SpvCapabilityShaderViewportIndexLayerEXT
This capability allows gl_ViewportIndex and gl_Layer to also be used
as outputs in Vertex and Tesselation shaders.

v2: Make conditional to the capability, add gl_Layer, add tesselation
    shaders. (Iago)

v3: Don't export to tesselation control shader.

v4: Add Reviewd-by tag.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-03-07 07:04:20 +01:00
Timothy Arceri
1fdb21541e Revert "nir: bump loop unroll limit to 96."
This reverts commit 2d36efdb7f.

This raised limit turns out to harmful for more complex shaders,
it causes excessive spilling in some Bioshock Infinite shaders.

The fps for the ssao demo on radv remains unchanged when reverting
this.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-07 15:10:05 +11:00
Ian Romanick
e3ea166a2c nir: Simplify some comparisons like a+b < a
All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14514555 -> 14514547 (<.01%)
instructions in affected programs: 1972 -> 1964 (-0.41%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.39% max: 0.42% x̄: 0.41% x̃: 0.41%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.41% -0.40%
Instructions are helped.

total cycles in shared programs: 533141444 -> 533136780 (<.01%)
cycles in affected programs: 164728 -> 160064 (-2.83%)
helped: 181
HURT: 3
helped stats (abs) min: 2 max: 94 x̄: 26.17 x̃: 30
helped stats (rel) min: 0.12% max: 5.33% x̄: 3.42% x̃: 3.80%
HURT stats (abs)   min: 4 max: 54 x̄: 24.00 x̃: 14
HURT stats (rel)   min: 0.20% max: 2.39% x̄: 1.09% x̃: 0.68%
95% mean confidence interval for cycles value: -27.12 -23.58
95% mean confidence interval for cycles %-change: -3.54% -3.16%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533667 -> 10533539 (<.01%)
instructions in affected programs: 10148 -> 10020 (-1.26%)
helped: 124
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.03 x̃: 1
helped stats (rel) min: 0.39% max: 4.35% x̄: 2.20% x̃: 2.04%
95% mean confidence interval for instructions value: -1.06 -1.00
95% mean confidence interval for instructions %-change: -2.46% -1.95%
Instructions are helped.

total cycles in shared programs: 146136887 -> 146132122 (<.01%)
cycles in affected programs: 206382 -> 201617 (-2.31%)
helped: 171
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 27.87 x̃: 30
helped stats (rel) min: 0.08% max: 5.73% x̄: 2.98% x̃: 2.67%
95% mean confidence interval for cycles value: -29.19 -26.54
95% mean confidence interval for cycles %-change: -3.20% -2.76%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886515 -> 7886507 (<.01%)
instructions in affected programs: 3016 -> 3008 (-0.27%)
helped: 8
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.28% x̄: 0.27% x̃: 0.27%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.27% -0.26%
Instructions are helped.

total cycles in shared programs: 178100396 -> 178100388 (<.01%)
cycles in affected programs: 156128 -> 156120 (<.01%)
helped: 4
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -3.68 1.68
95% mean confidence interval for cycles %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857872 -> 4857868 (<.01%)
instructions in affected programs: 1544 -> 1540 (-0.26%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.25% max: 0.27% x̄: 0.26% x̃: 0.26%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.28% -0.24%
Instructions are helped.

total cycles in shared programs: 122167654 -> 122167662 (<.01%)
cycles in affected programs: 96248 -> 96256 (<.01%)
helped: 0
HURT: 4
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: 2.00 2.00
95% mean confidence interval for cycles %-change: <.01% 0.02%
Cycles are HURT.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:30 -08:00
Ian Romanick
d1ed4ffe0b nir: Use De Morgan's Law on logic compounded comparisons
The replacement of the comparison operators must happen during this
step.  If it does not, the next pass of nir_opt_algebraic will reapply
De Morgan's Law in the "opposite direction" before performing dead code
elimination.  The resulting infinite loop will eventually get OOM
killed.

Haswell, Broadwell, and Skylake had similar results. (Broadwell shown)
total instructions in shared programs: 14808185 -> 14808036 (<.01%)
instructions in affected programs: 13758 -> 13609 (-1.08%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 3.82 x̃: 3
helped stats (rel) min: 0.44% max: 1.55% x̄: 0.98% x̃: 1.01%
95% mean confidence interval for instructions value: -4.67 -2.97
95% mean confidence interval for instructions %-change: -1.09% -0.88%
Instructions are helped.

total cycles in shared programs: 559438333 -> 559435832 (<.01%)
cycles in affected programs: 199160 -> 196659 (-1.26%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 184 x̄: 61.50 x̃: 51
helped stats (rel) min: 0.02% max: 6.94% x̄: 1.41% x̃: 1.40%
HURT stats (abs)   min: 2 max: 40 x̄: 27.33 x̃: 40
HURT stats (rel)   min: 0.05% max: 0.74% x̄: 0.51% x̃: 0.74%
95% mean confidence interval for cycles value: -71.47 -39.69
95% mean confidence interval for cycles %-change: -1.64% -0.93%
Cycles are helped.

Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11811776 -> 11811553 (<.01%)
instructions in affected programs: 15201 -> 14978 (-1.47%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 5.72 x̃: 6
helped stats (rel) min: 0.44% max: 2.53% x̄: 1.30% x̃: 1.26%
95% mean confidence interval for instructions value: -7.21 -4.23
95% mean confidence interval for instructions %-change: -1.48% -1.12%
Instructions are helped.

total cycles in shared programs: 257617270 -> 257614589 (<.01%)
cycles in affected programs: 212107 -> 209426 (-1.26%)
helped: 45
HURT: 0
helped stats (abs) min: 2 max: 180 x̄: 59.58 x̃: 54
helped stats (rel) min: 0.02% max: 6.02% x̄: 1.30% x̃: 1.32%
95% mean confidence interval for cycles value: -74.02 -45.14
95% mean confidence interval for cycles %-change: -1.59% -1.01%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886648 -> 7886515 (<.01%)
instructions in affected programs: 14106 -> 13973 (-0.94%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.59 x̃: 4
helped stats (rel) min: 0.35% max: 1.83% x̄: 0.90% x̃: 0.81%
95% mean confidence interval for instructions value: -5.65 -3.52
95% mean confidence interval for instructions %-change: -1.03% -0.76%
Instructions are helped.

total cycles in shared programs: 178100812 -> 178100396 (<.01%)
cycles in affected programs: 67970 -> 67554 (-0.61%)
helped: 29
HURT: 0
helped stats (abs) min: 2 max: 40 x̄: 14.34 x̃: 12
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.58% x̃: 0.54%
95% mean confidence interval for cycles value: -18.30 -10.39
95% mean confidence interval for cycles %-change: -0.71% -0.45%
Cycles are helped.

GM45
total instructions in shared programs: 4857939 -> 4857872 (<.01%)
instructions in affected programs: 7426 -> 7359 (-0.90%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 4.47 x̃: 4
helped stats (rel) min: 0.33% max: 1.80% x̄: 0.87% x̃: 0.77%
95% mean confidence interval for instructions value: -6.06 -2.87
95% mean confidence interval for instructions %-change: -1.06% -0.67%
Instructions are helped.

total cycles in shared programs: 122167930 -> 122167654 (<.01%)
cycles in affected programs: 43118 -> 42842 (-0.64%)
helped: 15
HURT: 0
helped stats (abs) min: 4 max: 40 x̄: 18.40 x̃: 16
helped stats (rel) min: 0.15% max: 1.69% x̄: 0.62% x̃: 0.54%
95% mean confidence interval for cycles value: -25.03 -11.77
95% mean confidence interval for cycles %-change: -0.82% -0.41%
Cycles are helped.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
52607658ff nir: Replace fmin(b2f(a), b) with a bcsel
All of the affected shaders are HDR mappers from Serious Sam 3.

All Gen7+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516285 -> 14516273 (<.01%)
instructions in affected programs: 348 -> 336 (-3.45%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.08% max: 6.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -5.55% -3.06%
Instructions are helped.

total cycles in shared programs: 533163876 -> 533163808 (<.01%)
cycles in affected programs: 1144 -> 1076 (-5.94%)
helped: 4
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 5.80% max: 6.08% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for cycles value: -18.84 -15.16
95% mean confidence interval for cycles %-change: -6.20% -5.68%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10533321 -> 10533309 (<.01%)
instructions in affected programs: 372 -> 360 (-3.23%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 2.00% max: 5.88% x̄: 3.91% x̃: 3.85%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -4.96% -2.86%
Instructions are helped.

total cycles in shared programs: 146136632 -> 146136428 (<.01%)
cycles in affected programs: 11668 -> 11464 (-1.75%)
helped: 12
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 0.99% max: 3.44% x̄: 2.20% x̃: 2.29%
95% mean confidence interval for cycles value: -17.66 -16.34
95% mean confidence interval for cycles %-change: -2.82% -1.58%
Cycles are helped.

Iron Lake
total instructions in shared programs: 7886301 -> 7886277 (<.01%)
instructions in affected programs: 576 -> 552 (-4.17%)
helped: 12
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 6.06% x̄: 4.51% x̃: 4.65%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.30% -3.72%
Instructions are helped.

total cycles in shared programs: 178113176 -> 178113176 (0.00%)
cycles in affected programs: 2116 -> 2116 (0.00%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 1.14% max: 1.14% x̄: 1.14% x̃: 1.14%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.50% max: 0.65% x̄: 0.58% x̃: 0.58%
95% mean confidence interval for cycles value: -3.25 3.25
95% mean confidence interval for cycles %-change: -0.93% 0.94%
Inconclusive result (value mean confidence interval includes 0).

GM45
total instructions in shared programs: 4857756 -> 4857744 (<.01%)
instructions in affected programs: 294 -> 282 (-4.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.94% max: 5.71% x̄: 4.40% x̃: 4.55%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -5.71% -3.09%
Instructions are helped.

total cycles in shared programs: 122178730 -> 122178722 (<.01%)
cycles in affected programs: 700 -> 692 (-1.14%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00
Ian Romanick
b974dfee11 nir: Pull b2f out of bcsel
All platforms had similar results. (Skylake shown)
total instructions in shared programs: 14516592 -> 14516586 (<.01%)
instructions in affected programs: 500 -> 494 (-1.20%)
helped: 2
HURT: 0

total cycles in shared programs: 533167044 -> 533166998 (<.01%)
cycles in affected programs: 6988 -> 6942 (-0.66%)
helped: 2
HURT: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2018-03-06 11:17:29 -08:00