Commit graph

1561 commits

Author SHA1 Message Date
Timothy Arceri
49f3439089 glsl: make uniform values helper available for use elsewhere
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
bb16cf805d glsl: cache some more image metadata
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:43 +11:00
Timothy Arceri
a3ff840d05 glsl: add support for caching atomic buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
3d15d814c0 glsl: add shader cache support for buffer blocks
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
6761259958 glsl: store subroutine remap table in shader cache
V2: use new helpers to store/restore table entries.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
787535fb11 glsl: add support for caching subroutines
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
0057de58f9 glsl: add support for caching shaders with xfb qualifiers
For now this disables the shader cache when transform feedback is
enabled via the GL API as we don't currently allow for it when
generating the sha for the shader.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
3bbfee3cd3 glsl: add shader cache support for samplers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
c4cff5f402 glsl: add basic support for resource list to shader cache
This initially adds support for simple uniforms and varyings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
3c45d8f464 glsl: fix uniform remap table cache when explicit locations used
V2: don't store pointers use an enum instead to flag what should be
restored. Also do the work in a helper that we will later use for
the subroutine remap table.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Carl Worth
a01973a784 glsl: Serialize three additional hash tables with program metadata
The three additional tables are AttributeBindings, FragDataBindings,
and FragDataIndexBindings.

The first table (AttributeBindings) was identified as missing by
trying to test the shader cache with a program that called
glGetAttribLocation.

Many thanks to Tapani Pälli <tapani.palli@intel.com>, as it was review
of related work that he had done previously that pointed me to the
necessity to also save and restore FragDataBindings and
FragDataIndexBindings.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
e5bb4a0b0f glsl: use correct shader source in case of cache fallback
The scenario is:

glShaderSource
glCompileShader <-- deferred due to cache hit of shader

glShaderSource <-- with new source code

glAttachShader
glLinkProgram <-- no cache hit for program

At this point we need to compile the original source when we
fallback.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
8771940682 glsl: make use of on disk shader cache
The hash key for glsl metadata is a hash of the hashes of each GLSL
source string.

This commit uses the put_key/get_key support in the cache put the SHA-1
hash of the source string for each successfully compiled shader into the
cache. This allows for early, optimistic returns from glCompileShader
(if the identical source string had been successfully compiled in the past),
in the hope that the final, linked shader will be found in the cache.

This is based on the intial patch by Carl.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Timothy Arceri
34ca0fce22 glsl: add initial implementation of shader cache
This uses disk_cache.c to write out a serialization of various
state that's required in order to successfully load and use a
binary written out by a drivers backend, this state is referred to as
"metadata" throughout the implementation.

This initial version is intended to work with all stages beside
compute.

This patch is based on the initial work done by Carl.

V2: extend the file's doxygen comment to cover some of the
design decisions.

V3:
- skip cache for fixed function shaders
- add int64 support
- fix glsl IR program parameter caching/restore and cache the
  parameter values which are used by gallium backends.
- use new link status enum

V4:
- add compute program support

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 11:18:42 +11:00
Dave Airlie
03f4982c68 nir: handle some 64-bit integer conversions
These are enough for the spir-v generator to handle UConvert
and SConvert operations, and fix the 4 tests in CTS.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:13:21 +10:00
Dave Airlie
adb9555794 nir: handle 64-bit integer types in glsl->nir type conversion.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:13:14 +10:00
Dave Airlie
14167080e2 spirv: handle SpvOpUConvert in proper place.
This was falling into the quantizetof16 path.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:11:59 +10:00
Dave Airlie
2d0b145902 spirv: add support for Int64 capability
This just adds the support at the spirv->nir level for the Int64
cap.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:11:13 +10:00
Dave Airlie
48ebdbecc5 spirv/nir: add support for int64
This adds the spirv->nir conversion for int64 types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:11:05 +10:00
Dave Airlie
7593f2ac1b nir/types: add C accessors for 64-bit integer types.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-16 14:10:45 +10:00
Bas Nieuwenhuizen
501a4c0d73 spirv: Add support for SpvCapabilityStorageImageReadWithoutFormat.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-15 21:18:18 +01:00
Kenneth Graunke
a3e4fa5495 glsl: Handle packed_type == ivec4[] in lower_packed_varyings().
For GS input arrays, we may turn a packed_type of ivec4 into an
array of ivec4s.  We still want flat qualification.

Found by inspection.  Not known to help anything.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2017-02-14 14:47:40 -08:00
Alex Smith
94d48b7f9f spirv: Add support for SpvCapabilityStorageImageWriteWithoutFormat
Allow that capability if the driver indicates that it is supported, and
flag whether images are read-only/write-only in the nir_variable (based
on the NonReadable and NonWritable decorations), which drivers may need
to implement this.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-14 08:16:52 -08:00
Iago Toral Quiroga
5c6eaa1421 nir/spirv: do not require a format with images that are not sampled
As soon as we support shaderStorageImageWriteWithoutFormat we can see
write-only images (sampled == 2) that don't have a format specified.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-02-14 08:16:52 -08:00
Anuj Phogat
5e2909e732 mesa: Add EXT_frag_depth bits and enable it on all drivers
Passes the newly added piglit test for this extension on i965.

V2: Fix comments by Ilia.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-02-13 16:08:40 -08:00
Kenneth Graunke
57dc6d80a0 glsl: Drop resize-to-MaxPatchVertices hack.
TCS and TES inputs without an array size are implicitly sized to
gl_MaxPatchVertices.  But TCS outputs are apparently not:

   "If no size is specified, it will be taken from the output patch size
    (gl_VerticesOut) declared in the shader."

Fixes dEQP-GLES31.functional.program_interface_query.program_output.
array_size.separable_tess_ctrl.var.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-02-12 21:09:25 -08:00
Kenneth Graunke
e99df398f1 glsl: Update a comment about link errors for TCS && !TES.
OpenGL ES actually has spec text to prohibit this.  It's just OpenGL
that's confusing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-02-12 21:09:21 -08:00
Jose Maria Casanova Crespo
5bc222ebaf glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1
From GLSL ES 3.10 spec, section 4.1.9 "Arrays":

"If an array is declared as the last member of a shader storage block
 and the size is not specified at compile-time, it is sized at run-time.
 In all other cases, arrays are sized only at compile-time."

In desktop GLSL it is allowed to have unsized-arrays that are
not last, as long as we can determine that they are implicitly
sized, which is detected at link-time.

With this patch Mesa reports a compilation error as glslang does with
the following shader:

buffer SSBO { vec4 data[]; vec4 moreData;};
void main (void)
{
}

Fixes:
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-10 23:14:12 -08:00
Matt Turner
d7a0486a9e glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=...
Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed
an OpenGL 3.3 compatibility profile context (with various unimplemented
features and bugs), but still refused to compile shaders with

   #version 330 compatibility

This patch simply adds a small bit of plumbing to let that through.

Of course the same caveats apply: compatibility profile is still not
supported (and will not be supported), so there are no guarantees that
anything will work.

Tested-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-02-09 15:14:43 +00:00
Samuel Iglesias Gonsálvez
824e1bb078 nir: add opcode to perform int64 to bool conversions
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-09 10:18:34 +01:00
Timothy Arceri
0bf21519b7 glsl: add param to force shader recompile
This will be used to skip checking the cache and force a recompile.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-09 12:22:56 +11:00
Timothy Arceri
a3fd8bb8c5 st/mesa/i965: create link status enum
For the on-disk shader cache we want to be able to differentiate
between a program that was linked and one that was loaded from cache.

V2:
 - don't return the new enum directly to the application when queried,
   instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome
   corruptions when using cache.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-09 12:22:56 +11:00
Jason Ekstrand
1de3cd8a34 spirv: Add more asserts in vtn_vector_construct
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99465
2017-02-07 08:08:06 -08:00
Marc Di Luzio
21efe2528c glsl: correct compute shader checks for memoryBarrier functions
As per the spec -
"The functions memoryBarrierShared() and groupMemoryBarrier() are
available only in compute shaders; the other functions are available
in all shader types."

Conform to this by adding another delegate to check for compute
shader support instead of only whether the current stage is compute

This allows some fragment shaders in Dirt Rally to compile

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-06 21:12:33 -08:00
Lionel Landwerlin
875b15eec4 spirv: add SPV_KHR_shader_draw_parameters support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-01 15:08:33 +00:00
Lionel Landwerlin
bd46040162 compiler: add missing enums for debug
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2017-02-01 15:08:30 +00:00
Francisco Jerez
11e9ebbf15 nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:33:33 -08:00
Francisco Jerez
013d40d1ce glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:33:33 -08:00
Francisco Jerez
7215375c44 nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
See "glsl: Rewrite atan2 implementation to fix accuracy and handling
of zero/infinity." for the rationale, but note that the instruction
count benefit discussed there is somewhat less important for the SPIRV
implementation, because the current code already emitted no control
flow instructions -- Still this saves us one hardware instruction per
scalar component on Intel SKL hardware.

Fixes the following Vulkan CTS tests on Intel hardware:

    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3
    dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4
    dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2
    dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4

Note that most of the test-cases above expect IEEE-compliant handling
of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so
except for the last two the test-cases above weren't expected to pass
yet.  The reason they do is that the i965 back-end implementation of
the NIR fmin and fmax instructions is not quite GLSL-compliant (it
complies with IEEE 754 recommendations though), because fmin/fmax of a
NaN and a non-NaN argument currently always return the non-NaN
argument, which causes atan() to flush NaN to one and return the
expected value.  The front-end should probably not be relying on this
behavior for correctness though because other back-ends are likely to
behave differently -- A follow-up patch will handle the atan2(±∞, ±∞)
corner cases explicitly.

v2: Fix up argument scaling to take into account the range and
    precision of exotic FP24 hardware.  Flip coordinate system for
    arguments along the vertical line as if they were on the left
    half-plane in order to avoid division by zero which may give
    unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
    some more comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-01-31 10:33:27 -08:00
Francisco Jerez
e9ffd12827 glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
This addresses several issues of the current atan2 implementation:

 - Negative zero (and negative denorms which end up getting flushed to
   zero) isn't handled correctly by the current implementation.  The
   reason is that it does 'y >= 0' and 'x < 0' comparisons to decide
   on which side of the branch cut the argument is, which causes us to
   return incorrect results (off by up to 2π) for very small negative
   values.

 - There is a serious precision problem for x values of large enough
   magnitude introduced by the floating point division operation being
   implemented as a mul+rcp sequence.  This can lead to the quotient
   getting flushed to zero in some cases introducing an error of over
   8e6 ULP in the result -- Or in the most catastrophic case will
   cause us to return NaN instead of the correct value ±π/2 for y=±∞
   and x very large.  We can fix this easily by scaling down both
   arguments when the absolute value of the denominator goes above
   certain threshold.  The error of this atan2 implementation remains
   below 25 ULP in most of its domain except for a neighborhood of y=0
   where it reaches a maximum error of about 180 ULP.

 - It emits a bunch of instructions including no less than three
   if-else branches per scalar component that don't seem to get
   optimized out later on.  This implementation uses about 13% less
   instructions on Intel SKL hardware and doesn't emit any control
   flow instructions.

v2: Fix up argument scaling to take into account the range and
    precision of exotic FP24 hardware.  Flip coordinate system for
    arguments along the vertical line as if they were on the left
    half-plane in order to avoid division by zero which may give
    unspecified results on non-GLSL 4.1-capable hardware.  Sprinkle in
    some more comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2017-01-31 10:32:45 -08:00
Francisco Jerez
7ec3af3f8f glsl/ir_builder: Add rcp builder.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:32:43 -08:00
Francisco Jerez
6643a97de3 glsl: Fix constant evaluation of the rcp op.
Will avoid a regression in a future commit that introduces some
additional rcp operations.  According to the GLSL 4.10 specification:

"Dividing by 0 results in the appropriately signed IEEE Inf."

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
2017-01-31 10:32:43 -08:00
Bartosz Tomczyk
fc27181f9e glsl: fix heap-buffer-overflow
The `end+1` skips the ']', whereas the `strlen+1` includes the final
'\0' in the move to terminate the string.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-01-31 15:58:52 +01:00
Carl Worth
b8cb1a05cd glsl: add new uniform fields to be used to restore state from cache
Signed-off-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 09:51:30 +11:00
Carl Worth
0f60c6616e glsl: Switch to disable-by-default for the GLSL shader cache
The shader cache is expected to be developed incrementally over a
fairly long series of commits. For that period of instability, we
require users to opt into the shader cache by setting:

	MESA_GLSL_CACHE_ENABLE=1

In the future, when the shader cache is complete, we can revert this
commit so that the cache will be on by default.

The user can always disable the cache with
MESA_GLSL_CACHE_DISABLE=1. That functionality is not affected by this
commit, (nor will it be affected by the future revert).

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-01-31 09:51:30 +11:00
Emil Velikov
74a174e12f glsl: remove explicit __STDC_FORMAT_MACROS define
Correctly handled by all the build systems.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-01-27 17:56:57 +00:00
Emil Velikov
cf00cc72e9 nir: add extra const notation in compare_blocks()
MSVC warns about different const qualifiers. Add the extra const to
silence it.

nir_phi_builder.c(244) : warning C4090: 'initializing' : different 'const' qualifiers
nir_phi_builder.c(245) : warning C4090: 'initializing' : different 'const' qualifiers

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-01-27 17:56:56 +00:00
Emil Velikov
a2dea3b654 nir: silence implicit conversion to 64bit
MSVC warns about implicit conversion as below. Annotate the literal
appropriately to silence the warning.

nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-01-27 17:56:56 +00:00
Lionel Landwerlin
bbe8705c57 spirv: handle undefined components for OpVectorShuffle
Fixes:
   dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
   dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:31:21 +00:00
Lionel Landwerlin
df7063cba3 spirv: handle OpUndef as part of the variable parsing pass
Looking at the following bit of SPIRV shader :

...
%zero        = OpConstant %i32 0
%ivec3_0     = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef  = OpUndef %ivec3
%sc_0        = OpSpecConstant %i32 0
%sc_1        = OpSpecConstant %i32 0
%sc_2        = OpSpecConstant %i32 0
...

Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-01-26 17:29:29 +00:00