Commit graph

74187 commits

Author SHA1 Message Date
Tapani Pälli
f2fe607261 glsl: set matrix_stride for non matrices with atomic counter buffers
Patch sets matrix_stride as 0 for non matrix uniforms that are in a
atomic counter buffer. Matrix stride calculation for actual matrix
uniforms is done during link_assign_uniform_locations.

From ARB_program_interface_query specification:

GL_MATRIX_STRIDE:

   "For active variables not declared as a matrix or array of matrices,
   zero is written to <params>.  For active variables not backed by a
   buffer object, -1 is written to <params>, regardless of the variable
   type."

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
2015-11-12 14:15:29 +02:00
Tapani Pälli
7e6dac1186 mesa: validate precision of varyings during ValidateProgramPipeline
Fixes following failing ES3.1 CTS tests:

   ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingFloat
   ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingInt
   ES31-CTS.sepshaderobjs.InterfacePrecisionMatchingUInt

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-11-12 09:50:14 +02:00
Tapani Pälli
5bd122cad9 glsl: do not lose precision information when packing varyings
This information will be used by cross stage validation of varyings
for pipeline objects.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-11-12 09:50:14 +02:00
Iago Toral Quiroga
f84bc57d7d glsl: Add precision information to ir_variable
We will need this later on when we implement proper support for
precision qualifiers in the drivers and also to do link time checks for
uniforms as indicated by the spec.

This patch also adds compile-time checks for variables without precision
information (currently, Mesa only checks that a default precision is set
for floats in fragment shaders).

As indicated by Ian, the addition of the precision information to
ir_variable has been done using a bitfield and pahole to identify an
available hole so that memory requirements for ir_variable stay the
same.

v2 (Ian):
  - Avoid if-ladders by defining arrays of supported sampler names and
    indexing
    into them with type->sampler_array + 2 * type->sampler_shadow
  - Make the code that selects the precision qualifier to use an utility
    function
  - Fix a typo

v3 (Tapani):
  - rebased
  - squashed in "Precision qualifiers are not allowed on structs"
  - fixed select_gles_precision for sampler arrays
  - fixed precision_qualifier_allowed for arrays of structs

v4 (Tapani):
  - add atomic_uint handling
  - do not allow precision qualifier on images
  (issues reported by Marta)

v5 (Tapani):
  - support precision qualifier on image types

v6 (Tapani):
  - set precision qualifier on interface block members

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-11-12 09:50:13 +02:00
Iago Toral Quiroga
9a00e1a69d glsl: Move the definition of precision_qualifier_allowed
We will need this to build later patches

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-11-12 09:50:13 +02:00
Iago Toral Quiroga
e6629d814f glsl: Add user-defined default precision qualifiers to the symbol table
Notice that the spec requires that a default precision has been set for every
type used by a shader that can use a precision qualifier and does not have a
predefined precision, however, at the moment, Mesa only checks this for floats
in the fragment shader. This is probably because the GLSL ES 1.0 specs mentions
this case specifically, but GLSL ES 3.0 clarifies that the same applies to
other types:

"The fragment language has no default precision qualifier for floating point
 types. Hence for float, floating point vector and matrix variable
 declarations, either the declaration must include a precision qualifier or
 the default float precision must have been previously declared. Similarly,
 there is no default precision qualifier for the following sampler types in
 either the vertex or fragment language:

 sampler3D;
 samplerCubeShadow;
 sampler2DShadow;
 sampler2DArray;
 sampler2DArrayShadow;
 isampler2D;
 isampler3D;
 isamplerCube;
 isampler2DArray;
 usampler2D;
 usampler3D;
 usamplerCube;
 usampler2DArray;"

we will fix this in a later patch.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-11-12 09:50:13 +02:00
Iago Toral Quiroga
e3082fb273 glsl: Add default precision qualifiers to the symbol table
The GLSL ES spec specifies default precision qualifiers for certain types,
so populate the symbol table with these.

Notice that the desktop GLSL spec also indicates defaults for some types
but this is not really useful since precision qualifiers are completely
ignored in desktop GLSL.

v2: simplify and add samplerExternalOES, specified by
    OES_EGL_image_external (Tapani)

v3: add atomic_uint (reported missing by Marta)

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-11-12 09:50:13 +02:00
Iago Toral Quiroga
d6a6167354 glsl: Add API to put default precision qualifiers in the symbol table
These have scoping rules that match the ones defined for other things such
as variables, so we want them in the symbol table.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-11-12 09:50:13 +02:00
Samuel Iglesias Gonsálvez
d4fdb84f80 i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE
FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message.

This patch adjusts the number of registers written by the opcode
following what the PRM spec says about the number of registers written
by the SIMD8 and SIMD16's writeback messages for sampler messages.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-12 08:39:14 +01:00
Ben Widawsky
55314c5be4 i965/skl/gt4: Fix URB programming restriction.
The comment in the code details the restriction. Thanks to Ken for having a very
helpful conversation with me, and spotting the blurb in the link I sent him :P.

There are still stability problems for me on GT4, but this definitely helps with
some of the failures.

v2: Comment fixes

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-11 18:13:19 -08:00
Ilia Mirkin
c4182bb9b0 nv50,nvc0: add ARB_clear_texture support
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-11-11 19:20:41 -05:00
Ilia Mirkin
ae39b0fda8 st/mesa: implement ARB_clear_texture
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-11 19:20:41 -05:00
Ilia Mirkin
3695b253f9 gallium: add PIPE_CAP_CLEAR_TEXTURE and clear_texture prototype
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-11 19:20:41 -05:00
Timothy Arceri
725fcdfbb1 glsl: add helper to check for enhanced layouts support
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-12 10:18:14 +11:00
Timothy Arceri
82e4f22d1e mesa: add ARB_enhanced_layouts
Reviewed-by: Emil Velikov <emil.velikov@collabora.co.uk>
2015-11-12 10:18:08 +11:00
Dave Airlie
df8af7d751 r600: initialised PGM_RESOURCES_2 for ES/GS
This fixes the corruption on rendering that we are seeing in
certain geometry shaders.

Fixes:  https://bugs.freedesktop.org/show_bug.cgi?id=91780
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested / Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: "10.6" "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-12 09:03:13 +10:00
Kenneth Graunke
918bda23dd i965: Split nir_emit_intrinsic by stage with a general fallback.
Many intrinsics only apply to a particular stage (such as discard).
In other cases, we may want to interpret them differently based on
the stage (such as load_primitive_id or load_input).

The current method isn't that pretty - we handle all intrinsics in
one giant function.  Sometimes we assert on stage, sometimes we forget.
Different behaviors are handled via if-ladders based on stage.

This commit introduces new nir_emit_<stage>_intrinsic() functions,
and makes nir_emit_instr() call those.  In turn, those fall back to
the generic nir_emit_intrinsic() function for cases they don't want
to handle specially.

This makes it clear which intrinsics only exist in one stage, and makes
it easy to handle inputs/outputs differently for various stages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-11-11 11:57:37 -08:00
Ilia Mirkin
912babba7b mesa/copyimage: allow width/height to not be multiples of block
For compressed textures, the image size is not necessarily a multiple of
the block size (e.g. the last mip levels). Section 18.3.2 (Copying
Between Images) of the OpenGL 4.5 Core Profile spec says:

    An INVALID_VALUE error is generated if the dimensions of either
    subregion exceeds the boundaries of the corresponding image
    object, or if the image format is compressed and the dimensions of
    the subregion fail to meet the alignment constraints of the
    format.

and Section 8.7 (Compressed Texture Images) says:

    An INVALID_OPERATION error is generated if any of the following
    conditions occurs:

      * width is not a multiple of four, and width + xoffset is not
        equal to the value of TEXTURE_WIDTH.
      * height is not a multiple of four, and height + yoffset is not
        equal to the value of TEXTURE_HEIGHT.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92860
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: mesa-stable@lists.freedesktop.org
2015-11-11 14:37:55 -05:00
Jason Ekstrand
80890eb0d3 i965/brw_reg: Add a brw_VxH_indirect helper
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-11 10:52:30 -08:00
Brian Paul
68993f77cd mesa: remove old comments in arrayobj.c 2015-11-11 09:38:22 -07:00
Brian Paul
9870a5c6c9 st/wgl: clarify code in stw_framebuffer_from_hwnd_locked()
Just a minor code change to make it obvious that NULL is returned when
we don't find the given HWND.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-11 09:38:22 -07:00
Brian Paul
004ed6f4a9 st/wgl: improve some function comments
In particular, explain when stw_framebuffer objects are
locked/unlocked/etc.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-11 09:38:22 -07:00
Brian Paul
b93cb6c1dc st/wgl: whitespace/formatting fixes 2015-11-11 09:38:22 -07:00
Brian Paul
eb812921ac st/wgl: fix locking issue in stw_st_framebuffer_present_locked()
When stw_st_framebuffer_present_locked() is called, the
stw_framebuffer's mutex will already be locked.  Normally, the
stw_framebuffer_present_locked() function calls
stw_framebuffer_release() to unlock the mutex when it's done.  But if
for some reason the 'resource' pointer in
stw_st_framebuffer_present_locked() is null, we'd return without
unlocking the stw_framebuffer.  This fixes that to avoid potential
deadlocks.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2015-11-11 09:38:22 -07:00
Kenneth Graunke
e42a29531a i965: Print force_writemask_all in dump_instructions().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-11 08:35:15 -08:00
Kenneth Graunke
ecb5e0a986 i965: Combine BRW_NEW_*_BINDING_TABLE dirty bits.
A while back, we moved to directly emitting the Gen7+ state when
constructing the binding tables.  These flags are only used on
Gen4-6, which emit all the binding table pointers at once.

We gain nothing by having separate flags, so combine them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-11-11 08:33:58 -08:00
Kenneth Graunke
a2987ff57f i965: Map GL_PATCHES to 3DPRIM_PATCHLIST_n.
Inspired by a patch by Fabian Bieler.

Fabian defined a _3DPRIM_PATCHLIST_0 macro (which isn't actually a valid
topology type); I instead chose to make a macro that takes an argument.
He also took the number of patch vertices from _mesa_prim (which was set
to ctx->TessCtrlProgram.patch_vertices) - I chose to use it directly to
avoid the need for the VBO patch.

v2: Change macro to 0x20 + (n - 1) instead of 0x1F + n to better match
    the documentation (suggested by Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-11-11 08:33:48 -08:00
Emil Velikov
cbb7d90e57 docs: add news item and link release notes for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-11 11:18:32 +00:00
Emil Velikov
6435d8ac5a docs: add sha256 checksums for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 66c949d0a1)
2015-11-11 11:16:43 +00:00
Emil Velikov
07948b03fb docs: add release notes for 11.0.5
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit ee57c22141)
2015-11-11 11:16:42 +00:00
Glenn Kennard
3f45d29fe4 r600g: Pass conservative depth parameters to hw
Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-11 09:06:25 +10:00
Dave Airlie
b3e793f2db Revert "r600g: Pass conservative depth parameters to hw"
This reverts commit a1fc78911e.

I pushed the wrong patch.
2015-11-11 09:05:50 +10:00
Glenn Kennard
c878d61124 r600g: Implement ARB_texture_view
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-11 08:36:08 +10:00
Glenn Kennard
a1fc78911e r600g: Pass conservative depth parameters to hw
Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-11 08:32:35 +10:00
Eduardo Lima Mitev
de51676b41 i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const
When both fadd and fmul instructions have at least one operand that is a
constant and it is only used once, the total number of instructions can
be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
the constants will be progagated as immediate operands of fmul and fadd.

This patch detects these situations and prevents fusing fmul+fadd into ffma.

Shader-db results on i965 Haswell:

total instructions in shared programs: 6235835 -> 6225895 (-0.16%)
instructions in affected programs:     1124094 -> 1114154 (-0.88%)
total loops in shared programs:        1979 -> 1979 (0.00%)
helped:                                7612
HURT:                                  843
GAINED:                                4
LOST:                                  0

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev
fb3b5669ce util: Add list_is_singular() helper function
Returns whether the list has exactly one element.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev
94ff35204d nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver
Because the next patch will add an optimization that is specific to i965,
we want to move this loweing pass to that driver altogether.

This is safe because i965 is the only consumer.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-10 21:13:35 +01:00
Kristian Høgsberg Kristensen
96b22fb080 glsl: Use array deref for access to vector components
We've assumed that we could lower per-component vector access from

  vec[i] = scalar

to

  vec = ir_triop_vector_insert(vec, scalar, i)

but with SSBOs (and compute shader SLM and tesselation outputs) this is
no longer valid. If a vector is "externally visible", multiple threads
can write independent components simultaneously. With lowering to
ir_triop_vector_insert, each thread read the entire vector, changes one
component, then writes out the entire vector. This is racy.

Instead of generating a ir_binop_vector_extract when we see v[i], we
generate ir_dereference_array. We then add a lowering pass to lower the
ir_dereference_array to ir_binop_vector_extract for rvalues and for to
vector_insert for lvalues in a separate lowering pass.

The resulting IR is the same as before, but we now have a window between
ast->ir conversion and the lowering pass where v[i] appears in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen
60dd5287ff glsl: Lower UBO and SSBO access in glsl linker
All GLSL IR consumers run this lowering pass so we can move it to the
linker. This moves the pass up quite a bit, but that's the point: it
needs to run before we throw away information about per-component vector
access.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen
f0e95c2500 glsl: Drop exec_list argument to lower_ubo_reference
We always pass in shader->ir and we already pass in the shader, so just
drop the exec_list. Most passes either take just a exec_list or a
shader, so this seems more consistent.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Connor Abbott
213f86416f nir/glsl: switch to using the builder
v2: use nir_bulder_cf_insert (Ken)

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:56:43 -05:00
Connor Abbott
fbbfb7c025 nir/glsl: make emit() take nir_ssa_def * sources
Again, this matches what the builder will have to do.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:56:35 -05:00
Connor Abbott
a60e990dd2 nir/glsl: convert nir_visitor::result to a nir_ssa_def *
Its only user now returns a nir_ssa_def *, and we'll need this since the
builder returns a nir_ssa_def *.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:55:54 -05:00
Connor Abbott
30fe8eaa8e nir/glsl: make evaluate_rvalue() return a nir_ssa_def *
A long time ago, before NIR was even merged to master, glsl_to_nir used
registers and these sources were actually register sources. But nowadays
everything in glsl_to_nir is an SSA value, so stop pretending that by
evaluating an rvalue we can get an arbitrary nir_src. Most importantly,
we need this since the builder takes nir_ssa_def * sources directly.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:55:14 -05:00
Jose Fonseca
6f42162329 st/mesa: Destroy buffer object's mutex.
Ideally we should have a _mesa_cleanup_buffer_object function in
src/mesa/bufferobj.c so that the destruction logic resided in a single
place.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-11-10 11:04:28 +00:00
Kenneth Graunke
db54673b54 nir: Store PatchInputsRead and PatchOutputsWritten in nir_shader_info.
These tessellation shader related fields need plumbing through NIR.

v2: Use uint32_t instead of uint64_t to match the source type of
    GLbitfield (caught by Iago Toral).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-11-10 01:03:43 -08:00
Eric Anholt
437d7b6119 vc4: Avoid loading undefined (newly-allocated) FBO contents.
Since X has undefined contents in new pixmaps, it will allocate new
textures for an FBO and draw to them without an explicit clear.  For
VC4, it's much faster to emit a clear than the load of the actual
undefined memory contents, so just do that instead.
2015-11-09 19:17:36 -08:00
Eric Anholt
5980389bbf vc4: Return NULL when we can't make our shadow for a sampler view.
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-09 19:17:36 -08:00
Eric Anholt
eb8fb0064d vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-09 19:17:36 -08:00
Eric Anholt
84608e07e7 vc4: Add CL dumping for GL_ARRAY_PRIMITIVE. 2015-11-09 19:17:36 -08:00