Commit graph

57990 commits

Author SHA1 Message Date
Paul Berry
336351e971 glsl/ast: Check that geometry shader interface block inputs are arrays.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-13 20:02:54 -07:00
Paul Berry
3b837e637e i965/gen7+: Fix build error introduced by renaming upload_3dstate_so_decl_list.
Commit 9f9ccf707c renamed
upload_3dstate_so_decl_list to gen7_upload_3dstate_so_decl_list but
forgot to update the caller.
2013-08-13 19:36:27 -07:00
Jon Severinsson
9298f537a7 radeon/llvm: Add missing "%s" format string to fprintf.
This fixes a compilation warning with -Wformat-security.

CC: "9.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-08-13 19:18:14 -07:00
Chad Versace
11b8f8e7e4 i965: Move arrays brw_multisample_positions* to new header
Move the arrays to the new header brw_multisample_state.h, which will be
shared with Broadwell code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:04:20 -07:00
Chad Versace
7eecda29c8 i965: Refactor names of sample_positions_8/4x arrays
Place each array in the brw namespace by renaming it:
    sample_positions_4x -> brw_multisample_positions_4x
    sample_positions_8x -> brw_multisample_positions_8x

This prepares for moving the arrays to a header shared by gen6 and gen8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:03:59 -07:00
Kenneth Graunke
9f9ccf707c i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2)
We will reuse this for Broadwell.

v2: Prefix function name with 'gen7'. (chadv)

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:03:57 -07:00
Kenneth Graunke
f4e5c235de i965: Mark a few brw_draw_upload.c functions as non-static
We will reuse these for Broadwell.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-08-13 18:02:13 -07:00
Ian Romanick
1b35e33af4 glsl: Require function return type arrays be explicitly sized
Fixes piglit array-function-return-unsized.vert.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
42624b1c81 glsl: Move and refine test for unsized arrays in GLSL ES
GLSL ES does not allow unsized arrays, and GLSL ES 1.00 does not allow
array initializers.  However, GLSL ES 3.00 allows array initializers,
and the initializer can explicitly size the array.  The specification
even includes some examples of this:

    float x[] = float[2] (1.0, 2.0);     // declares an array of size 2
    float y[] = float[] (1.0, 2.0, 3.0); // declares an array of size 3

    float a[5];
    float b[] = a;

Move the unsized array check to after the initializer has been
processed.  If the array is still unsized, generate the error.  This
should have no effect in GLSL ES 1.00 because, as previously mentioned,
array initializers are not allowed.

Fixes piglit "glsl-es-3.00 compiler array-sized-by-initializer.vert".

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "9.1 9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
d5aee174b8 glx: Generate GLXBadDrawable when drawable is zero
Fixes piglit glx-query-drawable-GLXBadDrawable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
ef83bd2b95 mesa: Use _mesa_detach_renderbuffer when deleting a texture
The functional change is that now invalidate_framebuffer is called if
the texture is actually detached from one of the currently bound FBOs.
Previously this was only done for renderbuffers.

The remaining changes make the texture delete path look more similar to
the renderbuffer delete path.  This includes adding relevant spec
quotations to justify the behavior.

Fixes piglit fbo-incomplete "delete texture of bound FBO" test.

v2: Move 'fb->Attachment[i].Texture == att' check from previous patch to
this patch... where it was intended to be in the first place.  Noticed
by Chad.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
438cc6bc49 mesa: Make detach_renderbuffer available outside fbobject.c
Also add a return value indicating whether any work was done.

This will be used by the next patch.

v2: Move 'fb->Attachment[i].Texture == att' check to the next
patch... where it was intended to be in the first place.  Noticed by
Chad.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Ian Romanick
341fb93c16 meta: Don't call _mesa_Ortho with width or height of 0
Fixes failures in oglconform fbo mipmap.manual.color,
mipmap.manual.colorAndDepth, mipmap.automatic, and
mipmap.manualIterateTexTargets subtests.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-13 17:53:33 -07:00
Vadim Girlin
17bb96b03d r600g/sb: use MULADD workaround on R7xx for MULADD_IEEE
Looks like the same issue that was seen with MULADD in trans slot on
R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is
just a most frequently used?). So the workaround is to not allow affected
instructions to be placed into the trans slot.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=67927

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-14 01:03:18 +04:00
Roland Scheidegger
6991f86945 gallivm: implement new float comparison instructions returning integer masks
FSEQ/FSGE/FSLT/FSNE work just the same as SEQ/SGE/SLT/SNE except skip the
select.
And just for consistency use the same appropriate ordered/unordered comparisons
for the old opcodes as well.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Roland Scheidegger
0930082ffd tgsi: implement new float comparison instructions returning integer masks
Also while here add a bunch of other forgotten (integer) instructions to
tgsi_util_get_inst_usage_mask() (which isn't used for much except optimizing
away unused input components), though it may still be incomplete.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Roland Scheidegger
e7a5bf7a34 gallium: add new float comparison instructions returning integer masks
Newer graphic languages don't want messy float mask results but instead true
"boolean" mask results for float comparisons. Otherwise just need to convert
the floats back to integers. Need to keep the old opcodes however due to both
legacy (gl and d3d9) needing them and because older hw can't really deal with
integers. These new FSEQ/FSGE/FSLT/FSNE opcodes are part of integer API and
hence must be supported if a driver claims to support glsl 1.30 (or
PIPE_SHADER_CAP_INTEGERS).

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-13 19:09:17 +02:00
Chia-I Wu
3b6cee1634 ilo: enable dumping of WM PCB
It was disabled because it wasn't supported.
2013-08-13 16:28:24 +08:00
Chia-I Wu
0f8a86682f ilo: no binding table change when constants are pushed
When constants can be pushed, and nothing else requires new SURFACE_STATEs,
there is no need to emit BINDING_TABLE_STATE.
2013-08-13 16:26:03 +08:00
Chia-I Wu
c6e1e0157b ilo: support push constant model in shaders
Source constants from URB constant data when the constant data can fit in the
PCB.
2013-08-13 16:04:35 +08:00
Chia-I Wu
5e30ffbda6 ilo: support copying constant buffer 0 to PCB
Add ILO_KERNEL_PCB_CBUF0_SIZE so that a kernel can specify how many bytes of
constant buffer 0 need to be copied to PCB.
2013-08-13 15:52:41 +08:00
Chia-I Wu
5df62dce34 ilo: make constant buffer 0 upload optional
Add ILO_KERNEL_SKIP_CBUF0_UPLOAD so that we can skip constant buffer 0 upload
when the kernel does not need it.
2013-08-13 15:52:37 +08:00
Chia-I Wu
8b5b5fe394 Revert "ilo: initialize constant buffer SURFACE_STATE early"
This reverts commit a9b800aa81.  With push
constant support, the constructed SURFACE_STATE is unused and wasted.  The
change only slows things down.
2013-08-13 15:24:58 +08:00
Armin K
f423eba46e gbm: Link to libwayland-drm if Wayland EGL platform is enabled
We were relying on libEGL to pull in libwayland-client symbols, but with
commit 2c2e64edab cleaned up the
symbol leak.

https://bugs.freedesktop.org/show_bug.cgi?id=67962
2013-08-12 15:16:22 -07:00
Roland Scheidegger
cd2f26090a gallivm: fix exec_mask interaction with geometry shader after end of main
Because we must maintain an exec_mask even if there's currently nothing
on the mask stack, we can still have an exec_mask at the end of the program.
Effectively, this mask should be set back to default when returning from main.
Without relying on END/RET opcode (I think it's valid to have neither) it is
actually difficult to do this, as there doesn't seem any reasonable place to
do it, so instead let's just say the exec_mask is invalid outside main (which
it really is effectively).
The problem is that geometry shader called end_primitive outside the shader
(in the epilogue), and as a result used a bogus mask, leading to bugs if we
had to set the (somewhat misnamed) ret_in_main bit anywhere. So just avoid
the mask combining function when called from outside the shader.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-12 23:33:00 +02:00
Roland Scheidegger
dfa7b72563 draw: simplify prim mask construction
The code was quite weird, the second comparison was in fact a complete no-op
and we can also do the comparison with the vector directly instead of scalar,
which should not also be faster but it is way more obvious how that mask
is actually going to look like.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-12 23:33:00 +02:00
Roland Scheidegger
7147094ff2 gallivm: simplify geometry shader mask handling a bit
Instead of reducing masks to 0/1 simply use the mask directly as -1.
Also use some signed comparison instead of unsigned (as far as I understand
these values have to be (very) small and signed means llvm doesn't have to
apply additional logic to do the unsigned comparisons the cpu can't do).
Saves a couple of instructions in some test geometry shader here.

v2: that was a bit to much optimization, don't skip combining the masks...

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-12 23:33:00 +02:00
Roland Scheidegger
84fce45321 draw: (trivial) dump tgsi for geometry shaders with GALLIVM_DEBUG_TGSI
And dump the variant key too (same as vs does).
Just so I can stop wondering why I see the tgsi dump for fs and vs but not
gs...
2013-08-12 23:33:00 +02:00
Roland Scheidegger
8c5283dc17 gallivm: (trivial) fix typo in argument declaration of lp_build_size_query_soa
Was meant to match the name used elsewhere, spotted by Anthony.
2013-08-12 23:33:00 +02:00
Kenneth Graunke
4d95efd146 i965/fs: Add dump_instruction() support for ARF destinations.
CMP instructions use BRW_ARF_NULL as a destination.  Prior to this
patch, dump_instruction() decoded the destination as "???".

Now it decodes BRW_ARF_NULL as "(null)" and other ARFs numerically.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:13:06 -07:00
Kenneth Graunke
ee7bfab068 i965/fs: Remove extraneous newline in dump_instruction() for CMP.
This resulted in printouts like:

   246: cmp.cmod.f0.0
    ???, vgrf152, 0.000000f, (null),

With this patch, CMP is properly printed on one line.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:13:04 -07:00
Kenneth Graunke
80e1c2f35f i965/fs: Optimize IF/MOV/ELSE/MOV/ENDIF to SEL when possible.
Many GLSL shaders contain code of the form:

   x = condition ? foo : bar

The compiler emits an ir_if tree for this, since each subexpression
might be a complex tree that could have side-effects and short-circuit
logic operations.

However, the common case is to simply pick one of two constants or
variable's values---which is exactly what SEL is for.  Replacing IF/ELSE
with SEL also simplifies the control flow graph, making optimization
passes which work on basic blocks more effective.

The shader-db statistics:

   total instructions in shared programs: 1655247 -> 1503234 (-9.18%)
   instructions in affected programs:     949188 -> 797175 (-16.02%)

   2,970 shaders were helped, none hurt.  Gained 181 SIMD16 programs.

This helps Valve's Source Engine games (max -41.33%), The Cave
(max -33.33%), Serious Sam 3 (max -18.64%), Yo Frankie! (max -30.19%),
Zen Bound (max -22.22%), GStreamer (max -6.12%), and GLBenchmark 2.7
(max -1.94%).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:13:01 -07:00
Kenneth Graunke
2c32c3985c i965/fs: Consider predicated SEL instructions as whole variable writes.
The instruction

   (+f0.0) SEL dst, src0, src1

will write either src0 or src1 to dst, depending on the predicate.
Unlike most predicated instructions, it always writes to dst.

fs_inst::is_partial_write() is supposed to return true if the whole
register is guaranteed to be written.  The !inst->predicated check makes
sense for most instructions, which might not write the whole register,
but SEL is a special case.

This caused live interval analysis to ignore the destination of
predicated SEL instructions when computing "def" information.

Requires the previous commit to avoid regressions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:12:59 -07:00
Kenneth Graunke
d21f542aa1 i965/fs: Explicitly disallow CSE on predicated instructions.
The existing inst->is_partial_write() already disallows predicated
instructions, so this has no functional change.  However, it's worth
doing explicitly since the CSE pass does not consider the flag register.
This means it could blindly factor out operations that use the same
sources, but which have different condition codes set.

This prevents a regression in the next commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:12:57 -07:00
Kenneth Graunke
53d8cff63b i965/fs: Log a performance warning if skipping 16-wide due to pulls.
Usually, the driver creates both 8-wide and 16-wide variants of every
fragment shader.  When 16-wide compilation fails, it logs a performance
warning explaining why only an 8-wide program exists.

However, when there are pull parameters, the driver won't even bother
trying the 16-wide compile (since it would fail).  In this case, it
failed to emit a performance warning, leaving no explanation for the
missing 16-wide program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-12 13:12:47 -07:00
Chia-I Wu
a9b800aa81 ilo: initialize constant buffer SURFACE_STATE early
Fix ilo_gpe_init_view_surface_for_buffer to allow buffer to be NULL, and add
ilo_gpe_set_view_surface_bo to set it later.  This allows us to set up
SURFACE_STATE early for constant buffers backed by user buffers.
2013-08-12 11:49:51 +08:00
Chia-I Wu
b2f79a3823 ilo: 3DSTATE_INDEX_BUFFER may be wrongly skipped
In finalize_index_buffer(), when the current index buffer was destroyed due to
u_upload_data(), it may happen that the new index buffer is at the same
address as the old one.  Comparing the pointers to the two buffers could fail
to work, and 3DSTATE_INDEX_BUFFER would be incorrectly skipped.

Holding a reference to the current index buffer before calling u_upload_data()
should fix the problem.
2013-08-10 13:01:41 +08:00
Chris Forbes
637e6a0aa8 i965: add missing BRW_NEW_INTERPOLATION_MAP to state dump
Makes this flag appear in the output for INTEL_DEBUG=state

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-10 20:29:12 +12:00
Chris Forbes
e114b13dae i965: Add a new debug mode for the VUE map
INTEL_DEBUG=vue now emits a listing of each slot in the VUE map,
and the corresponding interpolation mode.

V2: Fix whitespace issues.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-08-10 20:28:45 +12:00
Ian Romanick
5894898148 glsl: Don't allow const on out or inout function parameters
Fixes piglit tests const-inout-parameter.frag and
const-out-parameter.frag.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-08-09 13:51:18 -07:00
Roland Scheidegger
894d4903e7 gallivm: set non-existing values really to zero in size queries for d3d10
My previous attempt at doing so double-failed miserably (minification of
zero still gives one, and even if it would not the value was never written
anyway).
While here also rename the confusingly named int_vec bld as we have int vecs
of different sizes, and rename need_nr_mips (as this also changes out-of-bounds
behavior) to is_sviewinfo too.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-09 20:49:19 +02:00
Roland Scheidegger
b0f74250e1 gallivm: use texture target from shader instead of static state for size query
d3d10 has no notion of distinct array resources neither at the resource nor
sampler view level. However, shader dcl of resources certainly has, and
d3d10 expects resinfo to return the values according to that - in particular
a resource might have been a 1d texture with some array layers, then the
sampler view might have only used 1 layer so it can be accessed both as 1d
or 1d array texture (I think - the former definitely works). resinfo of a
resource decleared as array needs to return number of array layers but
non-array resource needs to return 0 (and not 1). Hence fix this by passing
the target from the shader decl to emit_size_query and use that (in case of
OpenGL the target will come from the instruction itself).
Could probably do the same for actual sampling, though it may not matter there
(as the bogus components will essentially get clamped away), possibly could
wreak havoc though if it REALLY doesn't match (which is of course an error
but still).

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-09 20:49:18 +02:00
Roland Scheidegger
38ad404f76 gallivm: honor d3d10's wishes of out-of-bounds behavior for texture size query
Specifically, must return 0 for non-existent mip levels (and non-existent
textures which is an unsolved problem) for everything but total mip count.

Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-08-09 20:49:18 +02:00
Paul Berry
417dc8081b glsl: Enable ARB_fragment_coord_conventions functionality in GLSL 1.50.
GLSL 1.50 incorporates the functionality of the
ARB_fragment_coord_conventions extension, so we need to make this
functionality available even if the extension isn't enabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-08-09 10:35:06 -07:00
Paul Berry
13fedf2883 main: Fix deprecation of glLineWidth()
From section E.1 (Profiles and Deprecated Features of OpenGL 3.0)
of the OpenGL 3.0 spec:

    "LineWidth is not deprecated, but values greater than 1.0
    will generate an INVALID VALUE error"

From context it is clear that values greater than 1.0 should only
generate an INVALID VALUE error in a forward-compatible context.

The code was correctly quoting this spec text, but it was disallowing
all line widths in forward-compatible contexts, instead of just widths
greater than 1.0.

This patch introduces the correct check, so that setting a line width
of 1.0 or less is permitted.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-08-09 10:34:05 -07:00
Roland Scheidegger
836098f6b2 util: (trivial) fix asm input/output list for fxsave
Otherwise gcc might do very unsafe optimizations, spotted by Uros Bizjak.
Hopefully this time it's finally right?
2013-08-09 17:30:13 +02:00
Alex Deucher
c88783047e r600g: disable GPUVM by default
Cayman and trinity systems still seem to suffer from
stability problems with GPUVM.  This also fixes compute
on these asics.  It can still be enabled for testing
by setting env var RADEON_VA=true.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=65958

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" <mesa-stable@lists.freedesktop.org>
CC: "9.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
2013-08-09 10:51:25 -04:00
Zack Rusin
e8d8974f80 softpipe: fix the regressions
softpipe has a really weird handling of the draw attrs, lets
just not inject outputs in its data.
Trivial.
2013-08-08 20:54:50 -04:00
Zack Rusin
662a4d4a12 draw: rewrite primitive assembler
We can't be injecting the primitive id's in the pipeline because
by that time the primitives have already been decomposed. To
properly number the primitives we need to handle the adjacency
primitives by hand. This patch moves the prim id injection into
the original primitive assembler and completely removes the
useless pipeline stage.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-08 20:54:25 -04:00
Zack Rusin
1d425c4c6d draw: reset the vertex id when injecting new primitive id
Without reseting the vertex id, with primitives where the same
vertex is used with different primitives (e.g. tri/lines strips)
our vbuf module won't re-emit those vertices with the changed
primitive id. So lets reset the vertex id whenever injecting
new primitive id to make sure that the vertex data is correctly
emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-08 20:54:03 -04:00