Commit graph

59368 commits

Author SHA1 Message Date
Emil Velikov
2b7ffde8bd st/xorg: add sanity checks after malloc
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:37 +00:00
Emil Velikov
5c398e243c st/xorg: remove unnecessary headers
v2: Remove xf86PciInfo.h, all drivers provide their own PCI ID list

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:37 +00:00
Rob Clark
2bc1fc2fb6 freedreno: emulated unsupported primitive types
Use u_primconvert to convert unsupported primitives into supported
primitive plus index buffer.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-29 16:49:43 -04:00
Rob Clark
b881917088 gallium/auxiliary/indices: add u_primconvert
A convenient front end to indices generate/translate code, for emulating
primitives which are not supported natively by the driver.

This handles saving/restoring index buffer state, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 16:49:43 -04:00
Rob Clark
28f3f8d413 gallium/auxiliary/indices: add start param
Add 'start' parameter to generator/translator.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 16:49:43 -04:00
Rob Clark
5127436a4a freedreno: update generated headers
pull in some fixes to draw-initiator/prim-type.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-29 16:49:43 -04:00
Eric Anholt
774b787d6b i965/fs: Drop our dead push constants before overflowing to pull constants.
The idea of the original order was that you'd dead code eliminate accesses
to push constants.  But I've never seen a case of that (nor has
shader-db), while we frequently see sparse accesses of large constant
arrays that would overflow into pull constants.

Cuts pull constant use on csgo, serious sam, planeshift, and the cave:

total instructions in shared programs: 1695103 -> 1688795 (-0.37%)
instructions in affected programs:     92024 -> 85716 (-6.85%)
GAINED:                                339
LOST:                                  0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-29 13:43:01 -07:00
Alexander von Gluck IV
9a9fb94ca9 haiku-softpipe: Minor cleanup and color space fixes
* Use more consistant data sources
* Fix improper color space assignments
* Remove unnecessary comments and code
* Drop unnecessary round_up function (this was leftover
  from moving winsys code out of renderer)

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-29 15:27:43 -05:00
Alexander von Gluck IV
439dd0e20a winsys: Correct Haiku winsys display target code
* Instead of assuming the displaytarget is the same
  stride / colorspace as the destination, lets
  actually check the source bitmap.
* Fixes random stride issues in rendering

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-29 15:27:40 -05:00
Francisco Jerez
b8f89fc5cb clover: Use context device list for error checking in clGetProgramBuildInfo.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70891.

Reported-by: Bruno Jiménez <brunojimen@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
e515dcbf96 i965: Simplify the shader time code by using atomic counter helpers.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
d58bd75263 i965: Add brw_reg constructors taking a dynamically determined vector width.
The MRF variant is going to be used extensively by the atomic counter
intrinsics to assemble untyped atomic and surface read messages
easily.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
5e621cb9fe i965/gen7: Implement code generation for untyped surface read instructions. 2013-10-29 12:40:56 -07:00
Francisco Jerez
cfaaa9bbb7 i965/gen7: Implement code generation for untyped atomic instructions.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
5809512b17 i965: Implement ABO surface state emission.
The maximum number of atomic buffer objects is somewhat arbitrary, we
can change it in the future easily if it turns out it's not enough...

v2: Add comments with the relevant mesa dirty bits.  Fix usage of
    BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom.
v3: Update binding table layout diagrams.
v4: Resolve conflicts with the recent dynamic surface index assignment changes.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
c4e730e218 i965: Define vtbl method that initializes an untyped R/W surface.
And add Gen7 implementation.

v2: Fix off by one error in buffer size calculation.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
7a54db9ce5 glsl: Fix the function inlining pass to deal with general opaque arguments.
Almost a trivial change, it boils down to renaming a few identifiers
so their names still make sense for opaque types other than sampler.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
bbded5b5fe glsl: Add built-in functions and constants required for ARB_shader_atomic_counters.
v2: Represent atomics as GLSL intrinsics.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
9562922376 glsl: Basic support for built-in intrinsics.
Fix the linker to deal with intrinsic functions which are undefined
all the way down to the driver back-end, and introduce intrinsic
definition helpers in the built-in generator.

We still need to figure out what kind of interface we want for drivers
to communicate to the GLSL front-end which of the supported intrinsics
should use a default GLSL implementation and which should use a
hardware-specific override.  As there's no default GLSL implementation
for atomic ops, this seems like something we can worry about later on.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

v2: Define local helper function to generate ir_call nodes in the
    builtin generator.
2013-10-29 12:40:55 -07:00
Francisco Jerez
cc744a0947 glsl: Add type predicate to check whether a type contains any opaque types.
And use it to forbid comparisons of opaque operands.  According to the
GL 4.2 specification:

> Except for array indexing, structure member selection, and
> parentheses, opaque variables are not allowed to be operands in
> expressions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
26db3b933f glsl: Add new atomic_uint built-in GLSL type.
v2: Fix GLSL version in which the type became available.  Add
    contains_atomic() convenience method.  Split off atomic counter
    comparison error checking to a separate patch that will handle all
    opaque types.  Include new ir_variable fields for atomic types.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
0bed1ab73b glsl: Add extension enables for ARB_shader_atomic_counters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
1c7dcfed7c mesa: Add support for ARB_shader_atomic_counters.
This patch implements the common support code required for the
ARB_shader_atomic_counters extension.  It defines the necessary data
structures for tracking atomic counter buffer objects (from now on
"ABOs") associated with some specific context or shader program, it
implements support for binding buffers to an ABO binding point and
querying the existing atomic counters and buffers declared by GLSL
shaders.

v2: Fix extension checks.  Drop unused MAX_ATOMIC_BUFFERS constant.

Acked-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
e3fd31dc41 glapi: Add support for ARB_shader_atomic_counters.
Add XML file for the dispatch code generator, update the
dispatch_sanity test and add stub definition for the new entry point.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
db47074ac0 i965: Handle deallocation of some private ralloc contexts explicitly.
These ralloc contexts belong to a specific object and are being
deallocated manually from the class destructor.  Now that we've hooked
up destructors to ralloc there's no reason for them to be children of
any other context, and doing so might to lead to double frees under
some circumstances.  The class destructor has all the responsibility
of freeing class memory resources now.
2013-10-29 12:40:55 -07:00
Francisco Jerez
d18477deea ralloc: Hook up C++ destructors to ralloc when necessary.
This patch makes sure that class destructors are called as they should
be when a C++ object allocated by ralloc is released.

Based on a previous patch by Kenneth Graunke, but it doesn't exhibit
the ~0.8% performance regression in shader compilation times because
we now use the HAS_TRIVIAL_DESTRUCTOR() macro to detect the typical
case where the indirect function call can be avoided because the
object's destructor doesn't need to do anything.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
98ab905af0 mesa: Define introspection macro to determine whether a type is trivially destructible.
Only implemented on GCC and Clang for now.  Other compilers use a
dummy implementation that always returns false, which should be a safe
[but slightly inefficient] assumption in all cases.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Paul Berry
be63803b0c glsl: Generalize MSVC fix for strcasecmp().
This will let us use strcasecmp() from anywhere inside Mesa without
having to worry about the fact that it doesn't exist in MSVC.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 11:10:56 -07:00
Roland Scheidegger
e4195acab5 llvmpipe: fix bogus layer clamping in setup
The layer coming from GS needs to be clamped (not sure if that's actually
the correct error behavior but we need something) as the number can be higher
than the amount of layers in the fb. However, this code was using the layer
calculation from the scene, and this was actually calculated in
lp_scene_begin_rasterization() hence too late (so setup was using the value
from the _previous_ scene or just zero if it was the first scene).
Since the value is used in both rasterization and setup, move calculation up
to lp_scene_begin_binning() though it's a bit more inconvenient to calculate
there. (Theoretically could move _all_ code which was in
lp_scene_begin_rasterization() to there, because ever since we got rid of
swizzled render/depth buffers our "map" functions preparing the fb data for
render don't actually change the data in there at all, but it feels like
it would be a hack.)

v2: improve comments

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-29 17:54:03 +01:00
Matthew McClure
be0b67a143 util,llvmpipe: correctly set the minimum representable depth value
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-10-29 15:53:48 +00:00
Brian Paul
d0eaf6752d st/mesa: move out of memory check in st_draw_vbo()
Before we were only checking the st->vertex_array_out_of_memory flag
after updating array state.  But if there's two consecutive glDrawArrays
calls and the first one is skipped because of OOM, the second one should
be skipped too.

Cc: 9.2 <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2013-10-29 08:09:34 -06:00
Brian Paul
ea9fe9ebdb svga: reindent drawing code 2013-10-29 08:09:34 -06:00
Eric Anholt
415d6dc5bd i965/vec4: Reduce working set size of live variables computation.
Orbital Explorer was generating a 4000 instruction geometry shader, which
was taking 275 trips through dead code elimination and register
coalescing, each of which updated live variables to get its work done, and
invalidated those live variables afterwards.

By using bitfields instead of bools (reducing the working set size by a
factor of 8) in live variables analysis, it drops from 88% of the profile
to 57%, and reduces overall runtime from I-got-bored-and-killed-it (Paul
says 3+ minutes) to 10.5 seconds.

Compare to f179f419d1 on the FS side.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 00:27:35 -07:00
Vadim Girlin
8bd4476010 r600g/sb: fix value::is_fixed()
This prevents unnecessary (and wrong) register allocation in the
scheduler for preloaded values in fixed registers.

Fixes interpolation-mixed.shader_test on rv770
(and probably on all other pre-evergreen chips).

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-10-29 05:49:21 +04:00
Eric Anholt
08bf52712e glsl: Drop no-op shifts involving 0.
I noticed this in a shader in Unigine Heaven that was spilling.  While it
doesn't really reduce register pressure, it shaves a few instructions
anyway (7955 -> 7882).

v2: Fix turning "0 >> x" into "x" instead of "0" (caught by Erik
    Faye-Lund).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-28 14:07:31 -07:00
Eric Anholt
3a0fdf2ab6 glsl: Use ir_builder more in opt_algebraic.
While ir_builder is slightly less efficient, we're only increasing the
work when there's actual optimization being done, and it's way more
readable code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-28 14:07:31 -07:00
Eric Anholt
27bcb5063f glsl: Move common code out of opt_algebraic's handle_expression().
Matt and I had each screwed up these common required patterns recently, in
ways that wouldn't have been noticed for a long time if not for code
review.  Just enforce it in the caller so that we don't rely on code
review catching these bugs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-10-28 14:07:31 -07:00
Carl Worth
29996e2199 Remove error when calling glGenQueries/glDeleteQueries while a query is active
There is nothing in the OpenGL specification which prevents the user from
calling glGenQueries to generate a new query object while another object is
active. Neither is there anything in the Mesa implementation which prevents
this. So remove the INVALID_OPERATION errors in this case.

Similarly, it is explicitly allowed by the OpenGL specification to delete an
active query, so remove the assertion for that case, replacing it with the
necesssary state updates to end the query, (clear the bindpt pointer and call
into the driver's EndQuery hook).

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
2013-10-28 12:56:49 -07:00
Kenneth Graunke
5563dfabc8 i965: Also emit HiZ and Stencil packets when disabling depth on Gen6.
The normal drawing path does this, and it's necessary on Ivybridge,
so let's try it on Sandybridge too.  It's not explicitly documented
as necessary, but might help with hangs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:36 -07:00
Kenneth Graunke
29e5d5db51 i965: Also emit HIER_DEPTH and STENCIL packets when disabling depth.
From the documentation:
"[DevIVB] 3DSTATE_DEPTH_BUFFER must always be programmed along with the
 other Depth/Stencil state commands(i.e. 3DSTATE_CLEAR_PARAMS,
 3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)."

We normally do this, but BLORP was failing to do so in the case where it
disables depth.

Not observed to fix anything yet.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:33 -07:00
Kenneth Graunke
65b1f642ac i965: Move post-sync non-zero flush for 3DSTATE_MULTISAMPLE.
For some reason, we put the flush in the caller, rather than just before
emitting the packet.  This is more than a cosmetic problem: BLORP calls
gen6_emit_3dstate_multisample() directly, and so it missed the flush.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:32 -07:00
Kenneth Graunke
10a918e52c i965: Also guard 3DSTATE_DRAWING_RECTANGLE with a flush in blorp.
Non-pipelined commands need this flush.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:31 -07:00
Kenneth Graunke
3aef1fefb4 i965: Emit post-sync non-zero flush before 3DSTATE_DRAWING_RECTANGLE.
This is another non-pipelined command that needs a flush on Sandybridge.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:29 -07:00
Kenneth Graunke
436e815a25 i965: Emit post-sync non-zero flush before 3DSTATE_GS_SVB_INDEX.
From the comments above intel_emit_post_sync_nonzero_flush:
"[DevSNB-C+{W/A}] Before any depth stall flush (including those
 produced by non-pipelined state commands), software needs to first
 send a PIPE_CONTROL with no bits set except Post-Sync Operation != 0."

This suggests that every non-pipelined (0x79xx) command needs a
post-sync non-zero flush before it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:27 -07:00
Daniel Vetter
32a3f5f6d7 i965: CS writes/reads should use I915_GEM_INSTRUCTION
Otherwise the gen6 w/a in the kernel won't kick in and the write will
land nowhere.

Inspired by a patch Ken pointed me at which had the same issue (but
isn't yet merged and also for a gen7+ feature). An audit of the entire
driver didn't reveal any other case than the one in in the write_reg
helper used by the gen6 queryobj code.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Xinkai Chen <yeled.nova@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Cc: "9.2" <mesa-stable@lists.freedesktop.org>
2013-10-28 11:29:15 -07:00
Anuj Phogat
f278d49c4b i965: Do not set bilinear_filter flag in case of multisample blits
Setting bilinear_filter flag in case of multisample blits with
GL_LINEAR filter causes incorrect behavior in translate_dst_to_src()
function. This broke Modern Warfare (1, 2 and 3) on SNB, IVB and HSW.

Tested on SNB and IVB, no Piglit regressions. Trace file of the game
(taken with apitrace) works fine with this patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69078
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reported-by: Armin K <krejzi@email.com>
Tested-by: Armin K <krejzi@email.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-28 09:33:01 -07:00
Rico Schüller
14f02cdee8 mesa: Remove trailing whitespace in texparam.c
Signed-off-by: Rico Schüller <kgbricola@web.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
2013-10-28 08:43:40 -06:00
Brian Paul
0ce3bfbd40 mesa: use void in _mesa_VDPAUFiniNV() as in the header file 2013-10-28 08:37:39 -06:00
Timothy Arceri
b59c5926cb glsl: Add check for unsized arrays to glsl types
The main purpose of this patch is to increase readability of
the array code by introducing is_unsized_array() to glsl_types.
Some redundent is_array() checks are also removed, and small number
of other related clean ups.

The introduction of is_unsized_array() should also make the
ARB_arrays_of_arrays code simpler and more readable when it arrives.

V2: Also replace code that checks for unsized arrays directly with the
length variable

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

v3 (Paul Berry <stereotype441@gmail.com>): clean up formatting.
Separate whitespace cleanups to their own patch.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-28 06:06:04 -07:00
Timothy Arceri
5cd7eb9f07 glsl: whitespace cleanups.
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>

v2 (Paul Berry <stereotype441@gmail.com>): Separate from "glsl: Add
check for unsized arrays to glsl types".

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-28 06:06:04 -07:00