Commit graph

51166 commits

Author SHA1 Message Date
José Fonseca
c14f516e58 tools/trace: Do a better job at comparing multi line strings.
For TGSI diffing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
9b7d21f8f5 tools/trace: Tool to compare json state dumps.
Copied verbatim from apitrace's scripts/jsondiff.py
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
cc4ad695ca tools/trace: Tool to dump gallium state at any draw call.
Based from the code from the good old python state tracker.

Extremely handy to diagnose regressions in state trackers.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:20 +01:00
José Fonseca
a7bccb33b9 tools/trace: Defer blob hex-decoding.
To speed up parsing.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
José Fonseca
a8f7e12d92 trace: Don't dump texture transfers.
Huge trace files with little value.
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-21 14:30:19 +01:00
Chia-I Wu
bbd2d575e6 ilo: replace a boolean by bool
bool is used internally.  This is just cosmetic.
2013-06-20 11:40:20 +08:00
Chia-I Wu
8b2cba8f97 ilo: rename cache_seqno to uploaded
It has been used as a bool since shader cache rework.
2013-06-20 11:36:54 +08:00
Roland Scheidegger
ffebefa114 util: (trivial) add has_popcnt field
Not used yet but there's a couple of places in llvmpipe which should use this
(occlusion count is currently very inefficent if there's no cpu popcnt
instruction).
2013-06-19 23:47:36 +02:00
Roland Scheidegger
5c9aee111e llvmpipe: use 64bit counter for occlusion queries
Some APIs require 64bit and at least for 64bit archs the overhead
should be minimal.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
dc5dc4fd94 llvmpipe: handle more queries
Handle PIPE_QUERY_GPU_FINISHED and PIPE_QUERY_TIMESTAMP_DISJOINT, and
also fill out the ps_invocations and c_primitives from the
PIPE_QUERY_PIPELINE_STATISTICS (the others in there should already
be handled). Note that ps_invocations isn't pixel exact, just 16 pixel
exact but I guess it's better than nothing.
Doesn't really seem to work correctly but there's probably bugs elsewhere.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
bf5096303f softpipe: handle all queries, and change for the new disjoint semantics
The driver can do render_condition but wasn't handling the occlusion
and so_overflow predicates (though the latter might not work yet due
to gs support).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger
cdf89d0b5c gallium: fix PIPE_QUERY_TIMESTAMP_DISJOINT
The semantics didn't really make sense, not really matching neither d3d9
(though the docs are all broken there) nor d3d10. So make it match d3d10
semantics, which actually gives meaning to the "disjoint" part.
Drivers are fixed up in a very primitive way, I have no idea what could
actually cause the counter to become unreliable so just always return
FALSE for the disjoint part.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:35 +02:00
José Fonseca
a0a40805dd trace: Dump pipe_rasterizer_state::clip_halfz.
Trivial.
2013-06-19 18:16:16 +01:00
Brian Paul
1e16e48f88 svga: add some comments about primitive conversion
And clean up the svga_translate_prim() function with better
variable names.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
8b3d4efed8 indices: add some comments
This is pretty complicated code with few/any comments.  Here's a first stab.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
2e8c51c98f svga: reindent svga_tgsi.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
0de01a47dd svga: whitespace, comment, formatting fixes in svga_tgsi_emit.h
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
1f57349e20 svga: move some svga/tgsi functions
Move some functions from the svga_tgsi_insn.h header into the
svga_tgsi_insn.c file since they're only used there.  Plus, add
comments and fix formatting.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:14 -06:00
Brian Paul
3abd9285be svga: formatting fixes in svga_tgsi_insn.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
9e6c29bf12 mesa: wrap comments, code to 78 columns in multisample.c
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Brian Paul
bdd5a0c12b mesa: remove unused BITSET64 macros
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 11:13:13 -06:00
Maarten Lankhorst
f1cccd6ca0 nvc0: kill assert in ppp code
It's no longer always true, and the video tilign aligment should
ensure the alignment is handled correctly regardless.
2013-06-19 13:08:51 +02:00
Chia-I Wu
cf41fae96b ilo: rework shader cache
The new code makes the shader cache manages all shaders and be able to upload
all of them to a caller-provided bo as a whole.

Previously, we uploaded only the bound shaders.  When a different set of
shaders is bound, we had to allocate a new kernel bo to upload if the current
one is busy.
2013-06-19 16:46:42 +08:00
Emil Velikov
7f7b05d6b3 nv50: avoid crash on updating RASTERIZE_ENABLE state
When doing blit using the 3D engine, the rasterizer cso may be NULL.

Ported from nvc0 commit 8aa8b0539.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-06-19 00:02:24 +02:00
Kristian Høgsberg
712269d674 wayland: Handle global_remove event as well
We need to set up a handler for the global_remove event that gets sent
out when a global gets removed.  Without the handler we end up calling
a NULL pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=65910

NOTE: This is a candidate for the stable branches.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2013-06-18 17:45:19 -04:00
Jordan Justen
adeda5afd4 gen7: fix GPU hang on WebGL texture-size test
When rendering to a texture with BaseLevel set, the miptree may be laid
out such that BaseLevel is in level 0 of the miptree (to avoid wasting
memory on unused levels between 0 and BaseLevel-1).  In that case, we
have to shift our render target's level down to the appropriate level of
the smaller miptree.

The WebGL test in combination with a meta code relating to
glGenerateMipmap also triggered a similar failure scenario.

This GPU hang regression was introduced by c754f7a8.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=65324
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-18 14:06:46 -07:00
Eric Anholt
248fddecd8 intel: Remove unused IS_POWER_OF_TWO() macro.
The is_power_of_two() inline function has been used instead.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-06-18 12:08:08 -07:00
Zack Rusin
9542131b27 Revert "draw: clear the draw buffers in draw"
This reverts commit 41966fdb3b.
While it's a lot cleaner it causes regressions because
the draw interface is always called from the draw functions
of the drivers (because the buffers need to be mapped) which
means that the stream output buffers endup being cleared on
every draw rather than on setting.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-17 21:43:10 -04:00
Roland Scheidegger
8975dc798d llvmpipe: fixes for conditional rendering
honor render_condition for clear_render_target and clear_depth_stencil.
Also add minimal support for occlusion predicate, though it can't be active
at the same time as an occlusion query yet.
While here also switchify some large if-else (actually just mutually
exclusive if-if-if...) constructs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Roland Scheidegger
793e8e3d7e gallium: add condition parameter to render_condition
For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-18 18:01:24 +02:00
Chia-I Wu
443dc15cf7 ilo: construct depth/stencil command in create_surface()
Add ilo_gpe_init_zs_surface() to construct

 3DSTATE_DEPTH_BUFFER
 3DSTATE_STENCIL_BUFFER
 3DSTATE_HIER_DEPTH_BUFFER

at surface creation time.  This allows fast state emission in draw_vbo().
2013-06-18 16:23:13 +08:00
Eric Anholt
eb20215075 intel: Allow blorp CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
746b57ef0e intel: Allow blit CopyTexSubImage to nonzero destination slices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
b0e3c3b852 intel: Directly implement blit glBlitFramebuffer instead of awkward reuse.
This gets us support for blitting to attachment types other than
textures.

v2: fix up comments from review by Kenneth.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
815dce9282 intel: Move XRGB->ARGB blit logic into intel_miptree_blit().
Now any caller (such as glCopyPixels()) can benefit from it, and it only
changes the correct subset of the destination instead of a whole teximage.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-06-17 15:43:23 -07:00
Eric Anholt
04a5e940c9 intel: Fix Y tiling support for glCopyTexSubImage's alpha override.
Apparently we don't have any piglit tests for this, because it would have
assertion failed in a debug build, or just rendered wrong in a non-debug
build if the destination wasn't covering whole tiles.

v2: Use the new macros.

Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-06-17 15:43:23 -07:00
Eric Anholt
78c2fc5925 intel: Make batch macros for doing BCS_SWCTRL setup.
We're going to add more BCS_SWCTRL setup instances soon, and you have to
be careful to have the set and restore atomic with the rendering that's
done, so that our state doesn't leak out to other rendering processes.

v2: Rewrite the patch to have batch begin/advance macros so that magic
    numbers don't get sprinkled around (and so you don't mix up your
    do-I-need-to-reset vs what-do-I-reset-to logic, which I nearly did in
    the next patch when first writing it)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-17 15:43:13 -07:00
Eric Anholt
b65b1c3148 mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage().
Intel had brokenness here, and I'd like to continue moving Mesa toward
hiding 1D_ARRAY's ridiculousness inside of the core, like we did with
MapTextureImage.  Fixes copyteximage 1D_ARRAY on intel.

There's still an impedance mismatch in meta when falling back to read and
texsubimage, since texsubimage expects coordinates into 1D_ARRAY as
(width, slice, 0) instead of (width, 0, slice).

v2: Fix offset of scanline reads from the source. (Thanks Brian!), replace
    dd.h comment with Paul's text and replace early exit with an assert.

Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
2013-06-17 15:26:20 -07:00
Dave Airlie
9e8400f4c9 tgsi: text parser: fix parsing of array in declaration
I noticed this code didn't work as advertised while doing some passing around
of TGSI shaders and trying to reparse them, and things failing.

This seems to fix it here for at least the small test case I hacked into a
graw test.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2013-06-18 08:25:12 +10:00
Sven Joachim
0829b893a9 mesa: Fix ieee fp on Alpha
Commit 1f82bf12ed inadvertently broke it, checking for __IEEE_FLOAT on all
Alpha machines instead of only on VMS as before.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Sven Joachim <svenjoac@gmx.de>
2013-06-17 10:02:56 -07:00
Richard Sandiford
c132c2978b st/xlib: Fix XImage stride calculation
Fixes window skew seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:15:13 -04:00
Richard Sandiford
876fefe2ff st/xlib Fix XIMage bytes-per-pixel calculation
Fixes a crash seen while running gnome on a 16-bit screen over vnc.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-17 12:14:32 -04:00
Jonathan Gray
ebd68dd029 gallium: replace bswap_32 calls with util_bswap32
byteswap.h and bswap_32 aren't portable, replace them with calls to
gallium's util_bswap32 as suggested by Mark Kettenis.  Lets these files
build on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-06-17 17:22:28 +02:00
Zack Rusin
7807763dd8 draw: fix a regression in computing max elt
gl can use elts without setting indices, in which case
our eltMax was set to 0 and always invoking the overflow
condition. So by default set eltMax to maximum, it will
be curbed by draw_set_indexes (if it ever comes) and if
not then it will let gl's glVertexPointer/glDrawArrays
work correctly. Fixes piglit's
triangle-rasterization-overdraw test.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-17 11:06:39 -04:00
Zack Rusin
41966fdb3b draw: clear the draw buffers in draw
Moves clearing of the draw so target buffers to the draw
module. They had to be cleared in the drivers before
which was quite messy.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-06-17 11:06:39 -04:00
Chia-I Wu
98bc4c62a6 ilo: add pipe-based copy method to ilo_blitter
It enables accelerated resource_copy_region() when blt-based method fails.
2013-06-17 18:28:58 +08:00
Chia-I Wu
ebfd7a61c0 ilo: add BLT-based blitting methods to ilo_blitter
Port BLT code in ilo_blit.c to BLT-based blitting methods of ilo_blitter.  Add
BLT-based clears.  The latter is verifed with util_clear(), but it is not in
use yet.
2013-06-17 16:36:53 +08:00
Chia-I Wu
b4b3a5c6dc ilo: replace util_blitter by ilo_blitter
ilo_blitter is just a wrapper for util_blitter for now.  We will port BLT code
to ilo_blitter shortly.
2013-06-17 14:37:10 +08:00
Kenneth Graunke
6d7abafdc8 i965: Assume flexible hardware primitive restart exists in the future.
Primitive restart with an arbitrary cut index was first supported as of
Haswell.  It's very doubtful that they'd take that away in future
hardware, so we may as well alter the check now.
2013-06-14 22:58:18 -07:00
Chris Forbes
def84d8014 i965: Shrink Gen5 VUE map layout to be the same as Gen4.
The PRM suggests a larger layout, mostly to support having
gl_ClipDistance[] somewhere predictable for the fixed-function clipper
-- but it didn't actually arrive in Gen5.

Just use the same layout for both Gen4 and Gen5.

No Piglit regressions.

Improves performance in CS:S Video Stress Test by ~3%.

V2: - Remove now-useless function for determining the SF URB read offset
    - Remove now-unused BRW_VARYING_SLOT_POS_DUPLICATE

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-16 01:05:41 +12:00