Commit graph

4923 commits

Author SHA1 Message Date
Eric Anholt
4db16a9480 intel: Add .aub file output support.
This will allow the driver to capture all of its execution state to a
file for later debugging.  intel_gpu_dump is limited in that it only
captures batchbuffers, and Mesa's captures, while more complete, still
capture only a portion of the state involved in execution.

This is a squash commit of a long series of hacking as we tried to get
the resulting traces to work in the internal simulator.  It contains
contributions by Yuanhan Liu and Kenneth Graunke.

v2: Drop the MI_FLUSH_ENABLE setup.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-03-09 16:34:14 -08:00
Kenneth Graunke
6e642db7f4 intel: Add support for overriding the PCI ID via an environment variable
For example:

    export INTEL_DEVID_OVERRIDE=0x162

If this variable is set, don't actually submit the batchbuffer to the
GPU, it probably contains commands for the wrong generation of hardware.

v2: Introduce a getter for the overridden devid, and avoid getenv per exec.

Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
2012-03-09 16:34:14 -08:00
David Herrman
fd39e61d0e xf86drmMode.h: Add header protection
xf86drmMode.h is missing a header protection. xf86drm.h has one so just
copy it and adjust the name.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
2012-03-09 13:40:14 -05:00
Alan Coopersmith
f82c778703 Make drm/drm_fourcc.h portable to non-linux platforms
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2012-03-05 19:07:02 -08:00
Matt Turner
be30d350b6 Don't require pciaccess if Intel is disabled
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>
2012-03-02 14:34:17 -05:00
Eric Anholt
783db34f6d intel: Import a new batchbuffer for the gen7 test.
This one doesn't have the 3DSTATE_HIER_DEPTH_BUFFER bug that the
previous one did.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-22 12:27:34 -08:00
Eric Anholt
b395af0d2d intel: Add decode for gen7 HIER_DEPTH_BUFFER.
Note that the regression test complains here: The batch that was
captured included a bug in its packet output, which was later fixed in
Mesa.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-22 12:27:25 -08:00
Eric Anholt
e6beaf8ee4 intel: Add decode for gen7 3DSTATE_WM.
This requires pulling the gen6 3DSTATE_WM out to a function so it
doesn't override gen7's handler.

v2: Fix pasteo in interpreting ZW interpolation (thanks danvet!).

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-22 12:26:45 -08:00
Eric Anholt
259e7b6138 intel: Fix a typo in decode error message.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-22 12:25:19 -08:00
Chris Wilson
23eeb7e1e4 intel: Detect cache domain inconsistency with valgrind
Every access to either the GTT or CPU pointer is supposed to be
proceeded by a set_domain ioctl so that GEM is able to manage the cache
domains correctly and for the following access to be coherent. Of
course, some people explicitly want incoherent, non-blocking access
which is going to trigger warnings by this patch but are probably better
served by explicit suppression.

v2: Also mark the pointers as inaccessible following the explicit unmap
and implicit unmap upon return to the cache.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-02-15 11:16:59 +00:00
Jerome Glisse
9b3ad51ae5 radeon: fix pitch alignment for scanout buffer
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-02-13 20:46:43 -05:00
Chris Wilson
ced219ebbd configure: Fix pkg-config test in absence of valgrind
The empty string used for the not case is replaced by the default
if-else clause and so causes the configure to fail in the absence of
valgrind. Which is not quite what was intended.

Instead use the common idiom of setting a variable depending on whether
the true or false branch is taken and emit the conditional code as a
second step.

Reported-by: Tobias Jakobi <liquid.acid@gmx.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2012-02-13 00:24:14 +00:00
Chris Wilson
90b23cc24c intel: Mark up with valgrind intrinsics to reduce false positives
In particular, declare the hidden CPU mmaps to valgrind so that it knows
about those memory regions.

v2: Add an additional VG_CLEAR for the getparam

References: https://bugs.freedesktop.org/show_bug.cgi?id=35071
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
[anholt: Ideally valgrind should just learn about the ioctls, and
         removing the clear for the non-valgrindified code feels risky.]
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-02-11 11:45:39 +00:00
Michel Dänzer
2cfac57d36 radeon_cs_setup_bo: Fix accounting if caller specified write and read domains.
Only account for the write domain in that case.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=43893 .

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-02-08 10:50:55 +01:00
Jerome Glisse
230ec7d7bb configure: Bump version for 2.4.31 2012-02-06 15:22:58 -05:00
Jerome Glisse
356b87d8b3 radeon: add r600_pci_ids.h to header file
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-02-06 15:22:14 -05:00
Jerome Glisse
10c0837780 radeon: fix surface API for good before anyone start relying on it
The mipmap level computation was wrong, we need to know the block
width, height, depth of compressed texture to properly compute this.
Change API to provide block width, height, depth instead of nblk_x,
nblk_y, nblk_z.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-02-03 14:42:47 -05:00
Jerome Glisse
6a720cb866 radeon: surface fix macro -> micro tile fallback
We need to force 1D tiling only on old kernel the fallback was
broken along the way.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-02-02 18:36:42 -05:00
Ville Syrjälä
76b4a69aab Using sizeof() on a function parameter with an array type does not
work. sizeof() treats such parameters as pointers.

Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
2012-02-02 14:53:43 -05:00
Ville Syrjälä
a14c3dd0f9 This function was missing.
Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
2012-02-02 14:53:41 -05:00
Ville Syrjälä
df497e9281 drmModeFreeResources() always leaked some memory.
drmModeGetPlaneResources() and drmModeGetPlane() leaked in one error
path.

Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
2012-02-02 14:53:39 -05:00
Jerome Glisse
c51f7f0e46 radeon: add surface allocator helper v10
The surface allocator is able to build complete miptree when allocating
surface for r600/r700/evergreen/northern islands GPU family. It also
compute bo size and alignment for render buffer, depth buffer and
scanout buffer.

v2 fix r6xx/r7xx 2D tiling width align computation
v3 add tile split support and fix 1d texture alignment
v4 rework to more properly support compressed format, split surface pixel
   size and surface element size in separate fields
v5 support texture array (still issue on r6xx)
v6 split surface value computation and mipmap tree building, rework eg
   and newer computation
v7 add a check for tile split and 2d tiled
v8 initialize mode value before testing it in all case, reenable
   2D macro tile mode on r6xx for cubemap and array. Fix cubemap
   to force array size to the number of face.
v9 fix handling of stencil buffer on evergreen
v10 on evergreen depth buffer need to have enough room for a stencil
    buffer just after depth one

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2012-02-01 17:11:29 -05:00
Eugeni Dodonov
151cdcfe68 intel: query for LLC support
This adds support for querying the kernel about the LLC support in the
hardware.

In case the ioctl fails, we assume that it is present on GEN6 and GEN7.

v2: fix the return code checking

Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
2012-02-01 15:54:02 -02:00
Paul Berry
82c6938d23 intel: Fix build of Intel DRM on x86 systems
Commit efd6e81e inadvertently broke the build by looking for "i?86" or
"x86_64" in $host_os.  The correct variable to check is $host_cpu.

This was preventing libdrm_intel.so from being built.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-01-31 14:46:16 -08:00
Jeremy Huddleston
efd6e81e2b Don't build Intel DRM if $CHOST is not i?86-* or x86_64-*
This fixes a failure in 'make check' found by the tinderbox when trying to
build this code on Linux/ppc.  This code is only designed to run on
Intel platforms, so don't even bother building it if we're not in that set.

Found-by: Tinderbox
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2012-01-30 15:20:04 -08:00
Chad Versace
592ac67626 intel: Fix bufmgr_gem->gen for gen > 4
If the pci_device's actual gen was > 4, then we stupidly set
bufmgr_gem->gen = 6. Luckily this caused no bugs, and this fix shouldn't
change any behavior, because all checks against the gen currently have one
of the forms below:
    gen == 2
    gen == 3
    gen >= 4

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2012-01-30 13:03:35 -08:00
Eric Anholt
b643b0713a intel: Add minimal decode for remaining gen7 packets in use.
This just gets packet name and length in place, with the remainder
unfinished.  I've long since finished the work that got me started
fixing up the decode.
2012-01-27 13:21:20 -08:00
Eric Anholt
54b12a085f intel: Add decode for gen7 constant buffer packets. 2012-01-27 13:21:20 -08:00
Eric Anholt
938df6be48 intel: Add decode for gen7 state pointers.
Since CC_STATE_POINTERS for gen6 and 7 are quite different but use the
same opcode, move gen6 out to a helper function too, so we can use a
helper function for gen7.
2012-01-27 13:21:20 -08:00
Eric Anholt
6a0b25e66b intel: Add support for parsing gen7 URB packets. 2012-01-27 13:21:20 -08:00
Eric Anholt
ba8ce2da04 intel: Make most of the logic for 965 3d packet length checks table-driven.
This puts the error message in a consistent location relative to the
packet, and while I'm here I made the error message a bit more
informative.

Now, most static length packets need to just declare their length in
the table and not worry.
2012-01-27 13:21:20 -08:00
Eric Anholt
b129e10af2 intel: Move the logic for getting 965 3d packet length to the packet table.
While I'm touching every line of the table, sort it by opcode.
2012-01-27 13:21:20 -08:00
Eric Anholt
3dcb2d47ee intel: Add support for parsing 965 3d packets using helper functions.
I want to add packets, without contributing to the switch statement of
doom.
2012-01-27 13:21:19 -08:00
Eric Anholt
5a1c10fe6a intel: Parse the correct length for gen7 3DSTATE_MULTISAMPLE. 2012-01-27 13:21:19 -08:00
Eric Anholt
9695eee8a2 intel: Put the "gen" shorthand chipset identifier in the context.
It's a lot nicer than using IS_WHATEVER(devid) all over the place, and
we have this in our other projects too.
2012-01-27 13:21:19 -08:00
Eric Anholt
028715ee70 intel: Avoid the need for most overflow checks by using a scratch page.
The overflow checks were all thoroughly untested, and a bunch of the
ones I'm deleting were pretty broken.  Now, in the case of overflow,
you just decode data of 0xd0d0d0d0, and instr_out prints the warning
message instead.  Note that this still has the same issue of being
under-tested, but at least it's one place instead of per-packet.

A couple of BUFFER_FAIL uses are left where the length to be decoded
could be (significantly) larger than a page, and the decode didn't
just call instr_out (which doesn't dereference data itself unless it's
safe).
2012-01-27 13:21:19 -08:00
Eric Anholt
c1d2946da8 intel: Make instr_out take the decode context.
This reduces some of the extra derefs of the pointers.
2012-01-27 13:21:19 -08:00
Eric Anholt
b0371612f4 intel: Use the context to simplify BR01 decode.
Similar to BR00, count was always 1 and was always an index, not a count.
2012-01-27 13:21:19 -08:00
Eric Anholt
62b410344c intel: Use the context to simplify BR00 decode.
The count (actually index) was always 0, because BR00 is dword 0.
2012-01-27 13:21:19 -08:00
Eric Anholt
de49fd41e2 intel: Plumb the context through the decode callchain.
We still deref the context at the start of every call, but that will
change next.
2012-01-27 13:21:19 -08:00
Eric Anholt
a756fa384f intel: Drop the code for counting parsing failures.
Nothing was consuming it.  If something wants this in the future,
would be done using the decode context anyway.
2012-01-27 13:21:19 -08:00
Eric Anholt
8fb66a7ded intel: Track the current packet location in the decode context.
This is the start of plumbing the context through the decode
callchain instead of the current 4 arguments.
2012-01-27 13:21:19 -08:00
Eric Anholt
b5cb7f88de intel: Add a regression test for 2D decode, which I'm about to refactor. 2012-01-27 13:21:19 -08:00
Jesse Barnes
66518ab565 intel: add sprite ioctl defines and struct for i915 sprite code 2012-01-09 10:22:33 -08:00
Eric Anholt
adf1428915 configure: Bump version for 2.4.30 2012-01-06 08:50:31 -08:00
Eric Anholt
9fb83a49cb intel: Update for new i915_drm.h defines. 2012-01-04 14:51:59 -08:00
Eric Anholt
683855f655 intel: Add regression tests for batch decode.
The .batch was generated using the dump-a-batch branch of

git://people.freedesktop.org/~anholt/mesa

using glxgears on gen7 hardware, using INTEL_DEVID_OVERRIDE for
non-gen7 (this means that offsets in the buffers for non-gen7 are 0!).
The .ref was generated by:

./test_decode tests/gen7-3d.batch -dump.

The .sh exists because you can't supply arguments to tests using the
simple automake tests driver.  Something reasonable could be done
using automake's parallel-tests driver (in fact, a previous version of
the patch did that), but I was concerned that:

1) The parallel-tests driver is documented to be unstable -- they may
   change interfaces on us later.
2) The parallel-tests driver hides the output of tests in .log files
   scattered all over the tree, which was ugly and more painful to
   work with.

v2: Actually add the batch files, add a .gitignore for the *-new.txt
    files added after failures, and fix failure mode for undetected
    chipset name.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
2012-01-04 14:49:44 -08:00
Eric Anholt
ccbc40340b intel: Add a regression test program for intel_decode.c.
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-04 14:49:44 -08:00
Eric Anholt
ea33a231d5 intel: Add an interface for setting the output file for decode.
Consumers often want to choose stdout vs stderr, and for testing I
want to output to an open_memstream file.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-01-04 14:49:44 -08:00
Johannes Obermayr
a9dd34a7ee intel/intel_decode.c: Remove #include "intel_decode.h".
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
2011-12-30 21:07:55 -08:00