Commit graph

56298 commits

Author SHA1 Message Date
Chia-I Wu
e6186b0769 ilo: hook up pipe context 3D functions 2013-04-26 16:16:43 +08:00
Chia-I Wu
5b310f6230 ilo: add GEN7 support for 3D pipeline 2013-04-26 16:16:43 +08:00
Chia-I Wu
91ce766c35 ilo: add 3D pipeline for GEN6
The 3D pipeline is a high-level interface to emit 3D commands and states.  It
uses GEN6 GPE to do the real work.
2013-04-26 16:16:43 +08:00
Chia-I Wu
67233b56d6 ilo: add GEN7 GPE 2013-04-26 16:16:43 +08:00
Chia-I Wu
d3602dfac6 ilo: add GEN6 GPE
GEN6 GPE (Graphics Processing Engine) is a low-level interface to emit 3D
commands and states.
2013-04-26 16:16:43 +08:00
Chia-I Wu
72357cf3bb ilo: hook up pipe context query functions
None of the query types are supported yet.
2013-04-26 16:16:43 +08:00
Chia-I Wu
8f949bc1da ilo: hook up pipe context transfer functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
0754ff33e3 ilo: hook up pipe context blit functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
89d1702b9b ilo: hook up pipe context state functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
520af66797 ilo: add functions to manage shaders
This commits add shader cache, shader state, shader variant, and etc.  It does
not add the shader compiler though.
2013-04-26 16:16:42 +08:00
Chia-I Wu
86940bf41c ilo: hook up pipe context flush function 2013-04-26 16:16:42 +08:00
Chia-I Wu
eed1e5a407 ilo: add command parser
The command parser manages batch buffers and command submissions.
2013-04-26 16:16:42 +08:00
Chia-I Wu
3a4a570c34 ilo: hook up pipe screen resource functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
b50e68cb67 ilo: hook up pipe screen format functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
babb2b5c50 ilo: hook up pipe_screen param and fence functions 2013-04-26 16:16:42 +08:00
Chia-I Wu
e74d67738d ilo: add debug flags settable through ILO_DEBUG 2013-04-26 16:16:42 +08:00
Chia-I Wu
63b5720105 ilo: new pipe driver for Intel GEN6+
This commit adds some boilerplate code.  The header files found under include/
are copied from i965.
2013-04-26 16:16:41 +08:00
Chia-I Wu
380e6875b8 winsys/intel: new winsys for intel
This is a wrapper for libdrm_intel to allow the pipe driver to stay OS
agnostic.
2013-04-26 15:49:00 +08:00
José Fonseca
542c5b3703 gallivm: Fix trivial out-of-bounds indirection in lp_build_cube_lookup().
Courtesy of clang:

  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
           tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02);
           ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
              rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]);
                                                       ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
              rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]);
                                                       ^   ~
  src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
           LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
           ^
2013-04-26 08:44:37 +01:00
Matt Turner
0c1d87b0d7 i965/vs: Add support for LRP instruction.
Only 13 affected programs in shader-db, but they were all helped.

total instructions in shared programs: 368877 -> 368851 (-0.01%)
instructions in affected programs:     1576 -> 1550 (-1.65%)

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 18:27:39 -07:00
Matt Turner
c0f67a127b i965/vs: Add a function to fix-up uniform arguments for 3-src insts.
Three-source instructions have a vertical stride overloaded to 4, which
prevents directly using vec4 uniforms as arguments. Instead we need to
insert a MOV instruction to do the replication for the three-source
instruction.

With this in place, we can use three-source instructions in the vertex
shader. While some thought needs to go into deciding whether its better
to use a three-source instruction rather than a sequence of equivalent
instructions (when one or more sources are uniforms or immediates), this
will allow us to skip a lot of ugly lowering code and use the BFE and
BFI2 instructions directly.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 18:27:39 -07:00
Jerome Glisse
abb96fdea7 winsys/radeon: consolidate tracing into winsys v2
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).

Lot of file touched because of winsys API changes.

v2: Do not write lockup file if ib uniq id does not match last one

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 18:36:31 -04:00
Tom Stellard
53fbae7eac r600g/compute: Removed unused and untested code
There was a lot of code in evergreen_compute_internal.c that was not
being used at all and most of it was duplicating code from other parts
of the driver.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:32:22 -07:00
Tom Stellard
f986087d5c r600g/compute: Use a constant buffer to store kernel parameters v2
v2:
  - Fix usage of set_constant_buffer()
  - Fix typo in comment

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 13:32:17 -07:00
Tom Stellard
ffadc71afb r600g: Add evergreen_emit_cs_constant_buffers() v2
v2:
  - Bump R600_NUM_ATOMS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-25 13:25:00 -07:00
Tom Stellard
83a00a1de8 r600g/compute: Don't use radeon_winsys::buffer_wait() after dispatching a kernel
The state tracker should be responsible for waiting for the kernel to
finish.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:24:51 -07:00
Tom Stellard
09e47f7a25 r600g/compute: Fix input buffer size calculation
Buffer size should be in bytes not dwords.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 13:24:24 -07:00
Adam Jackson
904b03824b linux: Don't emit a .note.ABI-tag section anymore (#26663)
We don't support pre-2.6 kernels anyway - the install docs say 2.6.28
for DRI - and apparently this confuses ld.so's sorting when multiple
libGLs are installed.  Just remove it.

Note: this is a candidate for the stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2013-04-25 15:51:35 -04:00
Rob Clark
73de07cbbc freedreno: use writecombine buffers
Better than uncached for writes, which are common for vertex buffer
upload, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-25 15:10:56 -04:00
Rob Clark
f706d4d340 freedreno: don't patch and re-emit same shader as much
New textures or vertex buffers don't always require patching and
re-emitting the shaders.  So do a better job of figuring out when we
actually have to patch the shader.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-25 15:10:56 -04:00
Eric Anholt
578987ce1c i965: Avoid recompiles for fragment clamping on non-clamping APIs.
Removes 75/78 state-dependent recompiles in GLB2.7 (the remaining 3 are
due to FBO-rendering size predictions).  We currently expose
GL_ARB_color_buffer_float on GL core, so we may mis-predict there, but I'm
about to send a patch for removing that silly extension in that case.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-25 12:03:00 -07:00
Alex Deucher
b5145ca2a8 radeonsi: add new SI pci ids
Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 14:22:46 -04:00
Alex Deucher
b3a856dfa9 r600g: add new richland pci ids
Note: this is a candidate for the stable branches.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-04-25 14:21:15 -04:00
José Fonseca
12096f334b draw: Yield zeros for LLVM fetches of non-existing vertex elements.
If a bug in an app/stater-tacker causes vertex buffer to fetch vertex
elements that are not bound, simply return zeros instead of crashing.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-25 16:16:21 +01:00
José Fonseca
28e6a272fc trace: Only close trace files on exit.
Many applications don't exit cleanly, others may create and destroy a
screen multiple times, so we only write </trace> tag and close at exit
time.
2013-04-25 14:18:33 +01:00
José Fonseca
74d1153c9c graw: Set the vertex shader constant buffer.
We were setting the fragment shader, which wasn't needed.
2013-04-25 14:06:50 +01:00
José Fonseca
e88a1dba09 graw: Simple utilities to dump and disassemble TGSI tokens.
Useful for core dumps, where calling tgsi_dump() from gdb is not an
alternative.
2013-04-25 13:03:06 +01:00
José Fonseca
1687932d2b scons: Support clang.
clang is supports most gcc options / extensions, with a some exceptions.

The biggest advantage of using clang is that compilation times are much
short.

One can tell scons to use clang when building by invoking it as

   CC=clang CXX=clang++ scons libgl-xlib
2013-04-25 11:59:01 +01:00
José Fonseca
f0c296773d util/u_sse: Fix _mm_shuffle_epi8 prototype for clang.
Clang does not support __artificial__. Instead match precisely what's
in the clang headers.
2013-04-25 11:59:01 +01:00
José Fonseca
45a60e2e7a scons: Remove redundant code.
-fvisibility=hidden is already elsewhere for the whole tree.
2013-04-25 11:59:01 +01:00
Chris Forbes
8fd0190278 mesa: fix bogus comment about PrimitiveRestart fields
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-25 20:49:25 +12:00
Chris Forbes
447bf1fb52 i965: report correct sample positions
From low to high bits, the sample positions are packed y0,x0,y1,x1...

Fixes arb_texture_multisample-sample-position piglit.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-04-25 20:47:54 +12:00
Rob Clark
49a7624973 freedreno: fix bogus IMM const reg index
We were assigning incorrect const register for immediates, and
potentially writing immediate const to the wrong location.  This fixes
an incorrect-rendering bug with xonotic.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
9495ee12c6 freedreno: clear fixes and debugging
Set a few extra registers to make sure we are in proper state for
clearing.  And also add some debug options to mark all state dirty in
clear and gmem operations to aid in debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
d5d6ec8843 freedreno: fix texture fetch type
There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether
there are two or there valid input components.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
d086bb22bc freedreno: fix temp register usage
The previous approach of using the dst register as an intermediate
temporary doesn't work in a lot of cases.  For example, if the dst
register is the same as one of the src registers.

For now, just simplify it and always allocate a new register to use as
an intermediate.  In some cases this will result in more registers used
than required.  I think the best solution would be to implement an
optimization pass to reduce the number of registers used, which would
also solve the problem we have now of not being able to use GPRs that
are assigned for TGSI_FILE_INPUT.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
7a837da556 freedreno: add noop driver
It is useful for debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
eec37f1cdc freedreno: use u_math macros/helpers more
Get rid of a few self-defined macros:
  ALIGN() -> align()
  min() -> MIN2()
  max() -> MAX2()

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
38d8b02eba freedreno: implement fd_screen_destroy()
Opps, didn't notice that I had left it stubbed out.

Also, make things fail a bit more gracefully when things go wrong.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00
Rob Clark
a64e2d9d9f freedreno: set SWAP bit based on format
Really this should be set based on buffer format, not on color vs
depth/stencil.  Probably there should be more formats that set the bit
as we add support for more render target formats.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-24 21:09:46 -04:00