Commit graph

64509 commits

Author SHA1 Message Date
Marek Olšák
8c235465cd gallium/radeon: use gpu_address from r600_resource
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-09 23:41:16 +02:00
Marek Olšák
f6c392a270 r600g: use gpu_address from r600_resource
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
1c03a690bf radeonsi: use gpu_address from r600_resource
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
e878e154cd gallium/radeon: store VM address in r600_resource
This will help to get rid of the buffer_get_virtual_address calls.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
43b5c34cc3 r600g: remove useless r600_resource_va calls
R600-R700 don't support virtual memory.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
0e229b8c5a radeonsi: always prefer SWITCH_ON_EOP(0) on CIK
The code is rewritten to take known constraints into account, while always
using 0 by default.

This should improve performance for multi-SE parts in theory.

A debug option is also added for easier debugging. (If there are hangs,
use the option. If the hangs go away, you have found the problem.)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

v2: fix a typo, set max_se for evergreen GPUs according to the kernel driver
2014-08-09 23:41:15 +02:00
Marek Olšák
515269b3a7 radeonsi: fix a hang with instancing in Unigine Heaven/Valley on Hawaii
This isn't documented anywhere, but it's the only thing that works
for this case.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
085a861545 radeon,r200: fix buffer validation after CS flush
This validates all bound buffers (CB, ZB, textures, DMA) at the beginning
of CS. This fixes "bo->space_accouned" assertion failures.

Tested by: Jochen Rollwagen <joro-2013@t-online.de>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
0b5d88a518 st/mesa: fix blit-based partial TexSubImage for 1D arrays
This fixes piglit spec/EXT_texture_array/render-1darray.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
56286834b8 st/mesa: fix DrawPixels(GL_STENCIL_INDEX)
This is a bug which was probably uncovered recently by Jason's commits
and broke this.

The problem is _mesa_base_tex_format(GL_STENCIL_INDEX) returns -1.

Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-09 23:41:15 +02:00
Marek Olšák
88e0a2f88b st/mesa: dump TGSI before calling into the driver
If the driver crashes in create_xx_shader, you want to see the shader.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-08-09 23:41:15 +02:00
Jon TURNEY
a2e1dc0cce configure.ac: Use LIBS rather than LDFLAGS to add -ldl to dladdr check
ec8ebff "Check for dladdr()" erroneously uses LDFLAGS rather than LIBS to add
-ldl to the dladdr check.

Replace the workaround in 39a4cc4 of explicitly checking in libdl, with a more
correct approach of using LIBS.

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Pali Rohár <pali.rohar@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-08-09 11:18:31 +01:00
Eric Anholt
7b4b60b7e5 vc4: Add support for the COS instruction. 2014-08-08 18:59:47 -07:00
Eric Anholt
663ffff0e7 vc4: Add support for the SIN instruction.
v2: Rebase on helpers.
2014-08-08 18:59:47 -07:00
Eric Anholt
d815b2490b vc4: Fix register aliasing for packing of scaled coordinates.
Fixes glean fragProg1's "ADD test" and likely many others.
2014-08-08 18:59:47 -07:00
Eric Anholt
9492eb588d vc4: Add some debug code for forcing fragment shader output color. 2014-08-08 18:59:47 -07:00
Eric Anholt
961715eab2 u_primconvert: Copy min/max_index from the original primitive.
These values are supposed to be the minimum/maximum index values used to
read from the vertex buffers.  This code either copies index values out of
the old IB (so, same min/max as the original draw call), or generates a
new IB (using index values between the start and the start + count of the
old array draw info, which just happens to be what min/max_index are set
to by st_draw.c).

We were incorrectly setting the max_index in the
converting-from-glDrawArrays case to the start vertex plus the number of
vertices generated in the new IB, which broke QUADS primitive conversion
on VC4 (where max_index really has to be correct, or the kernel might
reject your draw call due to buffer overflow).

Reviewed-by: Rob Clark <robclark@freedesktop.org> (from verbal description
             of the patch)
2014-08-08 18:59:47 -07:00
Eric Anholt
1d03692f78 vc4: Fix using and emitting the 1/W from the vertex/coord shaders.
v2: Rebase on helpers change.
2014-08-08 18:59:47 -07:00
Eric Anholt
88bc5baa00 vc4: Add support for swizzles of 32 bit float vertex attributes.
Some tests start working (useprogram-flushverts, for example) due to
getitng the right vertices now.  Some that used to pass start failing with
memory overflow during binning, which is weird (glsl-fs-texture2drect).
And a couple stop rendering correctly (glsl-fs-bug25902).

v2: Move the attribute format setup in the key from after search time to
    before the search.
v3: Fix reading of attributes other than position (I forgot to respect
    attr and stored everything in inputs 0-3, i.e. position).
2014-08-08 18:59:47 -07:00
Eric Anholt
f069367f39 vc4: Add support for the TGSI FRC opcode.
v2: Rebase on helpers.
2014-08-08 18:59:47 -07:00
Eric Anholt
bf542cd372 vc4: Add support for the TGSI TRUNC opcode.
v2: Rebase on helpers.
2014-08-08 18:59:47 -07:00
Eric Anholt
399285403a vc4: Crank up the tile allocation BO size
This avoids a simulator assertion failure with glamor.  I need to actually
support resize, though.
2014-08-08 18:59:47 -07:00
Eric Anholt
75afa64ef8 vc4: Add support for multiple attributes 2014-08-08 18:59:47 -07:00
Eric Anholt
32948ca768 vc4: Add more useful debug for the undefined-source case
We could get undefined sources in real programs from the wild, so we'll
need to turn off this debug eventually.  But for now, using undefined
sources is typically me just mistyping something.
2014-08-08 18:59:47 -07:00
Eric Anholt
6ff2129d58 vc4: Add support for the lit opcode.
v2: Fix how it was using the X channel for the real work of the opcode,
    instead of Y.  Fixes glean's LIT test.
v3: Rebase on the helpers.
2014-08-08 18:59:47 -07:00
Eric Anholt
63e49da0a5 vc4: Add support for the POW opcode
v2: Rebase on helpers.
2014-08-08 18:59:47 -07:00
Eric Anholt
0e182e7d8f vc4: Refactor uniform handling.
I wanted an easy way to set up new uniforms every time, so I could handle
texture-sampler-related uniforms.

v2: Rebase on helpers change.
2014-08-08 18:59:47 -07:00
Eric Anholt
6c185bd263 vc4: Add support for the LRP opcode.
v2: Rebase on helpers, cutting out most of the code in this change.
2014-08-08 18:59:47 -07:00
Eric Anholt
ec9da314ba vc4: Add copy propagation between temps.
We put in a bunch of extra MOVs for program outputs, and this can clean
those up.  We should do uniforms, too, though.

v2: Fix missing flagging of progress when we actually optimize.  Caught by
    Aaron Watry.
2014-08-08 18:59:47 -07:00
Eric Anholt
d9d1c14430 vc4: Add dead code elimination.
This cleans up a bunch of noise in the compiled coordinate shaders (since
we don't need the varying outputs), and also from writemasked instructions
with negated src operands.
2014-08-08 18:59:47 -07:00
Eric Anholt
1d23d55ae9 vc4: Add an initial pass of algebraic optimization.
There was a lot of extra noise in my piglit shader dumps because of silly
CMPs.
2014-08-08 18:59:47 -07:00
Eric Anholt
4c53087c67 vc4: Add support for CMP.
This took a couple of tries, and this is the squash of those attempts.

v2: Fix register file conflicts on the args in the
    destination-is-accumulator case.
v3: Rebase on helper change and qir_inst4 change.
2014-08-08 18:59:47 -07:00
Eric Anholt
eea1d36915 vc4: Make scheduling of NOPs a separate step from QIR -> QPU translation.
This should also be used as a way to pair QIR instructions into QPU
instructions later.
2014-08-08 18:59:46 -07:00
Eric Anholt
c293927511 vc4: Add WIP support for varyings.
It doesn't do all the interpolation yet, but more tests can run now.

v2: Rebase on helpers.
2014-08-08 18:59:46 -07:00
Eric Anholt
db9f41ea88 vc4: Use r3 instead of r5 for temps, since r5 only has 32 bits of storage
Reserving a whole accumulator for temps is awful in the first place, but
I'll fix that later.
2014-08-08 18:59:46 -07:00
Eric Anholt
23b2bad991 vc4: Fix emit of ABS
v2: Rebase on qir helpers.
2014-08-08 18:59:46 -07:00
Eric Anholt
cf2d777fbe vc4: Add shader variant caching to handle FS output swizzle. 2014-08-08 18:59:46 -07:00
Eric Anholt
6cf86dd487 vc4: Load the tile buffer before incrementally drawing.
We will want to occasionally disable this again when we do clear support.

v2: Squash with the previous commit (I accidentally committed at two
    stages of writing the change)
2014-08-08 18:59:46 -07:00
Eric Anholt
c3f96060a8 vc4: Don't reallocate the tile alloc/state bos every frame.
This was a problem for the simulator since we don't free memory back to
it, and it would soon just run out.
2014-08-08 18:59:46 -07:00
Eric Anholt
21db430210 vc4: Add VC4_DEBUG env option
v2: Fix an accidental deletion of some characters from the copyright
    message (caught by Ilia Mirkin)
2014-08-08 18:59:46 -07:00
Eric Anholt
2e35981d4d vc4: Add support for SNE/SEQ/SGE/SLT. 2014-08-08 18:59:46 -07:00
Eric Anholt
7108c24fd0 vc4: Use the user's actual first vertex attribute.
This is hardcoded to read it as RGBA32F so far, but starts to get more
tests working.
2014-08-08 18:59:46 -07:00
Eric Anholt
427f934f9e vc4: Fix UBO allocation when no uniforms are used.
We do rely on a real BO getting allocated, so make sure we ask for a non-zero size.
2014-08-08 18:59:46 -07:00
Eric Anholt
db8712bcbc vc4: Add initial support for math opcodes 2014-08-08 18:59:46 -07:00
Eric Anholt
792d1c92df vc4: Switch to actually generating vertex and fragment shader code from TGSI.
This introduces an IR (QIR, for QPU IR) to do optimization on.  It's a
scalar, SSA IR in general.  It looks like optimization is pretty easy this
way, though I haven't figured out if it's going to be good for our weird
register allocation or not (or if I want to reduce to basically QPU
instructions first), and I've got some problems with it having some
multi-QPU-instruction opcodes (SEQ and CMP, for example) which I probably
want to break down.

Of course, this commit mostly doesn't work, since many other things are
still hardwired, like the VBO data.

v2: Rewrite to use a bunch of helpers (qir_OPCODE) for emitting QIR
    instructions into temporary values, and make qir_inst4 take the 4 args
    separately instead of an array (all later callers wanted individual
    args).
2014-08-08 18:59:46 -07:00
Eric Anholt
e59890aebb vc4: Start converting the driver to use vertex shaders.
Note: This is the cutoff point where I switched from developing primarily
on the Pi to developing o the simulator.  As a result, from this point on
the code is untested on the Pi (the kernel code I have currently wasn't
rendering anything at this commit, though the simulator renders
successfully, suggesting kernel bugs).
2014-08-08 18:59:46 -07:00
Eric Anholt
1850d0a1cb vc4: Initial skeleton driver import.
This mostly just takes every draw call and turns it into a sequence of
commands that clear the FBO and draw a single shaded triangle to it,
regardless of the actual input vertices or shaders.  I copied the initial
driver skeleton mostly from freedreno, and I've preserved Rob Clark's
copyright for those.  I also based my initial hardcoded shaders and
command lists on Scott Mansell (phire)'s "hackdriver" project, though the
bit patterns of the shaders emitted end up being different.

v2: Rebase on gallium megadrivers changes.
v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change.
v4: Rely on simpenrose actually being installed when building for
    simulation.
v5: Add more header duplicate-include guards.
v6: Apply Emil's review (protection against vc4 sim and ilo at the same
    time, and dropping the dricommon drm bits) and fix a copyright header
    (thanks, Roland)
2014-08-08 18:59:46 -07:00
Roland Scheidegger
f017e32c0a draw: (trivial) use information about gs being present from variant key
This is a purely cosmetic change.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-08-09 03:52:58 +02:00
Roland Scheidegger
6d2ecdb4a6 draw: don't use clipvertex output if user plane clipping is disabled
The non-llvm path made sure that both clip and pre_clip_pos point to the data
output by position, not clipvertex, if user based clipping is disabled.
However, the llvm path did not, which apparently led to failures if
gl_ClipVertex was written but user plane clipping not enabled (bug 80183).
Why I have no idea really, but just make it match the non-llvm behavior...

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-08-09 03:52:58 +02:00
Chris Forbes
0f4c5a70c6 i965: Get rid of backend_instruction::sampler
The generators no longer use this.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-08-09 13:12:35 +12:00