Commit graph

66874 commits

Author SHA1 Message Date
Rob Clark
d595987ea3 freedreno/a3xx: refactor/optimize emit
Because we reuse various bits of emit code (for state/vertex/prog/etc)
for both regular draws and internal draws (gmem<->mem, clear, etc), the
number of parameters getting passed around has been growing.  Refactor
to group these into fd3_emit.  This simplifies fxn signatures, avoids
passing around shader key on the stack, etc.  It also gives us a nice
place to cache shader-variant lookup to avoid looking up shader variants
multiple times per draw (without having to *also* pass them around as
fxn args everywhere).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Rob Clark
d5d80b3739 freedreno/a3xx: refactor vertex state emit
Get rid of fd3_vertex_buf and use fd_vertex_state directly for all
draws.  Removes a tiny bit of CPU overhead for munging around the vertex
state every time it is emitted, but more importantly it cleans things up
for later optimizations, so the emit paths don't have to special case
internal draws (gmem<->mem, clears, etc) with regular draws.

Instead of constructing fd3_vertex_buf array each time for internal
draws, and context init time pre-create solid_vbuf_state and
blit_vbuf_state.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-15 15:49:48 -04:00
Eric Anholt
57de9bbb63 vc4: Fix the uniform debug output.
I dropped the shader index when moving to the compiled shader struct, but
didn't update the format string here.
2014-10-15 18:12:03 +01:00
Eric Anholt
201d4c0b2a vc4: Add support for user clip plane and gl_ClipVertex.
Fixes about 15 piglit tests about interpolation and clipping.
2014-10-15 18:11:46 +01:00
Eric Anholt
6a0bf67048 vc4: Move the output semantics setup to a helper.
I want to reuse it elsewhere to set up outputs that aren't in the TGSI.
2014-10-15 18:11:46 +01:00
Kenneth Graunke
39a5a60b57 i965: Allow CSE on Gen4-5 unary math.
Due to the implicit move-from-GRF, unary math looks a lot like the Gen6+
math instruction: it's a single instruction (SEND) with a GRF source.
The difference is that it also implicitly clobbers a message register.

The only visible effect is that CSE will remove the MRF-clobbering from
later math operations.  This should be fine; compute_to_mrf and
remove_redundant_mrf_writes don't look at the values populated by
implied writes, so they can't rely on those values being present.
Less interference may actually help those passes make more progress.

Binary math is still problematic, since it involves a separate MOV
instruction to load the second operand.  We continue disallowing CSE for
binary math operations.

total instructions in shared programs: 3340303 -> 3340100 (-0.01%)
instructions in affected programs:     26927 -> 26724 (-0.75%)
Nothing hurt, gained, or lost.  ~6% reduction on a few shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-10-15 08:44:54 -07:00
Michel Dänzer
159f93cf39 r600g,radeonsi: Only set use_staging_texture = TRUE once
No need to check for setting the flag after we set it already.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:26:30 +09:00
Michel Dänzer
87da286755 r600g,radeonsi: Use staging texture for transfers if any miplevel is tiled
We set the NO_CPU_ACCESS flag for BO allocation in that case, so direct CPU
access may not work.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:26:14 +09:00
Michel Dänzer
3ede67a4c6 winsys/radeon: Use separate caching buffer manager for each set of flags
Otherwise the caching buffer manager may return a buffer which was created
with a different set of flags, which can cause trouble.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-10-15 16:11:40 +09:00
Andres Gomez
657764c21c configure.ac: check for libexpat when no pkg-config is available
Previously, when no pkg-config was available for
libexpat we would just add the needed linking
flags without any extra check.

Now, we check that the library and the headers are
also installed in the building environment.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-15 08:59:12 +02:00
Tom Stellard
8cf6482c3d clover: Fix regression in module serialization
We need to serialize semantic information for arguments, which was added
in 06139c56fa.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-10-14 17:58:06 -04:00
Jason Ekstrand
3435aa49f4 i965/fs: Use the correct regs_written on unspill instructions
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-10-14 12:39:45 -07:00
Ilia Mirkin
742158b51e st/gbm: fix order of arguments passed to is_format_supported
Reported by Coverity

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Cc: mesa-stable@lists.freedesktop.org
2014-10-14 12:33:38 -04:00
Ilia Mirkin
5524af8136 nouveau: 3d textures are unsupported, limit 3d levels to 1
Ideally there would be a swrast fallback, but the driver isn't ready for
that. This should avoid crashes if someone tries to use 3d textures
though.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: mesa-stable@lists.freedesktop.org
2014-10-14 12:33:38 -04:00
Rob Clark
abe3b3d1e0 freedreno: use tgsi_lowering
Now that the freedreno_lowering code is moved to tgsi_lowering, remove
our private copy and switch over to using the common version.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-10-14 12:30:08 -04:00
David Heidelberger
d2c1d9693f r300/compiler: remove useless check
This code is already in if (!variable->C->is_r500) so no need check
twice.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: David Heidelberger <david.heidelberger@ixit.cz>
2014-10-14 12:18:32 -04:00
Nick Sarnie
e5bf8d38db ilo: Build pipe-loader for ilo
Trivial patch to create the pipe loader for ilo. All the code was already there.

Signed-off-by: Nick Sarnie <commendsarnex@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-14 16:16:08 +01:00
Emil Velikov
af897df508 automake: explicitly set TARGET_RADEON_{WINSYS,COMMON}
Originally the variables were set only once via the ?= operator but
that causes issues when doing incremental builds. They appear to be
undefined and missing from the dependency list despite their addition
to LIBADD.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84807
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-14 16:16:08 +01:00
Eric Anholt
a2d8b6dbd5 vc4: Fix render target NPOT alignment at small miplevels.
The texturing hardware takes the POT level 0 width/height and minifies
those.  This is different from what we were doing, for example, for
273-wide's level 5: POT(273>>5) == 8, while POT(273)>>5 == 16.

Fixes piglit-depthstencil-render-miplevels 273.
2014-10-14 14:57:50 +01:00
Eric Anholt
b5fc9d5664 vc4: Add support for having 0 vertex elements used.
You have to load at least 1, according to the simulator.  Fixes 4 piglit
tests and even more ES2 conformance tests.
2014-10-14 11:29:48 +01:00
Vinson Lee
a2fd55cfb6 auxilary/os: Add DragonFly BSD support in os_get_total_physical_memory.
This patch fixes this build error on DragonFly BSD.

  CC       os/os_misc.lo
os/os_misc.c: In function 'os_get_total_physical_memory':
os/os_misc.c:132:2: error: #error Unsupported *BSD

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-13 23:40:46 -07:00
Daniel Manjarres
291be28476 glx: Fix glxUseXFont for glxWindow and glxPixmaps
The current implementation of glxUseXFont requires creating
a temporary pixmap and graphics context, which requires a real
old-school X11 Window, not a glxDrawable. This patch changes
things so that glxUseXFont will also accept a glxWindow or
glxPixmap, and lookup the underlying X11 Drawable. Without
this patch glxUseXFont generates a giant stream of Xerrors
about bad drawables and bad graphics contexts.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54372

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-10-13 22:05:58 -06:00
Chia-I Wu
4e2cf84b1f ilo: clear writer pointer after unmapping
It does not look like an issue now but it is good to be future proof.  Spotted
by Courtney Goeltzenleuchter.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
2014-10-14 08:54:25 +08:00
Eric Anholt
615bbf0ca6 vc4: Write the VPM read setup multiple times to queue all the inputs.
There's a 4-element fifo, and the size (number of dwords per vertex) field
is just 4 bits.

Fixes glsl-routing on sim.
2014-10-13 17:16:05 +01:00
Eric Anholt
e1d1c39626 vc4: Add support for the TXL opcode.
There's a bit at the bottom of cube map stride (which has some formatting
bugs in the docs) which flips the bias coordinate to being an absolute
LOD.
2014-10-13 17:15:47 +01:00
Eric Anholt
5bc91b6e32 vc4: Improve the accuracy of SIN and COS.
This gets them to pass glsl-sin/cos.  There was an obvious problem that I
was using the FRC code on the scaled input value, which means that we had
a range in [0, 1], while our taylor is most accurate across [-0.5, 0.5].
We can just slide things over, but that means flipping the sign of the
coefficients.  After that, it was just a matter of stuffing more
coefficients in.
2014-10-13 17:15:47 +01:00
Kenneth Graunke
02ca66fbc3 i965: Use unsynchronized maps for the program cache on LLC platforms.
There's no reason to stall on pwrite - the CPU always appends to the
buffer and never modifies existing contents, and the GPU never writes
it.  Further, the CPU always appends new data before submitting a batch
that requires it.

This code predates the unsynchronized mapping feature, so we simply
didn't have the option when it was written.

Ideally, we would do this for non-LLC platforms too, but unsynchronized
mapping support only exists for LLC systems.

Saves a bunch of stall avoidance copies when uploading shaders.

v2: Rebase on changes to previous patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]
2014-10-13 06:55:08 -07:00
Kenneth Graunke
4c766c7959 i965: Issue performance warnings when copying the program cache BO.
We don't really want unnecessary buffer copying, so it'd be nice to know
when it's happening.

v2: Drop stall warnings when doing a read-only CPU mapping of the cache
    BO.  The GPU also uses it in a read-only fashion, so there won't be
    any stalls, even though the buffer is busy.  (Thanks to Chris Wilson
    for catching this mistake.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> [v1]
2014-10-13 06:53:54 -07:00
Kenneth Graunke
a4fe727a5d i965: Issue performance warnings on MapBufferRange stalls.
This is easy: we just need to use brw_map_bo instead of mapping it
directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-10-13 06:44:05 -07:00
Eric Anholt
5d72a1c956 vc4: Match VS outputs to FS inputs.
If the VS doesn't output a value that the FS needs, we still need to read
the right contents for the remaining FS inputs, by emitting padding.  And
if the VS outputs something the FS doesn't need, we shouldn't put it in
the VPM at all (so the code producing it can get DCEed).

Fixes 77 piglit tests.
2014-10-13 13:23:48 +01:00
Christian König
d561a42bc1 configure: use $libdir/dri as default for VA-API
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:59 +02:00
Christian König
966ae170b0 configure: remove superflous VA-API line from configure.ac
We don't have GALLIUM_STATE_TRACKERS_DIRS any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:54 +02:00
Christian König
d3004a267a configure: respect $libdir for the OMX installation dir
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:49 +02:00
Christian König
5ce06d12ff configure: Revert "ask vdpau.pc for the default location of the vdpau drivers"
This reverts commit bbe6f7f865.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-10-13 12:43:05 +02:00
Eric Anholt
83365a5b57 vc4: Add support for the CEIL opcode.
Not as big of a deal as SSG, but still +9 piglit tests.
2014-10-13 08:06:48 +01:00
Eric Anholt
926eaa9af4 vc4: Add support for the SSG opcode. 2014-10-13 08:06:48 +01:00
Emil Velikov
b86f814afd docs: add news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-10-13 02:14:02 +01:00
Emil Velikov
fc6345a916 docs: Add sha256 sums for the 10.3.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit fa98c74692)
2014-10-13 02:06:29 +01:00
Emil Velikov
04fae07f0e Add release notes for the 10.3.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 088d350178)
2014-10-13 02:06:20 +01:00
Emil Velikov
66ea8a581d docs: Add sha256 sums for the 10.2.9 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 52bd154980)
2014-10-13 02:05:53 +01:00
Emil Velikov
f5e61295cd Add release notes for the 10.2.9 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 9f1149876f)
2014-10-13 02:05:22 +01:00
Glenn Kennard
a327fa3a06 r600g: Implement GL_ARB_sample_shading
Also fixes two sided lighting which was broken at least
on pre-evergreen by commit b1eb00.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
75e97e2e3f radeonsi: use tgsi_shader_info in si_llvm_emit_fs_epilogue
This is the last use tgsi_parse_token in radeonsi.

It looks ugly because the code was re-indented, but there is really no change
in behavior.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
558f7770a7 radeonsi: remove si_shader_output_values::index
It's redundant now.

It led to a simplification in si_llvm_emit_streamout, because outidx == reg.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
ec0d16872b radeonsi: use tgsi_shader_info in si_llvm_emit_vs_epilogue
That code was really ugly.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
8067732740 radeonsi: remove shader->input[] and output[] arrays and dependencies
They were reinventing tgsi_shader_info. They are unused now.

radeon_llvm_context::load_input can be NULL if input fetching is implemented
in some other way.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
8b057ddaea radeonsi: move param_offset out of shader->input[] and output[]
Those are going away.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:57 +02:00
Marek Olšák
02134cfaae radeonsi: use tgsi_shader_info to get a list of GS outputs
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
101905d3f7 radeonsi: use tgsi_shader_info in si_update_spi_map
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00
Marek Olšák
6f04cf7fac radeonsi: simplify dereferences in si_update_spi_map
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-10-12 23:53:54 +02:00