Commit graph

56615 commits

Author SHA1 Message Date
Vadim Girlin
7d555f2f4c r600g: mask unused source components for SAMPLE
This results in more clean shader code and may improve the quality of
optimized code produced by r600-sb due to eliminated false dependencies
in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-30 21:50:47 +04:00
Eric Anholt
df410863d7 intel: Remove the last spans code!
The remaining bits happen to do nothing that
_swrast_span_render_start()/finish() don't do.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
526cf46666 intel: Move the S8 offset calc function near its remaining usage.
It's not really span code ever since we stopped using spans for S8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
e7c5e9949b intel: Ensure renderbuffers are current when mapping them.
In the case of renering to windows in X, we would render to stale buffers
(or not render at all!) if you hit a MapRenderbuffer as the first thing
done to your window after new buffers are ready to be collected in DRI2.

I think this also covers the weird comment about irb->mt being missing
sometimes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
0e8ef74c5f mesa: Add a clarifying comment about rowStride of compressed textures.
I always forget how we do this for compressed textures.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:45 -07:00
Eric Anholt
3750ff9e5f mesa: Remove the Map field from texture images.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
adf958d9c2 swrast: Always use MapTextureImage for mapping textures for swrast.
Now that everything goes through ImageSlices[], we can rely on the
driver's existing texture mapping function.

A big block of code goes away on Radeon that looks like it was to deal with
the validate that happened at SpanRenderStart, which no longer occurs since we
don't need validation for the MapTextureImage hook.

v2: Rewrite comment about ImageSlices, fix duplicated swImages, touch up
    unmap loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
ea05e259c9 nouveau: Replace swrast_texture_image->Map usage with ->Buffer.
This code is trying to deal with providing a map in the case that
AllocTexImageBuffer was called, which is hooked up to the swrast variant.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
b78e48289f nouveau: Just use MapTextureImage instead of duplicating the logic.
MapTextureImage has the exact same logic, except it can also handle
swrast-allocated buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
f91823f026 swrast: Make a teximage's stored RowStride be in terms of bytes per row.
For hardware drivers with pitch alignment requirements, a
non-power-of-two-sized texture format won't end up being an integer number
of pixels per row.  Also, avoids having to change our units between
MapTextureImage's rowStride and swrast's RowStride.

This doesn't fully convert the compressed texel fetch path, but does make
sure we don't drop any bits (not that we'd expect to).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
35e179b18c swrast: Replace use of teximage Map in 1D/2D paths with ImageSlices[0].
This gets us ready for the Map field to die.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:44 -07:00
Eric Anholt
0c883e46d8 swrast: Replace ImageOffsets with an ImageSlices pointer.
This is a step toward allowing drivers to use their normal mapping paths,
instead of requiring that all slice mappings come from an aligned offset
from the first slice's map.

This incidentally fixes missing slice handling in FXT1 swrast.

v2: Use slice height helper function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
e7ecc11311 swrast: Reuse _swrast_free_texture_image_buffer from drivers.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
0a484f1006 swrast: Move ImageOffsets allocation to shared code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
f709c31c67 swrast: Clean up and explain the mapping process.
v2: Move slice height calculation to a helper function (recommeded by Brian).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:43 -07:00
Eric Anholt
741e540055 swrast: Factor out texture slice counting.
This function going to get used a lot more in upcoming patches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Eric Anholt
dca4178130 radeon: Remove some dead teximage mapping code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Eric Anholt
0de08fb594 radeon: Add missing swrast field initialization.
This is the equivalent of intel's
80513ec8b4.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-30 10:40:42 -07:00
Vincent Lejeune
a6a4b70e2d r600g/llvm: Fix opencl build 2013-04-30 16:38:47 +02:00
Alexander von Gluck IV
f1361ed084 Gallium: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Alexander von Gluck IV
60cc73c333 Mapi: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Alexander von Gluck IV
39bdf08628 Mesa: Use mmap on Haiku for executable memory vs malloc
* Haiku now has DEP enabled by default.
2013-04-29 23:22:35 -05:00
Vincent Lejeune
51e9bfdc48 r600g/llvm: get use_kill from compiler shader 2013-04-30 02:17:18 +02:00
Eric Anholt
a79786af64 i965/fs: Print out the estimated cycle count in INTEL_DEBUG=wm
This could be used by shader-db for hopefully more accurate regression
testing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:44:35 -07:00
Eric Anholt
61ca2c4f73 i965/fs: Allow LRPs with uniform registers.
Improves GLB2.7 performance on my HSW by 0.671455% +/- 0.225037% (n=62).

v2: Make is_valid_3src() a method of the fs_reg. (recommended by Ken)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
2013-04-29 11:41:35 -07:00
Eric Anholt
de7e8b1d01 intel: Be more conservative in disabling tiling to save memory.
Improves GLB2.7 trex performance 1.01985% +/- 0.721366% on my IVB (n=10)
and by 3.38771% +/- 0.584241% (n=15) on my HSW, due to a 32x32 ARGB8888
cubemap going from untiled to tiled.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2013-04-29 11:41:34 -07:00
Eric Anholt
73bc6061f5 i965: Disable Z16 on contexts that don't require it.
It appears that Z16 on Intel hardware is in fact slower than Z24, so
people are getting surprisingly hurt when trying to use Z16 as a
performance-versus-precision tradeoff, or when they're targeting GLES2 and
that's all you get.

GL 3.0+ have Z16 on the list of required exact format sizes, but GLES
doesn't, so choose the better-performing layout in that case.  Improves
GLB 2.7 trex performance at 1920x1080 by 10.7% +/- 1.1% (n=3) on my IVB
system.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
e409889213 intel: Report FBO incompleteness causes through GL_ARB_debug_output.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
6ae473221a intel: Fold the one last function intel_tex_format.c into the caller.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-29 11:41:34 -07:00
Eric Anholt
40b207b62f mesa: Fix error checking for GS UBO getters.
These are supposed to be present if both things are available, but we were
enabling them if either one was.
2013-04-29 11:41:34 -07:00
Eric Anholt
072709da91 mesa: Add a clarifying comment about EXTRA_ error checking. 2013-04-29 11:41:34 -07:00
Eric Anholt
eac1199604 mesa: Add an extra clarifying set of braces to getter checking.
For this multi-page single statement, my thought the end was to that the
next block was mis-indented, rather than that the dropped indentation
actually indicated the end of the loop.
2013-04-29 11:41:33 -07:00
Eric Anholt
2534f0a57d mesa: Fix error checking for getters consisting of only API versions.
In almost all of our cases, getters that are turned on for only some API
variants will have an extension listed as one of the things that can
enable it, and thus api_check gets set.  For extra_gl30_es3 (used for
NUM_EXTENSIONS, MAJOR_VERSION, MINOR_VERSION) on a GL 2.1 context, though,
we would check twice, not find either one, but never actually throw the
error.
2013-04-29 11:41:33 -07:00
Eric Anholt
d63a10afcc mesa: Clarify the names of error checking variables for glGet.
There's no reason to actually count these things, so the integer ++
behavior was just confusing.
2013-04-29 11:41:33 -07:00
Eric Anholt
4df1b986d3 i915: Add support for GL_EXT_texture_sRGB and GL_EXT_texture_sRGB_decode.
This brings the driver up to GL 2.1.
2013-04-29 11:41:33 -07:00
Eric Anholt
97217a40f9 i915: Always enable GL 2.0 support.
There's no point in shipping a non-GL2 driver today.
2013-04-29 11:41:33 -07:00
Eric Anholt
eb062ab07f i915: Correctly set the OQ counter bits.
While we may provide the extension, we need to tell applications that they
can't actually use it:

            An implementation can either set QUERY_COUNTER_BITS_ARB to the
            value 0, or to some number greater than or equal to n.  If an
            implementation returns 0 for QUERY_COUNTER_BITS_ARB, then the
            occlusion queries will always return that zero samples passed the
            occlusion test, and so an application should not use occlusion
            queries on that implementation.
2013-04-29 11:41:33 -07:00
Kenneth Graunke
5e46482993 i965: Move is_math/is_tex/is_control_flow() to backend_instruction.
These are entirely based on the opcode, which is available in
backend_instruction.  It makes sense to only implement them in one
place.

This changes the VS implementation of is_tex() slightly, which now
accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD.  However, since those
aren't generated in the VS anyway, it should be fine.

This also makes is_control_flow() available in the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2013-04-29 11:10:50 -07:00
Zack Rusin
a6e7c22664 draw/so: fix overflow calculation
only report overflow for missing targets if they're actually being
used. if the targets are missing but are not being used by any
slot in the stream output declaration we should correctly just
ignore them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 03:48:36 -04:00
José Fonseca
220ef8295c llvmpipe: Fix queries when screen->num_threads == 0.
That is, when llvmpipe is run in single-threaded mode.

Trivial.

Tested with

  LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
2013-04-29 15:40:06 +01:00
José Fonseca
c4bea00fb3 Revert "st/mesa: add a simple path to BufferData if it only discards buffer contents"
This reverts commit 5649f886f7.

It causes segfaults when size is zero.
2013-04-29 15:13:57 +01:00
Jerome Glisse
c7a13dc5f5 r600g: force full cache for hyperz
Seems that in some case allowing half cache usage confuse the gpu
and trigger lockup. Force full cache use.

Should fix :
https://bugs.freedesktop.org/show_bug.cgi?id=59592
https://bugs.freedesktop.org/show_bug.cgi?id=60848
https://bugs.freedesktop.org/show_bug.cgi?id=60969
https://bugs.freedesktop.org/show_bug.cgi?id=61747
https://bugs.freedesktop.org/show_bug.cgi?id=62466
https://bugs.freedesktop.org/show_bug.cgi?id=62669
https://bugs.freedesktop.org/show_bug.cgi?id=62721
https://bugs.freedesktop.org/show_bug.cgi?id=63124

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2013-04-29 10:06:29 -04:00
Rob Clark
3900a0e4df freedreno: fix rebase screw-up
Add back 2nd arg to emit_vertexbufs() which got lost in rebase.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-04-29 07:36:27 -04:00
Chris Forbes
79f786f936 i965/fs: Don't try to use bogus interpolation modes pre-Gen6.
Interpolation modes other than perspective-barycentric-pixel-center (and
their associated coefficients in the WM payload) only exist in Gen6 and
later.

Unfortunately, if a varying was declared as `centroid`, we would blindly
read the nonexistant values, and so produce all manner of bad behavior
-- texture swimming, snow, etc.

Fixes rendering in Counter-Strike Source and Team Fortress 2 on
Ironlake.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
2013-04-30 06:50:16 +12:00
Matt Turner
a8eed0299d i965/vs: Fix order of source arguments to LRP.
The order or arguments matches DirectX, and is backwards from GLSL's
mix() built-in.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63983
2013-04-28 14:38:14 -07:00
Zack Rusin
3bba787879 llvmpipe: stop crashing when one of the so targets is null
Fixes a crash when one of the so targets is null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:12 -04:00
Zack Rusin
0031cde1e1 draw/so: indicate overflow when buffer is missing
We were crashing if one of the buffers wasn't set, we should
just treat it as an overflow. It's useful when using so
statistics because it allows one to figure out how much data
would be generated by so without actually writing any of it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:19:07 -04:00
Zack Rusin
f9f57312de gallivm: fix indirect addressing of temps in soa mode
we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-27 01:18:51 -04:00
Zack Rusin
3093ac6f4f tgsi/ureg: Add a function to return the number of outputs
We already hold the variable, just weren't providing access
to it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-26 23:05:45 -04:00
Zack Rusin
53d36d5fb0 draw/so: Fix overflow calculations
We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-04-26 23:04:26 -04:00