Commit graph

47510 commits

Author SHA1 Message Date
Vadim Girlin
c15f8569fd r600g: precalculate semantic indices for SPI setup
There is no need to duplicate semantic mapping which is done in hw, so get
rid of r600_find_vs_semantic_index.

TGSI name/sid pair is mapped to the 8-bit semantic index for SPI.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2011-11-04 16:23:19 -04:00
José Fonseca
2df15d07c0 svga: Tighten the register file assertions.
Untested. But should fix fdo 42576.
2011-11-04 20:10:01 +00:00
Dave Airlie
26ebf9c5e1 radeon/r200: strip texture borders.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-11-04 17:17:17 +00:00
Dave Airlie
71f1d468b4 radeon/r200: fix r100/r200 blit to use the offsets.
This is needed to do proper renderbuffer operation on mipmaps.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-11-04 17:17:17 +00:00
Dave Airlie
2431c992cb radeon: drop mtface/mtlevel, use ones in base class.
This just uses the base class copies.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-11-04 17:17:17 +00:00
Marek Olšák
85c151f3d9 u_vbuf_mgr: avoid one call to pipe_resource_reference in most cases 2011-11-04 18:11:01 +01:00
José Fonseca
f800a29ee2 swrast: Avoid void * arithmetic.
An error with MSVC.
2011-11-04 08:54:55 +00:00
Eric Anholt
eab201bad4 i965/gen6: Improve glReadPixels() performance by blitting to a linear temp.
The readpixels microbenchmark in mesa-demos goes from 47Mpix/sec at
1000x1000 to 450Mpix/sec.  The 10x10 sizes stay about the same.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:48:07 -07:00
Eric Anholt
a1488eec38 intel: Add safety asserts for the blit engine's pitch alignment requirements.
Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:53 -07:00
Eric Anholt
ac6a376f52 intel: Don't force a batchbuffer flush in readpixels.
Renderbuffer mapping handles flushing the batchbuffer if required, so
all we need to do is make sure any pending rendering has reached the
batchbuffer.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:53 -07:00
Eric Anholt
e7349a55f7 radeon: Remove early dereference of src/dst width in glCopyTexSubImage.
There doesn't appear to be any particular reason for this -- it's not
like the width is changing between the deref and the use.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:53 -07:00
Eric Anholt
d9f2add181 swrast: Drop the global mapping of buffers across glReadPixels().
Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:53 -07:00
Eric Anholt
5b1ad74824 swrast: Drop the remaining GetRow-based glReadPixels() fast-path.
In all of piglit, only two tests hit it (reading to RGBA float, where
GetRow would drop floats into place from R, RG, or RGB).  Mostly this
is because _ColorReadClamp has been causing transferOps to always be
set, skipping any fast-paths anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:53 -07:00
Eric Anholt
91b2ce85d1 swrast: Remove dead _swrast_read_depth_span_uint().
All the code using it is converted to MapRenderbuffer and the core
unpack functions.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:53 -07:00
Eric Anholt
345fc41619 swrast: Convert color glReadPixels slow path to using MapRenderbuffer.
This may be a bit slower than before because we're switching from
per-format compiled loops in GetRow to
_mesa_unpack_rgba_block_unpack's loop around a callback to unpack a
pixel.  The solution there would be to make _mesa_unpack_rgba_block
fold the span loop into the format handlers.

(On the other hand, function call overhead will hardly matter if
MapRenderbuffer means the driver gets the data into cacheable memory
instead of uncached).

The adjust_colors code should no longer be required, since the unpack
function does the 565 to float conversion in a single pass instead of
converting it (poorly) through 8888 as apparently happened in the
past.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
3e51ef0990 swrast: Skip _swrast_validate_derived in _swrast_ReadPixels().
None of the callgraph below this uses derived state (almost nothing
even dereferences the swrast context).

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
2e82daa31b swrast: Add a readpixels fast-path based on memcpy and MapRenderbuffer.
v2: Move _mesa_get_format_bytes out of the loop.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
5f140bceda mesa: Add a function for comparing gl_format to format/type.
This should be useful in making more generic fast paths in the pixel
paths.

v2: Add note about PACK_SWAP_BYTES, and fix up for endianness by
    synchronizing with memcpy_texture paths in texstore.c.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
e887df9bf5 swrast: Switch the remaining depth readpixels to MapRenderbuffer.
This avoids the wrapper, which should improve performance on packed
depth/stencil drivers.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
d072a5f545 swrast: Switch the remaining depth/stencil readpixels path to MapRenderbuffer.
Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
f53680857a swrast: MapRenderbuffer in separate depth/stencil readpixels fastpath
This introduces two new span helper functions we'll want to use in
several places as we move to MapRenderbuffer, which pull out integer
depth and stencil values from a renderbuffer mapping based on the
renderbuffer format.

v2: Use format_unpack helper for stencil read.
v3: Clean up comment after conversion to format_unpack.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
e452fbe871 swrast: Calculate image address/stride once for depth/stencil readpixels.
The fast and slow paths were doing these separately before.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
b832ac974f swrast: Make the packed depth/stencil read fastpath use MapRenderbuffer.
This also makes it handle 24/8 vs 8/24, fixing piglit
depthstencil-default_fb-readpixels-24_8 on i965.  While here, avoid
incorrectly fast-pathing if packing->SwapBytes is set.

v2: Move the unpack code to format_unpack.c, fix BUFFER_DEPTH typo
v3: Fix signed/unsigned comparison.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Eric Anholt
ff27e058bc swrast: Directly map the stencil buffer in read_stencil_pixels.
This avoids going through the wrapper that has to rewrite the data for
packed depth/stencil.  This isn't done in _swrast_read_stencil_span
because we don't want to map/unmap for each span.

v2: Move the unpack code to format_unpack.c.
v3: Fix signed/unsigned comparison.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-11-03 23:29:52 -07:00
Vinson Lee
492d223590 radeon: Fix variable initialization typo.
Fixes Coverity uninitialized scalar variable defect.
2011-11-03 20:34:02 -07:00
Paul Berry
8fad0f9998 i965: Fix constant propagation into 32-bit integer MUL.
i965's MUL instruction can't take an immediate value as its first
argument.  So normally, if constant propagation wants to propagate a
constant into the first argument of a MUL instruction, it swaps the
order of the two arguments.

This doesn't work for 32-bit integer (and unsigned integer)
multiplies, because the MUL operation is asymmetric in that case (it
multiplies 16 bits of one operand by 32 bits of the other).

Fixes piglit tests {vs,fs}-multiply-const-{ivec4,uvec4}.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-11-03 18:18:34 -07:00
Brian Paul
df73a70fba svga: use the draw-module's sprite stage depending on FS inputs
If we're drawing sprites and the fragment shader needs both auto-
generated texcoords and user-defined varying vars we need to use
this fallback path.
The reason is when we enable auto texcoord generation, it gets
enabled for all texcoord sets.  And that clobbers the user-defined
varying vars.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2011-11-03 17:41:08 -06:00
Brian Paul
022e270b1b svga: pass fragment shader to draw module
If we use the draw-module for wide point/line/etc drawing we'll need
a fragment shader too (like we pass in the vertex shader).

This fixes sprite point rendering when forcing the swtnl path.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2011-11-03 17:41:08 -06:00
Brian Paul
58ea42b7db svga: implement generic variable index remapping
The state tracker may generate shaders that use generic vs outputs /
fs inputs like:

DCL IN[0], GENERIC[0]
DCL IN[1], GENERIC[10]
DCL IN[2], GENERIC[11]

This patch remaps 0, 10, 11 to small integers like 1, 2, 3 so that we
stay inside the SVGA3D limit (8).

The remapping is done to both the vertex shader outputs and the
fragment shader inputs.  The same mapping must be used for a vs/fs
pair.

Note that 'union svga_compile_key' is now 'struct svga_compile_key'
because we needed to add the register remapping table.  The change in
size isn't really significant though (it's not a search key).

Also, add assertions when building up SVGA3D src/dst registers to we
don't try to store too large of value for the bitfield size.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2011-11-03 17:41:08 -06:00
Brian Paul
e814d57725 draw: assert that we have non-null fragment shader
Instead of just segfaulting.  Recently ran into this.
2011-11-03 16:56:11 -06:00
nobled
ac0ec07e6c texgetimage: add missing return on error
Missed this back in the arb_robustness branch
<6b329b9274b18c50f4177eef7ee087d50ebc1525>.

NOTE: This is a candidate for the 7.11 branch.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-11-03 15:27:08 -07:00
Brian Paul
bf5255fb30 mesa: fix texture target mix-up in NV_fragment_program parser
The returned value should be a texture target index, not a bit.
I spotted this from seeing a new compiler warning caused by the increase
in the number of texture targets.  This has been broken for a long time.

Note: This is a candidate for the 7.11 branch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-11-03 15:32:19 -06:00
Ian Romanick
f37b1ad937 linker: Check that initializers for global variables match
This requires tracking a couple extra fields in ir_variable:

 * A flag to indicate that a variable had an initializer.

 * For non-const variables, a field to track the constant value of the
   variable's initializer.

For variables non-constant initalizers, ir_variable::has_initializer
will be true, but ir_variable::constant_initializer will be NULL.  The
linker can use the values of these fields to check adherence to the
GLSL 4.20 rules for shared global variables:

    "If a shared global has multiple initializers, the initializers
    must all be constant expressions, and they must all have the same
    value. Otherwise, a link error will result. (A shared global
    having only one initializer does not require that initializer to
    be a constant expression.)"

Previous to 4.20 the GLSL spec simply said that initializers must have
the same value.  In this case of non-constant initializers, this was
impossible to determine.  As a result, no vendor actually implemented
that behavior.  The 4.20 behavior matches the behavior of NVIDIA's
shipping implementations.

NOTE: This is candidate for the 7.11 branch.  This patch also needs
the preceding patch "glsl: Refactor generate_ARB_draw_buffers_variables
to use add_builtin_constant"

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34687
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2011-11-03 13:36:00 -07:00
Ian Romanick
d3b39194dc glsl: Refactor generate_ARB_draw_buffers_variables to use add_builtin_constant
v2: Remove int cast based on feedback from Ken.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2011-11-03 13:36:00 -07:00
Ian Romanick
22af08b410 glsl: Put all bitfields in ir_variable together for better packing
The diff looks weird because ir_variable::depth_layout was between the
last two bitfields in the structure.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2011-11-03 13:36:00 -07:00
Ian Romanick
46173f9079 linker: Fix the indentation of a block in cross_validate_globals
I suspect the indentation got messed up during a code merge.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Paul Berry <stereotype441@gmail.com>
2011-11-03 13:36:00 -07:00
Eric Anholt
9954a93ab7 radeon: Check an error return instead of assigning it to a dead variable.
Fixes gcc set-but-unused-variable warning.

Reviewed-by: Alex Deucher <alexdeucher@gmail.com>
2011-11-03 09:13:46 -07:00
Marek Olšák
ca0f515f85 r300g: force buffer placements to GTT on big endian machines 2011-11-03 16:39:40 +01:00
Maarten Lankhorst
eadbcb221d state_trackers/vdpau: Add support for VC-1 decoding
Add a struct with all the fields.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2011-11-03 13:52:01 +01:00
Maarten Lankhorst
91d33b5c58 state_trackers/vdpau: Add mpeg4 part2 to PipeToProfile and ProfileToPipe
So it can actually be used when someone implements it. :)

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2011-11-03 13:52:01 +01:00
Maarten Lankhorst
12bf452945 state_trackers/vdpau: Add support for MPEG4 Part 2
Just the support patch, no decoder implements it currently.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2011-11-03 13:52:01 +01:00
Maarten Lankhorst
1eb48c5500 state_trackers/vdpau: Test if profile is supported first before trying to create decoder
So a nicer error message is returned.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2011-11-03 13:52:01 +01:00
Maarten Lankhorst
c4d47f065a state_trackers/vdpau: Add num_slices to mpeg12 picture structure
Bitstream parsers might need that field.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2011-11-03 13:52:01 +01:00
Maarten Lankhorst
c9c6eec1c6 state_trackers/vdpau: Implement VdpGenerateCSCMatrix
With the smpte240 profile, which was missing.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2011-11-03 13:52:00 +01:00
Christian König
8a7e645c9b g3dvl: remove some stale variable increment
Incrementing "td" before initializing it is
pointless and just leads to an uninitialized
variable warning with MSVC.

Signed-off-by: Christian König <deathsimple@vodafone.de>
2011-11-03 13:52:00 +01:00
Dave Airlie
c6a3026472 r600g: more integer support
just some more trivial integer changes for r600/r700.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-11-03 10:11:32 +00:00
Dave Airlie
d546dcbb1b radeon: fix some regressions in texturing code.
On a piglit run vs 7.11 this fixes 23 tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-11-03 10:10:07 +00:00
José Fonseca
97213fd741 scons: Use -static-libstdc++ on 32bits builds w/ Mingw-w64 too. 2011-11-03 09:59:34 +00:00
José Fonseca
3276c3d42b libgl-gdi: Mingw-w64 in 32bit mode matches the Mingw32's .DEF semantics. 2011-11-03 09:59:34 +00:00
Chia-I Wu
a56951139a docs: list GL_OES_EGL_image_external in 7.12 release notes 2011-11-03 15:09:45 +08:00