Commit graph

32791 commits

Author SHA1 Message Date
Luca Barbieri
4e2080a86e nvfx: new 2D: unify textures and buffers
Stop using the vtbl, and use real transfers for buffers too.
2010-08-21 20:42:14 +02:00
Luca Barbieri
0481ed25c9 nvfx: new 2D: use a CPU copy for up to 4 pixels, up from 0
Seems a reasonable threshold for now.

Significantly speeds up Piglit's 1x1 glReadPixels (but, you know,
reading pixels in 1x1 blocks is NOT a good idea, especially if you
might be running on a less-than-perfect driver).
2010-08-21 20:42:14 +02:00
Luca Barbieri
28eb392a85 nvfx: new 2D: new render temporaries with resources
This patch adds support for creating temporary surfaces to allow
rendering to surfaces that cannot be rendered to.
It uses the _second_ version of the render temporary infrastructure.

This is necessary for swizzled 3D textures and small mipmaps of
swizzled 2D textures.

This version of the patch creates a resource to use as a temporary
instead of a raw BO, making the code simpler.
2010-08-21 20:42:14 +02:00
Luca Barbieri
ff74143fcc nv30: new 2D: support ARB_texture_rectangle
This uses nv30's _RECT formats.
2010-08-21 20:42:14 +02:00
Luca Barbieri
4793f48a19 nvfx: new 2D: optimize fragtex format lookup
Use an array indexed by the pipe format instead of doing a linear scan.
2010-08-21 20:42:14 +02:00
Luca Barbieri
d983701267 nvfx: new 2D: enable swizzling for all surfaces
Now that the new 2D code is in place, swizzling can be safely enabled.

Render temporaries are needed in some cases, so this may degrade nv30
a bit until it gets render temporaries too.
2010-08-21 20:42:14 +02:00
Luca Barbieri
9ed0686e8e nvfx: new 2D: use new 2D engine in Gallium
This patch implements nv04_surface_copy/fill using the new 2D engine module.
It supports falling back to the 3D engine using the u_blitter module, which will be
added in a later patch.

Also adds support for using the 3D engine, reusing the u_blitter module
created for r300.
This is used for unswizzling and copies between swizzled surfaces.
2010-08-21 20:42:14 +02:00
Luca Barbieri
24a4ea003f nv04-nv40: new 2D: add new Gallium-independent 2D engine
This patch add a brand new nv04-nv40 2D engine module.
It should correctly implement all operations involving swizzled, and 3D-swizzled surfaces.

This code is independent from the Gallium framework and can thus be reused in the DDX and classic Mesa drivers (it's only likely to be useful in the latter, though).

Currently, surface_copy and surface_fill are broken for 3D textures, for swizzled source textures and possibly for some misaligned cases

The code is based around the new nv04_region structure, which encapsulates the information from pipe_surface needed for the 2D engine and CPU copies.
The use of nv04_region makes the code independent of the Gallium framework and allows to transform the nv04_region without clobbering the nv04_region.
The existing M2MF, blitter, and SWIZZLED_SURFACE paths have been improved and a new CPU path has been added.
There is also support to tell the caller to use the 3D engine.

The main feature of the copy/fill setup algorithm is linearization/contiguous-linearization of swizzled surfaces.
The idea of linearization is that some swizzled surfaces are laid out like linear ones (1xN, 2xN, Nx1) and can thus be used as such (e.g. useful for copying single pixels).
Also, some rectangles (e.g. the whole surface) are contiguous in memory. If both the source and destination rectangles are swizzled but contiguous, then they can be regarded as both linear: this is the idea of "contiguous linearization".
This, for instance, allows to use the 2D engine to duplicate the content of a swizzled surface to another swizzled surface, by pretending they are actually linear.
After linearization, the result may not be 64-byte aligned. Another transformation is done to enlarge the linear surface so that it becomes 64-byte aligned.
This is also used to 64-byte align swizzled texture mipmaps.

The inner loop of the CPU path is as optimized as possible without using SSE/SSE2.
Future improvements could include SSE/SSE2 support, and possibly a faster coordinate swizzling algorithm (which is however not used in the inner loop).
It may be a good idea to autogenerate swizzling code at least for all possible POT 2D texture dimensions  (less than 256), maybe for all 3D ones too (less than 4096).
Also, it woud be a very good idea to make a copy with the GPU first if the source surface is in uncached memory.
2010-08-21 20:42:14 +02:00
Luca Barbieri
23639dc046 nvfx: new 2D: rewrite transfer code to use staging transfers
This greatly simplifies the code, and avoids ad-hoc copy code.

Also, these new transfers work for buffers too, even though they
are still used for miptrees only.
2010-08-21 20:42:14 +02:00
Luca Barbieri
ed2930e7e2 nvfx: new 2D: rewrite miptree code, adapt transfers
Changes:
- Disable swizzling on non-RGBA 2D textures, since the current 2D
  code is mostly broken in those cases. A later patch will fix this.
  Thanks to Andrew Randrianasulu who reported this.
- Fix compressed texture transfers and hack around the current 2D
  code inability to copy compressed textures by using direct access.
  Thanks to Andrew Randrianasulu who reported this.

This patch rewrites all the miptree layout and transfer code in the
nvfx driver.

The current code is broken in several ways:
1. 3D textures are laid out first by face, then by level, which is
incorrect
2. Cube maps should have 128-byte aligned faces
3. Swizzled textures have a strange alignment test that seems
unnecessary
4. We store the image_offsets for each face/slice but they can be
easily computed instead
5. "Swizzling" is not supported for compressed formats. They can be
"swizzled" but swizzling only means that there are no gaps (pitch is
level-dependant) and the layout is still linear
6. Swizzling is not supported for non-RGBA formats. All formats (except
possibly depth) can be swizzled according to my testing.

The miptree layout is rewritten based on my empirical testing, which I
posted in the "miptree findings" mail.
The image_offset array is removed, since it can be calculated with a
simple multiplication; the only array in the miptree structure is now
the one for mipmap level starts, which it seems cannot be easily
computed in constant time.

Also, we now directly store a nouveau_bo instead of a pipe_buffer in
the miptree structure, like nv50 does.

Support for render temporaries is removed, and will be readded in a
later patch.

Note that the current temporary code is broken, because it does not
copy the temporary back on render cache flushes.
2010-08-21 20:42:14 +02:00
Luca Barbieri
ac97e8dba6 nvfx: add nouveau_resource_on_gpu
Add a function to get whether a resource is likely on the GPU or not.

Currently always returns TRUE.
2010-08-21 20:42:14 +02:00
Luca Barbieri
37fa0cf4ea nvfx: add linear flag for buffers 2010-08-21 20:42:14 +02:00
Luca Barbieri
6a73d99a52 nvfx: properly unreference bound objects on context destruction 2010-08-21 20:42:13 +02:00
Luca Barbieri
e189823eb4 nvfx: reference count bound objects 2010-08-21 20:42:13 +02:00
Luca Barbieri
95acfd0c8a nvfx: fix format support code for compressed texture
A source line was put in the wrong place.
2010-08-21 20:42:13 +02:00
Luca Barbieri
e6ff995d14 gallium/auxiliary: add semantic linkage utility code 2010-08-21 20:42:13 +02:00
Luca Barbieri
bed9dff9d9 u_debug_describe: use switch instead of if chain 2010-08-21 12:47:18 +02:00
Luca Barbieri
061c2a7cb3 u_debug_describe: add PIPE_TEXTURE_RECT 2010-08-21 12:45:39 +02:00
Luca Barbieri
fa32fde26c auxiliary: add copyright headers
Thanks to Jose Fonseca for pointing out they were missing.
2010-08-21 12:37:39 +02:00
José Fonseca
121aa3cfcb util: Match printf format to silence warning. 2010-08-21 10:38:22 +01:00
José Fonseca
a5888d3113 mesa: Remove unsused local variable. 2010-08-21 10:34:57 +01:00
José Fonseca
04c2a22175 util: Make the reference debuggin code more C++ friendly.
C++ doesn't accept function <-> void* conversions without a putting a
fight.
2010-08-21 10:34:42 +01:00
José Fonseca
7a40d15e6c util: Remove the x86 exception handlers.
Unused now that check_os_katmai_support was removed.
2010-08-21 10:07:12 +01:00
Alex Corscadden
ce3a07c392 trace: Don't immediately destroy the pipe's sampler view in the trace driver.
The trace driver's implementation of sampler_view_destroy was calling
directly into the underlying pipe's sampler_view_destroy implementation.
This causes problems for pipes that keep references to sampler views
even after the state tracker has released them.  Instead, we'll simply
drop the trace driver's reference to the pipe's sampler view.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2010-08-21 09:45:43 +01:00
Alex Corscadden
29dde59ea7 trace: Trace the correct version of the resource when setting the index buffer.
The trace driver was tracing the unwrapped version of the index buffer
when setting the index buffer.  This caused an assert validating that
a resource belonged to the trace driver to fail.  Instead, we'll log
the unmodified index buffer structure when setting the index buffer.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2010-08-21 09:44:12 +01:00
Vinson Lee
f7188ac9ce generate_builtins.py: Remove unused import sys. 2010-08-21 01:13:16 -07:00
Chia-I Wu
29cff9ce2e mapi: Use MAPI_EXPORT to export public functions.
mapi.h is included by vgapi and st/vega.  On win32, the macro expands to
dllexport and dllimport respectively.
2010-08-21 14:13:59 +08:00
Chia-I Wu
df98423f24 mapi: Prefix functions in u_current.h by u_current.
That is, replace the old _glapi_* names by new names that start with
u_current_.  When MAPI_GLAPI_CURRENT is defined, u_current.h defines
rename macros to restore the old names.  That is done for ABI
compatibility.
2010-08-21 14:13:59 +08:00
Chia-I Wu
760451baae glapi: Move public function/variable declarations to glapi.h.
glapi defines an interface that is used by DRI drivers.  It must not be
changed in an ABI incompatible way.  This commit moves all
functions/variables belong to the interface to glapi.h.  Instead of
including u_current.h from glapi.h, u_current.h now includes glapi.h.
2010-08-21 14:13:59 +08:00
richard
0eac4b8740 evergreen : initial support driver code. 2010-08-20 19:28:47 -04:00
Eric Anholt
501c9dc627 i965: Rename nr_depth_regs to nr_payload_regs.
Only 8 out of the up to 13 regs are for source/dest depth, so the name
wasn't particularly appropriate.  Note that this doesn't count the
constant or URB payload regs.  Also, don't pre-divide by 2, so it's
actually a number of registers.
2010-08-20 16:17:40 -07:00
Eric Anholt
e6ec500e19 i965: Also use the SIMD8 FB writes for SIMD8 mode on non-SNB. 2010-08-20 16:17:40 -07:00
Zhenyu Wang
5266c0a0c8 i965: Add support for FB writes on Sandybridge. 2010-08-20 16:17:40 -07:00
Zhenyu Wang
3ce2eccbfb i965: Set the destination horiz stride even for da16, as SNB seems to need it. 2010-08-20 16:17:40 -07:00
Zhenyu Wang
35c127362f i965: Set the maximum number of threads on Sandybridge. 2010-08-20 16:17:40 -07:00
Zhenyu Wang
93ba0055c3 i965: Add AccWrCtl support on Sandybridge.
Whenever the accumulator results are needed, this bit must be set.
2010-08-20 16:17:39 -07:00
Zhenyu Wang
ffb5095d56 i965: Mention the mlen and rlen for URB reads. 2010-08-20 16:17:39 -07:00
Zhenyu Wang
da1502494b i965: Sandybridge doesn't have Compr4 mode, since it's not needed any more. 2010-08-20 16:17:39 -07:00
Zhenyu Wang
0e2d0cc577 i965: Adjust disasm of subreg numbers to be in units of the register type.
This makes reading the code easier when matching up to the specs,
which also use this format.
2010-08-20 16:17:39 -07:00
Eric Anholt
b7004350fa i965: Fix DP write channel ordering on Sandybridge.
The SIMD16 message no longer has the goofy interleaved format that
made Compr4 compression necessary before.
2010-08-20 16:17:39 -07:00
Luca Barbieri
132b9439e2 os_stream: fix bugs in allocation path 2010-08-21 00:51:29 +02:00
Luca Barbieri
9960200d5e p_compiler: add replacement va_copy
This might technically not always be correct, because va_copy might
be a function, or a system might not have va_copy, and not work with
assignment.

Hopefully this is never the case.
Without configure tests, it doesn't seem possible to do better.
2010-08-21 00:51:29 +02:00
Kenneth Graunke
7f80041efa Delete more vestiges of the old shader compiler. 2010-08-20 13:06:02 -07:00
Kenneth Graunke
d6cc7191da glsl: Remove bogus "ambient" field from vec4 gl_TextureEnvColor. 2010-08-20 13:01:12 -07:00
Luca Barbieri
c3e3793c32 glsl: add missing ambient field to gl_LightModel
Again, this is a one-element struct that was incorrectly missing the
field.
2010-08-20 13:01:09 -07:00
Luca Barbieri
fc76d72763 glsl: don't crash if a field is specified for a non-struct uniform
This was triggered by the previous bug, but is a separate problem
in the general sense.
2010-08-20 13:01:07 -07:00
Luca Barbieri
c108a7927d glsl: add missing sceneColor field to gl_{Front, Back}LightModelProduct
According to both GLSL 1.20 and 4.0, these are a struct with one field
called "sceneColor".

Fixes a crash on loading in FlightGear.
2010-08-20 13:01:04 -07:00
Eric Anholt
27e6552a8f intel: Don't try to do work for BufferSubData with a size of 0.
If we hit the linear blit path, we'd come up with a pitch of 0, then
divide by zero.

Fixes vbo-subdata-zero, made for bug #28931 (warsow).
2010-08-20 12:36:34 -07:00
Nick Bowler
5482eaba6e mesa: Fix GetUniformLocation while compiling display lists.
This function was apparently missing from the display list dispatch
table, causing the generic no-op function to be called instead.  To make
matters worse, the no-op function is indistinguishable from a successful
call to GetUniformLocation.  GL specifies that GetUniformLocation is
executed immediately when compiling display lists.

Fixes fdo bug 29622.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
2010-08-20 10:55:50 -07:00
Eric Anholt
284ce20901 Remove remnants of the old glsl compiler. 2010-08-20 10:55:42 -07:00