Commit graph

46866 commits

Author SHA1 Message Date
Brian Paul
e75051d196 mesa: fix error check for zero-sized compressed subtexture
For glCompressedTexSubImage, width or height = 0 is legal.
Fixes a failure in piglit's s3tc-errors test.

This is for the 9.0 and 8.0 branches.  Already fixed on master.
2012-10-09 07:47:43 -06:00
Brian Paul
32faf7ab0d mesa: don't call TexImage driver hooks for zero-sized images
This simply avoids some failed assertions but there's no reason to
call the driver hooks for storing a tex image if its size is zero.

Note: This is a candidate for the stable branches.
(cherry picked from commit 91d8409649)
2012-10-09 07:47:43 -06:00
Ian Romanick
e5fdeef1e0 mesa: Bump version number to 9.0 (final)
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2012-10-08 14:58:35 -07:00
Anuj Phogat
ad4b3b93de _mesa_meta_GenerateMipmap: Support all texture targets by generating shaders at runtime
This is a squash for the following 7 commits.  The first introduces the
functionality, and the remaining six fix various bugs.

Patch 1:
    _mesa_meta_GenerateMipmap: Support all texture targets by generating shaders at runtime

    glsl path of _mesa_meta_GenerateMipmap() function would require different fragment
    shaders depending on the texture target. This patch adds the code to generate
    appropriate fragment shader programs at run time.
    Fixes https://bugs.freedesktop.org/show_bug.cgi?id=54296

    V2: Removed the code for integer textures as ARB is planning to
        disallow automatic mipmap generation for integer textures.
        Now using ralloc_asprintf in setup_glsl_generate_mipmap().

    NOTE: This is a candidate for stable branches.

    Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
    Reviewed-by: Brian Paul <brianp@vmware.com>
    (cherry picked from commit 299acac849)

Patch 2:
    _mesa_meta_GenerateMipmap: Generate separate shaders for glsl 120 / 130

    glsl version of _mesa_meta_GenerateMipmap() would require separate
    shaders for glsl 120 and 130.

    V2: Removed the code for integer textures as ARB is planning to
        disallow automatic mipmap generation for integer textures.

    NOTE: This is a candidate for stable branches.

    Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
    Reviewed-by: Brian Paul <brianp@vmware.com>
    (cherry picked from commit 15bf3103b4)

Patch 3:
    meta: Add on demand compilation of per target shader programs

    A call to glGenerateMipmap() follows the generation of a relevant
    shader program in setup_glsl_generate_mipmap().

    To support all texture targets and to avoid compiling shaders
    everytime, per target shader programs are compiled on demand
    and saved for the next call.

    Fixes float-texture(mipmap.manual):
    See Comment 6: https://bugs.freedesktop.org/show_bug.cgi?id=54296

    NOTE: This is a candidate for stable branches.

    Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
    Reviewed-by: Brian Paul <brianp@vmware.com>
    (cherry picked from commit eb1d87fb94)

Patch 4:
    meta: make mem_ctx non-global.

    I can't see any external users, and this is a global symbol,

    Reviewed-by: Matt Turner <mattst88@gmail.com>
    Signed-off-by: Dave Airlie <airlied@redhat.com>
    (cherry picked from commit 36639ec6e9)

Patch 5:
    meta: Remove unsafe global mem_ctx pointer

    NOTE: This is a candidate for the 9.0 branch.

    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>
    Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
    (cherry picked from commit ab097dde0c)

Patch 6:
    meta: Rearrange shader creation in setup_glsl_generate_mipmap

    The diff looks weird, but this moves the code from the first 'if
    (ctx->Const.GLSLVersion < 130)' block down into the second block.  It
    also moves some variable decalarations closer to their use.

    NOTE: This is a candidate for the 9.0 branch.

    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>
    Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
    (cherry picked from commit 3308c079bd)

Patch 7:
    meta: Don't use GLSL 1.30 shader on OpenGL ES 2

    Fixes GLES2 CoverageGL conformance test.

    NOTE: This is a candidate for the 9.0 branch.

    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Brian Paul <brianp@vmware.com>
    Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
    (cherry picked from commit 0242381f06)
2012-10-07 20:38:14 -07:00
Marek Olšák
7851d398de r600g: fix possible issue with stencil mipmap rendering
Somehow I only hit this issue with my latest libdrm changes.
This won't be needed with DB texturing.

NOTE: This is a candidate for the 9.0 branch.
(cherry picked from commit 9dfca930d7)
2012-10-06 05:40:09 +02:00
Brian Paul
19a15cd5ba mesa: remove bogus compressed texture size checks
A compressed texture image size doesn't have to be a multiple of the
compressed block size (only sub-images do).  Fixes issues when building
compressed mipmaps because we often wind up with non-block-size images
for the higher mipmap levels.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=55445

Note: This is a candidate for the stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Sven Arvidsson <sa@whiz.se>
(cherry picked from commit df4a88ac43)
2012-10-05 15:55:47 -07:00
Anuj Phogat
c566267f5c intel/i965: Disable SampleAlphaToOne if dual source blending enabled
From SandyBridge PRM, volume 2 Part 1, section 12.2.3, BLEND_STATE:
DWord 1, Bit 30 (AlphaToOne Enable):
"If Dual Source Blending is enabled, this bit must be disabled"

Note: This is a candidate for stable branches.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit ea0d088727)
2012-10-05 15:55:47 -07:00
Kenneth Graunke
8491e03b2b mesa: Flag _NEW_VARYING_VP_INPUTS when TexEnv programs are active.
The idea here is to not flag _NEW_VARYING_VP_INPUTS when shaders (either
GLSL or ARB vp/fp) are in use.  If either TNL or TexEnv programs are
active, at least one stage is using fixed function.

On Pineview, fixes 20 Piglit, 60 oglconforms, and 7 ES 1.1 conformance
tests, as well as missing textures in Xonotic.  These were all
regressions since commit fb4a34e60e.

NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49127
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54807
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7fa0f10cd8)
2012-10-05 15:55:47 -07:00
Paul Berry
78c9adb17e mesa: don't enable glVertexPointer() when using API_OPENGLES2.
This function is only present in GLES1 and in the OpenGL compatibility
profile.

Fixes the following "make check" failure:

    [----------] 1 test from DispatchSanity_test
    [ RUN      ] DispatchSanity_test.GLES2
    Mesa warning: couldn't open libtxc_dxtn.so, software DXTn
    compression/decompression unavailable
    dispatch_sanity.cpp:122: Failure
    Value of: table[i]
       Actual: 0x4de54e
    Expected: (_glapi_proc) _mesa_generic_nop
    Which is: 0x41af72
    i = 321
    [  FAILED  ] DispatchSanity_test.GLES2 (4 ms)
    [----------] 1 test from DispatchSanity_test (4 ms total)

NOTE: This is a candidate for stable release branches.

Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Tested-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 8f0b81bf7d)
2012-10-05 15:55:47 -07:00
Robert Bragg
e1cb50b15d SwapBuffersRegionNOK: invert rectangles on y axis
The EGL_NOK_swap_region2 spec states that the rectangles are specified
with a bottom-left origin within a surface coordinate space also with a
bottom left origin, so this patch ensures the rectangles are flipped
before passing them on to dri2_copy_region.

Fixes piglit's egl-nok-swap-region test.

Tested-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 0a523a8820)
2012-10-05 15:03:56 -07:00
Tom Stellard
542f6feda9 radeon/llvm: Remove R600InstrInfo.td from TD_FILES
Fixes build bug introduced by
cebbdd4ac2
(cherry picked from commit 2baaa5c7eb)
2012-10-05 15:45:43 +02:00
Tom Stellard
71b5503164 radeon/llvm: Cleanup makefile
Hopefully, this will fix all the parallel make problems people have
been having.
(cherry picked from commit cebbdd4ac2)
2012-10-04 10:10:37 -04:00
Eric Anholt
b2048c5e90 i965: Use visibility cflags on the driver code.
(cherry picked from commit 837f06b42f)

The only symbols that need to be public (those in intel_screen.c that the
loader looks for) are already marked public.  Saves 100k of compiled driver
size.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-10-03 15:24:19 -07:00
Matt Turner
ddb9ecca3b build: Don't build libdricore if not building classic drivers
(cherry picked from commit 523c015246)
2012-10-03 15:24:18 -07:00
Matt Turner
76732c9ca5 build: Add visibility CFLAGS to OSMesa
(cherry picked from commit 24ded89876)
2012-10-03 15:24:18 -07:00
Matt Turner
c8669c7ba7 build: Link OSMesa with glapi, libdl, libstdc++
(cherry picked from commit 1762ec28db)

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=399813
          https://bugs.freedesktop.org/show_bug.cgi?id=53179
2012-10-03 15:24:11 -07:00
Matt Turner
f9a8673a7a build: Set visibility CFLAGS in dri/swrast
(cherry picked from commit 4cfff7211c)
2012-10-03 14:00:52 -07:00
Matt Turner
f88046a1f0 build: Set visibility CFLAGS in dri/r200
(cherry picked from commit 3628402707)
2012-10-03 14:00:46 -07:00
Matt Turner
8ebcf34d87 build: Set visibility CFLAGS in dri/radeon
(cherry picked from commit 55d45efdd8)
2012-10-03 14:00:40 -07:00
Matt Turner
3b794e4a56 build: Set visibility CFLAGS in dri/nouveau
(cherry picked from commit 340637d54d)
2012-10-03 14:00:32 -07:00
Matt Turner
0470fa395f build: Set visibility CFLAGS in dri/i915
(cherry picked from commit 381d120b8a)
2012-10-03 14:00:26 -07:00
Matt Turner
6512610f9a build: Set visibility CFLAGS in dri/common
(cherry picked from commit d2872b5612)
2012-10-03 14:00:14 -07:00
Matt Turner
03cfc8d660 build: Build src/glsl with visibility CFLAGS
(cherry picked from commit 8746f641bb)
2012-10-03 13:59:52 -07:00
Matt Turner
a1f1add42d build: Turn on visibility CFLAGS for core mesa
(cherry picked from commit 710a90ccaf)
2012-10-03 13:59:43 -07:00
Matt Turner
a2f28ceea2 build: Use AX_PTHREAD's HAVE_PTHREAD preprocessor definition
(cherry picked from commit 814345f54b)

Conflicts:

	src/mapi/glapi/gen/gl_x86-64_asm.py
	src/mapi/glapi/gen/gl_x86_asm.py
2012-10-03 13:59:08 -07:00
Matt Turner
421dda800d build: Use PTHREAD_LIBS and PTHREAD_CFLAGS
(cherry picked from commit b6651ae6ad)

Conflicts:

	src/mesa/main/tests/Makefile.am
2012-10-03 13:56:24 -07:00
Matt Turner
89e76252ca dri drivers: Link dricommon before dynamic libraries
I think libtool should be handling this for us, but the build fails for
Jordan because libdricommon (a static library, which uses expat) appears
before -lexpat on the linker command.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 31ab61cac1)

Conflicts:

	src/mesa/drivers/dri/i965/Makefile.am
2012-10-03 13:46:03 -07:00
Oliver McFadden
f2b4f588f5 Revert "i965: Implement guardband clipping on Ivybridge."
This reverts commit 610910a66d.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55523
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-03 13:32:32 +03:00
Oliver McFadden
dbe13c105f Revert "i965: Implement guardband clipping on Sandybridge."
This reverts commit 85cd30406f.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55523
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-10-03 13:32:21 +03:00
Brian Paul
604cd6b966 mesa: fix glCompressedTexSubImage assertion/segfault
If the destination texture image doesn't exist we'd hit an assertion
(or crash in a release build).  The piglit/s3tc-errors test hits this.
This has already been fixed in master by the error checking code
consolidation.

Note: This is a candidate for the 8.0 branch.
2012-10-01 08:24:36 -06:00
Brian Paul
e642d61d13 scons: add new -p (prefix) options for yacc
These were recently added to the Makefiles.
(cherry picked from commit e78ebbc5f9)
2012-09-30 11:43:58 -07:00
Marek Olšák
d9197f9037 r600g: fix EXP on Cayman
NOTE: This is a candidate for the stable branches.
(cherry picked from commit 96f50d0cf7)
2012-09-30 05:31:49 +02:00
Marek Olšák
fc62ee7e0d r600g: fix RSQ of negative value on Cayman
NOTE: This is a candidate for the stable branches.
(cherry picked from commit fd5c538464)
2012-09-30 05:31:42 +02:00
Marek Olšák
50ba62e231 r600g: fix instance divisor on Cayman
Not sure if this is the best way to fix it.

NOTE: This is a candidate for the stable branches.
(cherry picked from commit 836325bf7e)
2012-09-30 05:31:35 +02:00
Kenneth Graunke
549129838c meta: Use float for temporary images, not (un)signed normalized.
In commit 091eb15b69, Jordan changed get_temp_image_type() to use
_mesa_get_format_datatype() instead of returning GL_FLOAT.  That has
several possible return values: GL_FLOAT, GL_INT, GL_UNSIGNED_INT,
GL_SIGNED_NORMALIZED, and GL_UNSIGNED_NORMALIZED.

We do want to use GL_INT/GL_UNSIGNED_INT for integer formats.  However,
we want to continue using GL_FLOAT for the normalized fixed-point types.
There isn't any code in pack.c to handle GL_(UN)SIGNED_NORMALIZED.

Fixes oglconform's fboarb advanced.blit.copypix, which was regressed by
commit 091eb15b69.

NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53573
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 3767b25bd3)
2012-09-28 15:56:11 -07:00
Eric Anholt
0586a94929 i965: Remove broken non-interleaved-to-interleaved upload code.
This failed when all the uploads to occur were uniform-type vertex data (like
glColor4f being active across a DrawArrays), because it would upload 1 element
instead of 1 element per vertex.  There was no citation for how this code
helped any particular application, and it breaks ETQW, so just remove it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47170
NOTE: This is a candidate for the 9.0 and 8.0 branches.
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 0334e8dc25)
2012-09-28 15:55:28 -07:00
Kenneth Graunke
fdabc7d9f6 meta: Don't _mesa_set_enable() invalid targets in ES 1.
GL_TEXTURE_1D, GL_TEXTURE_3D, GL_TEXTURE_RECTANGLE, and
GL_TEXTURE_GEN_S/T/R/Q don't exist in ES 1 contexts, so any meta ops
that used _mesa_meta_begin with MESA_META_TEXTURE would trigger GL
errors.  One such operation is _mesa_meta_Clear().

On ES 1, we want to disable GL_TEXTURE_GEN_STR_OES instead.

Fixes the ES1 conformance test miplin.c, which was regressed by commit
08be1d288f.

NOTE: This is a candidate for the 9.0 branch.

v2: Also blacklist GL_TEXTURE_3D, per Brian's comment.
v3: Disable GL_TEXTURE_GEN_STR_OES, per Ian's comment.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54297
Reviewed-by: Brian Paul <brianp@vmware.com> [v1]
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 679c93ff89)
2012-09-28 15:54:18 -07:00
Matt Turner
cb84fe5e10 build: Link libglapi with pthreads
NOTE: This is a candidate for the 9.0 branch.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=839060
          https://bugs.gentoo.org/show_bug.cgi?id=435152
Reviewed-by: Adam Jackson <ajax@redhat.com>
(cherry picked from commit 9ed00075d8)
2012-09-28 15:48:30 -07:00
Brian Paul
5ef472dd83 mesa: fix incorrect error for glCompressedSubTexImage
If a subtexture region isn't aligned to the compressed block size,
return GL_INVALID_OPERATION, not gl_INVALID_VALUE.

NOTE: This is a candidate for the stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1f586684d6)
2012-09-28 15:47:57 -07:00
Ian Romanick
7c60a95a0e i915: Don't free the intel_context structure when intelCreateContext fails.
intelDestroyContext will eventually be called, and it will clean things up.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53618
(cherry picked from commit de958de71b)
2012-09-28 15:41:34 -07:00
Ian Romanick
8aaef12a59 i965: Don't free the intel_context structure when intelCreateContext fails.
This squashes two commits from master:

    i965: Don't free the intel_context structure when intelCreateContext fails.

    intelDestroyContext will eventually be called, and it will clean things
    up.  The call to brwInitVtbl is moved earlier so that
    intelDestroyContext can call the device-specific destructor.  This also
    makes the code look more like the i915 code.

    NOTE: This is a candidate for the 9.0 branch.

    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    Reviewed-by: Eric Anholt <eric@anholt.net>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54301
    (cherry picked from commit 87f26214d6)

And:

    i965: brwInitVtbl needs to know the chipset generation

    Fixes major regressions since de958de.

    Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
    (cherry picked from commit e87c63f288)

The second commit message should have read 'since 87f2621', of course.
2012-09-28 15:40:20 -07:00
Ian Romanick
a87b0110b9 intel: Don't call intelDestroyContext if there is no context to destroy
Some error paths in the device-specific context creation functions can exit
before the deintel_context structure is allocated.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53618
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54301
(cherry picked from commit 22897c7497)
2012-09-28 15:07:02 -07:00
Ian Romanick
5174eed793 dri_util: Use calloc to allocate __DRIcontext
The __DRIcontext contains some pointers, and some drivers check for them to be
NULL in some failure paths.  Instead of sprinkling NULL assignments across the
various drivers, just zero out the whole thing.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Lu Hua <huax.lu@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=53618
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54301
(cherry picked from commit f93cb0bebb)
2012-09-28 15:06:58 -07:00
Kenneth Graunke
8c1c18769e i965/blorp: Add support for blits between SRGB and linear formats (fixed).
This is a squash of 2 commits from master.
The first commit is:

i965/blorp: Add support for blits between SRGB and linear formats.

Fixes colorspace issues in L4D2 when multisampling is enabled (the
scene was far too dark, but the flashlight area was way too bright).

The nVidia and AMD binary drivers both allow this kind of blit.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit e2249e8c4d)

The second commit is:

i965/blorp: Fix sRGB MSAA resolves.

Commit e2249e8c4d (i965/blorp: Add
support for blits between SRGB and linear formats) changed blorp to
always configure surface states for in linear format (even if the
underlying surface is sRGB).  This allowed sRGB-to-linear and
linear-to-sRGB blits to occur without causing the image to be
inappropriately brightened or darkened.

However, it broke sRGB MSAA resolves, since they rely on the
destination buffer format being sRGB in order to ensure that samples
are averaged together in sRGB-correct fashion.

This patch fixes the problem by instead configuring the source buffer
to use the *same* format as the destination buffer.  This ensures that
the image won't be brightened or darkened, but preserves proper sRGB
averaging.

Fixes piglit tests "EXT_framebuffer_multisample/accuracy srgb".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55265

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 124b214f09)
2012-09-28 11:20:40 -07:00
Paul Berry
849a3d243d i965: Don't spill "smeared" registers.
Fixes an assertion failure when compiling certain shaders that need both
pull constants and register spilling:

brw_eu_emit.c:204: validate_reg: Assertion `execsize >= width' failed.

NOTE: This is a candidate for the 8.0 release branch.

Signed-off-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit ab5ce2789f)
2012-09-28 11:20:40 -07:00
Paul Berry
36bc0fe4f2 i965/blorp: Increase Y alignment for multisampled stencil blits.
This patch is a band-aid fix for a bug in commit 5fd67fa (i965/blorp:
Reduce alignment restrictions for stencil blits), which causes
multisampled stencil blits to work incorrectly on Sandy Bridge.

When blitting to or from a normal stencil buffer, we have to use a
coordinate transformation that swizzles coordinates to account for the
fact that stencil buffers use W tiling, but the most similar tiling
format available for textures and render targets is Y tiling.  The
differences between W and Y tiling cause pixels to be scrambled within
a block of size 8x4 (width x height) as measured relative to a W tile,
or 16x2 as measured relative to a Y tile.  So in order to make sure
that pixels at the edges of the blit aren't lost, we need to align the
rendering rectangle (and the buffer sizes) to multiples of the 8x4
block size.  This alignment happens in the brw_blorp_blit_params
constructor, whereas the determination of how to swizzle the
coordinates happens during code generation, in the
brw_blorp_blit_program class.

When blitting to or from a multisampled stencil buffer, the coordinate
swizzling is more complex, because it has to account for the
interleaving pattern of samples, which uses 4x4 blocks for 4x MSAA and
8x4 blocks for 8x MSAA.  The end result is that if multisampling is in
use, the 16x2 block size (relative so a Y tile) needs to be expanded
to 16x4, and the corresponding size relative to a W tile expands to
8x8.

The problem doesn't affect Ivy Bridge severely enough to crop up in
Piglit tests because on Ivy Bridge we have to disable multisampling
when blitting *to* a multisampled stencil buffer (the blorp compiler
generates code to compensate for the fact that multisampling is
disabled).  However I suspect a bug is still present because we don't
disable multisampling when blitting *from* a multisampled stencil
buffer.

This patch fixes the problem by doubling the vertical alignment
requirement when blitting to or from a multisampled stencil buffer,
and multisampling has not been disabled.

In the long run I would like to rework the brw_blorp_blit_params
constructor--it's difficult to follow and has had several subtle bugs
like this one.  However this band-aid fix should be suitable for
cherry-picking to release branches.

Fixes Piglit tests "unaligned-blit {2,4} stencil {msaa,upsample}" on
Sandy Bridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit a33ce665a5)
2012-09-28 11:20:39 -07:00
Paul Berry
76c1c34c4a i965/blorp: Fix offsets and width/height for stencil blits.
Fixes piglit test "framebuffer-blit-levels draw stencil".

Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1a5d4f7cb2)
2012-09-28 11:20:39 -07:00
Paul Berry
21e9850d53 i965/blorp: Reduce alignment restrictions for stencil blits.
Previously, we aligned all stencil blit operations to multiples of the
size of a tile, since stencil buffers use W-tiling, and blorp has to
approximate this by configuring the 3D pipeline for Y-tiling and
swizzling coordinates.

However, this was unnecessarily conservative; it turns out that the
differences between W-tiling and Y-tiling are confined to 32-byte
sub-tiles within the 4k tiling pattern; the layout of these 32-byte
sub-tiles within the larger 4k tile is the same (8 sub-tiles across by
16 sub-tiles down, in column-major order).  Therefore we only need to
align stencil blit operations to multiples of the sub-tile size.

Note: although the performance improvement of this change is probably
quite small, the fact that W-tiling and Y-tiling formats only differ
within 32-byte sub-tiles will be essential in a future patch to ensure
that stencil blits work correctly between parts of the miptree other
than level/layer 0.  Making this change provides handy documentation
(and validation) of this fact.

Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 5fd67fac14)
2012-09-28 11:20:39 -07:00
Paul Berry
62bc4af0e1 i965/blorp: don't reduce stencil alignment restrictions when multisampling.
When blitting to a stencil buffer, we need to align the rectangle we
send down the rendering pipeline, to account for the fact that the
stencil buffer uses a W-tiled layout, but we are configuring its
surface state as Y-tiled.

Previously, when the stencil buffer was multisampled, we assumed that
we could reduce the amount of alignment that was necessary, since each
pixel occupies a block of 2x2 or 4x2 samples in the stencil buffer.
That would have been correct if the coordinates we were adjusting were
measured in pixels.  However, the conversion from pixel coordinates to
coordinates within the interleaved buffer has already been done;
therefore the full alignment restriction applies.

Note: the reason this mistake wasn't previously uncovered by piglit
tests is because it is being masked by another mistake: the blorp
engine is using overly conservative alignment restrictions when doing
stencil blits.  The overly conservative alignment restrictions will be
removed in the patch that follows.  Doing this fix now will prevent
the subsequent patch from introducing regressions.

Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 1a75063d5f)
2012-09-28 11:20:39 -07:00
Paul Berry
68da5dfc2c intel: Add map_stencil_as_y_tiled to intel_region_get_aligned_offset.
This patch modifies intel_region_get_aligned_offset() to make the
appropriate calculation when the blorp engine sets up a W-tiled
stencil buffer using a Y-tiled SURFACE_STATE.

Acked-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit b760c9913d)
2012-09-28 11:20:39 -07:00