Commit graph

57432 commits

Author SHA1 Message Date
Marek Olšák
94d294137e r600g: don't read back the MSAA depth buffer if the read flag is not set
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
141b892620 r600g: don't flush the context in texture_transfer_map
the winsys does this automatically

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
ae87aae0c4 r600g: fix texture offset computation for mapped MSAA depth buffers
It was wrong, because the offset shouldn't be applied to MSAA depth buffers.
This small cleanup should prevent such issues in the future.

This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n".

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
a3263cca59 r600g: fix color resolve for RGBX8 and RGBX16 integer formats
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
b1a061b81e r600g: enable fast MSAA color clear for array/3D/cube textures
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Marek Olšák
87669c3654 r600g: implement fast MSAA color clear for integer textures
this also fixes the fast clear with multiple colorbuffers and each having
a different format

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-07-08 20:25:18 +02:00
Christian König
085c695488 r600/uvd: fix check for UVD 2.x
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-07-08 19:51:20 +02:00
Chris Forbes
1415a1884c i965: fix alpha test for MRT
Include src0 alpha in the RT write message when using MRT, so it is used
for the alpha test instead of the normal per-RT alpha value.

Fixes broken rendering in Dota2 under Wine [FDO #62647].

No Piglit regressions on Ivybridge.

V2: reuse (and simplify) existing sample_alpha_to_coverage flag in
the FS key, rather than adding another redundant one.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewd-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62647
NOTE: This is a candidate for the stable branches.
2013-07-06 12:41:54 +12:00
Roland Scheidegger
9ef49cfd84 gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch
The logic for choosing number of lods was bogus.
(The code should ultimately handle the case of only one lod even with multiple
quads but currently can't.)
2013-07-05 18:07:51 +02:00
José Fonseca
45f174ce40 gallivm: Remove bogus assert.
It is perfectly valid for the swizzle to be bigger than 2. For example the
texel offsets could be

  SAMPLE ..., IMM[0].zzz

What is not correct is for chan_index to be bigger than 2.

Trivial.
2013-07-05 14:35:54 +01:00
Ben Skeggs
c29c6b2b2e nvc0: enable very initial support for nvf0 (GK110)
Shaders need a lot of work still.  Basic stuff generally works, so this
is basically just fine for gnome-shell, OA etc at this point.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2013-07-05 14:15:04 +10:00
Roland Scheidegger
4dbca8672b gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources
The assertion was always broken but the code unused until enabling the
per-element lod code. Fixes piglit texelFetch vs isampler1D and similar
tests (only run with GL 3.0 version override).
2013-07-05 01:19:23 +02:00
Roland Scheidegger
f3bbf65929 gallivm: do per-pixel lod calculations for explicit lod
d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just affect neighboring pixels.
Some code was already there to handle this so fix it up and enable it.
There will no doubt be a performance hit unfortunately, we could do better
if we'd knew we had a real vector shift instruction (with variable shift
count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu).
Don't do anything for lod bias and explicit derivatives yet, though
no special magic should be needed for them neither.
Likewise, the size query is still broken just the same.

v2: Use information if lod is a (broadcast) scalar or not. The idea would be
to base this on the actual value, for now just pretend it's a scalar in fs
and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same
code is generated for fs as before).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-07-04 19:42:04 +02:00
Zack Rusin
bbd1e60198 draw: fix overflows in the indexed rendering paths
The semantics for overflow detection are a bit tricky with
indexed rendering. If the base index in the elements array
overflows, then the index of the first element should be used,
if the index with bias overflows then it should be treated
like a normal overflow. Also overflows need to be checked for
in all paths that either the bias, or the starting index location.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:30 -04:00
Zack Rusin
09820902d7 draw/llvm: index overflows if it's greater than elt max
The comparison, incorrectly, was greater-than-or-equal to
elt max.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-07-03 09:06:24 -04:00
Kenneth Graunke
764afc48cf i965: Move the rest of intel_tex_layout.c into brw_tex_layout.c.
The texture alignment unit functions are called from brw_tex_layout.c,
so it makes sense to put them there.  Since the only caller of
intel_get_texture_alignment_unit() is in brw_tex_layout.c, it could be
made into a static function.  However, this patch instead simply folds
it into the caller, as it's only two lines anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
466aa712b6 i965: Push intel_get_texture_alignment_unit call into brw_miptree_layout
intel_miptree_create_layout() calls intel_get_texture_alignment_unit()
and then immediately calls brw_miptree_layout().  There are no other
callers.

intel_get_texture_alignment_unit() populates the miptree's alignment
unit fields, which are used by brw_miptree_layout() to determine where
to place each miplevel.  Since brw_miptree_layout() needs those to be
present, it makes sense to have it initialize them as the first step.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
c4c3c0dc94 i965: Declare for-loop counters in the loop in brw_tex_layout.c.
The driver is compiled in C99 mode, so this is not a problem.  It's
slighlty tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ccf312fd12 i965: Remove use of GLuint/GLint in brw_tex_layout.c.
Using GL types is silly; this isn't even remotely API-facing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
ed95e396f3 i965: Tidy the brw_tex_layout.c copyright and file header comments.
This uses Doxygen style for the file comments, and generally makes it
more consistent with the rest of the driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
2ea87fde31 i965: Move i945_texture_layout_2d to brw_tex_layout.c
This consolidates the miptree layout logic in a single file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:15 -07:00
Kenneth Graunke
1920209970 i965: Remove fallthrough for Gen4 cube map layout.
Now that both 2DArray and Cube layouts are taken care of by helper
functions, it's easy to just call the right function for each
generation.  This is a little cleaner than falling through.

This also reworks the comments.  Referencing "Volume 1" of the BSpec
isn't very helpful, since that's only available inside Intel, and it
doesn't even use volume numbers.  Also, "Ironlake...finally" sounds a
bit strange considering that almost all hardware uses the 2D array
approach.  At this point, Gen4 is the only special case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7e4007a1b3 i965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases.
These do the exact same thing; combining them is tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
bc51f15b32 i965: Pull 3D texture layout code out into a helper function.
A bit cleaner than having it in one giant function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
abc2bdffd6 i965: Replace maxBatchSize variable with BATCH_SZ define.
maxBatchSize was only ever initialized to BATCH_SZ, and a few places
used BATCH_SZ directly anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
2c602d2adf i965: Move annotate_aub out of the vtable.
brw_annotate_aub() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
f05f8793c8 i965: Move debug_batch hook out of the vtable.
brw_debug_batch() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
749160aab3 i965: Remove render_target_supported from the vtable.
brw_render_target_supported() is the only implementation of this
function, so it makes sense to just call it directly.

Rather than adding an #include of brw_wm.h, this patch moves the
prototype to brw_context.h.  Prototypes seem to be in rather arbitrary
places at the moment, and either place seems as good as the other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7c5279e554 i965: Move is_hiz_depth_format out of the vtable.
brw_is_hiz_depth_format() is the only implementation of this function,
so it makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
607338f1cb i965: Remove the invalidate_state() vtable hook.
The hook was a noop.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
251cdcf059 i965: Replace fprintfs with assertions in GLenum comparison translators.
These functions translate GLenum comparison operations into the hardware
enumerations.  They should never be passed something other than a GL
comparison operator, or something is very broken.

Assertions seem more appropriate than fprintf.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:14 -07:00
Kenneth Graunke
7ee616f1bf i965: Replace intel_state.c enums with those from brw_defines.h.
Both intel_context.h and brw_defines.h have #defines for comparison
functions, stencil ops, blending logic ops, and blending factors.
They're exactly the same values, so it makes sense to pick one.

brw_defines.h is the logical place for this kind of stuff, so this patch
converts intel_state.c to use the set defined there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
c9db037dc9 i965: Delete pre-DRI2.3 viewport hacks.
The __DRI_USE_INVALIDATE extension was added in May 11th, 2010 by commit
4258e3a2e1.  At this point, it's unlikely that anyone's using the
right mix of new and old components to hit this path.  Deleting it
removes an untested code path and cleans up the driver a bit.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Keith Packard <keithp@keithp.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
cbb37b7586 i965: Remove "There are probably better ways" comment.
There are always better ways to do things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7115bee993 i965: Delete brw_print_reg() function.
This wasn't called from anywhere; presumably it was used to examine
brw_regs when debugging shader assembly.  However, it prints registers
in a different notation than brw_disasm.c which everyone is used
to...which means I doubt anyone will want to use it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
bc8b62e3a0 i965: Move contents of intel_clear.h to intel_context.h.
Having a header file for a single prototype seems rather excessive.
Plus, the actual function is in brw_clear.c, not intel_clear.c, so
there isn't even the .c/.h filename symmetry one might expect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d8e70f301 i965: Move contents of intel_extensions.h to intel_context.h.
Having an entire header file for a single prototype seems a bit
excessive.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
7d119880e8 i965: Remove some dead code.
A random smattering of things that just aren't used anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
d245e795cf i965: Delete dead intel_buffer_object::range_map_size field.
Nothing uses this, apparently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
1f6ebdd43f i965: Remove intel_buffer_object::source.
This was only used for BOs backed by system memory on i915.  With that
gone, there's nothing that even sets source to non-zero, so this is
purely dead code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:13 -07:00
Kenneth Graunke
6e5b80ee5a i965: Fix buffer object segfault since removal of system memory BOs.
Commit cf31a19300 removed support for BOs
backed by system memory, as it was only useful for i915.  However, it
removed a little too much code: intel_bufferobj_buffer() used to call
intel_bufferobj_alloc_buffer(), and after that commit, it didn't.

This led to NULL pointer dereferences in several test cases, such as
es3conform's transform_feedback_state_variables test.

This commit restores the allocation, preserving the original behavior.
It may not be the cleanest approach, but tidying should come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66432
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-03 10:48:12 -07:00
Matthew McClure
012ba47076 postprocess: move second temporary assertion into isolated configuration
With this patch we will only assert that the second temporary is allocated,
when there are more than two active filters.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423

Signed-off-by: Brian Paul <brianp@vmware.com>
2013-07-03 09:19:04 -06:00
José Fonseca
9b6788eb15 glsl: Ensure snprintf is defined on MSVC builds.
Should fix:

  src\glsl\opt_dead_builtin_varyings.cpp(244) : error C3861: 'snprintf': identifier not found
  ...
2013-07-03 08:26:08 +01:00
Ilia Mirkin
4bc8e3c3e4 targets/xvmc-nouveau: add in missing nv30 lib
Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so
that it may be dlopen'd.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2013-07-03 09:02:40 +02:00
Marek Olšák
30c3e8718d mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies
Not needed with do_dead_builtin_varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
74edd56927 st/mesa: disable EXT_separate_shader_objects
The extension disallows elimination of set-but-unused varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
b3d8b4c0b4 glsl/linker: eliminate unused and set-but-unused built-in varyings
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.

v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
    - use snprintf
    - disable the optimization for GLES2

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
3c555827c3 glsl/linker: check against varying limit after unused varyings are eliminated
We counted even the varyings which were later eliminated, which was
suboptimal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
284d954912 glsl/linker: link shaders in the opposite order (from fragment to vertex)
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.

For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
030ca230e2 mesa: renumber shader indices according to their placement in pipeline
See my explanation in mtypes.h.

v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00