Commit graph

57386 commits

Author SHA1 Message Date
Marek Olšák
b3d8b4c0b4 glsl/linker: eliminate unused and set-but-unused built-in varyings
This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.

v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
    - use snprintf
    - disable the optimization for GLES2

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
3c555827c3 glsl/linker: check against varying limit after unused varyings are eliminated
We counted even the varyings which were later eliminated, which was
suboptimal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
284d954912 glsl/linker: link shaders in the opposite order (from fragment to vertex)
This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.

For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
Marek Olšák
030ca230e2 mesa: renumber shader indices according to their placement in pipeline
See my explanation in mtypes.h.

v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-02 17:02:14 +02:00
José Fonseca
84f367e69a gallivm: Simplify intrinsic name construction.
Just noticed this could be slightly shortened when fixing MSVC build.

Trivial.
2013-07-02 13:12:31 +01:00
Kenneth Graunke
15ca0ca1b6 glsl/builtins: Fix ARB_texture_cube_map_array built-in availability.
This patch adds texture() for isamplerCubeArray and usamplerCubeArray,
which were entirely missing.

It also makes texture() with a LOD bias fragment shader specific.  The
main GLSL specification explicitly says that texturing with LOD bias
should not be allowed for vertex shaders.

Affects Piglit's ARB_texture_cube_map_array/compiler/tex_bias-01.vert.
which tries to use bias in a vertex shader.  Currently, it expects this
to pass (so this patch regresses the test), but I've sent a patch to
reverse the expected behavior (so this patch would fix the updated test):
http://lists.freedesktop.org/archives/piglit/2013-June/006123.html

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2013-07-02 01:01:30 -07:00
José Fonseca
4c859901ce gallivm: Fix MSVC build. 2013-07-02 06:41:32 +01:00
José Fonseca
e621ec816d gallivm: Fix indirect immediate registers.
If reg->Register.Indirect is true then the immediate is not truly a
constant LLVM expression.

There is no performance regression in using LLVMBuildBitCast, as it will
fallback to LLVMConstBitCast internally when the argument is a constant.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2013-07-02 06:30:06 +01:00
Zack Rusin
70bc43acdb gallium/tests: fix the translate test 2013-06-28 09:43:17 -04:00
Anuj Phogat
722721d718 i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
This patch enables ext_framebuffer_multisample_blit_scaled extension
on intel h/w >= gen6.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Anuj Phogat
6fc3da2da0 i965/blorp: Add bilinear filtering of samples for multisample scaled blits
Current implementation of ext_framebuffer_multisample_blit_scaled in
i965/blorp uses nearest filtering for multisample scaled blits. Using
nearest filtering produces blocky artifacts and negates the benefits
of MSAA. That is the reason why extension was not enabled on i965.

This patch implements the bilinear filtering of samples in blorp engine.
Images generated with this patch are free from blocky artifacts and show
big improvement in visual quality.

Observed no piglit and gles3 regressions.

V3:
- Algorithm used for filtering assumes a rectangular grid of samples
  roughly corresponding to sample locations.
- Test the boundary conditions on the edges of texture.

V4:
- Clip texcoords and use conditional MOVs.
- Send texture dimensions as push constants.
- Remove the optimization in case of scaled multisample blits.

V5:
- Move mcs_fetch() inside the 'for' loop after computing pixel coordinates.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-07-01 15:21:25 -07:00
Ian Romanick
27f2df2507 docs: Import 9.1.4 release notes, add news item.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2013-07-01 14:48:58 -07:00
Zack Rusin
1c2e5c223d draw/translate: fix instancing
We were incorrectly computing the buffer offset when using the
instances. The buffer offset is always equal to:
start_instance * stride + (instance_num / instance_divisor) *
stride
We were completely ignoring the start instance quite
often producing instances that completely wrong, e.g. if
start instance = 5, instance divisor = 2, then on the first
iteration it should be:
5 * stride, not (5/2) * stride as we'd have currently, and if
start instance = 1, instance divisor = 3, then on the first
iteration it should be:
1 * stride, not 0 as we'd have.
This fixes it and adjusts all the code to the changes.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 05:21:20 -04:00
Zack Rusin
df4ab7974a draw: fix incorrect clipper invocation statistics
clipper invocations are computed earlier (of course
before the emittion) so this code was adding bogus
numbers to already computed clipper invocations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:29 -04:00
Zack Rusin
34546d61c1 draw/gallivm: export overflow arithmetic to its own file
We'll be reusing this code so lets put it in a common file
and use it in the draw module.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:24 -04:00
Zack Rusin
88de009cc1 draw: check for integer overflows in instance computation
Integers could easily overflow is the starting instance
was large enough. Instead of letting bogus counts through
set the instance to max if it overflown and let our
regular buffer overflow computation handle it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:20 -04:00
Zack Rusin
2f13f28120 draw: check for an integer overflow when computing stride
Our buffer overflow arithmetic was susceptible to integer
overflows which was the buffer overflow logic to break.
Lets use the llvm overflow intrinsics to check for integer
overflows while computing the stride/needed buffer size.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:16 -04:00
Zack Rusin
e742f7788e draw: account for elem size when computing overflow
We weren't taking into account the size of element
that is to be fetched, which meant that it was possible
to overflow the buffer reads if the stride was very
close to the end of the buffer, e.g. stride = 3, buffer
size = 4, and the element to be read = 4. This should
be properly detected as an overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-06-28 04:24:12 -04:00
Vinson Lee
7214fe3cc4 i965: Initialize brw_blorp_const_color_program member variables.
Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-07-01 10:16:16 -07:00
Ross Burton
2c6186390c eglplatform: use unsigned long instead of 32-bit ints in generic platform
In the generic Unix case use the "unsigned long" type instead of 32-bit
integers so that the type sizes are consistant on 64-bit machines between X11
and not-X11.

Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:24 -07:00
Ross Burton
1a7275de9a build: fix EGL build when no X11 headers are present
eglplatform.h defaults to X11 on Unix unless told otherwise, so if we're doing a
build without any X11 support tell it so that we don't try including headers
that don't exist.

Also set GL_PC_FLAGS so that the definition is in egl.pc, so that applications
using EGL don't try to pull in X11 headers on systems where EGL was configured
without X11 support.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64959
Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-07-01 10:06:11 -07:00
José Fonseca
acc6a141b8 tools/trace: Return dummy fence object to silence warnings. 2013-07-01 12:06:58 +01:00
José Fonseca
0fd71ac9eb tools/trace: Don't crash if a trace has no timing information. 2013-07-01 12:05:57 +01:00
José Fonseca
fa3040c117 scons: Fix dependencies of enums.c and api_exec.c. 2013-07-01 12:04:59 +01:00
Maarten Lankhorst
bf95ca7de0 nvc0: allow frame dropping in h264
The only reason the checks existed were paranoia, when I first
wrote the code I wasn't sure it was correct. Now that I am,
the asserts triggered when XBMC was dropping frames, so remove it.

NOTE: This is a candidate for the 9.1 branch.
2013-07-01 08:47:49 +02:00
Tom Stellard
24fa43675f r300g/compiler: Prevent regalloc from swizzling texture operands v2
https://bugs.freedesktop.org/show_bug.cgi?id=63520

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
e2c3640540 r300g/compiler/tests: Add an assembly parser
The assembly parser can be used to load r300 assembly dumps
and run them through any of the r300 compiler passes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:38:57 -07:00
Tom Stellard
ab40d8d56f r300g: Fix make check
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2013-06-30 21:24:55 -07:00
Grigori Goronzy
30004b20c2 r600g: implement fast color clears for MSAA on evergreen+
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.

Fast clear is used only when all bound colorbuffers fulfill certain
conditions: a CMASK is required, we have to be able to create a clear
color value for the format and the texture mustn't contain multiple
images. Technically, it should be possible to support array textures
and cubemaps if all images are attached to the framebuffer,
but this does not appear to be common.

v2: fix fast clear check
v3: Marek: - disable fast clear with 128-bit formats, which are unsupported
           - set tex->dirty_level_mask in r600_clear, so that the driver knows
             the resource must be decompressed/expanded
           - return early from r600_clear if there's nothing else to do

Signed-off-by: Marek Olšák <maraeo@gmail.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
b1693194ee r600g/compute: disable unused colorbuffer slots
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
2013-07-01 03:02:43 +02:00
Marek Olšák
f83e220d36 st/mesa: handle SNORM formats in generic CopyPixels path
v2: check desc->is_mixed in util_format_is_snorm
2013-06-30 22:14:37 +02:00
Matt Turner
adf8afa168 i965: NULL check depth_mt to quiet static analysis.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-06-29 15:19:08 -07:00
Roland Scheidegger
7d430bfab9 llvmpipe: fix timer query if there's no bins
b04a295a4a removed seemingly unnecessary
code in get_query. Turns out this code could in fact be reached - while
timestamps are always binned, if there are no bins (which happens if fb
size is 0) then the rasterization query code filling this in is still
never executed.
So fix this up by filling in some timestamp, but do it at EndQuery time
not GetQuery time which should be more appropriate.
Makes piglit arb_timer_query-timestamp-get happy again.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-29 16:58:02 +02:00
Tom Stellard
5a925cc550 clover: Don't segfault when compiling a program with no kernel 2013-06-28 15:19:06 -07:00
Eric Anholt
d7361f2943 mesa: Remove unused allow_large_textures driconf from classic drivers.
This option hasn't been used since the introduction of DRI2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:27 -07:00
Kenneth Graunke
03600660a1 i915: Remove GLES 3.0 sRGB workaround.
Gen3 doesn't support GLES 3.0, so there's no need for it.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
dc8796506e i965: Remove is_945.
Only relevant on Gen3.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
a4e31956ac i965: Delete hw_stencil flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
4299e35888 i965: Remove hw_stipple flag.
This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1a5dca38e9 i965: Remove use_early_z option.
This was only used by i965+.

v2: Also remove the option from the driconf list. (change by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2cc5724db2 i965: Remove unused SUBPIXEL_* macros.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
2e9fe0ca12 i965: Remove redundant Gen3 PCI IDs.
Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Kenneth Graunke
1811f5c43d intel: Remove unused INTEL_MAX_FIXUP macro.
v2: Remove it from i915, too (change by anholt)

Acked-by: Eric Anholt <eric@anholt.net>
2013-06-28 13:35:26 -07:00
Eric Anholt
0ac0a1b02e i965: Drop i915 register/instruction definitions.
v2: Remove unused DV_PF_* macros, too. (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:26 -07:00
Eric Anholt
1b67cd29a1 i965: Drop code for calling the empty brw_update_draw_buffers() hook.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
7c232189c5 i965: Drop dead i915 blend state code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
d58d0a3754 i965: Drop i915-specific blit clear code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
cf31a19300 i965: Drop the system-memory VBO support for i915.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
814440aadd i965: Drop i915 swtnl code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00
Eric Anholt
bb2e312d4d i965: Drop i915-specific vtbl entries.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-06-28 13:35:25 -07:00