Commit graph

57138 commits

Author SHA1 Message Date
Rob Clark
ba6a490bbc freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-06-13 15:20:34 -04:00
Rob Clark
3394900dd3 freedreno: try for more squarish tile dimensions
Worth about ~0.5fps in xonotic, for example.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-06-13 15:20:34 -04:00
Rob Clark
6aeeb706d2 freedreno: fix for null textures
Some apps seem to give us a null sampler/view for texture slots which
come before the last used texture slot.  In particular 0ad triggers
this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-06-13 15:20:34 -04:00
Roland Scheidegger
2ea8e2fccf llvmpipe: increase number of queries which can be binned simultaneously to 64
Gallium (but not OpenGL) does allow nesting of queries, but there's no
limit specified (d3d10 has no limit neither). Nevertheless, for practical
purposes we need some limit in llvmpipe, otherwise we'd need more complex
handling of queries as we need to keep track of all binned queries (this
only affects queries which gather data past setup). A limit of 16 is too
small though, while 64 would suffice.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-06-13 20:08:39 +02:00
Bruno Jiménez
03aab2af16 radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS
v2:
    Add RADEON_INFO_ACTIVE_CU_COUNT as a define, as suggested by
    Tom Stellard

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-06-13 10:59:30 -04:00
Neil Roberts
b8d15ca5e8 Remove _mesa_is_type_integer and _mesa_is_enum_format_or_type_integer
The comment for _mesa_is_type_integer is confusing because it says that it
returns whether the type is an “integer (non-normalized)” format. I don't
think it makes sense to say whether a type is normalized or not because it
depends on what format it is used with. For example, GL_RGBA+GL_UNSIGNED_BYTE
is normalized but GL_RGBA_INTEGER+GL_UNSIGNED_BYTE isn't. If the normalized
comment is just a mistake then it still doesn't make much sense because it is
missing the packed-pixel types such as GL_UNSIGNED_INT_5_6_5. If those were
added then it effectively just returns type != GL_FLOAT.

That function was only used in _mesa_is_enum_format_or_type_integer. This
function effectively checks whether the format is non-normalized or the type
is an integer. I can't think of any situation where that check would make
sense.

As far as I can tell neither of these functions have ever been used anywhere
so we should just remove them to avoid confusion.

These functions were added in 9ad8f431b2.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-06-13 15:54:46 +01:00
Bruno Jiménez
2a0dffa0c9 clover: query driver for the max number of compute units
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-06-12 19:09:32 -04:00
Bruno Jiménez
8f4d37889c gallium: Add PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-06-12 19:08:06 -04:00
Bruno Jiménez
4f70d83089 r600g/compute: solve a bug introduced by 2e01b8b440
That commit made possible that the items could be one just
after the other when their size was a multiple of ITEM_ALIGNMENT.
But compute_memory_prealloc_chunk still looked to leave a gap
between items. Resulting in that we got an infinite loop when
trying to add an item which would left no space between itself and
the next item.

Fixes piglit test: cl-custom-r600-create-release-buffer-bug
And the test for alignment I have just sent:
http://lists.freedesktop.org/archives/piglit/2014-June/011135.html

Sorry about this.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-06-12 15:52:08 -04:00
Niels Ole Salscheider
607bc89970 egl/gallium: Set defines for supported APIs when using automake
This fixes automake builds which are broken since
b52a530ce2.

v2: This patch also adds the FEATURE_* defines back to targets/egl-static for
Android and Scons that have been removed in the mentioned commit.

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79885
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-12 18:07:20 +01:00
Courtney Goeltzenleuchter
0406f59eeb mesa: glx: Reduce error log level
The code that parses LIBGL_DRIVERS_PATH was printing an
error for every attempted dlopen. It's not an error to
have to check multiple items in the path, only an error if
no suitable library is found. Reduced the load error to
a warning to match behavior of dynamic linker.

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-12 10:19:00 -06:00
Brian Paul
33f273778b cso: fix stream-out clean up in cso_release_all()
Use the has_streamout flag as we do elsewhere to check if we need
to call pipe->set_stream_output_targets().  The driver might implement
the set_stream_output_targets() function, but not for all hardware
configurations.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-06-12 13:23:56 +01:00
Neil Roberts
765efeef88 i965: Set the fast clear color value for texture surfaces
When a multisampled texture is used for sampling the fast clear color value
needs to be programmed into the surface state. This was being left as all
zeroes so if the surface was cleared to a value other than black then it
wouldn't work properly. This doesn't matter for single-sample textures because
in that case the MCS buffer is resolved before it is used as a texture source.

https://bugs.freedesktop.org/show_bug.cgi?id=79729

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
2014-06-12 11:24:04 +01:00
Chris Forbes
2c79aa8272 glsl: Fix typo in comment.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-06-12 21:19:24 +12:00
Kenneth Graunke
3e71258023 i965: Fix disassembly of BLORP clear programs.
Too many levels of indirection.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-12 00:56:08 -07:00
Kenneth Graunke
b207caf9bc i965/fs: Move FB write default state mashing in a level.
We only need to alter the default state if we're emitting MOVs for
header related fields.  So, we can simply move the push/pop of state in
to the if (header_present) block, bypassing it in the common case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
2014-06-12 00:56:08 -07:00
Kenneth Graunke
a2ad771671 i965: Fix Haswell discard regressions since Gen4-5 line AA fix.
In commit dc2d3a7f5c, Iago accidentally
moved fire_fb_write() above the brw_pop_insn_state(), which caused the
SEND to lose its predication and change from WE_normal to WE_all.
Haswell uses predicated SENDs for discards, so this broke Piglit's
tests for discards.

We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked,
but the actual FB write itself should respect those.  So, pop state
first, and force it again around the single MOV.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
2014-06-12 00:56:08 -07:00
Michel Dänzer
be5e5b6c93 gbm: Remove 64x64 restriction from GBM_BO_USE_CURSOR
GBM_BO_USE_CURSOR_64X64 is kept so that existing users of GBM continue to
build, but it no longer rejects widths or heights other than 64.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79809

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-12 16:13:39 +09:00
Matt Turner
2c8520c03d i965: Use brw->gen in some generation checks.
Will simplify the automated conversion if we want to allow compiling the
driver for a single generation.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2014-06-11 20:57:10 -07:00
Matt Turner
f51a7e00da i965/fs: Clean up tabs in brw_fs_cse.cpp.
I'm adding vec4 CSE, and I want to diff the files.
2014-06-11 20:09:22 -07:00
Robert Bragg
c6f118484c meta: save and restore swizzle for _GenerateMipmap
This makes sure to use a no-op swizzle while iteratively rendering each
level of a mipmap otherwise we may loose components and effectively
apply the swizzle twice by the time these levels are sampled.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-11 21:38:01 +01:00
Ian Romanick
63117ac329 i965/vec4: Emit smarter code for b2f of a comparison
Previously we would emit the comparison, emit an AND to mask off extra
bits from the comparison result, then convert the result to float.  Now,
do the comparison, then use a cleverly constructed SEL to pick either
0.0f or 1.0f.

No piglit regressions on Ivybridge.

total instructions in shared programs: 1642311 -> 1639449 (-0.17%)
instructions in affected programs:     136533 -> 133671 (-2.10%)
GAINED:                                0
LOST:                                  0

Programs that are affected appear to save between 1 and 5 instuctions
(just by skimming the output from shader-db report.py.

v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes).  Remove
extraneous fix_3src_operand (suggested by Matt).  The latter change
required swapping the order of the operands and using predicate_inverse.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-11 12:00:24 -07:00
Ian Romanick
be0452b049 i965/vec4: Silence a couple unused parameter warnings
brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter]
brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-11 12:00:20 -07:00
Ian Romanick
014d45f137 glsl: Store gl_uniform_driver_storage::format as the actual type
And delete the incorrect comment.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-06-11 11:26:05 -07:00
Dave Airlie
0d89448662 softpipe: fix pt->resource assert placement
oops meant to move this.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 14:03:11 +10:00
Dave Airlie
9bc12ef241 softpipe: enable AMD_vertex_shader_layer.
This passes tests now on softpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:21 +10:00
Dave Airlie
8dede2fa6c softpipe: enable GLSL 3.30 support.
This enables GL3.3 on softpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:17 +10:00
Dave Airlie
c82d227edd softpipe: bump the softpipe geometry limits
This just aligns the limits with llvmpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:08 +10:00
Dave Airlie
7ea04f089b tgsi_exec: use defines for max inputs/outputs
This fixes the limits for GL 3.2, and subsequently fixes
some segfaults in some varying packing tests and max varying tests
after the limits bumped.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:04 +10:00
Dave Airlie
740d5bed77 softpipe: add layered rendering support.
This adds support for GL 3.2 layered rendering to softpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:20:30 +10:00
Dave Airlie
dc8fc39ada softpipe: add layering to the surface tile cache.
This adds the layer info to the tile cache.

This changes clear_flags to be dynamically allocated as
MAX_LAYERS seems like a too big step.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:20:30 +10:00
Dave Airlie
5a57248541 softpipe: add depth clamping support. (v2)
This passes the piglit depth clamp tests.

this is required for GL 3.2.

v2: move min/max up one level, could go further, thanks
to Roland for suggestion.

v1: Reviewed-by: Brian Paul <brianp@vmware.com>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:20:07 +10:00
Dave Airlie
a4670de0a0 tgsi/gs: bound max output vertices in shader
This limits the number of emitted vertices to the shaders max output
vertices, and avoids us writing things into memory that isn't big
enough for it.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:19:37 +10:00
Jon Ashburn
10e8d55799 i965: Add GPU BLIT of texture image to PBO in Intel driver
Add Intel driver hook for glGetTexImage to accelerate the case of reading
texture image into a PBO.  This case gets huge performance gains by using
GPU BLIT directly to PBO rather than GPU BLIT to temporary texture followed
by memcpy.

No regressions on Piglit tests  with Intel driver.
Performance gain (1280 x 800 FBO, Ivybridge):
glGetTexImage + glMapBufferRange  with patch 1.45 msec
glGetTexImage + glMapBufferRange without patch 4.68 msec

v3: (by Kenneth Graunke)
 - Fix compile after Eric's change to drop the tiling argument
   to intel_miptree_create_for_bo.
 - Add GL_TEXTURE_3D to blacklisted texture targets to prevent Piglit
   regressions.
 - Squash in several whitespace and coding style fixes.
2014-06-10 18:36:44 -07:00
Kenneth Graunke
237aac39b1 i965: Invalidate live intervals when inserting Gen4 SEND workarounds.
We need to invalidate the live intervals when inserting new
instructions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-06-10 16:38:27 -07:00
Kenneth Graunke
ecc78eab11 i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.
When walking backwards, we want to stop at the head sentinel, which is
where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL.

Fixes random crashes, as well as valgrind errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-06-10 16:38:27 -07:00
Kenneth Graunke
fc19c4aaf1 meta: Label the meta GLSL clear program.
Giving the meta clear program a meaningful name makes it easier to find
in output such as INTEL_DEBUG=fs or INTEL_DEBUG=shader_time.  We already
did so for integer programs, but neglected to label the primary program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-10 16:38:27 -07:00
Kenneth Graunke
2bcd24c9f0 i965/fs: Combine generate_math[12]_gen6 methods.
These used to call different math emitters (brw_math vs. brw_math2).
Now that they both call gen6_math, they're virtually identical.

When unrolling SIMD16 to multiple SIMD8 operations, we should take care
not to apply sechalf to brw_null_reg for src1.  Otherwise, we'd end up
with BRW_ARF_NULL + 1 as the register number, and I'm not sure if that's
valid.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:27 -07:00
Kenneth Graunke
35e48bd618 i965/fs: Drop the generate_math[12]_gen7 methods.
These functions are basically identical, so we should combine them.
However, they're so trivial, we may as well just fold them into their
only call sites.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke
f3ddd71f28 i965/vec4: Combine generate_math[12]_gen6 methods.
These are trivial to combine: we should just avoid checking the second
operand if it's brw_null_reg.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke
5260a26e92 i965/vec4: Drop the generate_math2_gen7() method.
It's now a single line of code, so we may as well fold it into the
caller.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke
b003fc265f i965: Rename brw_math to gen4_math.
Usually, I try to use "brw" for functions that apply to all generations,
and "gen4" for dead end/legacy code that is only used on Gen4-5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke
de65ec2fde i965: Split Gen4-5 and Gen6+ MATH instruction emitters.
Our existing functions, brw_math and brw_math2, had unclear roles:

Gen4-5 used brw_math for both unary and binary math functions; it never
used brw_math2.  Since operands are already in message registers, this
is reasonable.

Gen6+ used brw_math for unary math functions, and brw_math2 for binary
math functions, duplicating a lot of code.  The only real difference was
that brw_math used brw_null_reg() for src1.

This patch improves brw_math2's assertions to allow both unary and
binary operations, renames it to gen6_math(), and drops the Gen6+ code
out of brw_math().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke
7b9cf79790 i965: Make src_reg::equals() take a constant reference, not a pointer.
This is more typical C++ style.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke
000f4a33c0 i965: Don't set the "switch" flag on control flow instructions on Gen6+.
Thread switching on control flow instructions is a documented workaround
for Gen4-5 errata.  As far as I can tell, it hasn't been needed since
Sandybridge.  Thread switching is not free, so in theory this may help
performance slightly.

Flow control instructions with the "switch" flag cannot be compacted, so
removing it will make these instructions compactable.  (Of course, we
still have to implement compaction for flow control instructions...)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke
3a439534de i965/fs: Allow CSE on math opcodes on Gen6+.
total instructions in shared programs: 2081469 -> 2081248 (-0.01%)
instructions in affected programs:     22606 -> 22385 (-0.98%)
No programs were hurt by this patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-06-10 16:38:25 -07:00
Thomas Helland
2c9a1518a1 glsl: Remove unused include in expr.flatt.
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:52 -07:00
Thomas Helland
10e00611c2 glsl: Remove unused include in ir.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland
8e1e68119c glsl: Remove unused include from ir_constant_expression.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland
068d30655c glsl: Remove unused include from ir_basic_block.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00