Commit graph

59400 commits

Author SHA1 Message Date
Matt Turner
20d0297ff2 i965/fs: Add reads_flag() and writes_flag() to fs_inst.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 19:49:27 -07:00
Matt Turner
f768f998e0 i965/fs: Add is_null() method to fs_reg.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 19:49:27 -07:00
Eric Anholt
8dfc9f038e i965/fs: Use the gen7 scratch read opcode when possible.
This avoids a lot of message setup we had to do otherwise.  Improves
GLB2.7 performance with register spilling force enabled by 1.6442% +/-
0.553218% (n=4).

v2: Use BRW_PREDICATE_NONE, improve a comment (by Paul).

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:19 -07:00
Eric Anholt
6032261682 i965: Merge together opcodes for SHADER_OPCODE_GEN4_SCRATCH_READ/WRITE
I'm going to be introducing gen7 variants, and the previous naming was
going to get confusing.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:17 -07:00
Eric Anholt
32182bb004 i965/fs: Fix register unspills from a reg_offset.
We were clearing the reg_offset before trying to use it.  Oops.  Fixes
glsl-fs-texture2drect with the reg spilling debug enabled.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:15 -07:00
Eric Anholt
0e20051f54 i965/fs: Fix register spilling for 16-wide.
Things blew up when I enabled the debug register spill code without
disabling 16-wide, so I decided to just fix 16-wide spilling.

We still don't generate 16-wide when register spilling happens as part of
allocation (since we expect it to be slower), but now we can experiment
with allowing it in some cases in the future.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:51:10 -07:00
Eric Anholt
537f183fe6 i965/fs: Exit the compile if spilling would overwrite in-use MRFs.
I believe this will never happen in SIMD8 mode, but it could for SIMD16
when we fix it.

v2: Fix off-by-one in my register counting comment (caught by Paul).

Reviewed-by: Paul Berry <stereotype441@gmail.com> (v1)
2013-10-30 17:51:02 -07:00
Eric Anholt
44ec2f1751 i965/fs: Fix broken register spilling debug code.
Now that reg spilling generates new vgrfs, we were looping forever if you
ever turned it on.

Instead, move the debug code into the register allocator right near where
we'd be doing spilling anyway, which should more accurately reflect how
register spilling occurs in the wild.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:50:59 -07:00
Eric Anholt
b3f6690406 i965/fs: Split "find what MRFs were used" to a helper function.
I'm going to need to reuse this for fixing register spilling on SIMD16.
Note that BRW_MAX_MRF is 16, which is the same as BRW_MAX_GRF -
GEN7_MRF_HACK_START.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:50:56 -07:00
Eric Anholt
32ac5634d6 i965/fs: Update an ancient, wrong comment about reg_offset.
This hasn't been true since SIMD16 mode was added.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 17:50:51 -07:00
Kai Wasserbäch
bbb77fc2f1 radeonsi: Allow longer intrinsic names
Fixes a boat load of Piglit tests for me, which crashed like fdo#70913
before.

Thanks to Michel Dänzer for the tip.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70913
Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-30 16:40:06 -07:00
Tom Stellard
193594a1b8 clover: Don't install headers when using the icd
The ICD loader should be responsible for installing headers.

Reviewed and Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2013-10-30 16:40:06 -07:00
Tom Stellard
6f3465f340 radeon/llvm: Specify the DataLayout when running optimizations
Without DataLayout, a lot of optimization passes aren't run and the ones
that are don't work as well.
2013-10-30 16:40:06 -07:00
Eric Anholt
20dbeadd83 i965/fs: Prefer more-critical instructions of the same age in LIFO scheduling.
When faced with a million instructions that all became candidates at the
same time (none of which individually reduce register pressure), the ones
on the critical path are more likely to be the ones that will free up some
candidates soon.

shader-db:
total instructions in shared programs: 1681070 -> 1681070 (0.00%)
instructions in affected programs:     0 -> 0
GAINED:                                40
LOST:                                  74

Fixes indistinguishable-from-hanging behavior in GLES3conform's
uniform_buffer_object_max_uniform_block_size test, regressed by
c3c9a8c857.  Given that
93bd627d5a was unlocked by that commit, the
net effect on 16-wide program count is still quite positive, and I think
this should give us more stable scheduling (less dependency on original
instruction emit order).

v2: Comment suggestions by Paul

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70943
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 15:46:54 -07:00
Eric Anholt
017361dd37 i965: Compute the node's delay time for scheduling.
This is a step in doing scheduling as described in Muchnick (p538).  A
difference is that our latency function is only specific to one
instruction (it doesn't describe, for example, the different latency
between WAR of a send's arguments and RAW of a send's destination), but
that's changeable later.  We also don't separately compute the postorder
traversal of the graph, since we can use the setting of the delay field as
the "visited" flag.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 15:46:48 -07:00
Emil Velikov
9eb3de1ce7 automake: handle expat version pre 2.1
Commit aec20d66d9
(automake: properly handle non-default expat installation),
assumed that up-to date distributions use a recent version
of expat that handles security vunerabilities CVE-2012-1147
and CVE-2012-1148. Seems like this is not always the case
and they prefer to backport only the fix, rather than use
the updated library.

This commit adds a default case -lexpat whenever expat is
not found, while properly handling expat.pc if present.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71022
Reported-By: Bryce Harrington <b.harrington@samsung.com>
Reported-By: Vinson Lee <vlee@freedesktop.org>
Tested-by: Bryce Harrington <b.harrington@samsung.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-30 22:05:42 +00:00
Ian Romanick
5cb80f0314 glsl: Move layout(location) checks to AST-to-HIR conversion
This will simplify the addition of layout(location) qualifiers for
separate shader objects.  This was validated with new piglit tests
arb_explicit_attrib_location/1.30/compiler/not-enabled-01.vert and
arb_explicit_attrib_location/1.30/compiler/not-enabled-02.vert.

v2: Refactor error checking to check_explicit_attrib_location_allowed
and eliminate the gotos.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:30 -07:00
Ian Romanick
9d6294f5a2 glsl: Slightly restructure error generation in validate_explicit_location
Use mode_string to get the name of the variable mode.  Slightly change
the control flow.  Both of these changes make it easier to support
separate shader object location layouts.

The format of the message changed because mode_string can return a
string like "shader output".  This would result in an awkward message
like "vertex shader shader output..."

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:30 -07:00
Ian Romanick
f8c579dc0f glsl: Make mode_string function globally available
I made this a function (instead of a method of ir_variable) because it
made the change set smaller, and I expect that there will be an overload
that takes an ir_var_mode enum.  Having both functions used the same way
seemed better.

v2: Add missing case for ir_var_system_value.

v3: Change the ir_var_mode_count case to just break.  Move the assertion
and the return outside the switch-statment.  In the unlikely event that
var->mode is an invalid value other than ir_var_mode_count, the
assertion will still fire, and in release builds we won't wind up
returning a garbage pointer.  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:30 -07:00
Ian Romanick
2cb760d994 glsl: Eliminate the global check in validate_explicit_location
Since the separation of ir_var_function_in and ir_var_shader_in (similar
for out), this check is no longer necessary.  Previously, global_scope
was the only way to tell which was which.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:29 -07:00
Ian Romanick
8f00a77fbc glsl: Extract explicit location code from apply_type_qualifier_to_variable
Future patches will add some extra code to this path, and some of that
code will want to exit from the explicit location code early.

v2: Change a geometry shader "break" to a "return" so that try to apply
a bogus geometry shader location qualifier (which could cause cascading
errors).  Suggested by Paul.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-30 13:49:29 -07:00
Gregory Hainaut
0059d1948e mesa: Drop unused return value from use_shader_program
The return value has been unused since commit d348b0c.  This was
originally included in another patch, but it was split out by Ian
Romanick.

v2: Drop unnecessary final return.  Suggested by Paul.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: Eric Anholt <eric@anholt.net>
2013-10-30 13:49:29 -07:00
Fabio Pedretti
103824dc24 wayland: silence unused var warning
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-10-30 12:50:09 -07:00
Johannes Obermayr
5e162566db ilo: Fix out-of-tree build.
[olv: use $(srcdir) instead of $(top_srcdir)]
2013-10-30 21:17:10 +08:00
José Fonseca
26a8f76ba1 scons: Add missing dependencies to src/mapi/glapi/gen/*.xml
Incremental builds were failing because not all generated source files
were missing dependencies to src/mapi/glapi/gen/*.xml.

Hopefully this change will be the end of these incremental build
failures.
2013-10-30 12:21:54 +00:00
Marek Olšák
e929e27737 glsl: fix crash introduced by the previous commit 2013-10-30 00:14:35 +01:00
Marek Olšák
7e414b5864 glsl: break the gl_FragData array into separate gl_FragData[i] variables
This avoids a defect in lower_output_reads.

The problem is lower_output_reads treats the gl_FragData array as a single
variable. It first redirects all output writes to a temporary variable (array)
and then writes the whole temporary variable to the output, generating
assignments to all elements of gl_FragData.

BTW this pass can be modified to lower all arrays, not just inputs and outputs.
The question is whether it is worth it.

Reviewed-by: Paul Berry <stereotype441@gmail.com>

v2: addressed Paul Berry's comments
2013-10-29 23:50:01 +01:00
Emil Velikov
aec20d66d9 automake: properly handle non-default expat installation
Use PKG_CHECK_MODULE over requesting the user to setup the
option at configure time. Drop unused EXPAT_INCLUDE and
update all targets.

NOTE: The this commit removes the --with-expat configure
option. One should ensure that the expat they wish to use
has expat.pc file accessible by pkg-config.

v2:
* Add note about the removal of --with-expat
(per Tom Stellard)
* Drop EXPAT_CFLAGS for targets that do not build DRI_COMMON
(spotted by Matt Turner)
v3:
* Rebase on top of megadrivers (drop EXPAT_CFLAGS from swrast)

Acked-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v2)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>

Conflicts:
	configure.ac
	src/mesa/drivers/dri/common/Makefile.am
2013-10-29 21:14:41 +00:00
Emil Velikov
0828ad4e63 configure: use PKG_CONFIG variable over hardcoded pkg-config
Already available and used in other places of configure.ac.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
2a87647c6a targets/xorg-nouveau: drop usage of dri1 function DRICreatePCIBusID
The function should have never used it in the first place as it was
a left over from the DRI1 days of the nouveau ddx. While we're around
check if KMS is supported before opening the nouveau device, and
add support for Fermi & Kepler cards.

Compile tested only due to the lack of a Fermi/Kepler card.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
c9e6e6382f gallium/targets/xorg: drop set but unused variable entity
The function xf86GetEntityInfo() retrieves the entity rather than
doing any changes. Remove this no-op code.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
ba3efd6b42 st/xorg: drop set but unsused variables dxo, dyo
Commit a9f8baf00b removed the first and only use of the variables
but forgot to remove them.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:38 +00:00
Emil Velikov
2b7ffde8bd st/xorg: add sanity checks after malloc
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:37 +00:00
Emil Velikov
5c398e243c st/xorg: remove unnecessary headers
v2: Remove xf86PciInfo.h, all drivers provide their own PCI ID list

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 21:04:37 +00:00
Rob Clark
2bc1fc2fb6 freedreno: emulated unsupported primitive types
Use u_primconvert to convert unsupported primitives into supported
primitive plus index buffer.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-29 16:49:43 -04:00
Rob Clark
b881917088 gallium/auxiliary/indices: add u_primconvert
A convenient front end to indices generate/translate code, for emulating
primitives which are not supported natively by the driver.

This handles saving/restoring index buffer state, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 16:49:43 -04:00
Rob Clark
28f3f8d413 gallium/auxiliary/indices: add start param
Add 'start' parameter to generator/translator.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-29 16:49:43 -04:00
Rob Clark
5127436a4a freedreno: update generated headers
pull in some fixes to draw-initiator/prim-type.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2013-10-29 16:49:43 -04:00
Eric Anholt
774b787d6b i965/fs: Drop our dead push constants before overflowing to pull constants.
The idea of the original order was that you'd dead code eliminate accesses
to push constants.  But I've never seen a case of that (nor has
shader-db), while we frequently see sparse accesses of large constant
arrays that would overflow into pull constants.

Cuts pull constant use on csgo, serious sam, planeshift, and the cave:

total instructions in shared programs: 1695103 -> 1688795 (-0.37%)
instructions in affected programs:     92024 -> 85716 (-6.85%)
GAINED:                                339
LOST:                                  0

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-10-29 13:43:01 -07:00
Alexander von Gluck IV
9a9fb94ca9 haiku-softpipe: Minor cleanup and color space fixes
* Use more consistant data sources
* Fix improper color space assignments
* Remove unnecessary comments and code
* Drop unnecessary round_up function (this was leftover
  from moving winsys code out of renderer)

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-29 15:27:43 -05:00
Alexander von Gluck IV
439dd0e20a winsys: Correct Haiku winsys display target code
* Instead of assuming the displaytarget is the same
  stride / colorspace as the destination, lets
  actually check the source bitmap.
* Fixes random stride issues in rendering

Acked-by: Brian Paul <brianp@vmware.com>
2013-10-29 15:27:40 -05:00
Francisco Jerez
b8f89fc5cb clover: Use context device list for error checking in clGetProgramBuildInfo.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70891.

Reported-by: Bruno Jiménez <brunojimen@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
e515dcbf96 i965: Simplify the shader time code by using atomic counter helpers.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
d58bd75263 i965: Add brw_reg constructors taking a dynamically determined vector width.
The MRF variant is going to be used extensively by the atomic counter
intrinsics to assemble untyped atomic and surface read messages
easily.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
5e621cb9fe i965/gen7: Implement code generation for untyped surface read instructions. 2013-10-29 12:40:56 -07:00
Francisco Jerez
cfaaa9bbb7 i965/gen7: Implement code generation for untyped atomic instructions.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
5809512b17 i965: Implement ABO surface state emission.
The maximum number of atomic buffer objects is somewhat arbitrary, we
can change it in the future easily if it turns out it's not enough...

v2: Add comments with the relevant mesa dirty bits.  Fix usage of
    BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom.
v3: Update binding table layout diagrams.
v4: Resolve conflicts with the recent dynamic surface index assignment changes.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:56 -07:00
Francisco Jerez
c4e730e218 i965: Define vtbl method that initializes an untyped R/W surface.
And add Gen7 implementation.

v2: Fix off by one error in buffer size calculation.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
7a54db9ce5 glsl: Fix the function inlining pass to deal with general opaque arguments.
Almost a trivial change, it boils down to renaming a few identifiers
so their names still make sense for opaque types other than sampler.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00
Francisco Jerez
bbded5b5fe glsl: Add built-in functions and constants required for ARB_shader_atomic_counters.
v2: Represent atomics as GLSL intrinsics.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-10-29 12:40:55 -07:00