Commit graph

38860 commits

Author SHA1 Message Date
Luca Barbieri
2cdbced10d loop_unroll: unroll loops with (lowered) breaks
If the loop ends with an if with one break or in a single break unroll
it.  Loops that end with a continue will have that continue removed by
the redundant jump optimizer.  Likewise loops that end with an
if-statement with a break at the end of both branches will have the
break pulled out after the if-statement.

Loops of the form

   for (...) {
      do_something1();
      if (cond) {
	 do_something2();
	 break;
      } else {
	 do_something3();
      }
   }

will be unrolled as

   do_something1();
   if (cond) {
      do_something2();
   } else {
      do_something3();
      do_something1();
      if (cond) {
	 do_something2();
      } else {
	 do_something3();
	 /* Repeat inserting iterations here.*/
      }
   }

ir_lower_jumps can guarantee that all loops are put in this form
and thus all loops are now potentially unrollable if an upper bound
on the number of iterations can be found.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-09-13 16:20:40 -07:00
Ian Romanick
8f2214f489 glsl2: Add pass to remove redundant jumps 2010-09-13 14:25:26 -07:00
Ian Romanick
e79a1bb02a glsl: Explain file naming convention 2010-09-13 14:06:32 -07:00
Luca Barbieri
710d41131b loop_controls: fix analysis of already analyzed loops
The loop_controls pass didn't look at the counter values it put in ir_loop
on previous iterations, so while the first iteration worked, subsequent
ones couldn't determine max_iterations.
2010-09-13 13:03:10 -07:00
Ian Romanick
4de7a3b76a i965: Request that returns be lowered in shader main
Fixes piglit tests glsl-vs-main-return and glsl-fs-main-return.
2010-09-13 13:03:10 -07:00
Luca Barbieri
87708e8c90 glsl: call ir_lower_jumps according to compiler options 2010-09-13 13:03:09 -07:00
Luca Barbieri
3361cbac2a glsl: add continue/break/return unification/elimination pass (v2)
Changes in v2:
- Base class renamed to ir_control_flow_visitor
- Tried to comply with coding style

This is a new pass that supersedes ir_if_return and "lowers" jumps
to if/else structures.

Currently it causes no regressions on softpipe and nv40, but I'm not sure
whether the piglit glsl tests are thorough enough, so consider this
experimental.

It can be asked to:
1. Pull jumps out of ifs where possible
2. Remove all "continue"s, replacing them with an "execute flag"
3. Replace all "break" with a single conditional one at the end of the loop
4. Replace all "return"s with a single return at the end of the function,
   for the main function and/or other functions

This gives several great benefits:
1. All functions can be inlined after this pass
2. nv40 and other pre-DX10 chips without "continue" can be supported
3. nv30 and other pre-DX10 chips with no control flow at all are better supported

Note that for full effect we should also teach the unroller to unroll
loops with a fixed maximum number of iterations but with the canonical
conditional "break" that this pass will insert if asked to.

Continues are lowered by adding a per-loop "execute flag", initialized to
TRUE, that when cleared inhibits all execution until the end of the loop.

Breaks are lowered to continues, plus setting a "break flag" that is checked
at the end of the loop, and trigger the unique "break".

Returns are lowered to breaks/continues, plus adding a "return flag" that
causes loops to break again out of their enclosing loops until all the
loops are exited: then the "execute flag" logic will ignore everything
until the end of the function.

Note that "continue" and "return" can also be implemented by adding
a dummy loop and using break.
However, this is bad for hardware with limited nesting depth, and
prevents further optimization, and thus is not currently performed.
2010-09-13 13:03:09 -07:00
Luca Barbieri
55adbebc62 glsl: add ir_control_flow_visitor
This is just a subclass of ir_visitor with empty implementations of all
the visit methods for non-control flow nodes.

Used to avoid duplicating that in ir_visitor subclasses.

ir_hierarchical_visitor is another way to solve this, but is less natural
for some applications.
2010-09-13 13:03:09 -07:00
José Fonseca
6b5575baaa llvmpipe: Fix non SSE2 builds.
Should fix fdo 30168.
2010-09-13 20:43:36 +01:00
Marek Olšák
428dc6d7d2 r300g/swtcl: unlock VBO after draw_flush
https://bugs.freedesktop.org/show_bug.cgi?id=29901
https://bugs.freedesktop.org/show_bug.cgi?id=30132
2010-09-13 21:12:07 +02:00
Witold Baryluk
c40858fa0d llvmpipe: Change asm to __asm__.
According to gcc documentation both are equivalent,
second are prefered as first can make conflict with existing symbols.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2010-09-13 18:58:50 +01:00
Jesse Barnes
e7eff0cfce EGL DRI2: 0xa011 is Pineview not Ironlake
Point about needing a better way to do this validated.
2010-09-13 10:55:56 -07:00
Alex Deucher
9532eea509 r600c: const buffer sizes must be a multiple of 16 consts
This applies to r6xx/r7xx/evergreen
2010-09-13 13:41:46 -04:00
Jesse Barnes
c121608b6e EGL DRI2: add PCI ID for Ironlake mobile
Allows KMS EGL driver to load.  We need a better way of doing this.
2010-09-13 10:36:46 -07:00
Alex Deucher
6f839eb631 r600c/eg: remove obselete comment 2010-09-13 12:16:00 -04:00
Alex Deucher
2ef5bc3976 r600c/eg: remove unused emit timestamp function 2010-09-13 12:14:24 -04:00
Alex Deucher
07d95cdbfb r600c/eg: emit CB_BLEND_ALPHA with the other blend values
saves a few dwords
2010-09-13 12:11:29 -04:00
Alex Deucher
105ef5eb5e r600c: remove redundant state emit on evergreen
r700start3d already emits the context control packets
2010-09-13 12:06:34 -04:00
Kristian Høgsberg
7dcb305000 mesa: Revert accidentally committed vertex code chunk 2010-09-13 10:32:15 -04:00
Andre Maasikas
cdfe02d3fc r600c: eg: fix typo
probably copy/paste error
2010-09-13 16:55:58 +03:00
Andre Maasikas
629842b44c r600c: eg: 256 float4 constants may need more than 256 bytes 2010-09-13 16:29:44 +03:00
Andre Maasikas
c82beb436b r600c: eg - fix uninitialized variable 2010-09-13 16:19:18 +03:00
Kristian Høgsberg
4ebf07a426 glx: Don't destroy DRI2 drawables for legacy glx drawables
For GLX 1.3 drawables, we can destroy the DRI2 drawable when the GLX
drawable is destroyed.  However, for legacy drawables, there os no
good way of knowing when the application is done with it, so we just
let the DRI2 drawable linger on the server.  The server will destroy
the DRI2 drawable when it destroys the X drawable or the client exits
anyway.

https://bugs.freedesktop.org/show_bug.cgi?id=30109
2010-09-13 08:42:22 -04:00
Marek Olšák
0392e48867 r300g: fix SWTCL
https://bugs.freedesktop.org/show_bug.cgi?id=29901
2010-09-13 13:26:35 +02:00
José Fonseca
501d43028e llvmpipe: Unbreak rasterization on 64bit. 2010-09-13 12:03:35 +01:00
José Fonseca
91a9325761 gallium: Change the resource_copy_region semantics to allow copies between different yet compatible formats 2010-09-13 11:33:44 +01:00
Dave Airlie
61c2861b4e r600g: evergreen fixup dsa state for running query.
evergreen is always the same as r700 here.
2010-09-13 19:57:29 +10:00
Andre Maasikas
2471d0d6c5 r600c: remove stray unmap call
no idea how/why it got there
2010-09-13 12:42:25 +03:00
José Fonseca
b97c75e6a3 llvmpipe: use gcc asm only with gcc 2010-09-13 09:24:09 +01:00
Marek Olšák
6990148b12 r300g: print unassigned FS inputs for DBG_RS 2010-09-13 09:55:14 +02:00
Marek Olšák
ae1aa14965 r300g: fix map_buffer
https://bugs.freedesktop.org/show_bug.cgi?id=30145
2010-09-13 07:52:13 +02:00
Marek Olšák
185434fbe8 r300/compiler: fix warnings 2010-09-13 07:52:13 +02:00
Marek Olšák
ab7cc44580 r300g: add new debug options for dumping scissor regs and disabling CBZB clear 2010-09-13 07:49:43 +02:00
Marek Olšák
c3c5646b93 r300g: skip rendering if CS space validation fails
radeon_cs_space_check flushes the pipe context on failure, retries
the validation, and returns -1 if it fails again. At that point, there is
nothing we can do, so let's skip draw operations instead of getting stuck
in an infinite loop.

This code path ideally should never be hit.
2010-09-13 07:49:43 +02:00
Marek Olšák
317680c6fb r300g: remove u_upload_flush from r300_draw_arrays
This a leftover probably and is unnecessary, since we flush u_upload_mgr
in r300_flush.
2010-09-13 07:49:43 +02:00
Vinson Lee
b4f7f059c7 nvfx: Remove unused variables. 2010-09-12 21:48:40 -07:00
Vinson Lee
89e138b1c4 nvfx: Move declaration before code.
Fixes SCons build.
2010-09-12 21:39:21 -07:00
Keith Whitwell
c4046d4fda llvmpipe: introduce tri_3_4 for tiny triangles 2010-09-12 15:03:50 +01:00
Keith Whitwell
4b56e86e67 llvmpipe: allow tri_3_16 at any 4-aligned location within a tile
Doesn't require 16-alignment, so catch more cases.
2010-09-12 15:03:49 +01:00
Keith Whitwell
26b663c2aa llvmpipe: refactor tri_3_16
Keep step array as a set of four m128i's and reuse throughout the
rasterization.
2010-09-12 15:03:49 +01:00
Keith Whitwell
67b957781d llvmpipe: pass linear masks to fragment shader
Fragment shader can extract the correct bits for each quad.
2010-09-12 15:03:49 +01:00
Keith Whitwell
4b99b9f5ff llvmpipe: fix warnings on both 32 and 64 bit builds 2010-09-12 15:01:41 +01:00
Keith Whitwell
51b1d4f03c llvmpipe: fix wierd performance regression in isosurf
I really don't understand the mechanism behind this, but it
seems like the way data blocks for a scene are malloced, and in
particular whether we treat them as stack or a queue, and whether
we retain the most recently allocated or least recently allocated
has a real affect (~5%) on isosurf framerates...

This is probably specific to my distro or even just my machine,
but none the less, it's nicer not to see the framerates go in the
wrong direction.
2010-09-12 14:58:43 +01:00
José Fonseca
67763488b1 pb: Fix the build, and add notes. 2010-09-12 10:37:06 +01:00
José Fonseca
853953dc3c llvmpipe: Only generate the whole shader specialization for opaque shaders.
If not opaque, then the color buffer will have to be read any way,
therefore the specialization is pointless.
2010-09-12 10:15:48 +01:00
Dave Airlie
b5fcf0c8e0 pb: add void * for flush ctx to mapping functions
If the buffer we are attempting to map is referenced by the unsubmitted
command stream for this context, we need to flush the command stream,
however to do that we need to be able to access the context at the lowest
level map function, currently we set the buffer in the toplevel map, but this
racy between context. (we probably have a lot more issues than that.)

I'll look into a proper solution as suggested by jrfonseca when I get some time.
2010-09-12 13:32:43 +10:00
Luca Barbieri
95555ed03e nv30: fix breakage due to 10 texcoord support on nv40 2010-09-11 21:11:03 +02:00
Chia-I Wu
c34225974b Add missing files to the tarball file lists. 2010-09-12 02:31:33 +08:00
Chia-I Wu
19b2cfd6f6 mesa: Fix depend.es[12] generation when LLVM is enabled.
"llvm-config --cflags" outputs -f options, which conflict makedepend.
Clean up compiler flags and append LLVM_CFLAGS to the new xxx_CFLAGS
instead of xxx_CPPFLAGS, where xxx may be MESA, ES1, or ES2.
2010-09-12 02:31:33 +08:00
Tilman Sauerbeck
33b1d14913 r600g: Undo bo placement change.
This reverts a part of e795ca8f31
that causes artefacts and a performance drop.

Signed-off-by: Tilman Sauerbeck <tilman@code-monkey.de>
2010-09-11 18:40:45 +02:00