Commit graph

35462 commits

Author SHA1 Message Date
Marek Olšák
f558bcb397 r300g: optimize emission of fragment shader constants 2010-06-13 17:43:39 +02:00
Marek Olšák
3da6487115 r300g: turn fragment shader into a CB 2010-06-13 17:43:39 +02:00
Marek Olšák
0a44efaeb9 r300g: turn depth stencil state into a CB 2010-06-13 17:43:39 +02:00
Marek Olšák
f803211629 r300g: turn clip state into a CB 2010-06-13 17:43:39 +02:00
Marek Olšák
9dd50993c6 r300g: turn blend color into a CB 2010-06-13 17:43:38 +02:00
Marek Olšák
cd891648d4 r300g: turn blend state into a CB 2010-06-13 17:43:38 +02:00
Marek Olšák
a062156bb2 r300g: add API for building command buffers
The idea is to build a hardware command buffer for every CSO and memcpy
the buffer to a command stream at bind time (or dirty-state-emission time,
to be precise).
2010-06-13 17:43:38 +02:00
Marek Olšák
7ca24dfa6d r300g: inline FLUSH_CS
The fewer macros, the better.
2010-06-13 17:43:38 +02:00
Marek Olšák
98f67a6bbd r300g: reorder CS macros and document them a little 2010-06-13 17:43:38 +02:00
Marek Olšák
8f13e2bda1 r300g: drop DBG_CS
I'd like the CS macros to be as lightweight as possible for performance
reasons.
2010-06-13 17:43:38 +02:00
Marek Olšák
7005feabcd r300g: inline CHECK_CS 2010-06-13 17:43:38 +02:00
Marek Olšák
ae182296ce r300g: replace r300_cs_info with simplier get_cs_free_dwords 2010-06-13 17:43:38 +02:00
Marek Olšák
7d5230ce90 r300g: fix multiple render targets
This fixes tests/drawbuffers.
2010-06-13 17:43:38 +02:00
Marek Olšák
ea0ec0b48e r300g: remove r300_state.h 2010-06-13 17:43:38 +02:00
Marek Olšák
cb17f5ee75 r300g: move two-sided stencilref fallback to its own file 2010-06-13 17:43:37 +02:00
Marek Olšák
aa5422327d r300g: move index buffer translate functions to their new home 2010-06-13 17:43:37 +02:00
Marek Olšák
028459b0bf r300g: add fallback for unaligned/unsupported vertex stride/offset/format
There is a problem though, the translate module cannot emit half float
vertices.
2010-06-13 17:43:37 +02:00
Marek Olšák
1384a7bcca r300g: upload only vertex buffers referenced by vertex elements 2010-06-13 17:43:37 +02:00
Eric Anholt
1dc573a881 i965: Fix gen6 front cull mode. 2010-06-12 21:47:32 -07:00
Zhenyu Wang
5dbbb48f46 i965: Use the new message header format for FF_SYNC on gen6. 2010-06-12 21:47:32 -07:00
Zhenyu Wang
881ec3a814 i965: Add support for math instructions in the gen6 WM. 2010-06-12 21:47:32 -07:00
Zhenyu Wang
7ba2ecb32b i965: Set the correct WM GRF start reg on gen6. 2010-06-12 21:47:31 -07:00
Eric Anholt
0f59b9a95d i965: Update gen6 paths for the streaming rework. 2010-06-12 21:47:31 -07:00
Eric Anholt
7ad26b0030 i965: Stream out CC unit state.
before:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]       gl            firefox-talos-gfx   31.791   32.287   1.11%    6/6
after:
[  0]       gl            firefox-talos-gfx   31.198   31.675   0.96%    6/6
2010-06-12 21:47:31 -07:00
Zack Rusin
db05972807 draw/gs: copy the outputs only if we emitted something 2010-06-12 10:45:42 -04:00
Zack Rusin
1551b4da8c softpipe: small cleanup 2010-06-12 10:45:42 -04:00
Joakim Sindholt
60cfed6c70 r300/compiler: fix scons build 2010-06-12 15:40:14 +02:00
Vinson Lee
b6cfca42e3 i965: Remove unnecessary header. 2010-06-12 01:44:43 -07:00
Vinson Lee
e0b211d07c scons: Disable i965g build if using MSVC.
i965g uses C99 constructs that are not supported by MSVC.
2010-06-11 18:43:58 -07:00
Vinson Lee
de51485000 scons: Disable i915g build if using MSVC.
i915g uses C99 constructs that are not supported by MSVC.
2010-06-11 18:42:57 -07:00
Tom Stellard
3eca311b72 r300/compiler: Handle more complex conditionals in loops. 2010-06-11 22:06:59 +02:00
Tom Stellard
bde34a76b5 r300/compiler: Fix warning. 2010-06-11 22:06:59 +02:00
Tom Stellard
f7269cf26a r300/compiler: Handle SGT and SLE at the beginning of loops. 2010-06-11 22:06:59 +02:00
Tom Stellard
0125f5270b r300/compiler: Verify assumptions about opcode types. 2010-06-11 22:06:59 +02:00
Tom Stellard
6f1b6814bc r300/compiler: Unroll loops that decrement the counter.
e.g. for(i=10; i>0; i--)
2010-06-11 22:06:58 +02:00
Tom Stellard
0f1109ce36 r300/compiler: Unroll loops that have a constant number of iterations.
This only works with for loops that increment the counter.
e.g. for(i=0; i<10; i++)
2010-06-11 22:06:58 +02:00
Tom Stellard
622fd4d061 r300/compiler: Implement simple loop emulation
The loop emulation unrolls loops as may times as possbile while still
keeping the shader program below the maximum instruction limit.  At this
point, there are no checks for constant conditionals.  This is only enabled
for fragment shaders.
2010-06-11 22:06:58 +02:00
Eric Anholt
108264e859 i965: Remove the surface key used to generate constant surfaces.
We had to fill out all that junk when using the cache, but no more.
2010-06-11 12:21:23 -07:00
Eric Anholt
34c82804ed i965: Warning fixes from the i965-streaming merge. 2010-06-11 12:09:26 -07:00
Zack Rusin
53bd9796a1 gallium/softpipe/draw: support samplers in geometry shaders 2010-06-11 13:31:52 -04:00
Zack Rusin
2396967038 tgsi: support 2d indirect addressing 2010-06-11 10:35:24 -04:00
Eric Anholt
27bc2de546 i965: Use the state base address to avoid relocations.
This makes the binding table code simpler, and is required for gen6,
which requires binding table addresses to be under 64k offset from the
surface state base addr.

No significant change in performance on firefox-talos-gfx.
2010-06-11 00:16:15 -07:00
Eric Anholt
8ad3fdc967 i965: GC the last two arguments to brw_cache_data.
Now that the binding table is streamed indirect state, they were
always NULL/0.
2010-06-11 00:16:15 -07:00
Eric Anholt
309c011641 i965: Remove brw_state_cache_bo_delete now that it's unused again. 2010-06-11 00:16:09 -07:00
Eric Anholt
178414eba4 i965: Remove caching of surface state objects.
It turns out that computing a 56 byte key to look up a 20-byte object
out of a hash table was some sort of a bad idea.  Whoops.

before:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]       gl            firefox-talos-gfx   37.799   38.203   0.39%    6/6
after:
[  0]       gl            firefox-talos-gfx   34.761   34.784   0.17%    5/6
2010-06-11 00:15:59 -07:00
Eric Anholt
73de09f265 i965: Convert the binding table to streamed indirect state.
This slightly reduces reduces cairo-gl firefox-talos-gfx runtime on my
Ironlake:
before:
[ # ]  backend                         test   min(s) median(s) stddev. count
[  0]       gl            firefox-talos-gfx   38.236   38.383   0.43%    5/6
after:
[  0]       gl            firefox-talos-gfx   37.799   38.203   0.39%    6/6

It turns out the cost of caching these objects and looking them up in
the cache again is greater than the cost of just computing the object
again, particularly when the overhead of having a separate BO to pin
is removed.

(Those that are paying close attention will note that this is a
reversal of the path I was moving the driver in a couple of years ago.
The major thing that has changed is that back then all state was
recomputed when we wrapped the streaming state buffer, including
recompiling our precious programs.  Now, we're uncaching just the
objects that are cheap to compute, and retaining caching of expensive
objects)
2010-06-11 00:15:56 -07:00
Eric Anholt
118a47623a i965: Split constant buffer setup from its surface state/binding state.
This was bothering me when redoing the binding tables.
2010-06-11 00:15:56 -07:00
Eric Anholt
321014156b i965: Add support for streaming indirect state rather than caching objects. 2010-06-11 00:15:56 -07:00
Eric Anholt
f5bb775fd1 i965: Set the CC VP state immediately on state change.
The cache lookup of these two little floats was .12% of total CPU time
on firefox-talos-gfx because we did it any time commonly-changed state
changed.  On the other hand, updating the CC VP bo immediately whenver
CC VP state changes is a .07% overhead due to putting a driver hoook
in glEnable().
2010-06-11 00:15:56 -07:00
Eric Anholt
315ef0312a i965: Update old comment about state cache sizing. 2010-06-11 00:15:56 -07:00