Faster, simpler and more flexible.
Also, we set those flags properly on nv3x so that we don't allocate buffers in GART.
Since on AGP GART is uncached, OpenGL doesn't distinguish between vertex and index buffers, and we don't support hardware index buffers for now, this caused uncached reads.
Also check bind and not usage for PIPE_BIND_* flags, got broken in the gallium-resources transition.
texdepth stopped working when npot went in, this brings it back
to life.
< MostAwesomeDude> That looks like what I was going to do.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Currently we used a single buffer for each fragment programs, leading to
rendering synchronization. This patch uses a doubly linked list of BOs,
which is dynamically resized if all the BOs are busy.
Note that inline image transfers could be an alternative option: this
will be explored later.
This removes one of the big performance limitations of the current
driver.
We also stop using pipe_resource internally in favor of using nouveau_bo
directly.
[airlied -
Convert sprite coord index to a per-coord enable bit
set the rasteriser block up correctly for point sprites.
The inputs to the RS hw block change for sprite coords, so fix them up
properly - this fixes piglit point-sprite test.
]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Currently on nv30/nv40 an assert will be triggered once 32 queries are
outstanding.
This violates the OpenGL/Gallium interface, which requires support for
an unlimited number of fences.
This patch fixes the problem by putting queries in a linked list and
waiting on the oldest one if allocation fails.
nVidia seems to use a similar strategy, but with 1024 instead of 32 fences.
The next patch will improve this.
This patch adds support for two-sided vertex color to nv30/nv40.
When set, the COLOR0/1 fs inputs on back faces will be wired to vs outputs BCOLOR0/1.
This makes OpenGL two sided lighting work, which can be tested with progs/demos/projtex.
This is implemented in nvfx_state_fb and fragtex but was missing
in nvfx_screen.
This allows to avoid glCopyTexSubImage CPU fallbacks and makes Doom 3
much faster as a result.