The blitter only does up 32bpp at a time, so we handle it by mangling
coordinates and calling the surface 32bpp.
Fixes ARB_texture_rg/fbo-generatemipmap-formats-float with ARB_texture_float.
Reviewed-by: Brian Paul <brianp@vmware.com>
Of these, intel will be using I and L initially, and A once we rewrite
fragment shaders and the CC for rendering to it as R.
Reviewed-by: Brian Paul <brianp@vmware.com>
Keep track of when the caches are dirty, and only flush them when
the framebuffer state is set and when the context is flushed.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This fixes piglit's draw-instanced-divisor test for softpipe on both
the generic and SSE paths. This is temporary until we have the
correct per-array max_index information.
this needs revisiting, we really don't want to be flushing all 32 of these,
but currently we don't flush any of them, and it seems to have caused a regression
as reported on irc with doom3 on evergreen.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Writes within ELSE blocks were being ignored which prevented us from
discovering all possible writers for some register values.
Fixes piglit glsl-fs-raytrace-bug27060
This gets me from 2200 to 1978 dwords for a gears frame.
This is due to us having some 32-dwords blocks in the SPI, that we only
modify the first dwords off.
v2: fix dirty reg count from Bas Nieuwenhuizen
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is a first step to decreasing the CPU usage, by decreasing how much
stuff we pass to the GPU and hence to the kernel CS checker.
This adds a check to see if the values we need to write are actually dirty,
and avoids writing if they are. However certain register need to always
be written so we add a new flag to say which ones should be always written
if used. (Note this could probably be done cleaner with a larger refactoring,
since I think the CONST_BUFFER_SIZE_PS/VS and CONST_CACHE_PS/VS might
be better off as a special state).
It also moves the need_bo to be a flags on the register now.
With this, a frame of gears goes from emitting 3k dwords to emitting 2k dwords,
and I'm sure it could get a lot smaller.
v2: fix some evergreen dirty bits.
Original patch from: Bas Nieuwenhuizen, I NIHed nearly the same thing
before seeing his patch on the list, oops.
Reviewed-by: Bas Nieuwenhuizen
Signed-off-by: Dave Airlie <airlied@redhat.com>
Most of the newer portions of the code use OUT_BATCH style. I prefer
this style because it offers a clear distinction between a) hardware
messages/structures with a mandatory format, and b) data structures for
our own internal use that we can format however we want.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Since we never enable the GS on Sandybridge, there's no need to allocate
it any URB space.
Furthermore, the previous calculation was incorrect: it neglected to
multiply by nr_vs_entries, instead comparing whether twice the size of
a single VS URB entry was bigger than the entire URB space. It also
neglected to take into account that vs_size is in units of 128 byte
blocks, while urb_size is in bytes.
Despite the above problems, the calculations resulted in an acceptable
programming of the URB in most cases, at least on GT2.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The expression
x = y, 5, 3;
will generate
0:7(9): warning: left-hand operand of comma expression has no effect
The warning is only emitted for the left-hand operands, becuase the
right-most operand is the result of the expression. This could be
used in an assignment, etc.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts what remains of commit
28bab24e16. It was garbage, trying to
use a MESA_FORMAT enum as a preprocessor token, and I don't know how I
thought it was even tested.
Reviewed-by: Brian Paul <brianp@vmware.com>
The GL_RED and GL_RG were tricking this code into executing, but it's
totally unprepared for a 16-bit channel and just rescaled the values
down to 0. We don't have anything with <8bit channels alongside >8bit
channels, so disabling it should be safe.
Reviewed-by: Brian Paul <brianp@vmware.com>
This will replace the current (broken by trying to use an enum in the
preprocessor) spantmp2.h support I wrote for the intel driver.
Reviewed-by: Brian Paul <brianp@vmware.com>
Since we're using GTT mappings now (no manual detiling), there's
really nothing special to accessing these buffers, other than needing
the new RowStride field of gl_renderbuffer to accomodate padding.
Reduces the driver size by 2.7kb, and improves glean depthStencil
performance 3-10x (!)
Reviewed-by: Brian Paul <brianp@vmware.com>
This will allow some drivers to reuse the core renderbuffer.c get/put
row functions in place of using the spantmp.h macros. Note that
unlike textures, we use a signed integer here to allow for handling
FBO orientation.
Reviewed-by: Brian Paul <brianp@vmware.com>
Everything appears to already be in place for this. Fixes aborts in:
ARB_texture_rg/fbo-alphatest-formats-float
ARB_texture_rg/fbo-blending-formats-float.
Reviewed-by: Brian Paul <brianp@vmware.com>
The _mesa_base_fbo_format variant doesn't handle some texture
internalformats, such as "3".
Fixes:
fbo-blending-formats.
fbo-alphatest-formats
EXT_texture_sRGB/fbo-alphatest-formats
Reviewed-by: Brian Paul <brianp@vmware.com>