It's not optimal, but it's better than the register pressure scheduler
that was previously being used. The VLIW scheduler currently ignores
all the complicated instruction groups restrictions and just tries to
fill the instruction groups with as many instructions as possible.
Though, it does know enough not to put two trans only instructions in
the same group.
We are able to ignore the instruction group restrictions in the LLVM
backend, because the finalizer in r600_asm.c will fix any illegal
instruction groups the backend generates.
Enabling the VLIW scheduler improved the run time for a sha1 compute
shader by about 50%. I'm not sure what the impact will be for graphics
shaders. I tested Lightsmark with the VLIW scheduler enabled and the
framerate was about the same, but it might help apps that use really
big shaders.
The rest of the TFB implementation remains in transformfeedback.c, and
this will be shared with UBOs.
v2: Move the size/offset checks shared with UBOs to common code as
well. (Kenneth's review)
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Fix a typo spotted by Eric Anholt.
v3: Fix missing "GL" on types, fix style, fix Studly_Caps extension name,
drop commented code duplicated with GL3x.xml [anholt]
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Our intention is still that it's not abi stable, so make the package
version number get included in the library name. Now you can parallel
install dricore-using drivers from multiple mesa versions. We can put
it into lib now that we're following library versioning rules
(assuming that ABIs don't change within a single Mesa point release).
LD_LIBRARY_PATH still doesn't work with a non-/, non-/usr prefix
because libtool uses rpath instead of runpath for nonstandard
prefixes.
The weird versioning of the libGL where the package version was sort
of expressed as a big integer is dropped. libtool didn't like the 0
prefix, and it didn't really make sense anyway -- if you interpret it
as an integer version number, old Mesa 071200 was bigger than current
Mesa 08100. Instead, just bump the minor version and drop the
patchlevel.
Except for the deleted linux-cell target, these were just the target
cc/cflags. The only usage was for gen_matypes, which wants the
target's structure packing, not the host, anyway.
Every place that uses ASM_FLAGS already uses DEFINES. Not including
it in DEFINES is just a way to screw up potential users, as I've done
several times while working on the build system.
Even pre-automake, we rely on gmake features for pattern
substitutions, and replacing those with reams more make code is not
interesting. This will let us turn the old Makefiles using pattern
substitutions into automake without spewing warnings.
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
1) We need to insert a barrier between consecutive transform feedback calls.
2) VBO cache needs to be flushed when TFB output is used as VBO draw input.
Fixes Piglit test EXT_transform_feedback/immediate-reuse.
Thanks to Christoph Bumiller for pointing out bugs in previous versions
of this patch.
gl_ClipDistance needs special treatment in form of lowering pass
which transforms gl_ClipDistance representation from float[] to
vec4[]. There are 2 implementations - at glsl linker level (enabled
by LowerClipDistance option) and at glsl_to_tgsi level (enabled
unconditionally for gallium drivers). Second implementation is
incomplete - it does not take into account transform feedback (see
commit 642e5b413e "mesa: Fix transform
feedback of unsubscripted gl_ClipDistance array" for details).
There are 2 possible fixes:
- adding transform feedback support into glsl_to_tgsi version
- ripping gl_ClipDistance support from glsl_to_tgsi and enabling
gl_ClipDistance lowering on glsl linker side
This patch implements 2nd option. All it does is:
- reverts most of the commit 59be691638
"st/mesa: add support for gl_ClipDistance"
- changes LowerClipDistance to true
Fixes Piglit tests "EXT_transform_feedback/builtin-varyings
gl_ClipDistance[{2,3,4,5,6,7,8}]-no-subscript" at least on nv50
and evergreen cards.
From the GL 3.0 spec (p.116):
"Multisample rasterization is enabled or disabled by calling
Enable or Disable with the symbolic constant MULTISAMPLE."
Elsewhere in the spec, where multisample rasterization is described
(sections 3.4.3, 3.5.4, and 3.6.6), the following text is consistently
used:
"If MULTISAMPLE is enabled, and the value of SAMPLE_BUFFERS is
one, then..."
So, in other words, disabling GL_MULTISAMPLE should prevent
multisample rasterization from occurring, even if the draw framebuffer
is multisampled. This patch implements that behaviour by setting the
WM and SF stage's "multisample rasterization mode" to
MSRAST_ON_PATTERN only when the draw framebuffer is multisampled *and*
GL_MULTISAMPLE is enabled.
Fixes piglit test spec/EXT_framebuffer_multisample/enable-flag.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Due to hardware limitations, MSAA is unsupported on Gen6 for formats
containing >64 bits of data per pixel. From the Sandy Bridge PRM,
vol4 part1, p72 ("Surface Format"):
If Number of Multisamples is set to a value other than
MULTISAMPLECOUNT_1, this field cannot be set to the following
formats:
- any format with greater than 64 bits per element
- any compressed texture format (BC*)
- any YCRCB* format
Gen7 has a similar, but less stringent limitation: formats with >64
bits of data per pixel only support 4x MSAA.
This patch causes the unsupported formats to report
GL_FRAMEBUFFER_UNSUPPORTED.
Fixes piglit "multisample-formats" tests on Gen6.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Sandy Bridge and later don't use this field, so there's no point in
setting it. It can only cause harmful state-based recompiles.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The system array values concept doesn't really because it expects the
system values to be fixed per call, which is wrong for gl_VertexID and
iffy for gl_SampleID. So this patch does two things:
- kill the array, have emit_fetch_system_value directly pick the
values it needs (only gl_InstanceID for now, as the previous code)
- correctly handle the expected type in emit_fetch_system_value
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
This includes:
- picking up correctly which attributes are flatshaded and which are
noperspective
- copying the flatshaded attributes when needed, including the
non-built-in ones
- correctly interpolating the noperspective attributes in screen-space
instead than in a 3d-correct fashion.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
z or stencil texture should not be created with the z/stencil
flags for surface creation as they are intended to be bound
as texture.
v2: remove broken code
Signed-off-by: Jerome Glisse <jglisse@redhat.com>