On drivers that set gl_shader_compiler_options::LowerClipDistance (for
example i965), references to gl_ClipDistance (a float[8] array) will
be converted to references to gl_ClipDistanceMESA (a vec4[2] array).
This patch modifies the linker so that requests for transform feedback
of gl_ClipDistance are similarly converted.
Fixes Piglit test "EXT_transform_feedback/builtin-varyings
gl_ClipDistance".
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When using transform feedback, there are three circumstances in which
it is useful for Mesa to instruct a driver to stream out just a
portion of a varying slot (rather than the whole vec4):
(a) When a varying is smaller than a vec4, Mesa needs to instruct the
driver to stream out just the first one, two, or three components of
the varying slot.
(b) In the future, when we implement varying packing, some varyings
will be offset within the vec4, so Mesa will have to instruct the
driver to stream out an arbitrary contiguous subset of the components
of the varying slot (e.g. .yzw or .yz).
(c) On drivers that set gl_shader_compiler_options::LowerClipDistance,
if the client requests that an element of gl_ClipDistance be streamed
out using transform feedback, Mesa will have to instruct the driver to
stream out a single component of one of the gl_ClipDistance varying
slots.
Previous to this patch, only (a) was possible, since
gl_transform_feedback_info specified only the number of components of
the varying slot to stream out. This patch adds
gl_transform_feedback_info::ComponentOffset, which indicates which
components should be streamed out.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously, on i965 Gen6 and above, we weren't allocating space for
gl_ClipVertex in the VUE, since the VS was automatically converting it
to clip distances. This prevented transform feedback from being able
to capture gl_ClipVertex.
This patch goes aheads and allocates space for gl_ClipVertex in the
VUE on Gen6 and above. The old behavior is retained on Gen5 and
below, since (a) transform feedback is not yet supported on those
platforms, and (b) those platforms don't currently support
gl_ClipVertex anyhow.
Note: this constitutes a slight waste of VUE space for shaders that
use gl_ClipVertex and don't use transform feedback to capture it.
However, that seems preferable to making the VUE map (and all of the
state that depends on it) dependent on transform feedback settings.
Fixes Piglit test "EXT_transform_feedback/builtin-varyings
gl_ClipVertex".
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
On i965 Gen6 and above, gl_PointSize is stored in component W of the
first VUE slot (which corresponds to VERT_RESULT_PSIZ in the VUE map).
Normally we store varying floats in component X of a VUE slot, so we
need special case logic for gl_PointSize.
For Gen6, we do this with a ".wwww" swizzle in the GS. For Gen7, we
shift the component mask by 3 to select the W component.
Fixes Piglit test "EXT_transform_feedback/builtin-varyings
gl_PointSize".
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Commit 9d36c96d6e (mesa: Fix
glGetTransformFeedbackVarying()) accidentally added an extra memset()
call to the store_tfeedback_info() function, causing
prog->LinkedTransformFeedback.NumBuffers to be erased.
This patch removes the extra memset and rearranges the other
operations in store_tfeedback_info() to be in the correct order.
Fixes piglit tests "EXT_transform_feedback/api-errors *unbound*"
Reviewed-by: Eric Anholt <eric@anholt.net>
The src/dst arrays would overlap but dst was less than src so a simple
version of memcpy() would do the right thing. But this isn't guaranteed
when memcpy() is optimized.
Fixes demos/copypix when the dest region was clipped by the left side of
the window.
Reviewed-by: Adam Jackson <ajax@redhat.com>
This is useful for apps which don't print FPS.
Only enabled in SwapBuffers.
v2: track state per drawable, use libGL prefix
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Do it after we check whether inst_end != -1.
Also move the code structure at the beginning of r300_fragment_shader_code
to detect underflows easily with valgrind.
Improves performance from cca 1 fps to 23 fps in Cogs.
This new codepath is not always used, instead, there is a heuristic which
determines whether to use it. Using translate for uploads is generally
slower than what we have had already, it's a win only in a few cases.
This is for GL_ARB_vertex_type_2_10_10_10_rev.
I just took the code from u_format_table.c. It's based on pack_rgba_float.
I had no other choice. The u_format hooks are not exactly compatible
with translate. The cleanup of it is left for future work.
Reviewed-by: Dave Airlie <airlied@redhat.com>
The conversion is limited to only a few cases, because converting to any other
type shouldn't happen in any driver.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fetching int as float and vice versa is not allowed.
Fetching unsigned int as signed int and vice versa is not allowed either.
Doing conversions like that isn't allowed for samplers in OpenGL.
The three hooks could be consolidated into one fetch hook, which would fetch
uint as uint32, sint as sint32, and everything else as float. The receiving
parameter would be void*. This would be useful for implementing vertex fetches
for shader model 4.0, which has untyped registers.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Please see the diff for further info.
This paves the way for moving user buffer uploads out of drivers and should
allow to clean up the mess in u_upload_mgr in the meantime.
For now only allowed for buffers on r300 and r600.
Acked-by: Christian König <deathsimple@vodafone.de>
We don't wanna convert per-instance or constant (zero-stride) attribs into
ordinary vertex attribs.
More importantly, the translation of instance attribs now finally works.
To match what transfer_map returns. Really, subtracting the offset leads
to bugs if someone expects it to work exactly like transfer_map.
Reviewed-by: Brian Paul <brianp@vmware.com>
The current implementation was totally broken -- it was looking in an
unpopulated structure for varyings, and trying to do so using the
current list of varying names, not the list used at link time.
v2: Fix leaking of memory into the program per re-link.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
There was some duplication between the tgsi_dump.c and tgsi_text.c
files. Also use some static assertions to help catch errors when
adding new TGSI values.
v2: put strings in tgsi_strings.c file instead of the .h file.
Reviewed-by: Dave Airlie <airlied@redhat.com>
We were wastefully mapping the whole source/dest buffers before.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Adds two missing '|| srcFormat == GL_RG_INTEGER' in assertions and a
bunch of missing pixel converions cases.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Commit 0ed11e3331 fixed a "use after free"
bug by getting the next pointer before deleting the current node.
Unfortunately, it also made "next" never get updated if i->need != need.
Fixes infinite loops in piglit tests fbo-depth-array and fbo-depthtex.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
textureSize() returns an int, ivec2, or ivec3, but never an ivec4.
Creating the destination register as an ivec4 triggered later failures,
even though the register did hold the proper values.
For example, piglit test vs-textureSize-compare calls textureSize on a
2D texture and compares the result to an expected value. Unfortunately,
our generated code also tried to compare the third and fourth components
which were undefined, and failed.
Fixes piglit test vs-textureSize-compare as well as 19 subcases of
oglconform's glsl-bif-tex-size test.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44339
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Commit d45814c925 totally added a data
dependency on _NEW_TEXTURE, even including the comment, but didn't
actually add the dirty bit.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
From the EXT_transform_feedback spec:
The error INVALID_OPERATION is also generated by BeginTransformFeedbackEXT
if no binding points would be used, either because no program object is
active or because the active program object has specified no varying
variables to record.
...
The error INVALID_VALUE is generated by BindBufferRangeEXT or
BindBufferOffsetEXT if <offset> is not word-aligned.
Fixes Piglit tests:
- EXT_transform_feedback/api-errors no_prog_active
- EXT_transform_feedback/api-errors interleaved_no_varyings
- EXT_transform_feedback/api-errors separate_no_varyings
- EXT_transform_feedback/api-errors bind_offset_offset_1
- EXT_transform_feedback/api-errors bind_offset_offset_2
- EXT_transform_feedback/api-errors bind_offset_offset_3
- EXT_transform_feedback/api-errors bind_offset_offset_5
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
From the EXT_transform_feedback spec:
The error INVALID_OPERATION is generated by
BeginTransformFeedbackEXT if any transform feedback buffer object
binding point used in transform feedback mode does not have a
buffer object bound.
This required adding a new NumBuffers field to the
gl_transform_feedback_info struct, to keep track of how many transform
feedback buffers are required by the current program.
Fixes Piglit tests:
- EXT_transform_feedback/api-errors interleaved_unbound
- EXT_transform_feedback/api-errors separate_unbound_0_1
- EXT_transform_feedback/api-errors separate_unbound_0_2
- EXT_transform_feedback/api-errors separate_unbound_1_2
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>