Commit graph

54158 commits

Author SHA1 Message Date
Paul Berry
25ed3bef9b glsl/linker: Defer recording transform feedback locations.
This patch subdivides the loop that assigns varying locations into two
phases: one phase to match up varyings between shader stages (and
assign them varying locations), and a second phase to record the
varying assignments for use by transform feedback.

This paves the way for varying packing, which will require us to
further subdivide the first phase.

In addition, it lets us avoid a clumsy O(n^2) algorithm, since we can
now record the locations of all transform feedback varyings in a
single pass through the tfeedback_decls array, rather than have to
iterate through the array after assigning each varying.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-14 10:49:05 -08:00
Paul Berry
3e81c666db glsl: Create a field to store fractional varying locations.
Currently, the location of each varying is recorded in ir_variable as
a multiple of the size of a vec4.  In order to pack varyings, we need
to be able to record, e.g. that a vec2 is stored in the second half of
a varying slot rather than the first half.

This patch introduces a field ir_variable::location_frac, which
represents the offset within a vec4 where a varying's value is stored.
Varyings that are not subject to packing will always have a
location_frac value of zero.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:52 -08:00
Paul Berry
3c9c17db4a glsl/linker: Make separate ir_variable field to mean "unmatched".
Previously, the linker used a value of -1 in ir_variable::location to
denote a generic input or output of the shader that had not yet been
matched up to a variable in another pipeline stage.

This patch introduces a new ir_variable field,
is_unmatched_generic_inout, for that purpose.

In future patches, this will allow us to separate the process of
matching varyings between shader stages from the processes of
assigning locations to those varying.  That will in turn pave the way
for packing varyings.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:38 -08:00
Paul Berry
50895d443a glsl/linker: Always invalidate shader ins/outs, even in corner cases.
Previously, link_invalidate_variable_locations() was only called
during assign_attribute_or_color_locations() and
assign_varying_locations().  This meant that in the corner case when
there was only a vertex shader, and varyings were being captured by
transform feedback, link_invalidate_variable_locations() wasn't being
called for the varyings.

This patch migrates the calls to link_invalidate_variable_locations()
to link_shaders(), so that they will be called in all circumstances.
In addition, it modifies the call semantics so that
link_invalidate_variable_locations() need only be called once per
shader stage (rather than once for inputs and once for outputs).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:35 -08:00
Paul Berry
18392443d4 glsl/lower_clip_distance: Update symbol table.
This patch modifies the clip distance lowering pass so that the new
symbol it generates (glClipDistanceMESA) is added to the shader's
symbol table.

This will allow a later patch to modify the linker so that it finds
transform feedback varyings using the symbol table rather than having
to iterate through all the declarations in the shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-14 10:48:28 -08:00
Tapani Pälli
d249159fe6 android: build fix for libmesa_glsl_utils
hash_table.c compilation requires ralloc.h include path

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-14 10:01:45 -08:00
Brian Paul
a12a8c910f mesa: minor indentation fixes in texcompress_etc.c 2012-12-14 06:33:08 -07:00
Brian Paul
b29f2d5ff5 mesa: remove old swrast-based compressed texel fetch code 2012-12-14 06:33:08 -07:00
Brian Paul
7dc36a50de swrast: use new core Mesa compressed texel fetch functions 2012-12-14 06:33:08 -07:00
Brian Paul
faa95fd7fa mesa: reimplement _mesa_decompress_image() using new tex fetch code 2012-12-14 06:33:08 -07:00
Brian Paul
ccbe7db1e6 mesa: added _mesa_get_compressed_fetch_func() 2012-12-14 06:33:08 -07:00
Brian Paul
ad3e39bb6d mesa: add new texel fetch code for etc formats 2012-12-14 06:33:07 -07:00
Brian Paul
cd7baf5bf4 mesa: add new texel fetch code for rgtc formats 2012-12-14 06:33:07 -07:00
Brian Paul
141d299965 mesa: add new texel fetch code for fxt formats 2012-12-14 06:33:07 -07:00
Brian Paul
a774eaa57e mesa: add new texel fetch code for dxt formats 2012-12-14 06:33:07 -07:00
Brian Paul
2037a06da9 mesa: add compressed_fetch_func typedef
This is a first step in removing the swrast-related code in core
Mesa's texture compression files.
2012-12-14 06:33:07 -07:00
Brian Paul
90b7797a1d swrast: merge get_texel_fetch_func() and set_fetch_functions()
No real need for separate functions anymore.
2012-12-14 06:33:07 -07:00
Brian Paul
f4896cea04 swrast: make _mesa_get_texel_fetch_func() static
Not called from any other file.
2012-12-14 06:33:07 -07:00
Dave Airlie
9e41b0badb draw/llvmpipe: fix transform feedback position + enable other extensions
This builds on the previous draw/softpipe patch.

So llvmpipe does streamout calls after clip/viewport stages,
but we have the pre-clip position stored for later use, so
when we are doing transform feedback, and its the position vertex
grab the vertex from the stored pre clip position.

The perfect fix is too probably add a codegen transform feedback
stage in between shader and clip stages, but this is good enough
for now.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-14 11:34:40 +10:00
Dave Airlie
55d37eb40e draw: add support for later transform feedback extensions
This adds support to draw for the new features of transform feedback.

a) fix count_from_stream_output, using max_index+1 for now but it looks
like it should be valid as its derived from the vertex elements/vbo.

b) fix striding and dst offsets in output buffers - was just wrong before.

c) fix crash if tfb is suspended (so.num_targets == 0)

This also enables the new features on softpipe. It should be possible
to enable them on llvmpipe as well after this commit, but would need
to schedule piglit runs.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-14 11:34:15 +10:00
Tom Stellard
4330cfec8b clover: Fix build since removal of pipe_surface::usage
by commit 25409c6da8
2012-12-13 20:04:34 +00:00
Maxence Le Dore
6d7d821e3d r600g/radeonsi: Silence warnings
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-12-13 19:40:28 +00:00
Tom Stellard
c68babfc3c clover: Add support for compiler flags
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-12-13 19:22:44 +00:00
Tom Stellard
7f71efcf7a clover: Don't erase build info of devices not being built
Every call to _cl_program::build() was erasing the binaries and logs for
every device associated with the program.  This is incorrect because
it is possible to build a program for only a subset of devices and so
any device not being build should not have this information erased.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2012-12-13 19:22:35 +00:00
Vincent Lejeune
c7f9fb37ea r600g: use load_ar checks with llvm output.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-12-13 19:22:10 +00:00
Thierry Reding
60e05d7388 build: Fix AX_PROG_{CC,CXX}_FOR_BUILD macros
Override the cross_compiling and ac_tool_prefix variables by reassigning
to them instead of redefining the macros. Redefining them will actually
cause the variable names to be replaced instead of their content.

Furthermore push the definition of CPPFLAGS before running the checks
for the build tools to avoid the host CPPFLAGS from leaking into the
build CPPFLAGS.

While at it drop the redefinition of AC_TRY_COMPILER which hasn't been
used since autoconf 2.50 and make sure that all definitions are properly
popped when done (LDFLAGS, ac_cv_prog_CPP, ac_cv_prog_CXXCPP).

Acked-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
2012-12-13 10:58:11 -08:00
Roland Scheidegger
a460aea3f1 gallivm: fix texel fetch for array textures
Since we don't call lp_build_sample_common() in the texel fetch path we missed
the layer fixup code. If someone would have tried to do texelFetch with array
textures it would have crashed for sure.
Not really tested (can't run the piglit test being able to use texelFetch with
array samplers for now with llvmpipe).

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-13 19:17:09 +01:00
Paul Berry
6267853055 mesa: Fix computation of default vertex attrib stride for 2_10_10_10 formats.
Previously, if the client program didn't specify a stride when setting
up a vertex attribute, we used _mesa_sizeof_type() to compute the size
of the type, and multiplied it by the number of components.

This didn't work for the 2_10_10_10 formats, since _mesa_sizeof_type()
returns -1 for those types, resulting in all kinds of havoc, since it
was causing the hardware to be programmed with a negative stride
value.

This patch adds a new function _mesa_bytes_per_vertex_attrib(), which
is similar to the existing function _mesa_bytes_per_pixel(), but which
computes the size of a vertex attribute based on the type and the
number of formats.  For packed formats (currently only the 2_10_10_10
formats), it verifies that the number of components is correct and
returns the size of the packed format.  For unpacked formats, it
returns the size of the type times the number of components.

In addition, this patch adds an assertion so that if we ever forget to
update _mesa_bytes_per_vertex_attrib() when adding a new vertex
format, we'll see the problem quickly rather than having to debug a
subtle conformance test failure.

Fixes GLES3 conformance tests
vertex_type_2_10_10_10_rev_{conversion,divisor,stride_pointer}.test.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-13 10:09:03 -08:00
Matt Turner
11cea47246 mesa/uniform_query: Don't write to *params if there is an error
The GL 3.1 and ES 3.0 specs say of glGetActiveUniformsiv:
   "If an error occurs, nothing will be written to params."

So, make a pass through the indices and check that they're valid before
the pass that actually writes to params. Checking pname happens on the
first iteration of the second loop.

Fixes es3conform's getactiveuniformsiv_for_nonexistent_uniform_indices
test.

NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-13 09:53:28 -08:00
Matt Turner
6acabe33a3 mesa: print unsigned values with %u
Otherwise messages say silly things like
   glGetActiveUniformBlockiv(block index -1 >= 0)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-13 09:53:11 -08:00
Kenneth Graunke
200bb36778 i965: Fix disassembly of jump targets on Gen7.
Gen7 stores the JIP/UIP bits in different places.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-12 22:19:08 -08:00
Kenneth Graunke
c2eb9d3a0a i965: Make try_rewrite_rhs_to_dst compare VGRF size to regs written.
try_rewrite_rhs_to_dst is a quick optimization to avoid generating new
temporaries (and MOVs from those temporaries to the dest) for every
expression tree we visit.  By generating better code in simple cases, we
reduce the burden on later optimization passes like register coalescing.

Previously, we compared inst->regs_written() to lhs->vector_elements
to make sure the instruction generating our value wrote the same number
of components as our destination register.

However, this fails in some cases.  One example is texturing (which
produces a vec4) into gl_FragData[i].  Technically, gl_FragData[i] is
also a vec4.  However, the destination VGRF actually has size 4n (where
n is the size of the array).

split_virtual_grfs() can't split VGRFs that are used by SEND messages
which require contiguous destination registers (like texturing), and
register allocation needs all VGRFs to have sizes between 1 and 4.

Amnesia: The Dark Descent hits this case: a texturing instruction
(4 components) gets rewritten to the gl_FragData output register
(which was 4*3 = 12 components), causing the register allocator to
hit the "we rely on split_virtual_grfs" assertion.

This makes it possible to play Amnesia.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-12 14:44:37 -08:00
Emil Velikov
1223458764 configure.ac: Disable compiler optimizations when --enable-debug is set
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-12 14:48:06 -06:00
Brian Paul
e721a76e68 softpipe: remove unused corner0 variable 2012-12-12 08:51:19 -07:00
Brian Paul
8ef27e8fa9 llvmpipe: remove unneeded draw_flush() call
This is redundant since we're calling draw_bind_fragment_shader()
which already does a flush.

v2: the redundant flush in llvmpipe_set_constant_buffer() has
already been removed by commit 3427466e6d

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-12 08:45:45 -07:00
Marek Olšák
d225d076a9 r600g: suballocate memory for fetch shaders from a large buffer
Fetch shaders are usually destroyed at the context destruction by the state
tracker, so we can put them all in a large buffer without wasting memory.

This reduces the number of relocations sent to the kernel a little bit.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:31 +01:00
Marek Olšák
8df3855eed r600g: suballocate memory for the STRMOUT_BUFFER_FILLED_SIZE register
Instead of having a 4-byte buffer for each streamout target, we suballocate
each dword from a 4K buffer.

This further reduces the overall number of relocations.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:28 +01:00
Marek Olšák
cc2d908572 gallium/util: add a simple allocator for suballocating from a large buffer
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:24 +01:00
Marek Olšák
2478fcd87c r600g: use u_upload_mgr for allocating staging transfer buffers
u_upload_mgr suballocates memory from a large buffer and maps the allocated
range (unsychronized), which is perfect for short-lived staging buffers.

This reduces the number of relocations sent to the kernel.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:11:52 +01:00
Marek Olšák
448cd5ea60 winsys/radeon: don't use BIND flags, add a flag for the cache bufmgr instead 2012-12-12 13:09:54 +01:00
Marek Olšák
1d0bf69f83 st/dri: add a way to force MSAA on with an environment variable
There are 2 ways. I prefer the former:
  GALLIUM_MSAA=n
  __GL_FSAA_MODE=n

Tested with ETQW, which doesn't support MSAA on Linux. This is
the only way to get MSAA there.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
afa902a705 mesa: don't advertise ARB_texture_buffer_object in legacy contexts
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
0ac83a2001 mesa: disallow creation of GL 3.1 compatibility contexts
Death to driver-specific hacks!

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
25409c6da8 gallium: remove pipe_surface::usage
Not really used by anybody now.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:54 +01:00
Marek Olšák
c1f704073b svga: stop using pipe_surface::usage
There are only 2 possible usages: render target and depth stencil.
Both can be derived from the surface format, so the flag is redundant.

And it's going away...

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
21b1ec69fc gallium/util: move util_try_blit_via_copy_region to u_surface.c
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
3a555637b2 gallium/cso: don't use the pipe_error return type where it's not needed
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
eae9674f18 gallium: manage render condition in cso_context and fix postprocessing w/ it
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák
9ec6ffd85d st/mesa: remove a weird msaa hack
It doesn't work and it's not clear how it's supposed to work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Dave Airlie
621259b3de softpipe: implement seamless cubemap support. (v1.1)
This adds seamless sampling for cubemap boundaries if requested.

The corner case averaging is messy but seems like it should be spec
compliant.

The face direction stuff is also a bit messy, I've no idea if that could
or should be simpler, or even if all my directions are fully correct!

v1.1: update comments, drop unneeded seamless calls for nearest, fix
if statement layout.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-12 10:35:05 +10:00