we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
We already hold the variable, just weren't providing access
to it.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This was a very serious bug. We were always doing the viewport
transformations on the first output of the vertex shader. That means
that every application that was storing position in anything but
OUT[0] was outputing untransformed vertices and had broken output
for whatever it was storing at OUT[0]. Correctly take into
consideration where the vertex position is actually stored.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
There can be more stream output decls than shader outputs because
individual components from them can be split and distributed
among different so buffers.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Before, we'd incorrectly generate an error if we we tried to
replace a non-4x4 block near the edge of a NPOT compressed texture.
For example, if the dest image was 15 texels wide and xoffset=12
and width=3 we'd incorrectly generate GL_INVALID_OPERATION.
Verified with new tests added to piglit s3tc-errors test.
Note: This is a candidate for the stable branches.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
That is just not supported by the hardware.
v2: fix compare
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
patch fixes a crash that happens if glTexSubImage2D is called with a
negative xoffset.
NOTE: This is a candidate for stable branches.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
After more thought/discussion, it seems it is better to handle this sort
of stuff in the state tracker.
So this reverts commit 12096f334b, except the
variant->key -> key shorthands.
This is a simple shader compiler that performs almost zero optimizations. The
generated code is usually much larger comparing to that generated by i965.
The generated code also requires many more registers.
Function-wise, it lacks register spilling and does not support most TGSI
indirections. Other than those, it works alright.
Courtesy of clang:
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
Only 13 affected programs in shader-db, but they were all helped.
total instructions in shared programs: 368877 -> 368851 (-0.01%)
instructions in affected programs: 1576 -> 1550 (-1.65%)
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Three-source instructions have a vertical stride overloaded to 4, which
prevents directly using vec4 uniforms as arguments. Instead we need to
insert a MOV instruction to do the replication for the three-source
instruction.
With this in place, we can use three-source instructions in the vertex
shader. While some thought needs to go into deciding whether its better
to use a three-source instruction rather than a sequence of equivalent
instructions (when one or more sources are uniforms or immediates), this
will allow us to skip a lot of ugly lowering code and use the BFE and
BFI2 instructions directly.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).
Lot of file touched because of winsys API changes.
v2: Do not write lockup file if ib uniq id does not match last one
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
There was a lot of code in evergreen_compute_internal.c that was not
being used at all and most of it was duplicating code from other parts
of the driver.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
v2:
- Fix usage of set_constant_buffer()
- Fix typo in comment
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
We don't support pre-2.6 kernels anyway - the install docs say 2.6.28
for DRI - and apparently this confuses ld.so's sorting when multiple
libGLs are installed. Just remove it.
Note: this is a candidate for the stable branches.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
New textures or vertex buffers don't always require patching and
re-emitting the shaders. So do a better job of figuring out when we
actually have to patch the shader.
Signed-off-by: Rob Clark <robclark@freedesktop.org>