This doesn't fix any known issue. In fact, radeon drivers ignore all
the discard flags for textures and implicitly do "discard range"
for any write transfer.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Previously instruction scheduling tracked dependencies on a per-register
basis. This meant that there was an artificial dependency between
interpolation instructions writing into the same virtual register.
Instruction scheduling would insert a number of instructions between the
two instructions in this example, when they are actually independent.
linterp vgrf8+0.0:F, hw_reg2:F, hw_reg3:F, hw_reg6:F
linterp vgrf8+1.0:F, hw_reg2:F, hw_reg3:F, hw_reg6+16:F
This lead to cases where the first texture coordinate is interpolated at
the beginning of the shader, but the second is done immediately before
the texture operation that uses it as a source.
After this change, the artificial dependency is removed and the
interpolation instructions are scheduled together.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Revert "build: Build on Cygwin with gnu99 instead of c99." and define
_XOPEN_SOURCE appropriately.
This reverts commit 53e36d333c.
Since Cygwin 1.7.18 (April 2013), it's headers correctly prototype strtoll()
when using -std=c99, and correctly prototype strdup() when _XOPEN_SOURCE is
defined appropriately, so this workaround is no longer needed.
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Cc: Vinson Lee <vlee@freedesktop.org>
Apparently TXD wants its offset differently than TEX, accepting it in
the upper bits of the layer index. Unclear what happens when this is
combined with indirect sampler indexing.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Something about how we're implementing offsets for TXD is wrong, just
flip to the generic quadop-based implementation in that case.
This is the minimal fix appropriate for backporting.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
handleTEX moves the layer as the first argument. This makes sure that
the quadops deal with the texture coordinates.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Unfortunately there's no good way to do this on the nv50 shader isa.
Dropping the bias seems preferable to doing the compare post-filtering.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
This can only happen with texture(samplerCubeShadow, bias), where the
compare will be in the first argument.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Although the HSW PRM shows it, the BSpec lists this workaround as being
for Ivybridge only.
total instructions in shared programs: 1994951 -> 1993675 (-0.06%)
instructions in affected programs: 27325 -> 26049 (-4.67%)
Port of commit b16b3c87 to the vec4 code.
No shader-db improvements, but might as well. The fs backend saw an
improvement because it's scalar and multiple identical CMP instructions
were generated by the SEL peepholes.
[mattst88]: Modified to perform CSE on instructions with
the same writemask. Offered no improvement before.
total instructions in shared programs: 1995633 -> 1995185 (-0.02%)
instructions in affected programs: 14410 -> 13962 (-3.11%)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
With a hack to place an exec_node in the struct in C to be at the same
location as the inherited exec_node in C++.
Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
If another layout qualifier appeared to the left of `invocations` in the
GS input layout declaration, the invocation count would be dropped on
the floor.
Fixes the piglit tests:
spec/ARB_transform_feedback3/arb_transform_feedback3-ext_interleaved_two_bufs_gs_max
spec/ARB_gpu_shader5/arb_gpu_shader5-invocation-id
spec/ARB_gpu_shader5/compiler/correct-multiple-layout-qualifier-invocations.geom
spec/ARB_gpu_shader5/execution/invocations-conflicting
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
The hardware allows multiple simultaneous renders with the same
memory-backed constbufs but with each invocation having different
values. However in order for that to work, the data has to be streamed
in via the right constbuf slot. We weren't doing that for UBOs.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2 10.1" <mesa-stable@lists.freedesktop.org>
Now that this cap is used to determine the availability of both, adjust
its name to reflect the new reality.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The assumption is that any driver capable of emitting layer from the
vertex shader and supporting viewports should be able to also handle
emitting viewport index from the vertex shader.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tobias Droste <tdroste@gmx.de>
The Linux winsys can no longer relocate shader code, so avoid
reemitting BindGBShader commands. They are costly.
v2: Correctly handle errors from SVGA3D_BindGBShader()
Reported-by: Michael Banack <banackm@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Previously, we were assuming that kernel metadata nodes only had 1 operand.
Kernels which have attributes can have more than 1, e.g.:
!0 = metadata !{void (i32 addrspace(1)*)* @testKernel, metadata !1}
!1 = metadata !{metadata !"work_group_size_hint", i32 4, i32 1, i32 1}
Attempting to get the kernel without the correct number of attributes led
to memory corruption and luxrays crashing out.
Fixes the cl/program/execute/attributes.cl piglit test.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76223
CC: "10.2" <mesa-stable@lists.freedesktop.org>
This patch fixes the duplicated layout qualifier detection
for geometry shader's layout qualifiers.
Also it makes the detection code more legible by defining
allowed_duplicates_mask variable.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80778
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
If XA fails to initialize with pipe_loader enabled, the pipe_loader's
cleanup function will close the drm file descriptor. That's pretty bad
because the file descriptor will probably be the X server driver's only
connection to drm. Temporarily solve this by dup()'ing the file descriptor
before handing it over to the pipe loader.
This fixes freedesktop.org bugzilla bug #80645.
v2: Fix CC addresses.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>