Nowadays -1 for slots means that the semantic is not present, so
we need to store it in a signed variables, otherwise <0 comparisons
are pointless. Fixes
http://bugzilla.eng.vmware.com/show_bug.cgi?id=67811 (at least
with softpipe, edgeflags don't work wit llvmpipe)
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Since llvm -3.4svn r187618, TargetOptions doesn't provide
RealignStack, so only enable it with llvm<3.4
This option must now be specified using function attributes, see LLVM
commit r187618
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
And as a side effect fix a crash in the following piglit test:
general/attribs GL3
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "9.2 and 9.1" mesa-stable@lists.freedesktop.org
Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using
all 104 SGPRs.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
If gs is null, then freeing state->shader.tokens would result in a null
dereference.
Fixes "Dereference after null check" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Fixes "Uninitialized pointer read" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
unfilled_stage::face_slot is of type int.
Fixes "Unsigned compared against 0" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
pp_free_bos dereferences ppq without a null check.
Fixes "Dereference before null check" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
the issue is that stream output is run before the pipeline, which
means that unless we decompose the primitives before the so
then things crash. we could convert the entire stream output
code into a pipeline stage but it will take a bit, so for now
fix the crashes by simply re-adding the old input assembler
which is run before the SO.
Signed-off-by: Zack Rusin <zackr@vmware.com>
we used to have a face primitive assembler that we ran after if
the gs was missing but we had adjacency primitives in the pipeline,
lets convert it to a pipeline stage, which allows us to use it
to inject outputs (primitive id) into the vertices. it's also
a lot cleaner because the decomposition is already handled for us.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Inject front face only if the fragment shader uses it and
propagate through all channels because otherwise we'll
need to figure out the exact swizzle that the fs expects and
it's just simpler to make sure all the components within
the front face register are correctly set.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Replace "fulldecl->Semantic.Name/Index" with semName/semIndex.
Simplify if/else logic for TGSI_FILE_OUTPUT code.
Remove old comment.
Fix indentation.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Some videos specify mb_adaptive_frame_field_flag instead of
field_pic_flag. This implies that the pic height needs to be halved, and
this field needs to be passed to the VP engine.
Cc: "9.2" mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
The loop was iterating over all the fs inputs and setting them
to perspective interpolation, then after the loop we were
creating extra output slots with the correct interpolation. Instead
of injecting bogus extra outputs, just set the interpolation
on front face and prim id correctly when doing the initial scan
of fs inputs.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
clipping would drop the extra outputs because it always
used the number of standard vertex shader outputs, without
geometry shader or extra outputs. The commit makes sure
that clipping with geometry shaders which have more outputs
than the current vertex shader and with extra outputs correctly
propagates the entire vertex.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Draw module can decompose primitives into wireframe models, which
is a fancy word for 'lines', unfortunately that decomposition means
that we weren't able to preserve the original front-face info which
could be derived from the original primitives (lines don't have a
'face'). To fix it allow draw module to inject a fake face semantic
into outputs from which the backends can figure out the original
frontfacing info of the primitives.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Draw sometimes injects extra shader outputs (aa points, lines or
front face), unfortunately most of the pipeline and llvm code
didn't handle them at all. It only worked if number of inputs
happened to be bigger or equal to the number of shader outputs
plus the extra injected outputs. In particular when running
the pipeline which depends on the vertex_id in the vertex_header
things were completely broken. The patch adjust the code to
correctly use the total number of shader outputs (the standard
ones plus the injected ones) to make it all stop crashing and
work.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Instead of using the magical 4 use the above computed
vertex size. Doesn't change the behavior, just makes the code
a bit cleaner.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
when dumping shader outputs it's nice to have the integer
values of the outputs, in particular because some values
are integers.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Adding code to detect the usage of prim id and front face
semantics in fragment shaders.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
we forgot to add ucmp to the list of opcodes, so it was never
generated for ureg.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The spec says that front-face is true if the value is >0 and false
if it's <0. To make sure that we follow the spec, lets just
subtract 0.5 from our value (llvmpipe did 1 for frontface and 0
otherwise), which will get us a positive num for frontface and
negative for backface.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
llvm shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask for the tgsi shift instructions.
v2: only use mask for the tgsi shift instructions, not for the build shift
helpers. None of the internal callers need this behavior, and while llvm can
optimize away the masking for constants there are legitimate cases where it
might not be able to do so even if we know that shift count must be smaller
than type width (currently all such callers do not use the build shift
helpers).
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
c shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask (on x86 it actually would usually probably work as
shifts do masking on int domain shifts - unless some auto-vectorizer would
come along at last as simd domain does not mask the shift count).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Previously, nothing was said what happens with shift counts exceeding
bit width of the values to shift. In theory 3 behaviors are possible:
1) undefined (classic c definition)
2) just shift out all bits (so result is zero, or -1 potentially for ashr)
3) mask the shift count to bit width - 1
API's either require 3) or are ok with 1). In particular, GLSL (as well as a
couple uninteresting legacy GL extensions) is happy with undefined, whereas
both OpenCL and d3d10 require 3). Consequently, most hw also implements 3).
So, for simplicity we just specify that 3) is required rather than saying
undefined and then needing state trackers to work around it.
Also while here specify shift count as a vector, not scalar. As far as I
can tell this was a doc bug, neither state trackers nor drivers used scalar
shift count.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
MP_TEMP_SIZE must be aligned to 0x8000, while TEMP_SIZE on NVE4_3D
must be aligned to 0x20000, so perform both alignments to be sure
we allocate enough space (actually the bo will most likely use 128
KiB pages and not aligning to that would be a waste anyway).
Cc: "9.2" mesa-stable@lists.freedesktop.org
Previously we were using truncation, which gives the correct result
only for numbers in [0.5-1.0] range (because there's no mantissa bits
to do any rounding there).
This is frequently hit (and probably only used there) when converting
fragment depth to depth format (d24s8 etc.) or otherwise dealing with
depth format.
v2: as spotted by Jose, get rid of extra type (src_type is already unsigned).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
The variable 'usage' was being used for two different things.
Sometimes for PIPE_USAGE_x and other times for PIPE_TRANSFER_x.
This renames usage to access when we're talking about PIPE_TRANSFER_x
flags. Plus, add a bunch of comments to remind us what's going on.
Also, use unsigned for PIPE_TRANSFER_x bitmask to be consistent with
other places. And add a missing const qualifier.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
We should probably be using map()/unmap() when accessing resource
data, but this is a little better.
v2: assert that the resource is not a display target, per Jose.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
This was never a problem since the Mesa state tracker always gives
us a user-space constant buffer with buffer_offset=0. But if another
state tracker ever gave us a "HW" constant buffer with non-zero
buffer_offset we'd mis-render.
Also, use the correct buffer size. And move an assertion to the
top of the function.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
The cap means _can_ accept user-space constant buffers; it doesn't
mean _only_ accepts user-space constant buffers.
v2: also update the PIPE_CAP_USER_VERTEX_BUFFERS and
PIPE_CAP_USER_INDEX_BUFFERS descriptions as well. Per Jose.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
See documentation in mtypes.h.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
vblank_mode is read by dri_util.c and falls under the "dri2" driver name,
which is not connected to the actual Mesa/Gallium driver in any way.
Reviewed-by: Brian Paul <brianp@vmware.com>