Commit graph

73336 commits

Author SHA1 Message Date
Ian Romanick
e32a6590a4 i915: Don't override NewFramebuffer just to call _mesa_new_framebuffer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-10-06 11:28:00 -07:00
Ian Romanick
ed7f00f564 i965: Don't override NewFramebuffer just to call _mesa_new_framebuffer
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-10-06 11:27:45 -07:00
Ville Syrjälä
021f15816e i830: Fix culling with user fbos on gen2
Flip the cull bits when rendering to a user fbo on gen2. This
was already done on gen3 (since before git history starts)
but was missing from the gen2 code.

Fixes rendering of the driver+kart model in supertuxkart kart
selection screen.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
3e2c7ca773 i915: Adjust line size limits
The hardware can draw lines 0.5 to 7.5 pixels wide. Adjust the limits
to 1.0-7.0. The old limits seems to be from the era when i915 and i965
were sharing this code.

Not really sure if 1.0-7.0 is correct. Maybe it could be 0.5.7.5 as
those are the hw limits, or maybe some combination of the two?

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
00ee403883 i915: Enable intel_render path for points
The sub-pixel adjustment for points was killed off in
 commit 60d762aa62
 Author: Xiang, Haihao <haihao.xiang@intel.com>
 Date:   Wed Jan 2 11:38:51 2008 +0800

    i915: Needn't adjust pixel centers. fix #12944

so if we don't need it in intel_tris.c we don't need it in
intel_render.c either, which means we can allow intel_render.c to render
points.

No apparent regressions on PNV in ES1 or ES2 conformance.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
0febd0ecfd i915: Use COPY_DWORDS for points
The sub-pixel adjustment for points was killed off in
 commit 60d762aa62
 Author: Xiang, Haihao <haihao.xiang@intel.com>
 Date:   Wed Jan 2 11:38:51 2008 +0800

    i915: Needn't adjust pixel centers. fix #12944

so we can just as well use COPY_DWORDS().

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
bcf650496f i915: Use _tnl_RenderClippedPolygon and _tnl_RenderClippedLine
_tnl_RenderClippedPolygon and _tnl_RenderClippedLine already do most of
what we want so use them.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
303895655c i915: Handle provoking vertex in intelFastRenderClippedPoly()
intelFastRenderClippedPoly() renders the polygon using triangles. For
polygons the provoking vertex is always the first one, and currently
this function assumes that the provoking vertex for triangles is the
last one. In case the user changed the provoking vertex convention,
the hardware may be configured to treat the first vertex of triangles
as the provoking vertex. So check the convention and emit the triangles
in the appropriate order to avoid having to change the hardware
provoking vertex convention for rendering polygons.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
0886426503 t_dd_dmatmp: Check provoking vertex convention when rendering quads
When drawing quads using triangles we need to be careful to make
the provoking vertices match when flat shading.

v2: Major rebase on top of Ian's other t_dd_dmatmp.h work.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
83d511e190 t_dd_dmatmp: Disallow flat shading when rendering quad strips via tri strips
When rendering quad strips via tri strips we can't get the provoking
vertex right, so disallow flat shading.

v2: Major rebase on top of Ian's other t_dd_dmatmp.h work.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ville Syrjälä
b15b4581d1 t_dd_dmatmp: Allow flat shaded polygons with tri fans
We can allow rendering flat shaded polygons using tri fans if we check
the provoking vertex convention.

v2 (idr): Remove _EXT suffixes from GL_FIRST_VERTEX_CONVENTION.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-10-06 11:16:19 -07:00
Ian Romanick
5ca00e0b8d t_dd_dmatmp: Replace fprintf with unreachable
From http://lists.freedesktop.org/archives/mesa-dev/2015-May/084883.html:

    "There are no real error cases here, just dead code.
    validate_render() is supposed to make sure we never call these
    functions if the code can't actually render the primitives. The
    fprintf()+return branches should really just contain assert(0) or
    equivalent."

I also rearranged the if-else-block in render_quad_strip_verts to look
more like the other functions.  A future patch is going to change a
bunch of that code anyway.

v2: Make "unreachable" message more descriptive.  Suggested by Iago.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-10-06 10:44:00 -07:00
Ian Romanick
46b13666d8 radeon: Use C99 initializers for primitive arrays
Using C99 initializers for the primitive arrays makes things more
readable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-06 10:41:56 -07:00
Ian Romanick
68976a5a00 i965: Use C99 initializers for primitive arrays
Using C99 initializers for the primitive arrays makes things more
readable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-06 10:41:56 -07:00
Ville Syrjälä
fad5fd3a25 i915: Use C99 initializers for primitive arrays
Using C99 initializers for the primitive arrays makes things more
readable.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-06 10:41:56 -07:00
Brian Paul
3801fa65c1 tgsi: add const qualifier to silence warning
Trivial.
2015-10-06 08:51:33 -06:00
Brian Paul
b7766a95e1 glsl: whitespace/formatting/typo fixes in link_uniforms.cpp 2015-10-06 08:51:33 -06:00
Samuel Iglesias Gonsalvez
50d5a36f35 main: array stride for unsized arrays of arrays are calculated like records
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-10-06 14:28:26 +02:00
Samuel Iglesias Gonsalvez
82db642042 glsl: add std430 layout support for AoA
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-10-06 14:02:13 +02:00
Timothy Arceri
6483183279 docs: Mark GL_ARB_enhanced_layouts as in progress 2015-10-06 14:04:23 +11:00
Ilia Mirkin
dbae576f7f i965: add EXT_polygon_offset_clamp support to gen4/gen5
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-05 14:39:38 -07:00
Matt Turner
833fa9a8cd meta: Update comment about unsupported texture types.
Ken added support for 2DArray (commit ec23d5197e) and 1DArray (commit
14ca61125) last year.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-10-05 14:35:13 -07:00
Matt Turner
d4ff638504 glx: Drop CRAY support.
It couldn't have worked anyway. There were calls to undefined functions.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-10-05 14:34:16 -07:00
Matt Turner
617eb5e6c3 glsl: Remove CSE pass.
With NIR, it actually hurts things.

total instructions in shared programs: 6529329 -> 6528888 (-0.01%)
instructions in affected programs:     14833 -> 14392 (-2.97%)
helped:                                299
HURT:                                  1

In all affected programs I inspected (including the single hurt one) the
pass CSE'd some multiplies and caused some reassociation (e.g., caused
(A * B) * C to be A * (B * C)) when the original intermediate result was
reused elsewhere.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-05 14:31:26 -07:00
Matt Turner
5a360dcad1 i965: Generalize predicated break pass for use in vec4 backend.
instructions in affected programs:     44204 -> 43762 (-1.00%)
helped:                                221

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-05 13:42:58 -07:00
Matt Turner
4098a756b5 i965/fs: Use backend_instruction in predicated break peephole.
We're not using any fs_inst fields, and the next commit will make the
peephole used by the vec4 backend.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-05 13:42:58 -07:00
Matt Turner
5964419921 i965/fs: Remove SNB embedded-comparison support from optimizations.
We never emit IF instructions with an embedded comparison (lost in the
switch to NIR), so this code is not used. If we want to readd support,
we should have a pass that merges a CMP instruction with an IF or a
WHILE instruction after other optimizations have run.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-05 13:42:58 -07:00
Matt Turner
36ea9922ad mesa: Add missing _mm_mfence() before streaming loads.
According to the Intel Software Development Manual (Volume 1: Basic
Architecture, 12.10.3 Streaming Load Hint Instruction):

   Streaming loads may be weakly ordered and may appear to software to
   execute out of order with respect to other memory operations.
   Software must explicitly use fences (e.g. MFENCE) if it needs to
   preserve order among streaming loads or between streaming loads and
   other memory operations.

That is, a memory fence is needed to preserve the order between the GPU
writing the buffer and the streaming loads reading it back.

Reported-by: Joseph Nuzman <joseph.nuzman@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-10-05 12:06:33 -07:00
Chad Versace
93161be9e7 i965: Fix intel_miptree_is_fast_clear_capable()
There are three types of fast clears:
  a. fast depth clears
  b. fast singlesample color clears
  c. fast multisample color clears
Function intel_miptree_is_fast_clear_capable() checks if a miptree
supports fast clears of type (b).

Rename the function to disambiguate what it does:
  old: intel_miptree_is_fast_clear_capable
  new: intel_miptree_supports_non_msrt_fast_clear

The functionally accidentally rejected multisampled color surfaces
because it thought they were singlesample array surfaces. Fix that by
explicitly rejecting surfaces with samples > 1.

This fix would have been needed before we enabled layered fast
singlesample color clears (introduced in gen8), which we want to do
eventually. For now, though, this patch changes no behavior; it just
fixes how the driver chooses its behavior.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-10-05 11:14:04 -07:00
Chad Versace
125a04b474 i965/mt: Declare some functions as static
intel_tiling_supports_non_msrt_mcs() and
intel_miptree_is_fast_clear_capable() are not used outside of
intel_mipmap_tree.c.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-10-05 11:10:11 -07:00
Iago Toral Quiroga
73e0dfbaca i965: Make vec4_visitor's destructor virtual
We need a virtual destructor when at least one of the class' methods is virtual.
Failure to do so might lead to undefined behavior when destructing derived classes.
Fixes the following warning:

brw_vec4_gs_visitor.cpp: In function 'const unsigned int* brw::brw_gs_emit(brw_context*, gl_shader_program*, brw_gs_compile*, void*, unsigned int*)':
brw_vec4_gs_visitor.cpp:703:11: warning: deleting object of polymorphic class type 'brw::vec4_gs_visitor' which has non-virtual destructor might cause undefined behaviour [-Wdelete-non-virtual-dtor]
    delete gs;

Curro: This shouldn't be causing any actual bugs at the moment because
gen6_gs_visitor is the only subclass of vec4_visitor destroyed through
a pointer of a base class (vec4_gs_visitor *) and its destructor is
basically the same as its parent's. Anyway it seems sensible to change
this so it doesn't bite us in the future.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-10-05 13:50:15 +02:00
Tapani Pälli
a90feb581a glsl: set glsl error if binding qualifier used on global scope
Fixes following Piglit test:
	global-scope-binding-qualifier.frag

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-10-05 14:44:24 +03:00
Iago Toral Quiroga
102f6c446b i965: Assert on the number of combined UBO and SSBO binding table entries
In theory we can't break this assertion since the compiler frontend checks
that we don't exceed any of the individual limits, but it does not hurt to
be extra safe.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-05 08:19:34 +02:00
Iago Toral Quiroga
20cbe3688a i965: Reserve binding table space for SSBO surfaces
These share the space with UBO surfaces but we need to make sure we
allocate enough space for both sets (12 of each)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-05 08:12:17 +02:00
Iago Toral Quiroga
41c4d45e08 i965: Define BRW_MAX_SSBO
Instead of using hard-coded values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-05 08:12:17 +02:00
Iago Toral Quiroga
440f9348c1 i965: Define BRW_MAX_UBO
Instead of using hard-coded values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-05 08:12:17 +02:00
Matt Turner
4caa10193f i965/vec4: Remove more dead visitor/vertex program code.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-10-04 23:03:59 -07:00
Matt Turner
cd7fa1034a i965: Don't print line numbers with INTEL_DEBUG=optimizer.
The thing you want to do with the output files is diff them, which is
made more difficult by line numbers changing.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2015-10-04 23:03:59 -07:00
Ilia Mirkin
78ec9e28ec nv30: always go through translate module on big-endian
It seems like things are either coming in slighly wrong, or perhaps
uploaded incorrectly, but either way passing them through the translate
module seems to fix everything. Eventually we should figure out what's
going wrong and fix it "for real", but this should do for now.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-10-04 21:50:41 -04:00
Ilia Mirkin
1fec05d114 nv30: pretend to have packed texture/surface formats
This puts us in line with what the DDX/DRI2 st are expecting. It also
happens to work... no idea why, but seems better to have it work than to
ask lots of questions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-10-04 21:50:41 -04:00
Michel Dänzer
87c3c9acd2 st/dri: Use packed RGB formats
Fixes Gallium based DRI drivers failing to load on big endian hosts
because they can't find any matching fbconfigs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-04 21:50:31 -04:00
Timothy Arceri
763cd8c080 glsl: reduce memory footprint of uniform_storage struct
The uniform will only be of a single type so store the data for
opaque types in a single array.

Cc: Francisco Jerez <currojerez@riseup.net>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
2015-10-05 10:53:24 +11:00
Kenneth Graunke
b85757bc72 i965: Remove shader_prog from vec4_gs_visitor.
Unfortunately it has to stay in gen6_gs_visitor.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-04 14:00:01 -07:00
Kenneth Graunke
21585048a2 i965: Use nir->has_transform_feedback_varyings to avoid shader_prog.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-04 14:00:01 -07:00
Kenneth Graunke
7768b802e5 nir: Add a nir_shader_info::has_transform_feedback_varyings flag.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-10-04 14:00:01 -07:00
Kenneth Graunke
5d7f8cb5a5 nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.
Geometry and tessellation shaders process multiple vertices; their
inputs are arrays indexed by the vertex number.  While GLSL makes
this look like a normal array, it can be very different behind the
scenes.

On Intel hardware, all inputs for a particular vertex are stored
together - as if they were grouped into a single struct.  This means
that consecutive elements of these top-level arrays are not contiguous.
In fact, they may sometimes be in completely disjoint memory segments.

NIR's existing load_input intrinsics are awkward for this case, as they
distill everything down to a single offset.  We'd much rather keep the
vertex ID separate, but build up an offset as normal beyond that.

This patch introduces new nir_intrinsic_load_per_vertex_input
intrinsics to handle this case.  They work like ordinary load_input
intrinsics, but have an extra source (src[0]) which represents the
outermost array index.

v2: Rebase on earlier refactors.
v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-04 14:00:01 -07:00
Kenneth Graunke
f2a4b40cf1 nir/lower_io: Make get_io_offset() return a nir_ssa_def * for indirects.
get_io_offset() already walks the dereference chain and discovers
whether or not we have an indirect; we can just return that rather than
computing it a second time via deref_has_indirect().  This means moving
the call a bit earlier.

By returning a nir_ssa_def *, we can pass back both an existence flag
(via NULL checking the pointer) and the value in one parameter.  It
also simplifies the code somewhat.  nir_lower_samplers works in a
similar fashion.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-10-04 14:00:01 -07:00
Timothy Arceri
6994ca20aa glsl: fix whitespace
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-10-04 17:42:41 +11:00
Marek Olšák
814b7d1ab9 radeonsi: enable PIPE_CAP_FORCE_PERSAMPLE_INTERP
Now st/mesa won't generate 2 variants for this state.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00
Marek Olšák
b3c55fc669 radeonsi: do force_persample_interp in shaders for non-trivial cases
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:09 +02:00