Commit graph

83716 commits

Author SHA1 Message Date
Nicolai Hähnle
287822ee33 radeonsi: drop unnecessary u_pstipple.h include
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:51:29 +02:00
Nicolai Hähnle
3e4c5693a1 radeonsi: do not pass the return type to buffer_load_const
Overriding it is not allowed anyway, and actually lead to a crash when polygon
stippling was used with monolithic shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-08-08 12:51:26 +02:00
Kenneth Graunke
bd1bd03268 glsl: Combine GS and TES array resizing visitors.
These are largely identical, except that the GS version has a few
extra error conditions.  We can just pass in the stage and skip these.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:53:59 -07:00
Kenneth Graunke
398428f406 glsl: Fix location bias for patch variables.
We need to subtract VARYING_SLOT_PATCH0, not VARYING_SLOT_VAR0.

Since "patch" only applies to inputs and outputs, we can just handle
this once outside the switch statement, rather than replicating the
check twice and complicating the earlier conditions.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:53:42 -07:00
Kenneth Graunke
1556f16e46 glsl: Fix the program resource names of gl_TessLevelOuter/Inner[].
These are lowered to gl_TessLevel{Outer,Inner}MESA.  We need them to
appear in the program resource list with their original names and types.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:53:28 -07:00
Kenneth Graunke
4a49851da1 glsl: Delete bogus ir_set_program_inouts assert.
This assertion is bogus.  Varying structs, and arrays of structs, are
allowed by GLSL, and we can see them here.  While we currently don't
have any partial-variable support for those, simply returning false
and marking the entire thing as used is certainly legitimate.

I believe this is often swept under the rug by varying packing,
but that's disabled in certain tessellation situations.

Hit by 20 dEQP-GLES31.functional.tessellation.user_defined_io.* tests.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:51:21 -07:00
Kenneth Graunke
86915b495b glsl: Simplify interface qualifier parsing.
This better matches the grammar in section 4.3.9 of the GLSL 4.5 spec,
and also removes some redundant code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:48:48 -07:00
Kenneth Graunke
d0642c52fc glsl: Add a has_tessellation_shader() helper.
Similar to has_geometry_shader(), has_compute_shader(), and so on.
This will make it easier to add more conditions here later.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2016-08-07 23:47:55 -07:00
Marek Olšák
3fb4a9b3b3 Revert "gallium/radeon: count contexts"
This reverts commit b403eb3385.

Not needed.
2016-08-06 17:29:23 +02:00
Marek Olšák
11b1d064a3 radeonsi: add GLSL lit tests
They can only be run manually as described in HOW_TO_RUN.
It should help catch suboptimal code generation.

Some of the tests already fail.

v2: rename the tests to *.glsl,
    fix lit.cfg to find FileCheck

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
2016-08-06 16:11:43 +02:00
Marek Olšák
35942ee8a8 radeonsi: add a standalone compiler amdgcn_glslc
This will be used by GLSL lit tests.

For developers only. It shouldn't be distributable and it doesn't use
the Mesa build system.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 16:11:39 +02:00
Marek Olšák
ad8af99c86 radeonsi: add environment variable SI_FORCE_FAMILY
This will be used by: amdgcn_glslc -mcpu=[family]

It can also be used for shader-db if you want stats for a different family.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 16:11:35 +02:00
Marek Olšák
d0646cc745 winsys/radeon: implement cs_get_next_fence
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:31 +02:00
Marek Olšák
63b99590db winsys/amdgpu: implement cs_get_next_fence
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
04a6cb63aa gallium/radeon: add cs_get_next_fence winsys callback
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
b403eb3385 gallium/radeon: count contexts
We don't wanna use unflushed fences when we have multiple contexts.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
16d568d911 gallium/radeon: count gfx IB flushes
This will be used as a counter for whether fence_finish needs to flush
the IB.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 14:29:30 +02:00
Marek Olšák
c5ff0d3e65 gallium/radeon: move radeon_winsys::cs_memory_below_limit to drivers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
076db67217 gallium/radeon: inline radeon_winsys::query_memory_usage
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
9646ae7799 gallium/radeon/winsyses: expose per-IB used_vram and used_gart to drivers
The following patches will use this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
1c8f17599e gallium/radeon/winsyses: print CS submission error number
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
0edc2e433e radeonsi: flush if constant, shader, and streamout buffers use too much memory
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
c3efdeb8dd radeonsi: flush if sampler views and images use too much memory
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
d82cfab84c radeonsi: deal with high vertex buffer memory usage correctly
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
e62caf576e radeonsi: take compute shader and dispatch indirect memory usage into account
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
c56ecb68e7 radeonsi: take scratch buffer and draw indirect memory usage into account
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
ed2254d157 radeonsi: check IB memory usage of CP DMA operations
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Marek Olšák
f4b977bf3d gallium/radeon: add r600_resource::vram_usage and gart_usage
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-08-06 13:56:14 +02:00
Mathias Fröhlich
62d41162bb mesa: Copy bitmask of VBOs in the VAO on gl{Push,Pop}Attrib.
On gl{Push,Pop}Attrib(GL_CLIENT_VERTEX_ARRAY_BIT) take
care that gl_vertex_array_object::VertexAttribBufferMask
matches the bound buffer object in the
gl_vertex_array_object::VertexBinding array.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
2016-08-06 06:27:37 +02:00
Nanley Chery
c495c18b24 anv/gen7_pipeline: Set PixelShaderKillPixel for discards
According to the IVB PRM Vol2 P1, this bit must be set if a pixel shader
contains a discard instruction.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97207
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-08-05 09:53:52 -07:00
Jason Ekstrand
21f357b66e util/r11g11b10f: Whitespace cleanups
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:07:06 -07:00
Jason Ekstrand
ffcf8e1049 util/format: Use explicitly sized types
Both the rgb9e5 and r11g11b10 formats are defined based on how they are
packed into a 32-bit integer.  It makes sense that the functions that
manipulate them take an explicitly sized type.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:07:04 -07:00
Jason Ekstrand
c7eb9a7565 util/rgb9e5: Get rid of the float754 union
There are a number of reasons for this refactor.  First, format_rgb9e5.h is
not something that a user would expect to define such a generic union.
Second, defining it requires checking for endianness which is ugly.  Third,
90% of what we were doing with the union was float <-> uint32_t bitcasts
and the remaining 10% can be done with a sinmple left-shift by 23.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:07:01 -07:00
Jason Ekstrand
cda8d95660 util/format_rgb9e5: Get rid of the rgb9e5 union
The rgb9e5 format is a packed format defined in terms of slicing up a
single 32-bit value.  The bitfields are far more confusing than simple
shifts and require that we check the endianness.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:06:59 -07:00
Jason Ekstrand
f29fd7897a util: Move format_r11g11b10f.h to src/util
It's used from both mesa main and gallium.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:06:57 -07:00
Jason Ekstrand
6c665cdfc5 util: Move format_rgb9e5.h to src/util
It's used from both mesa main and gallium.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-08-05 09:06:31 -07:00
Andres Gomez
591869e921 glsl: fix indentation, comments and line lengths in ast_function.cpp
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:11 +03:00
Andres Gomez
8f98a120f3 glsl: apply_implicit_conversion is static again
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:11 +03:00
Andres Gomez
1443c10d74 glsl: struct constructors/initializers only allow implicit conversions
When an argument for a structure constructor or initializer doesn't
match the expected type, only Section 4.1.10 “Implicit Conversions”
are allowed to try to match that expected type.

From page 32 (page 38 of the PDF) of the GLSL 1.20 spec:

  " The arguments to the constructor will be used to set the structure's
    fields, in order, using one argument per field. Each argument must
    be the same type as the field it sets, or be a type that can be
    converted to the field's type according to Section 4.1.10 “Implicit
    Conversions.”"

From page 35 (page 41 of the PDF) of the GLSL 4.20 spec:

  " In all cases, the innermost initializer (i.e., not a list of
    initializers enclosed in curly braces) applied to an object must
    have the same type as the object being initialized or be a type that
    can be converted to the object's type according to section 4.1.10
    "Implicit Conversions". In the latter case, an implicit conversion
    will be done on the initializer before the assignment is done."

v2: Remove also the now redundant constant conversion, the
    constant_record_constructor helper and the replacement code
    (Timothy).

Fixes GL44-CTS.shading_language_420pack.initializer_list_negative

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:03 +03:00
Andres Gomez
de60d549b9 glsl: Refactor implicit conversion into its own helper
v2: Refactor also the conversion to constant and replacement code
    (Timothy).

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:03 +03:00
Andres Gomez
af796d756e glsl/types: disallow implicit conversions before GLSL 1.20
Implicit conversions were added in the GLSL 1.20 spec version.

v2: Join the checks for GLSL 1.10 and ESSL (Timothy).

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
2016-08-05 14:27:03 +03:00
Kenneth Graunke
875341c69b i965: Rework the unlit centroid workaround.
Previously, for every input, we moved the dispatch mask to the flag
register, then emitted two predicated PLN instructions, one with
centroid barycentric coordinates (for normal pixels), and one with
pixel barycentric coordinates (for unlit helper pixels).

Instead, we can simply emit a set of predicated MOVs at the top of
the program which copy the pixel barycentric coordinates over the
centroid ones for unlit helper pixel channels.  Then, we can just
use normal PLNs.

On Sandybridge:

total instructions in shared programs: 7538470 -> 7534500 (-0.05%)
instructions in affected programs: 101268 -> 97298 (-3.92%)
helped: 705
HURT: 9 (all of which are SIMD16 programs)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-08-05 01:43:52 -07:00
Tim Rowley
b521083ffb swr: [rasterizer core] static analysis fixes for conservative rast
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:35 -05:00
Tim Rowley
68dc544879 swr: [rasterizer core] implement InnerConservative input coverage
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:35 -05:00
Tim Rowley
4034f48833 swr: [rasterizer core] remove CanEarlyZ function
Test is now in SetupPipeline.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
b365989875 swr: [rasterizer core] use 32x32 macrotile for openswr
Significant performance increase (up to 2x) on high geometry workloads.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
5f4bc9e85b swr: [rasterizer fetch] add support for 24bit format fetch
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
527d45c8fe swr: [rasterizer fetch] additional fetch format support
Add support for 0 pitch in fetch.

Add support for USCALE/SSCALE for 32bit integer fetches.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
f438b7ba81 swr: [rasterizer jitter] fix potential jit exit crash
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00
Tim Rowley
57b07498d2 swr: [rasterizer core] update sync handling
Sync now uses a callback to ensure that it's called by the last
thread moving past a DC.  This will help with the new counter
handling.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-08-04 14:38:34 -05:00