Commit graph

82384 commits

Author SHA1 Message Date
Ilia Mirkin
d69e557f2a freedreno: add support for conditional rendering, required for GL3.0
A smarter implementation would make it possible to attach this to emit
state for the BY_REGION versions to avoid breaking the tiling. But this
is a start.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-18 14:31:13 -05:00
Ilia Mirkin
059da344ec freedreno/a3xx: add fake RGTC support (required for GL3)
Also throw in LATC while we're at it (same exact format). This could be
made more efficient by keeping a shadow compressed texture to use for
returning at map time. However... it's not worth it for now...
presumably compressed textures are not updated often.

Lastly fix up Z32S8 transfers to non-0 layers.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-18 14:31:13 -05:00
Ilia Mirkin
84d087aea2 freedreno/a3xx: add missing formats to enable ARB_vertex_type_2_10_10_10_rev
The previously RE'd formats were from an ES driver implementing
OES_vertex_type_10_10_10_2 and thus backwards. A future change could add
the 2_10_10_10 support.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-18 14:31:13 -05:00
Rob Clark
8106fec74c freedreno/a3xx+a4xx: fix for stk binning pass hang
We'd end up in a state where shader uses no inputs, yet num_elements is
greater than zero.  Triggered by a TF vertex shader which did:

  gl_Position = vec4(0.0, 0.0, 0.0, 0.0);

resulting in a binning pass variant with no inputs.

Includes equiv fix in a4xx, even though we don't have binning-pass
enabled yet on a4xx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-18 14:31:13 -05:00
Rob Clark
b24c9a8aee freedreno/a3xx+a4xx: fix GL_POINTS lockup w/ GLES
point_size_per_vertex is always TRUE for GLES, causing us to configure
the hw as if gl_PointSize was written, even if it was not.  Which makes
for grumpy hw.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-18 14:31:13 -05:00
Ilia Mirkin
b40e144a66 nir: fix typo in idiv lowering, causing large-udiv-udiv failures
In nv50, and in the python script that Rob circulated, we do:

   bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b);

Do the same in the nir div lowering pass. This fixes the large-udiv-udiv
piglit tests on freedreno.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-11-18 14:31:13 -05:00
Oded Gabbay
4581f8428e llvmpipe: disable VSX in ppc due to LLVM PPC bug
This patch disables the use of VSX instructions, as they cause some
piglit tests to fail

For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7

With this patch, ppc64le reaches parity with x86-64 as far as piglit test
suite is concerned.

v2:
- Added check that we have at least LLVM 3.4
- Added the LLVM bug URL as a comment in the code

v3:

- Only disable VSX if Altivec is supported, because if Altivec support
is missing, then VSX support doesn't exist anyway.

- Change original patch description.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-11-18 21:27:29 +02:00
Ilia Mirkin
8e68113c1a nvc0/ir: actually emit AFETCH on kepler
Looks like this was forgotten in the commit which added the AFETCH
logic.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
2015-11-18 14:26:16 -05:00
Kenneth Graunke
2631bfd62c nir: Store the size of the TCS output patch in nir_shader_info.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-18 10:49:18 -08:00
Kenneth Graunke
b196f1fff3 i965: Add enums for 3DSTATE_TE field values.
3DSTATE_TE has partitioning, output topology, and domain fields,
each of which has several enumerated values.  We'll also need to
switch on the domain, so enums (rather than #defines) seem like a
natural fit.

I chose to put these in brw_compiler.h because they'll be stored
in struct brw_tes_prog_data, which will live there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-11-18 10:49:18 -08:00
Ian Romanick
72e232374e meta/generate_mipmap: Don't leak the framebuffer object
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
2015-11-18 09:38:21 -08:00
Brian Paul
1a48326a84 svga: use more VGPU10 formats
We always want to prefer the VGPU10 formats over the VGPU9 ones when
we have VGPU10 support.

Original patch by Jose and updated by Brian.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-18 09:16:12 -07:00
Brian Paul
1a90e3e1e3 svga: add/use new svga_sampler_format() function
This is important for the case of sampling from a depth texture.  In
that case, we need to sample the texture as if it were a single-channel
color texture.  For other/color formats, we can use the format as-is.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-18 09:15:54 -07:00
Nicolai Hähnle
27ce75ed12 radeon: count cs dwords separately for query begin and end
This will be important for perfcounter queries.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-18 12:27:13 +01:00
Nicolai Hähnle
ffd01b7781 radeon: expose r600_query_hw functions for reuse
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
2015-11-18 12:27:13 +01:00
Nicolai Hähnle
50f0f938e3 radeon: implement r600_query_hw_get_result via function pointers
We will need the clear_result override for the batch query implementation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-18 12:27:13 +01:00
Nicolai Hähnle
c207c55fc0 radeon: split hw query buffer handling from cs emit
The idea here is that driver queries implemented outside of common code
will use the same query buffer handling with different logic for starting
and stopping the corresponding counters.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
2015-11-18 12:27:13 +01:00
Nicolai Hähnle
1d10b3d01e radeon: convert hardware queries to the new style
Move r600_query and r600_query_hw into the header because we will want to
reuse the buffer handling and suspend/resume logic outside of the common
radeon code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
2015-11-18 12:27:12 +01:00
Nicolai Hähnle
019106760d radeon: convert software queries to the new style
Software queries are all queries that do not require suspend/resume
and explicit handling of result buffers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
2015-11-18 12:27:12 +01:00
Nicolai Hähnle
829a9808a9 radeon: add query handler function pointers
The goal here is to be able to move the implementation details of hardware-
specific queries (in particular, performance counters) out of the common code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
[Fixed a rebase conflict and re-tested before pushing.]
2015-11-18 12:27:12 +01:00
Nicolai Hähnle
50cab4788d radeon: move R600_QUERY_* constants into a new query header file
More query-related structures will have to be moved into their own
header file to support hardware-specific performance counters.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-18 12:27:12 +01:00
Nicolai Hähnle
c56e83e518 radeon: cleanup driver query list
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-18 12:27:12 +01:00
Nicolai Hähnle
e117e74baf radeon: move get_driver_query_info to r600_query.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-18 12:27:11 +01:00
Neil Roberts
5dfb4dbc05 i965: Prevent fast clears for MSRTs on SKL
There are currently a bunch of formats that behave strangely when
sampling the cleared color from the MCS buffer on SKL. They seem to
mostly be formats that don't have an alpha component, although it's
not all of them, and we haven't yet found anything in the specs which
would explain this. For now to be on the safe side this patch just
prevents fast clears for MSRTs on SKL altogether so that when fast
clears are eventually enabled it will only be for single-sampled
surfaces. The assumption is that clears are probably more likely to be
used in single-sampled applications anyway so we can at least get them
working and we can enable MSRTs later once we understand the problem
better.

This patch should have no functional effect other than perhaps
receiving fewer perf_debug messages on SKL+.

v2: Improve the commit message to avoid saying the patch disables fast
    clears because it will be merged before fast clears are enabled
    for any surfaces so it doesn't actually disable anything.
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-18 10:29:07 +01:00
Jason Ekstrand
e9d634f4ad gen7/pipeline: Re-arrange stencil parameters to match gen8 2015-11-17 19:10:31 -08:00
Eric Anholt
dd05ffebfc vc4: Don't bother lowering uniforms when the same value is used twice.
DEQP likes to do math on uniforms, and the "fmaxabs dst, uni, uni" to get
the absolute value would get lowered.  The lowering doesn't bother to try
to restrict the lifetime of the lowered uniforms, so we'd end up register
allocation failng due to this on 5 of the tests (More tests still fail in
RA, which look like we'll need to reduce lowered uniform lifetimes to
fix).

No changes on shader-db, though fewer extra MOVs are generated on even
glxgears (MOVs pair well enough that it ends up being the same instruction
count).
2015-11-17 17:45:23 -08:00
Eric Anholt
dffe7260cd vc4: Fix uniform reordering to support reading the same uniform twice.
This does actually happen in the wild (particularly fabs of a uniform), so
we'd like to support it.
2015-11-17 17:45:23 -08:00
Eric Anholt
d18d1ba587 vc4: Fix documentation on vc4_qir_lower_uniforms.c. 2015-11-17 17:45:23 -08:00
Eric Anholt
a4bf28178f vc4: Add support for nir_op_uge, using the carry bit on QPU_A_SUB.
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-17 17:45:23 -08:00
Jason Ekstrand
9e39bdabad anv/gen7: Implement CmdPipelineBarrier 2015-11-17 17:09:27 -08:00
Jason Ekstrand
b707e90b6e anv/gen7: Don't use the upper bound on dynamic state base address
It doesn't do much for us and, if we have to resize the dynamic state block
pool for any reason, it becomes out-of-date.
2015-11-17 17:08:44 -08:00
Kenneth Graunke
27b1d34438 i965: Fix PIPE_CONTOL typo.
PIPE_CONTOL!!!
2015-11-17 16:33:48 -08:00
Ben Widawsky
c531d40927 i965: Add assertion for src_stencil payload size
This helps address a coverity warning and prevents future questions about this
code.

Reported-by: Coverity (via Ilia)
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-17 14:47:33 -08:00
Kenneth Graunke
2bec154b47 i965: Implement ARB_pipeline_statistics_query tessellation counters.
We basically just need to uncomment Ben's code.

v2: Fix obvious bugs caught by Ben.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
2015-11-17 14:23:26 -08:00
Timothy Arceri
d4fbf11b58 glsl: rename location layout helper
Change name from validate -> apply to more accurately describe what
the function does.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2015-11-18 07:30:23 +11:00
Timothy Arceri
03bbddd139 glsl: don't validate binding when its not needed
Checking that the flag has been set is all the validation thats
needed here.

Also not calling the binding validation function will make things
much simpler when adding compile time constant support as we
won't need to resolve the binding value.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:30:19 +11:00
Timothy Arceri
4f4ca6b90a glsl: remove temp variable to make code easier to read
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:30:07 +11:00
Timothy Arceri
a01b8c7e77 glsl: cleanup and fix validate matrix function for arrays
Previously if the member was an array of matrices then a
warning message would be incorrectly given.

Also the struct case could never be met so it has been removed.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:29:58 +11:00
Timothy Arceri
f8b5cc827e glsl: use better location in struct and block error messages
Previously we only gave the location for some members and never
gave the variable location. In those cases we were just giving
the location of the struct/block.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:29:53 +11:00
Timothy Arceri
c54865db78 glsl: only do type and qualifier validation once per declaration
For struct and block members previously we were doing it for
every variable declaration.

So for example

struct S {
  atomic_uint x, y, z;
};

Would previously generate three error messages when one is sufficient.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:29:47 +11:00
Timothy Arceri
14d343b024 glsl: rename function that processes struct and iface members
As of the previous commit this function handles only struct/iface
members.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:29:37 +11:00
Timothy Arceri
8cf795dc7c glsl: move block validation outside function that validates members
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:29:32 +11:00
Timothy Arceri
649803742d glsl: move ast layout qualifier handling code into its own function
We now also only apply these rules to variables rather than also
trying to apply them to function params.

V2: move code for handling stream layout qualifier

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2015-11-18 07:29:25 +11:00
Jason Ekstrand
f0390bcad6 anv: Add initial Haswell support 2015-11-17 12:14:24 -08:00
Kenneth Graunke
5b596f3878 i965: Add INTEL_DEBUG=shader_time support for tessellation shaders.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-11-17 10:34:04 -08:00
Kenneth Graunke
df87cb837f i965: Add INTEL_DEBUG=tcs,tes and hs,ds flags for tessellation shaders.
Even though both tessellation shader stages must be used together, I
still think it makes sense to add separate debug flags for each stage.
It makes it possible to read the TCS/HS, rule out problems, then read
the TES/DS separately, without sifting through as much printed text.

I decided to add both the GL names (tcs/tes) and hardware names (hs/ds)
so they can be used interchangeably.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-11-17 10:33:54 -08:00
Kenneth Graunke
e9b0fa496c i965: Add more MAX_*_URB_ENTRY_SIZE_BYTES #defines.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
2015-11-17 10:18:08 -08:00
Kenneth Graunke
874a1ed813 i965: Add missing stdio.h include to brw_compiler.h.
This is needed for the FILE * type in brw_print_vue_map().

Apparently, all files that include brw_compiler.h already pick this up
via some include chain, so this isn't actually a build fix.  However,
I have patches which introduce new consumers of brw_compiler.h that
fail to build because of the missing #include.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-11-17 10:18:08 -08:00
Jason Ekstrand
45320f677b anv: Add macros for doing per-gen compilation 2015-11-17 08:27:51 -08:00
Jason Ekstrand
92d164b1c3 anv/entrypoints: Add dispatch support for haswell 2015-11-17 08:27:51 -08:00