Commit graph

79944 commits

Author SHA1 Message Date
Roland Scheidegger
df33f11b39 glsl: fix ldexp lowering if bitfield insert lowering is also requested
Trivial, this just resurrects the code which was there once upon a time
(the code can't lower instructions generated in the lowering pass there,
and even if it could it would probably be suboptimal).
This fixes piglit mesa_shader_integer_functions fs-ldexp.shader_test and
vs-ldexp.shader_test with llvmpipe.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-12-06 04:10:43 +01:00
Nayan Deshmukh
3015a23fe0 radv: fix resource leak in radv_amdgpu_ctx_create
CovID: 1396387

V2. Fixup bad whitespace.

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folkore1984.net>
2016-12-06 11:49:01 +11:00
Andy Furniss
5338fb34d6 st/omx/enc Raise default encode level
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91281

Signed-off-by: Andy Furniss <adf.lists@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-12-05 19:39:47 -05:00
Andy Furniss
2a38a5b2b2 radeon/vce Handle H.264 level 5.2
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=91281
v2: explicitly add case 52

Signed-off-by: Andy Furniss <adf.lists@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>
2016-12-05 19:39:47 -05:00
Jason Ekstrand
7db009b59e nir: Remove some unused fields from nir_variable
All of these are happily set from glsl_to_nir or spirv_to_nir but their
values are never used for anything.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:10 -08:00
Jason Ekstrand
50e0b0bee3 nir: Delete most of the constant_initializer support
Constant initializers have been a constant (ha!) pain for quite some time.
While they're useful from a language perspective, people writing passes or
backends really don't want deal with them most of the time.  This commit
removes most of the constant initializer support from NIR.  It is expected
that you call nir_lower_constant_initializers VERY EARLY to ensure that
they're gone before you do anything interesting.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
2f19c19b5d nir: Simplify nir_lower_gs_intrinsics
It's only ever called on single-function shaders.  At this point, there are
a lot of helpers that can make it all much simpler.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
257aa5a1c4 nir/lower_returns: Stop using constant initializers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
507626304c glsl/nir: Call nir_lower_constant_initializers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
c5d664f9dc anv/pipeline: Call nir_lower_constant_initializers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
f5232db9e5 nir: Add a pass for lowering away constant initializers
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-05 15:40:09 -08:00
Jason Ekstrand
0291bf4db2 Revert "i965: use nir_lower_indirect_derefs() for GLSL"
This reverts commit 9404439a75.  I didn't
intend to push it and it breaks clip and cull distance.
2016-12-05 15:21:20 -08:00
Jason Ekstrand
5f0e4c7c79 i965: Delete the meta-base CopyImageSubData implementation
When I originally implemented the ARB_copy_image extension, the fast-path
was written in meta using texture views.  This path only worked if both
images were uncompressed color images.  All of the other cases fell back to
the blitter or, in the worst case, mapping and memcpy on the CPU.  Now that
we have the blorp path, it handles all copies ever and the old meta,
blitter, and CPU paths are only used on gen5 and below.  The primary reason
why we needed the meta path (apart from having a slow blitter on later
hardware) was to handle multisampling which gen5 and earlier don't support
anyway.  Since the blitter is reasonably fast on gen5, we can just delete
the meta path and get rid of all that terrible code.

If we decide that we're ok with just disabling ARB_copy_image on gen5 and
earlier (I personally am), then we could get rid of another 300 lines or so
of semi-hairy code.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2016-12-05 14:00:35 -08:00
Jason Ekstrand
06d864921e i965/copy_image: Re-implement the blitter path with emit_miptree_blit
By using emit_miptree_blit which does chunking, this fixes the blitter path
for the case where the image is too tall to blit normally.  We also pull it
into intel_blit as intel_miptree_copy.  This matches the naming of the
blorp blit and copy functions brw_blorp_blit and brw_blorp_copy.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
2016-12-05 14:00:35 -08:00
Jason Ekstrand
6c74e7f492 i965/blit: Break the guts of intel_miptree_blit into a helper
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0" <mesa-dev@lists.freedesktop.org>
2016-12-05 14:00:35 -08:00
Timothy Arceri
9404439a75 i965: use nir_lower_indirect_derefs() for GLSL
This moves the nir_lower_indirect_derefs() call into
brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
and removes that call to the old GLSL IR pass
lower_variable_index_to_cond_assign()

We want to do this pass in nir to be able to move loop unrolling
to nir.

There is a increase of 1-3 instructions in a small number of shaders,
and 2 Kerbal Space program shaders that increase by 32 instructions.

Shader-db results BDW:

total instructions in shared programs: 8705873 -> 8706194 (0.00%)
instructions in affected programs: 32515 -> 32836 (0.99%)
helped: 3
HURT: 79

total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
cycles in affected programs: 528104 -> 493460 (-6.56%)
helped: 47
HURT: 37

LOST:   2
GAINED: 0
2016-12-05 14:00:35 -08:00
Tim Rowley
0c70b26a2d swr: mark PIPE_CAP_NATIVE_FENCE_FD unsupported
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-12-05 13:42:39 -06:00
Tim Rowley
efc3ca64ba swr: include llvm version and vector width in renderer string
Uses llvmpipe's string formating.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-12-05 13:42:39 -06:00
Tim Rowley
b035d9cab5 gallivm: use getHostCPUFeatures on x86/llvm-4.0+.
Use llvm provided API based on cpuid rather than our own
manually mantained list of mattr enabling/disabling.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-12-05 13:42:39 -06:00
Juan A. Suarez Romero
48416b6f4d st/va: declare vlVaBuffer before vlVaContext
And declare coded_buf in vlVaContext as "vlVaBuffer *" instead of
"struct vlVaBuffer *".

This fixes several warnings later about assignment from incompatible
pointer type.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 17:03:57 +00:00
Juan A. Suarez Romero
5a585d019e st/va: remove unused variable pbuff
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
2016-12-05 17:03:56 +00:00
Emil Velikov
510722d146 st/va: automake: cleanup C{PP,}FLAGS
Remove some transitional left overs from the gallium pipe-loader rework
and kill off unneeded AM_CPPFLAGS.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-12-05 17:03:56 +00:00
Tobias Droste
9d14a25bee configure.ac: Move llvm_set_environment_variables higher.
This moves the function to get the LLVM environment variables higher
in the file. It still needs to be below the "--enable-opencl" because
it uses $enable_opencl.
It can be called without condition now as it only throws errors if
openCL is enabled.

v5:
HAVE_MESA_LLVM is only used for gallium. Rename it to HAVE_GALLIUM_LLVM.
In order to only link LLVM when it is needed, HAVE_GALLIUM_LLVM is only
set if "$enable-gallium-llvm" is yes.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2016-12-05 14:43:46 +00:00
Boyuan Zhang
3949d7c6ea st/va: fix gop size for rate control
The gop_size in rate control is the budget window for internal rate
control calculation, and shouldn't always equal to idr period. Define
a coefficient to let budget window contains a number of idr period for
proper rate control calculation. Adjust the number of i/p frame remaining
accordingly.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-12-05 09:23:38 -05:00
Boyuan Zhang
8206882392 st/va: force to submit two consecutive single jobs
The gop_size in rate control is the budget window for internal rate
control calculation, and shouldn't always equal to idr period. Define
a coefficient to let budget window contains a number of idr period for
proper rate control calculation. Adjust the number of i/p frame remaining
accordingly.

v2: fixed regression issues introduced by previous version

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2016-12-05 09:23:38 -05:00
Nayan Deshmukh
7b811c362a st/vdpau: fix compiler warning in vlVdpVideoMixerRender
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-12-05 11:20:55 +01:00
Topi Pohjolainen
5b27405eff i965: Release aux buffer when disabling ccs
Otherwise subsequent render cycles keep on using compression
and/or fast clear.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-12-05 09:20:05 +02:00
Bas Nieuwenhuizen
92d7563fba ac/nir: Only use the first component for SSBO atomics.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-12-05 01:40:54 +01:00
Dave Airlie
8033f78f94 radv: fix another regression since shadow fixes.
This fixes:
dEQP-VK.glsl.texture_gather.basic.2d.depth32f.*

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-05 10:14:37 +10:00
Iago Toral Quiroga
66e7effc85 spirv: Builtin Layer is an input for fragment shaders
This change makes it so we emit a load_input intrinsic when Layer
is read in a fragment shader.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-12-03 20:50:57 +01:00
Bruce Cherniak
a7b510f656 swr: Fix active_queries count
The active_query count was incorrect for query types that don't require
a begin_query.  Removed the unnecessary assert.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-12-02 14:36:28 -06:00
George Kyriazis
2085088033 swr: Fix type to match parameters of std::max()
Include propagation of comparisons further down.

Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
2016-12-02 14:36:28 -06:00
Tim Rowley
f1ca377ab1 swr: [rasterizer jitter] include cstdarg in builder_misc.cpp
Fixes build problem with llvm-svn.

v2: use cstdarg instead of stdarg.h

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2016-12-02 14:36:28 -06:00
Jason Ekstrand
19a541f496 nir: Get rid of nir_constant_data
This has bothered me for about as long as NIR has been around.  Why do we
have two different unions for constants?  No good reason other than one of
them is a direct port from GLSL IR.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-12-02 10:53:32 -08:00
Timothy Arceri
c45d84ad83 Revert "st/mesa: get Version from gl_program rather than gl_shader_program"
This reverts commit 6bf63b0119.

A patch that adds a reference to gl_shader_program_data to gl_program
needs to land befor this one.
2016-12-02 16:44:44 +11:00
Timothy Arceri
6bf63b0119 st/mesa: get Version from gl_program rather than gl_shader_program
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-12-02 13:54:54 +11:00
Timothy Arceri
ab8c01386a st/mesa/glsl: move Version to gl_shader_program_data
This is mostly just used during linking however the st uses it
when updating textures.

In order to store gl_program in the CurrentProgram array
rather than gl_shader_program we need to move this field to
the shared gl_shader_program_data struct.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-12-02 13:54:47 +11:00
Rob Clark
534917495d freedreno: no-op render when we need a fence
If app tries to create a fence but there is no rendering to submit, we
need a dummy/no-op submit.  Use a string-marker for the purpose.. mostly
since it avoids needing to realize that the packet format changes in
later gen's (so one less place to fixup for a5xx).

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-12-01 20:24:59 -05:00
Rob Clark
0b98e84e9b freedreno: native fence fd support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-12-01 20:24:46 -05:00
Rob Clark
16f6ceaca9 freedreno: some fence cleanup
Prep-work for next patch, mostly move to tracking last_fence as a
pipe_fence_handle (created now only in fd_gmem_render_tiles()), and a
bit of superficial renaming.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-12-01 20:16:31 -05:00
Rob Clark
026a7223a6 gallium: support for native fence fd's
This enables gallium support for EGL_ANDROID_native_fence_sync, for
drivers which support PIPE_CAP_NATIVE_FENCE_FD.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-12-01 20:16:31 -05:00
Rob Clark
72cc1ca58d gallium: wire up server_wait_sync
This will be needed for explicit synchronization with devices outside
the gpu, ie. EGL_ANDROID_native_fence_sync.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-01 20:16:31 -05:00
Rob Clark
0201f01dc4 egl: add EGL_ANDROID_native_fence_sync
With fixes from Chad squashed in, plus fixes for issues that Rafael
found while writing piglit tests.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2016-12-01 10:57:35 -08:00
Rob Clark
2ba4c7e154 egl: un-fallthrough sync attr parsing
Doesn't work so well when you start having more than one possible
attrib.  Prep-work for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2016-12-01 10:57:24 -08:00
Rob Clark
cce04a4630 egl: initialize SyncCondition after attr parsing
Reduce the noise in the next patch.  For EGL_SYNC_NATIVE_FENCE_ANDROID
the sync condition is conditional on EGL_SYNC_NATIVE_FENCE_FD_ANDROID
attribute.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Tested-by: Chad Versace <chadversary@chromium.org>
2016-12-01 10:52:55 -08:00
Tim Rowley
05f35a868c tgsi: store writes_primid when scanning tgsi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-12-01 11:33:01 -06:00
Ilia Mirkin
7c16552f8d mesa: only verify that enabled arrays have backing buffers
We were previously also verifying that no backing buffers were available
when an array wasn't enabled. This is has no basis in the spec, and it
causes GLupeN64 to fail as a result.

Fixes: c2e146f487 ("mesa: error out in indirect draw when vertex bindings mismatch")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2016-12-01 06:35:13 -05:00
Eric Anholt
51244859e3 vc4: Avoid false scheduling dependencies for LOAD_IMMs.
Noticed in shaders with branching, where we ended up scheduling delay
slots near the start of a block for the uniforms reset setup.

total instructions in shared programs: 93970 -> 93951 (-0.02%)
instructions in affected programs:     3117 -> 3098 (-0.61%)

3DMMES performance +0.423087% +/- 0.133521% (n=9,10)
2016-11-30 19:58:09 -08:00
Eric Anholt
6c34084d8e vc4: Try to schedule QIR instructions between writing to and reading math.
This helps us get the delay slots between SFU writes and reads filled.

total instructions in shared programs: 94494 -> 93970 (-0.55%)
instructions in affected programs:     59206 -> 58682 (-0.89%)

3DMMES performance +1.89967% +/- 0.157611% (n=10,9)
2016-11-30 19:58:09 -08:00
Eric Anholt
d182740ac8 vc4: Improve interleaving of texture coordinates vs results.
The latency_between was trying to handle the delay between the coordinate
write ("before") and the corresponding sample read ("after"), but we were
handing in the two instructions swapped.

This meant that we tried to fit things between a tex_s and its *preceding*
tex_result.  This made us only interleave normal texture coordinates by
accident, and pessimized UBO reads by pushing the tex_result collection
earlier until there was nothing but it (and then its preceding coordinate
setup) left.

In addition to latency reduction, things end up packing better (probably
due to reduced live ranges of the texture results):

total instructions in shared programs: 98121 -> 94775 (-3.41%)
instructions in affected programs:     91196 -> 87850 (-3.67%)

3DMMES performance +1.15569% +/- 0.124714% (n=8,10)
2016-11-30 19:58:09 -08:00