Commit graph

74815 commits

Author SHA1 Message Date
Leo Liu
37aed85969 st/omx/dec/h264: fix corruption when scaling matrix present flag set
The scaling list should be filled out with zig zag scan

v2: integrate zig zag scan for list 4x4 to vl(Christian)
v3: move list determination out from the loop(Ilia)

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 6ad2e55a14)
2016-02-04 10:06:27 +00:00
Leo Liu
3adf111821 vl: add zig zag scan for list 4x4
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
(cherry picked from commit 4f598f2173)
2016-02-04 10:06:27 +00:00
Ilia Mirkin
f5f021ecc5 st/mesa: treat a write as a read for range purposes
We use this logic to detect live ranges and then do plain renaming
across the whole codebase. As such, to prevent WaW hazards, we have to
treat a write as if it were also a read.

For example, the following sequence was observed before this patch:

 13: UIF TEMP[6].xxxx :0
 14:   ADD TEMP[6].x, CONST[6].xxxx, -IN[3].yyyy
 15:   RCP TEMP[7].x, TEMP[3].xxxx
 16:   MUL TEMP[3].x, TEMP[6].xxxx, TEMP[7].xxxx
 17:   ADD TEMP[6].x, CONST[7].xxxx, -IN[3].yyyy
 18:   RCP TEMP[7].x, TEMP[3].xxxx
 19:   MUL TEMP[4].x, TEMP[6].xxxx, TEMP[7].xxxx

While after this patch it becomes:

 13: UIF TEMP[7].xxxx :0
 14:   ADD TEMP[7].x, CONST[6].xxxx, -IN[3].yyyy
 15:   RCP TEMP[8].x, TEMP[3].xxxx
 16:   MUL TEMP[4].x, TEMP[7].xxxx, TEMP[8].xxxx
 17:   ADD TEMP[7].x, CONST[7].xxxx, -IN[3].yyyy
 18:   RCP TEMP[8].x, TEMP[3].xxxx
 19:   MUL TEMP[5].x, TEMP[7].xxxx, TEMP[8].xxxx

Most importantly note that in the first example, the second RCP is done
on the result of the MUL while in the second, the second RCP should have
the same value as the first. Looking at the GLSL source, it is apparent
that both of the RCP's should have had the same source.

Looking at what's going on, the GLSL looks something like

  float tmin_8;
  float tmin_10;
  tmin_10 = tmin_8;
... lots of code ...
  tmin_8 = tmpvar_17;
... more code that never looks at tmin_8 ...

And so we end up with a last_read somewhere at the beginning, and a
first_write somewhere at the bottom. For some reason DCE doesn't remove
it, but even if that were fixed, DCE doesn't handle 100% of cases, esp
including loops.

With the last_read somewhere high up, we overwrite the previously
correct (and large) last_read with a low one, and then proceed to decide
to merge all kinds of junk onto this temp. Even if that weren't the
case, and there were just some writes after the last read, then we might
still overwrite a merged value with one of those.

As a result, we should treat a write as a last_read for the purpose of
determining the live range.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 047b917718)
2016-02-04 10:06:27 +00:00
François Tigeot
3ef2a4bb2e gallium: Add DragonFly support
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit a48afb92ff)
2016-02-04 10:06:27 +00:00
Ilia Mirkin
12888ad942 nv50/ir: fix false global CSE on instructions with multiple defs
If an instruction has multiple defs, we have to do a lot more checks to
make sure that we can move it forward. Among other things, various code
likes to do

    a, b = tex()
    if () c = a
    else c = b

which means that a single phi node will have results pointing at the
same instruction. We obviously can't propagate the tex in this case, but
properly accounting for this situation is tricky. Just don't try for
instructions with multiple defs.

This fixes about 20 shaders in shader-db, including the dolphin efb2ram
shader.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3ca941d60e)
2016-02-04 10:06:27 +00:00
Ilia Mirkin
0f7d3d661d nv50,nvc0: fix buffer clearing to respect engine alignment requirements
It appears that the nvidia render engine is quite picky when it comes to
linear surfaces. It doesn't like non-256-byte aligned offsets, and
apparently doesn't even do non-256-byte strides.

This makes arb_clear_buffer_object-unaligned pass on both nv50 and nvc0.

As a side-effect this also allows RGB32 clears to work via GPU data
upload instead of synchronizing the buffer to the CPU (nvc0 only).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> # tested on GF108, GT215
Tested-by: Nick Sarnie <commendsarnex@gmail.com> # GK208
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 3ca2001b53)
2016-02-04 10:06:27 +00:00
Ilia Mirkin
7cb93f77f2 nvc0: avoid crashing when there are holes in vertex array bindings
When using the "shared" vertex array configuration strategy, we bind
each of the buffers as a separate array. However there can be holes in
such vertex buffer lists, so just emit a disable for those.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 438d421f8b)
2016-02-04 10:06:27 +00:00
Eric Anholt
03eb1716dd vc4: Throttle outstanding rendering after submission.
Just make sure that after we've submitted, we get to at least 5
(global) submits ago before we go on to do more.  Prevents up to
seconds of lag with window movement in X with xcompmgr -c.  There may
be useful tuning to do in the future, but for now this gets us
usability.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3fba517bdd)
2016-02-04 10:06:27 +00:00
Eric Anholt
4865ab2b43 vc4: Don't record the seqno of a failed job submit.
On an error return, the returned seqno will probably be unset, so we'd
lose track of what we've submitted so far for waiting on in the
future.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 2a449ce7c9)
2016-02-04 10:06:27 +00:00
Karol Herbst
045e269a92 nv50/ir: fix memory corruption when spilling and redoing RA
When RA fails, and we spill, we have to clean everything up before doing
RA again. We were forgetting to reset the hi/lo linked lists - at
least the hi list is guaranteed to still have pointers to now-deleted
RIG nodes.

Signed-off-by: Karol Herbst <nouveau@karolherbst.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 19ae5de981)
2016-02-04 10:06:26 +00:00
Ben Widawsky
761e3fda6e i965/bxt: Fix conservative wm thread counts.
When setting the conservative thread counts, I halved everything. That isn't
correct for the wm, which has nothing to do with actual thread counts. I suck.

BXT only has 1 slice, and there is some ambiguity about subslices, so just
reserve the max possible for now. It looks like this might fix:
piglit.spec.glsl-1_50.execution.variable-indexing.gs-output-array-vec4-index-wr.bxtm64.
I kind of question why that is, but it is what Jenkins says.

Mark is current running some of the other blacklisted tests on this patch. (it
effects anything requiring scratch space).

Cc: mesa-stable <mesa-stable@lists.freedesktop.org>
Cc: Neil Roberts <neil@linux.intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
(cherry picked from commit a443b5b732)
2016-02-04 10:06:26 +00:00
Ian Romanick
7e1bd26d54 meta: Use internal functions to set texture parameters
_mesa_texture_parameteriv is used because (the more obvious)
_mesa_texture_parameteri just stuffs the parameter in an array and calls
_mesa_texture_parameteriv.  This just cuts out the middleman.

As a side bonus we no longer need check that ARB_stencil_texturing is
supported.  The test doesn't allow non-supporting implementations to
avoid any work, and it's redundant with the value-changed test.

Fix bug #93717 because the state restore commands at the bottom of
_mesa_meta_GenerateMipmap no longer depend on the bound state.

Fixes  piglit   arb_direct_state_access-generatetexturemipmap  with  the
changes  recently sent  to the  piglit mailing  list.  See  the bugzilla
entry for more info.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93717
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 2542871387)
2016-02-04 10:06:26 +00:00
Ian Romanick
e5d37f16e0 meta/blit: Restore GL_DEPTH_STENCIL_TEXTURE_MODE state for GL_TEXTURE_RECTANGLE
Commit c246828c added the code to save and restore the stencil
texturing mode.  The restore, however, was erroneously inside the
'target != GL_TEXTURE_RECTANGLE' block.

Fixes piglit test 'arb_stencil_texturing-blit_corrupts_state
GL_TEXTURE_RECTANGLE'.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(cherry picked from commit 18b0ba340b)
2016-02-04 10:06:26 +00:00
Nicolai Hähnle
e22133f1bf radeonsi: add DCC buffer for sampler views on new CS
This fixes a VM fault and possible lockup in high memory pressure situations.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
(cherry picked from commit 1067e6eb55)
2016-02-04 10:06:26 +00:00
Nicolai Hähnle
6dbb144b3d radeonsi: ensure that VGT_GS_MODE is sent when necessary
Specifically, when the API switches from using a GS to not using a GS and then
back to using the same GS again, we do not have to re-send all the GS state,
but we do have to send VGT_GS_MODE. So make VGT_GS_MODE consistently be a part
of the VS state.

This fixes a rendering bug in Dolphin, but surely other applications are
affected as well.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93648
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 004fcd4230)
2016-02-04 10:06:26 +00:00
Nicolai Hähnle
4ba352dc80 radeonsi: extract the VGT_GS_MODE calculation into its own function
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 9f89bd69df)
2016-02-04 10:06:26 +00:00
Ilia Mirkin
e101a005b1 glsl: always compute proper varying type, irrespective of varying packing
Normally there's a producer and consumer, and the producer var gets
picked. In both the vertex->gs and tes->gs cases, that's the un-arrayed
version.

In the SSO case, however, there is no producer. So we picked the arrayed
GS variable, and as a result, used more slots than we should. More
critically, these slots would also no longer line up with the producer's
calculation. To fix this, we need to fix up the type of the variable
based on stage no matter what.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93650
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit dac2964f3e)
2016-02-04 10:06:26 +00:00
Timothy Arceri
65dfe8ba5f glsl: create helper to remove outer vertex index array used by some stages
This will be used in the following patch for calculating array sizes correctly
when reserving explicit varying locations.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
(cherry picked from commit 5907a02ab6)
2016-02-04 10:06:26 +00:00
Emil Velikov
ae0733e2a1 egl/dri2: expose srgb configs when KHR_gl_colorspace is available
Otherwise the user has no way of using it, and we'll try to access the
linear one.

v2:
 - Bail out when KHR_gl_colorspace is missing and srgb is set (Marek)

Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Fixes: c2c2e9ab604(egl: implement EGL_KHR_gl_colorspace (v2))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91596
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
(cherry picked from commit 54702c2fa1)
2016-02-03 14:51:22 +00:00
Emil Velikov
e8ae0e75fe targets/dri: android: use WHOLE static libraries
By using whole static libraries the android buildsystem provides
whole-archive (alike) solution. This means that we don't need to worry
about the order of the static libraries and any reverse, recursive or
circular dependencies that they have between one another.

Without this the linker will discard any unused hunks of one library
and we'll end up with unresolved symbols as those are required by
another static library. This issue has become more prominent with the
introduction of pipe-loader.

Whole static libraries has been used in i915/i965 for a very long
time, so we might do the same.

v2:
 - Better commit message (Ilia)
 - Keep external dependencies as [normal] static libs (Mauro)

Cc: mesa-stable@lists.freedesktop.org
Cc: Mauro Rossi <issor.oruam@gmail.com>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit f29a772a7e)
2016-02-03 14:50:38 +00:00
Emil Velikov
6e86513db9 i915: correctly parse/set the context flags
With an earlier commit we've spit the flags parsing to a separate
function, but forgot to update all the dri modules to use it.

Noticed when we've enabled KHR_debug for every dri module - fdo#93048

Fixes: 38366c0c6e "dri_util: Don't assume __DRIcontext->driverPrivate
is a gl_context"
Cc: Mark Janes <mark.a.janes@intel.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Mark Janes <mark.a.janes@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>

(cherry picked from commit 72fda2b710)
2016-02-03 14:34:26 +00:00
Jose Fonseca
ee5d530d33 pipe-loader: Fix PATH_MAX define on MSVC.
(cherry picked from commit 4befd82a64)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93628
Nominted-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-22 16:02:18 +00:00
Jose Fonseca
27d2dbd447 scons: Conditionally use DRM module on pipe-loader.
Fixes non Linux builds.

Trivial.

(cherry picked from commit 02afbd2476)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93628
Nominted-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-22 15:59:22 +00:00
Grazvydas Ignotas
4b88e1f758 r600g: don't leak driver const buffers
The buffers are referenced from r600_update_driver_const_buffers()
 -> r600_set_constant_buffer() -> u_upload_data(), but nothing
ever releases the reference. Similar case with driver_consts.
Found using valgrind.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
[Emil Velikov: resolve trivial conflicts]
(cherry picked from commit 0153ff8379)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-01-22 11:57:47 +00:00
Nicolai Hähnle
ebd01a0910 util/u_pstipple.c: copy immediates during transformation
Apparently, nobody has combined stippling with a fragment shader
containing immediates in almost five years...

Fixes a bug in Kodi with radeonsi reported by Christian König.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Tested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit e6281a2850)
2016-01-22 11:44:50 +00:00
Timothy Arceri
55c88cf8d1 glsl: fix interface block error message
Print the stream value not the pointer to the expression,
also use the unsigned format specifier.

Cc: 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
(cherry picked from commit d018619d7f)
2016-01-22 11:44:50 +00:00
Dave Airlie
c251d62ccb glsl: fix subroutine lowering reusing actual parmaters
One of the oglconform tests was crashing here, and it was
due to not cloning the actual parameters before creating the
new call. This makes a call clone function that does the right
things to make sure we clone all the needed info, and points
the callee at it. (It differs from ->clone due to this).

this may fix https://bugs.freedesktop.org/show_bug.cgi?id=93722, I had this
patch in my cts fixes tree, but hadn't had time to make sure I liked it.

Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
(cherry picked from commit 119bef9543)
2016-01-22 11:44:50 +00:00
Timothy Arceri
c4746503ca mesa: fix segfault in glUniformSubroutinesuiv()
From Section 7.9 (SUBROUTINE UNIFORM VARIABLES) of the OpenGL
4.5 Core spec:

   "The command

       void UniformSubroutinesuiv(enum shadertype, sizei count,
                                  const uint *indices);

   will load all active subroutine uniforms for shader stage
   shadertype with subroutine indices from indices, storing
   indices[i] into the uniform at location i. The indices for
   any locations between zero and the value of
   ACTIVE_SUBROUTINE_UNIFORM_LOCATIONS minus one which are not
   used will be ignored."

V2: simplify NULL check suggested by Jason.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org
https://bugs.freedesktop.org/show_bug.cgi?id=93731
(cherry picked from commit 86677f1016)
2016-01-22 11:44:50 +00:00
Timothy Arceri
a15cf952dc glsl: fix segfault linking subroutine uniform with explicit location
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: "11.0 11.1" mesa-stable@lists.freedesktop.org
(cherry picked from commit 50376e0c0e)
2016-01-22 11:44:49 +00:00
Kenneth Graunke
f9763f24ad glsl: Allow implicit int -> uint conversions for bitwise operators (&, ^, |).
The ARB has decided that implicit conversions should be performed for
bitwise operators in future language revisions.  Implementations of
current language revisions may or may not perform them.

This patch makes Mesa apply implicti conversions even on current
language versions.  Applications appear to expect this behavior,
and there's really no downside to doing so.

Fixes shader compilation in Shadow of Mordor.

Bugzilla: https://www.khronos.org/bugzilla/show_bug.cgi?id=1405
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit d54a70aa18)
2016-01-22 11:44:49 +00:00
Jason Ekstrand
288e760597 i965/fs: Always set channel 2 of texture headers in some stages
In the vertex and fragment stages, the hardware is nice to us and leaves
g0.2 zerod out for us so we can use it for headers.  However, in compute,
geometry, and tessellation stages, the hardware is not so nice.  In
particular, for compute shaders on BDW, the hardware places some debug bits
in 23:15.  As it happens, bit 15 is interpreted by the sampler as the alpha
channel mask.  This means that if you use a texturing instruction with a
header in a compute shader, you may randomly get the alpha channel
disabled.  Since channel masks affect the return length of the sampler
message, this can lead the GPU to expect a different mlen to the one you
specified in the shader and this, in turn, hangs your GPU.

Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit 61b0cfd84e)
2016-01-22 11:42:45 +00:00
Jason Ekstrand
4f760d768d i965/fs/generator: Take an actual shader stage rather than a string
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
[Emil Velikov: drop not applicable TES changes]
(cherry picked from commit 9870f798be)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/drivers/dri/i965/brw_fs_generator.cpp
	src/mesa/drivers/dri/i965/brw_shader.cpp
2016-01-22 11:37:43 +00:00
Jason Ekstrand
615c5a7fc8 i965/vec4: Use UW type for multiply into accumulator on GEN8+
BDW adds the following restriction: "When multiplying DW x DW, the dst
cannot be accumulator."

Cc: "11.1,11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 0a6811207f)
2016-01-22 11:36:43 +00:00
Ilia Mirkin
681415e7f0 st/mesa: use surface format to generate mipmaps when available
This fixes the recently posted mipmap + texture views piglit test.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
[Emil Velikov: resolve trivial conflicts]
(cherry picked from commit e94ef885bb)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/mesa/state_tracker/st_gen_mipmap.c
2016-01-22 11:36:43 +00:00
Marek Olšák
ce09b50e7a radeonsi: don't miss changes to SPI_TMPRING_SIZE
I'm not sure about the consequences of this bug, but it's definitely
dangerous.

This applies to SI, CIK, VI.

Cc: 11.0 11.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
(cherry picked from commit dc96a18d24)
2016-01-22 11:36:43 +00:00
Emil Velikov
ca6440ac33 cherry-ignore: drop the i965/kbl .num_slices patch
The variable was introduced after the branch point.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-01-22 11:35:10 +00:00
Kenneth Graunke
dac0229791 glsl: Make bitfield_insert/extract and bfi/bfm non-vectorizable.
Currently, opt_vectorize() tries to combine:

    result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x);
    result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y);
    result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z);
    result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w);

into a single ir_quadop_bitfield_insert opcode, which operates on
ivec4s.  However, GLSL IR's opcodes currently require the bits and
offset parameters to be scalar integers.  So, this breaks.

We want to be able to vectorize this eventually, but for now, just
chicken out and make opt_vectorize() bail by marking all the bitfield
insert/extract related opcodes as horizontal.  This is a relatively
uncommon case today, so we'll do the simple fix for stable branches,
and fix it properly on master.

Fixes assertion failures when compiling Shadow of Mordor vertex shaders
on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
[Emil Velikov: resolve trivial conflicts]
(cherry picked from commit 5e3edd4b28)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>

Conflicts:
	src/glsl/ir.h
2016-01-22 11:34:10 +00:00
Nicolai Hähnle
c7beaa4b4e i965: use _mesa_delete_buffer_object
This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 051603efd5)
2016-01-21 21:37:54 +02:00
Nicolai Hähnle
cd33a4981d i915: use _mesa_delete_buffer_object
This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 1b74c02e83)
2016-01-21 21:37:20 +02:00
Nicolai Hähnle
96736d3345 radeon: use _mesa_delete_buffer_object
This is more future-proof, plugs the memory leak of Label and properly
destroys the buffer mutex.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 8882b46226)
2016-01-21 21:36:44 +02:00
Nicolai Hähnle
19c8a5bf7e st/mesa: use _mesa_delete_buffer_object
This is more future-proof than the current code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1c2187b1c2)
2016-01-21 21:36:10 +02:00
Nicolai Hähnle
feea2407e4 mesa/bufferobj: make _mesa_delete_buffer_object externally accessible
gl_buffer_object has grown more complicated and requires cleanup. Using this
function from drivers will be more future-proof.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 6aed083b93)
2016-01-21 21:22:50 +02:00
Emil Velikov
4b2d9f29e9 docs: add sha256 checksums for 11.1.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-01-13 15:23:53 +02:00
Emil Velikov
330aa44a0d docs: add release notes for 11.1.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-01-13 12:21:22 +02:00
Emil Velikov
e429500dd1 Update version to 11.1.1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2016-01-13 12:21:21 +02:00
Sarah Sharp
bf00b36c9d mesa: Add KBL PCI IDs and platform information.
Add PCI IDs for the Intel Kabylake platforms.  The IDs are taken
directly from the Linux kernel patches, which are under review:

http://lists.freedesktop.org/archives/intel-gfx/2015-October/078967.html
http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=kbl-upstream-v2

The Kabylake PCI IDs taken from the kernel are rearranged to be in order
of GT type, then PCI ID.

Please note that if this patch is backported, the following fixes will
need to be added before this patch:

commit 28ed1e08e8 "i965/skl: Remove early platform support"
commit c1e38ad370 "i965/skl: Use larger URB size where available."

Thanks to Ben for fixing a bug around setting urb.size, and being
patient with my questions about what the various fields mean.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Suggested-by: Ben Widawsky <benjamin.widawsky@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (KBL-GT2)
Cc: "11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 39c41be50d)
2016-01-08 12:05:27 +02:00
Brian Paul
fab2039588 st/mesa: check state->mesa in early return check in st_validate_state()
We were checking the dirty->st flags but not the dirty->mesa flags.
When we took the early return, we didn't clear the dirty->mesa flags
so the next time we called st_validate_state() we'd often flush the
glBitmap cache.  And since st_validate_state() is called from
st_Bitmap(), it meant we flushed the bitmap cache for every glBitmap()
call.

This change seems to recover most of the performance loss observed
with the ipers demo on llvmpipe since commit commit 36c93a6fae.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: José Fonseca <jfonseca@vmware.com>
(cherry picked from commit c28d72a347)
2016-01-08 12:05:27 +02:00
Kenneth Graunke
536c8cbcd3 nir: Add a lower_fdiv option, turn fdiv into fmul/frcp.
The nir_opt_algebraic rule

(('fadd', ('flog2', a), ('fneg', ('flog2', b))), ('flog2', ('fdiv', a, b))),

can produce new fdiv operations, which need to be lowered on i965,
as we don't actually implement fdiv.  (Normally, we handle this in
GLSL IR's lower_instructions pass, but in the above case we introduce
an fdiv after that point.  So, make NIR do it for us.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 7295f4fcc2)
2016-01-08 12:05:27 +02:00
Ilia Mirkin
978480d69f nvc0: scale up inter_bo size so that it's 16M for a 4K video
Experimentally, 4M causes corruption and slowness, try to ramp it up
with size instead.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b16c9be4a5)
2016-01-08 12:05:27 +02:00
Ilia Mirkin
c2be35d309 nv50,nvc0: fix crash when increasing bsp bo size for h264
H264 doesn't have a bitplane bo. We just need a device reference, so use
the one from the client.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b5f2f7073f)
2016-01-08 12:05:27 +02:00