Commit graph

82318 commits

Author SHA1 Message Date
Emil Velikov
5e3e292502 mesa: automake: add directory prefix for git_sha1.h
Otherwise the build will assume that we've talking about builddir, which
is not the case in the else statement.

Here the file is already generated and is part of the tarball.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 359d9dfec3)
2016-06-07 12:35:46 +01:00
Emil Velikov
3be5c6a9ec egl: android: don't add the image loader extension for !render_node
With earlier commit we introduced support for render_node devices, which
was couples with the use of the image loader extension.

As the work was inspired by egl/wayland we (erroneously) added the
extension for the !render_node path as well.

That works for wayland, as the implementations of the DRI2 and IMAGE
loader extensions converge behind the scenes. As that is not yet
the case for Android we shouldn't expose the extension.

Fixes: 34ddef39ce ("egl: android: add dma-buf fd support")

Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1816c837c1)
2016-06-07 12:35:40 +01:00
Emil Velikov
a26ca04fe3 anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards
The generated sources should follow the example set by the vulkan
headers and our non-generated code. Namely: the code for all supported
platforms should be available, each one guarded by its respective
VK_USE_PLATFORM_*_KHR macro.

v2: Reword commit message.

Cc: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)
(cherry picked from commit b8e1f59d62)
2016-06-03 01:44:56 +01:00
Mauro Rossi
1a5d6a232f isl: add support for Android libmesa_isl static library
isl library is needed to build i965, libmesa_isl static library is added
to fix related Android building errors.

Any attempt to build libmesa_genxml as phony package module failed to deliver
gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9}

Due to constraints in Android Build System, libmesa_genxml is built as static,
at least one source is needed, so dummy.c is autogenerated for this scope,
libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES,
to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 278c2212ac)
2016-06-02 22:35:29 +01:00
Mauro Rossi
702a1121c9 android: libmesa_glsl: add a dependency on libmesa_nir static
Fixes the following building error:

target  C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp
In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0,
                 from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28:
external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory
compilation terminated.
build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed
make: *** [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1
make: *** Waiting for unfinished jobs....

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4143245c23)
2016-06-02 22:35:29 +01:00
Emil Velikov
9a21315ea9 isl: automake: don't include isl_format_layout.c in two lists.
Including the file in both ISL_FILES and ISL_GENERATED_FILES makes
the actual dependency list less obvious.

v2: Drop unrelated vulkan hunk (Jason).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit af1a0ae8ce)
2016-06-02 22:35:29 +01:00
Emil Velikov
94630ce0c7 automake: bring back the .PHONY git_sha1.h.tmp rule
With earlier commit 3689ef32af ("automake: rework the git_sha1.h rule,
include in tarball") we/I erroneously removed the PHONY rule and the
temporary file.

The former is used to ensure that the header is regenerated when on each
make invocation, while the latter helps us avoid the unneeded rebuild(s)
when the SHA1 hasn't changed.

Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit af2637aa32)
2016-06-02 22:35:29 +01:00
Christian König
6ad61d90ea radeon/uvd: fix the H264 level for Tonga v2
We support 5.2 for a while now.

v2: we even support 5.2 for H264, 5.1 is for HEVC.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b3e75c3997)
2016-06-02 14:04:14 +01:00
Jordan Justen
a136b8bfe2 i965: Remove old CS local ID handling
The old method pushed data for each channels uvec3 data of
gl_LocalInvocationID.

The new method pushes 1 dword of data that is a 'thread local ID'
value. Based on that value, we can generate gl_LocalInvocationIndex
and gl_LocalInvocationID with some calculations.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0a3acff5b5)
2016-06-02 14:02:05 +01:00
Jordan Justen
52ba7abe1e i965: Enable cross-thread constants and compact local IDs for hsw+
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

One complication is that cross-thread constants are loaded into
registers before per-thread constants. Previously, our local IDs were
loaded before the uniform data and treated as 'payload' data, even
though they were actually pushed into the registers like the other
uniform data.

Therefore, in this patch we simultaneously enable a newer layout where
each thread now uses a single uniform slot for a unique local ID for
the thread. This uniform is handled specially to make sure it is added
last into the uniform push constant registers. This minimizes our
usage of push constant registers, and maximizes our ability to use
cross-thread constants for registers.

To swap from the old to the new layout, we also need to flip some
lowering pass switches to let our driver handle the lowering instead.
We also no longer force thread_local_id_index to -1.

v4:
 * Minimize size of patch that switches from the old local ID layout
   to the new layout (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b1f22c6317)
2016-06-02 14:01:31 +01:00
Jordan Justen
28ecf2b90e anv: Support new local ID generation & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ba9594f32)
2016-06-02 14:01:04 +01:00
Jordan Justen
ead833a395 i965: Support new local ID push constant & cross-thread constants
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.

We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.

v4:
 * Support the old local ID push constant layout as well (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 30685392e0)
2016-06-02 13:59:04 +01:00
Jordan Justen
ee77c4a099 i965: Add CS push constant info to brw_cs_prog_data
We need information about push constants in a few places for the GL
driver, and another couple places for the vulkan driver.

When we add support for uploading both a common (cross-thread) set of
push constants, combined with the previous per-thread push constant
data, things are going to get even more complicated. To simplify
things, we add push constant info into the cs prog_data struct.

The cross-thread constant support is added as of Haswell. To support
it we need to make sure all push constants with uniform values are
added to earlier registers. The register that varies per thread and
holds the thread invocation's unique local ID needs to be added last.

For now we add the code that would calculate cross-thread constatn
information for hsw+, but we force it (cross_thread_supported) off
until the other parts of the driver support it.

v4:
 * Support older local ID push constant layout as well. (Jason)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d437798ace)
2016-06-02 13:56:54 +01:00
Jordan Justen
a94be40ecc i965: Store number of threads in brw_cs_prog_data
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1b79e7ebbd)
2016-06-02 13:54:44 +01:00
Jordan Justen
632d7ef148 i965: Add nir based intrinsic lowering and thread ID uniform
We add a lowering pass for nir intrinsics. This pass can replace nir
intrinsics with driver specific nir lower code.

We lower the gl_LocalInvocationIndex intrinsic based on a uniform
which is loaded with a thread specific ID.

We also lower the gl_LocalInvocationID based on
gl_LocalInvocationIndex.

v2:
 * Create variable during lowering pass. (Ken)

v3:
 * Don't create a variable, but instead just insert an intrisic call
   to load a uniform from the allocated location. (Jason)

v4:
 * Don't run this pass if thread_local_id_index < 0

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ef0957dac)
2016-06-02 13:53:58 +01:00
Jordan Justen
5513300f59 i965: Put CS local thread ID uniform in last push register
This thread ID uniform will be used to compute the
gl_LocalInvocationIndex and gl_LocalInvocationID values.

It is important for this uniform to be added in the last push constant
register. fs_visitor::assign_constant_locations is updated to make
sure this happens.

The reason this is important is that the cross-thread push constant
registers are loaded first, and the per-thread push constant registers
are loaded after that. (Broadwell adds another push constant upload
mechanism which reverses this order, but we are ignoring this for
now.)

v2:
 * Add variable in intrinsics lowering pass
 * Make sure the ID is pushed last in assign_constant_locations, and
   that we save a spot for the ID in the push constants

v3:
 * Simplify code based with Jason's suggestions.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 04fc72501a)
2016-06-02 13:53:24 +01:00
Jordan Justen
33d0016836 i965: Add uniform for a CS thread local base ID
v4:
 * Force thread_local_id_index to -1 for now, and have
   fs_visitor::setup_cs_payload look at thread_local_id_index. This
   enables us to more easily cut over from the old local ID layout to
   the new layout, as suggested by Jason.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit fa279dfbf0)
2016-06-02 13:51:11 +01:00
Jordan Justen
169b700dfd i965: Add nir channel_num system value
v2:
 * simd16/32 fixes (curro)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 8f48d23e0f)
2016-06-02 13:48:20 +01:00
Jordan Justen
33e985f8b9 nir: Make lowering gl_LocalInvocationIndex optional
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 6f316c9d86)
2016-06-02 13:45:29 +01:00
Jordan Justen
c9de6190a0 glsl: Add glsl LowerCsDerivedVariables option
v2:
 * Move lower flag to context constants. (Ken)

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7b9def3583)
2016-06-02 13:38:06 +01:00
Jason Ekstrand
05d88165d9 i965/fs: Copy the offset when lowering logical pull constant sends
This fixes 64 Vulkan CTS tests per gen

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 1205999c22)
2016-06-02 13:37:30 +01:00
Dave Airlie
d1cf18497a glsl/distance: make sure we use clip dist varying slot for lowered var.
When lowering, we always want to use the clip dist varying.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8d4f4adfbd)
2016-06-02 13:36:45 +01:00
Kenneth Graunke
5a44d36b46 i965: Fix isoline reads in scalar TES.
Isolines aren't reversed.  commit 5b2d8c2273 fixed this for the vec4
TES backend, but not the scalar one.

Found while debugging GL45-CTS.tessellation_shader.
tessellation_control_to_tessellation_evaluation.gl_tessLevel.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 25e1b8d366)
2016-06-02 13:36:14 +01:00
Ian Romanick
0e54eebeed glsl: Use Geom.VerticesOut == -1 to specify unset
Because apparently layout(max_vertices=0) is a thing.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a428c955ce)
2016-06-02 13:35:18 +01:00
Ian Romanick
0ab1a3957a i965: If control_data_header_size_bits is zero, don't do EndPrimitive
This can occur when max_vertices=0 is explicitly specified.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b27dfa5403)
2016-06-02 13:34:43 +01:00
Ian Romanick
1398a9510f mesa: Fix bogus strncmp
The string "[0]\0" is the same as "[0]" as far as the C string datatype
is concerned.  That string has length 3.  strncmp(s, length_3_string, 4)
is the same as strcmp(s, length_3_string), so make it be strcmp.

v2: Not the same as strncmp(..., 3).  Noticed by Ilia.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 049bb94d2e)
2016-06-02 13:33:53 +01:00
Ilia Mirkin
b265796c79 nir: allow sat on all float destination types
With the introduction of fp64 and fp16 to nir, there are now a bunch of
float types running around. A F1 2015 shader ends up with an i2f.sat
operation, which has a nir_type_float32 destination. Allow sat on all
the float destination types.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ca135a2612)
2016-06-02 13:32:52 +01:00
Alex Deucher
4a00da1662 radeonsi: fix the raster config setup for 1 RB iceland chips
I didn't realize there were 1 and 2 RB variants when this code
was originally added.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd85e4a041)
2016-06-02 13:32:05 +01:00
Dave Airlie
e817522728 mesa/sampler: fix error codes for sampler parameters.
The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it,
however version 8 of it fixed this, and the GL specs also have
the fixed value in them.

Fixes:
GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6400144041)
2016-06-02 13:31:18 +01:00
Dave Airlie
915cc490d7 glsl: define some GLES3 constants in GLSL 4.1
The GLSL 4.1 spec adds:
gl_MaxVertexUniformVectors
gl_MaxFragmentUniformVectors
gl_MaxVaryingVectors

This fixes:
GL45-CTS.gtf31.GL3Tests.uniform_buffer_object.uniform_buffer_object_build_in_constants

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 0ebf4257a3)
2016-06-02 13:30:24 +01:00
Topi Pohjolainen
683c6940d8 i965: Add norbc debug option
This INTEL_DEBUG option disables lossless compression (also known
as render buffer compression).

v2: (Matt) Use likely(!lossless_compression_disabled) instead of
           !likely(lossless_compression_disabled)
    (Grazvydas) Update docs/envvars.html

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 6ca118d2f4)
2016-06-02 13:28:22 +01:00
Topi Pohjolainen
2d483256d5 i965/gen9: Configure rbc buffers as plain for non-rbc tex views
Fixes rendering in Shadow of Mordor with rbc. Application writes
RGBA_UNORM texture filling it with values the application wants to
later on treat as SRGB_ALPHA.
Intel driver enables lossless compression for the buffer by the time
of writing. However, the driver fails to make sure the buffer can be
sampled as something else later on and unfortunately there is
restriction in the hardware for using lossless compression for srgb
formats which looks to extend itself to the sampling engine also.
Requesting srgb to linear conversion on top of compressed buffer
results the color values to be pretty much garbage.

Fortunately none of tracked benchmarks showed a regression with
this.

v2 (Matt): Add missing space

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 30e9e6bd07)
2016-06-02 13:27:53 +01:00
Kenneth Graunke
8c627af1f0 i965: Fix the passthrough TCS for isolines.
We weren't setting up several of the uniform values for the patch
header, so we'd crash when uploading push constants.  We at least
need to initialize them to zero.  We also had the isoline parameters
reversed, so it would also render incorrectly (if it didn't crash).

Fixes a new Piglit test(*) (isoline-no-tcs), as well as crashes in
GL44-CTS.tessellation_shader.single.max_patch_vertices.

(*) https://lists.freedesktop.org/archives/piglit/2016-May/019866.html

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a3dc99f3d4)
2016-06-02 13:27:23 +01:00
Dave Airlie
86e367a572 i965/xfb: skip components in correct buffer.
The driver was adding the skip components but always for buffer 0.

This fixes:
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ebb81cd683)
2016-06-02 13:26:50 +01:00
Dave Airlie
64015c03bb glsl/linker: fix multiple streams transform feedback.
e2791b38b4
mesa/program_interface_query: fix transform feedback varyings.

caused a regression in
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams
on radeonsi.

The problem was it was using the skip components varying to set
the stream id, when it should wait until a varying was written,
this just adds the varying checks in the right place.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1fe7bbb911)
2016-06-02 13:25:59 +01:00
Dave Airlie
99fcfd985e mesa/bufferobj: use mapping range in BufferSubData.
According to GL4.5 spec:
An INVALID_OPERATION error is generated if any part of the speci-
fied buffer range is mapped with MapBufferRange or MapBuffer (see sec-
tion 6.3), unless it was mapped with MAP_PERSISTENT_BIT set in the Map-
BufferRange access flags.

So we should use the if range is mapped path.

This fixes:
GL45-CTS.buffer_storage.map_persistent_buffer_sub_data

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: "12.0, 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit e891f7cf55)
2016-06-02 13:25:08 +01:00
Ilia Mirkin
7bc29c784a nv50/ir: fix error finding free element in bitset in some situations
This really only hits for bitsets with a size of a multiple of 32. We
can end up with pos = -1 as a result of the ffs, which we in turn decide
is a valid position (since we fall through the loop and i == 1, we end
up adding 32 to it, so end up returning 31 again).

Up until recently this was largely unreachable, as the register file
sizes were all 63 or 255. However with the advent of compute shaders
which can restrict the number of registers, this can now happen.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 18d11c9989)
2016-06-02 13:24:08 +01:00
Timothy Arceri
b2b7f05da6 Revert "glsl: fix xfb_offset unsized array validation"
This reverts commit aac90ba292.

The commit caused a regression in:
piglit.spec.glsl-1_50.compiler.gs-input-nonarray-named-block.geom

Also the CTS test it was meant to fix seems like it may be bogus.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 98d40b4d11)
2016-06-02 13:21:36 +01:00
Francisco Jerez
eb56a2f250 i965/fs: Allow scalar source regions on SNB math instructions.
I haven't found any evidence that this isn't supported by the
hardware, in fact according to the SNB hardware spec:

 "The supported regioning modes for math instructions are align16,
  align1 with the following restrictions:
   - Scalar source is supported.
  [...]
   - Source and destination offset must be the same, except the case of
     scalar source."

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit c1107cec44)
2016-06-02 13:20:45 +01:00
Francisco Jerez
c1269825cf i965/fs: Fix constant combining for instructions that cannot accept source mods.
This is the case for SNB math instructions so we need to be careful
and insert the literal value of the immediate into the table (rather
than its absolute value) if the instruction is unable to invert the
sign of the constant on the fly.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 06d8765bc0)
2016-06-02 13:20:15 +01:00
Francisco Jerez
f651a4bb2e i965/fs: Extend remove_duplicate_mrf_writes() to handle non-VGRF to MRF copies.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 303ec22ed6)
2016-06-02 13:19:41 +01:00
Francisco Jerez
44029d4237 i965/fs: Fix compute_to_mrf() to coalesce VGRFs initialized by multiple single-GRF writes.
Which requires using a bitset instead of a boolean flag to keep track
of the GRFs we've seen a generating instruction for already.  The
search loop continues until all instructions initializing the value of
the source VGRF have been found, or it is determined that coalescing
is not possible.

Fixes a few piglit test cases on Gen4-6 which were regressed by
6956015aa5 due to the different (yet
perfectly valid) ordering in which copy instructions are emitted now
by the simd lowering pass, which had the side effect of causing this
optimization pass to start corrupting the program in cases where a
VGRF-to-MRF copy instruction would be eliminated but only the last
instruction writing to the source VGRF region would be rewritten to
point to the target MRF.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4fe4f6e8a7)
2016-06-02 13:19:07 +01:00
Francisco Jerez
910fa7a824 i965/fs: Teach compute_to_mrf() about the COMPR4 address transformation.
This will be required to correctly transform the destination of 8-wide
instructions that write a single GRF of a VGRF to MRF copy marked
COMPR4.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 1898673f58)
2016-06-02 13:18:33 +01:00
Francisco Jerez
3b78304025 i965/fs: Refactor compute_to_mrf() to split search and rewrite into separate loops.
This will allow compute_to_mrf to handle cases where the source of the
VGRF-to-MRF copy is initialized by more than one instruction.  In such
cases we cannot rewrite the destination of any of the generating
instructions until it's known whether the whole VGRF source region can
be coalesced into the destination MRF, which will imply continuing the
search until all generating instructions have been found or it has
been determined that the VGRF and MRF registers cannot be coalesced.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 485fbaff03)
2016-06-02 13:18:00 +01:00
Francisco Jerez
dd96daa55e i965/fs: Fix compute-to-mrf VGRF region coverage condition.
Compute-to-mrf was checking whether the destination of scan_inst is
more than one component (making assumptions about the instruction data
type) in order to find out whether the result is being fully copied
into the MRF destination, which is rather inaccurate in cases where a
single-component instruction is only partially contained in the source
region, or when the execution size of the copy and scan_inst
instructions differ.  Instead check whether the destination region of
the instruction is really contained within the bounds of the source
region of the copy.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 4b0ec9f475)
2016-06-02 13:17:26 +01:00
Francisco Jerez
a6011c6fc6 i965/fs: Simplify and improve accuracy of compute_to_mrf() by using regions_overlap().
Compute-to-mrf was being rather heavy-handed about checking whether
instruction source or destination regions interfere with the copy
instruction, which could conceivably lead to program miscompilation.
Fix it by using regions_overlap() instead of the open-coded and
dubiously correct overlap checks.

Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit bb61e24787)
2016-06-02 13:16:52 +01:00
Francisco Jerez
2d83aad693 i965/fs: Teach regions_overlap() about COMPR4 MRF regions.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 88f380a2dd)
2016-06-02 13:16:04 +01:00
Dylan Baker
665f57c513 Don't use python 3
Now there are not files that require python 3, so for now just remove
the python 3 dependency and use python 2. I think the right plan is to
just get all of the python ready for python 3, and then use whatever
python is available.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 604010a7ed)
2016-06-02 13:15:38 +01:00
Dylan Baker
7e62585ee8 genxml: change chbang to python 2
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit ab31817fed)
2016-06-02 13:15:08 +01:00
Dylan Baker
4dd70617a1 genxml: use the isalpha method rather than str.isalpha.
This fixes gen_pack_header to work on python 2, where name[0] is unicode
not str.

Signed-off-by: Dylan Bake <dylanx.c.baker@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 12c1a01c72)
2016-06-02 13:14:38 +01:00