This reverts commit c1107cec44.
Apparently the hardware spec text I quoted in the commit message was
outright lying about scalar source math being supported on SNB, the
hardware seems to load 32 contiguous bits of data for each channel
regardless of the regioning mode. Fixes regressions in the following
CTS tests (which we didn't catch early due to CTS being temporarily
disabled in our CI system):
es2-cts.gtf.gl.atan.atan_vec3_frag_xvary
es2-cts.gtf.gl.cos.cos_vec2_frag_xvary
es2-cts.gtf.gl.atan.atan_vec2_frag_xvary
es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf
es2-cts.gtf.gl.cos.cos_float_frag_xvary
es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf
es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary
es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf
es2-cts.gtf.gl.cos.cos_vec3_frag_xvary
es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary
Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346
Reported-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
(cherry picked from commit 7244dc1e06)
The conditional mod of these instructions determines the semantics of
the comparison itself (rather than being evaluated based on the result
of the instruction as is usually the case for most other instructions
that allow conditional mods), so it's in general not legal to
propagate a conditional mod into a CMP instruction. This prevents
cmod propagation from (mis)optimizing:
cmp.z.f0 tmp, ...
mov.z.f0 null, tmp
into:
cmp.z.f0 tmp, ...
which gives the negation of the flag result of the original sequence.
I originally noticed this while working on SIMD32 in the scalar
back-end, but the same scenario is likely to be possible in vec4
programs so this commit ports the bugfix with the same name from the
scalar back-end to the vec4 cmod propagation pass.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit a2135c6fd9)
Otherwise we will fail to find the headers in some scenarios.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
(cherry picked from commit 7a3a0d9212)
This fixes:
GL45-CTS.geometry_shader.layered_rendering.layered_rendering
on Skylake.
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit d10ae20b96)
Loosely based on radeonsi (Thanks to Nicolai).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 28590eb949)
Skipping the temporary allocation and copy instructions is easy (just
return dst), but the conditions used to find out whether the copy can
be optimized out safely without breaking the program are rather
complex: The destination must be exactly one component of at most the
execution width of the lowered instruction, and all source regions of
the instruction must be either fully disjoint from the destination or
be aligned with it group by group.
v2: Don't handle partial source-destination overlap for simplicity
(Jason). No instruction count regressions with respect to v1 in
either shader-db or the few FP64 shader_runner test-cases with
partial overlap I've checked manually.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 7aa76d66a1)
Piglit test ext_framebuffer_multisample_blit_scaled-blit-scaled
(with added 16x sample support) now passes with this patch.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 75da9c9933)
The specs says INVALID_VALUE for exceeding dimensions,
which is really what is happening here.
This fixes:
GL45-CTS.copy_image.non_existent_mipmap
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit af7bf610cf)
This test was only happening for textures, but there is
nothing in the spec to say this, so test it for all cases.
This fixes:
GL45-CTS.copy_image.invalid_target
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit c0856eacf1)
These two lines have been here since the file was created.
I'm guessing the second one was just for testing during dev, so it's the
one that's going away.
CoverityID: 1296205
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Brian Paul <brianp@vmware.com>
(cherry picked from commit 17f4c723eb)
In the case of out-of-tree (OOT) builds, in particular when building
from tarball, we'll end up with the file in both srcdir and builddir.
We want the former to remain intact (since we need it on rebuild) while
the latter should be removed otherwise `make distclean' gets angry at
us.
Ideally there'll be a solution that feels a bit less of a hack. Until
then this does the job exactly as expected.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b7f7ec7843)
... when copied from git_sha1.h.
As the latter file can we lacking the write attribute, one should set it
explicitly. Otherwise we'll get a warning/failure at cleanup stage.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 2c424e00c3)
Otherwise the build will assume that we've talking about builddir, which
is not the case in the else statement.
Here the file is already generated and is part of the tarball.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 359d9dfec3)
With earlier commit we introduced support for render_node devices, which
was couples with the use of the image loader extension.
As the work was inspired by egl/wayland we (erroneously) added the
extension for the !render_node path as well.
That works for wayland, as the implementations of the DRI2 and IMAGE
loader extensions converge behind the scenes. As that is not yet
the case for Android we shouldn't expose the extension.
Fixes: 34ddef39ce ("egl: android: add dma-buf fd support")
Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 1816c837c1)
The generated sources should follow the example set by the vulkan
headers and our non-generated code. Namely: the code for all supported
platforms should be available, each one guarded by its respective
VK_USE_PLATFORM_*_KHR macro.
v2: Reword commit message.
Cc: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96285
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1 over IRC)
(cherry picked from commit b8e1f59d62)
isl library is needed to build i965, libmesa_isl static library is added
to fix related Android building errors.
Any attempt to build libmesa_genxml as phony package module failed to deliver
gen{7,75,8,9}_pack.h generated headers, needed for libmesa_isl_gen{7,75,8,9}
Due to constraints in Android Build System, libmesa_genxml is built as static,
at least one source is needed, so dummy.c is autogenerated for this scope,
libmesa_genxml dependency is declared using LOCAL_WHOLE_STATIC_LIBRARIES,
to avoid building errors due to missing genxml/gen{7,75,8,9}_pack.h headers.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 278c2212ac)
Fixes the following building error:
target C++: libmesa_glsl <= external/mesa/src/compiler/glsl/glsl_to_nir.cpp
In file included from external/mesa/src/compiler/glsl/glsl_to_nir.h:28:0,
from external/mesa/src/compiler/glsl/glsl_to_nir.cpp:28:
external/mesa/src/compiler/nir/nir.h:42:25: fatal error: nir_opcodes.h: No such file or directory
compilation terminated.
build/core/binary.mk:432: recipe for target 'out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o' failed
make: *** [out/target/product/x86/obj/STATIC_LIBRARIES/libmesa_glsl_intermediates/glsl/glsl_to_nir.o] Error 1
make: *** Waiting for unfinished jobs....
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4143245c23)
Including the file in both ISL_FILES and ISL_GENERATED_FILES makes
the actual dependency list less obvious.
v2: Drop unrelated vulkan hunk (Jason).
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit af1a0ae8ce)
With earlier commit 3689ef32af ("automake: rework the git_sha1.h rule,
include in tarball") we/I erroneously removed the PHONY rule and the
temporary file.
The former is used to ensure that the header is regenerated when on each
make invocation, while the latter helps us avoid the unneeded rebuild(s)
when the SHA1 hasn't changed.
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit af2637aa32)
We support 5.2 for a while now.
v2: we even support 5.2 for H264, 5.1 is for HEVC.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b3e75c3997)
The old method pushed data for each channels uvec3 data of
gl_LocalInvocationID.
The new method pushes 1 dword of data that is a 'thread local ID'
value. Based on that value, we can generate gl_LocalInvocationIndex
and gl_LocalInvocationID with some calculations.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 0a3acff5b5)
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.
One complication is that cross-thread constants are loaded into
registers before per-thread constants. Previously, our local IDs were
loaded before the uniform data and treated as 'payload' data, even
though they were actually pushed into the registers like the other
uniform data.
Therefore, in this patch we simultaneously enable a newer layout where
each thread now uses a single uniform slot for a unique local ID for
the thread. This uniform is handled specially to make sure it is added
last into the uniform push constant registers. This minimizes our
usage of push constant registers, and maximizes our ability to use
cross-thread constants for registers.
To swap from the old to the new layout, we also need to flip some
lowering pass switches to let our driver handle the lowering instead.
We also no longer force thread_local_id_index to -1.
v4:
* Minimize size of patch that switches from the old local ID layout
to the new layout (Jason)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit b1f22c6317)
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.
We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.
v4:
* Support the old local ID push constant layout as well (Jason)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ba9594f32)
The cross thread constant support appears on Haswell. It allows us to
upload a set of uniform data for all threads without duplicating it
per thread.
We also support per-thread data which allows us to store a per-thread
ID in one of the uniforms that can be used to calculate the
gl_LocalInvocationIndex and gl_LocalInvocationID variables.
v4:
* Support the old local ID push constant layout as well (Jason)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 30685392e0)
We need information about push constants in a few places for the GL
driver, and another couple places for the vulkan driver.
When we add support for uploading both a common (cross-thread) set of
push constants, combined with the previous per-thread push constant
data, things are going to get even more complicated. To simplify
things, we add push constant info into the cs prog_data struct.
The cross-thread constant support is added as of Haswell. To support
it we need to make sure all push constants with uniform values are
added to earlier registers. The register that varies per thread and
holds the thread invocation's unique local ID needs to be added last.
For now we add the code that would calculate cross-thread constatn
information for hsw+, but we force it (cross_thread_supported) off
until the other parts of the driver support it.
v4:
* Support older local ID push constant layout as well. (Jason)
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit d437798ace)
We add a lowering pass for nir intrinsics. This pass can replace nir
intrinsics with driver specific nir lower code.
We lower the gl_LocalInvocationIndex intrinsic based on a uniform
which is loaded with a thread specific ID.
We also lower the gl_LocalInvocationID based on
gl_LocalInvocationIndex.
v2:
* Create variable during lowering pass. (Ken)
v3:
* Don't create a variable, but instead just insert an intrisic call
to load a uniform from the allocated location. (Jason)
v4:
* Don't run this pass if thread_local_id_index < 0
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 3ef0957dac)
This thread ID uniform will be used to compute the
gl_LocalInvocationIndex and gl_LocalInvocationID values.
It is important for this uniform to be added in the last push constant
register. fs_visitor::assign_constant_locations is updated to make
sure this happens.
The reason this is important is that the cross-thread push constant
registers are loaded first, and the per-thread push constant registers
are loaded after that. (Broadwell adds another push constant upload
mechanism which reverses this order, but we are ignoring this for
now.)
v2:
* Add variable in intrinsics lowering pass
* Make sure the ID is pushed last in assign_constant_locations, and
that we save a spot for the ID in the push constants
v3:
* Simplify code based with Jason's suggestions.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit 04fc72501a)
v4:
* Force thread_local_id_index to -1 for now, and have
fs_visitor::setup_cs_payload look at thread_local_id_index. This
enables us to more easily cut over from the old local ID layout to
the new layout, as suggested by Jason.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit fa279dfbf0)
When lowering, we always want to use the clip dist varying.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 8d4f4adfbd)
Isolines aren't reversed. commit 5b2d8c2273 fixed this for the vec4
TES backend, but not the scalar one.
Found while debugging GL45-CTS.tessellation_shader.
tessellation_control_to_tessellation_evaluation.gl_tessLevel.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit 25e1b8d366)
Because apparently layout(max_vertices=0) is a thing.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit a428c955ce)
This can occur when max_vertices=0 is explicitly specified.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit b27dfa5403)
The string "[0]\0" is the same as "[0]" as far as the C string datatype
is concerned. That string has length 3. strncmp(s, length_3_string, 4)
is the same as strcmp(s, length_3_string), so make it be strcmp.
v2: Not the same as strncmp(..., 3). Noticed by Ilia.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
(cherry picked from commit 049bb94d2e)
With the introduction of fp64 and fp16 to nir, there are now a bunch of
float types running around. A F1 2015 shader ends up with an i2f.sat
operation, which has a nir_type_float32 destination. Allow sat on all
the float destination types.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit ca135a2612)
I didn't realize there were 1 and 2 RB variants when this code
was originally added.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
(cherry picked from commit bd85e4a041)
The initial ARB_sampler_objects spec had GL_INVALID_VALUE in it,
however version 8 of it fixed this, and the GL specs also have
the fixed value in them.
Fixes:
GL45-CTS.texture_border_clamp.samplerparameteri_non_gen_sampler_error
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 6400144041)
Fixes rendering in Shadow of Mordor with rbc. Application writes
RGBA_UNORM texture filling it with values the application wants to
later on treat as SRGB_ALPHA.
Intel driver enables lossless compression for the buffer by the time
of writing. However, the driver fails to make sure the buffer can be
sampled as something else later on and unfortunately there is
restriction in the hardware for using lossless compression for srgb
formats which looks to extend itself to the sampling engine also.
Requesting srgb to linear conversion on top of compressed buffer
results the color values to be pretty much garbage.
Fortunately none of tracked benchmarks showed a regression with
this.
v2 (Matt): Add missing space
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 30e9e6bd07)
We weren't setting up several of the uniform values for the patch
header, so we'd crash when uploading push constants. We at least
need to initialize them to zero. We also had the isoline parameters
reversed, so it would also render incorrectly (if it didn't crash).
Fixes a new Piglit test(*) (isoline-no-tcs), as well as crashes in
GL44-CTS.tessellation_shader.single.max_patch_vertices.
(*) https://lists.freedesktop.org/archives/piglit/2016-May/019866.html
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
(cherry picked from commit a3dc99f3d4)
The driver was adding the skip components but always for buffer 0.
This fixes:
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_skip_multiple_buffers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit ebb81cd683)
e2791b38b4
mesa/program_interface_query: fix transform feedback varyings.
caused a regression in
GL45-CTS.gtf40.GL3Tests.transform_feedback3.transform_feedback3_multiple_streams
on radeonsi.
The problem was it was using the skip components varying to set
the stream id, when it should wait until a varying was written,
this just adds the varying checks in the right place.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
(cherry picked from commit 1fe7bbb911)