Commit graph

69158 commits

Author SHA1 Message Date
Laura Ekstrand
44ecf0793d Revert "main: _mesa_cube_level_complete checks NumLayers."
This reverts commit 1ee000a0b6.
Failures with the GLES3 conformance suite and Synmark2 OGLHdrBloom revealed
that this commit was in error.

Extensive testing with Piglit prior to patch review and upstreaming did not
reveal this problem because, in the few Piglit tests that test for cube
completeness, NumLayers = 6.  This is because all of the existing tests use
TextureStorage to initialize the texture, which sets NumLayers.

A new Piglit test has been sent to the mailing list that reproduces the bug
related to this patch ("texturing: Testing
glGenerateMipmap(GL_TEXTURE_CUBE_MAP) without glTexStorage2D").

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-03-17 10:04:10 -07:00
Neil Roberts
5a06ee7384 i965/skl: Send a message header when doing constant loads SIMD4x2
Commit 0ac4c27275 made it add a header for the send message when
using SIMD4x2 on Skylake because without this it will end up using
SIMD8D. However the patch missed the case when a sampler is being used
to implement constant loads from a buffer surface in a SIMD4x2 vertex
shader.

This fixes 29 Piglit tests, mostly related to the ARL instruction in
vertex programs.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
2015-03-17 16:32:11 +00:00
Tapani Pälli
627c683086 i965/fs: in MAD optimizations, switch last argument to be immediate
Commit bb33a31 introduced optimizations that transform cases of MAD
in to simpler forms but it did not take in to account that src[0]
can not be immediate and did not report progress. Patch switches
src[0] and src[1] if src[0] is immediate and adds progress
reporting. If both sources are immediates, this is taken care of by
the same opt_algebraic pass on later run.

v2: Fix for all cases, use temporary fs_reg (Matt, Kenneth)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89569
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-03-17 07:59:30 +02:00
Vinson Lee
60f77b22b1 common.py: Fix PEP 8 issues.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 22:55:08 -07:00
Roland Scheidegger
2372275d2f gallivm: abort properly when running out of buffer space in lp_disassembly
Before this actually ran into an infinite loop printing out "invalid"...

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-03-17 00:46:48 +01:00
Marek Olšák
9d1682d619 docs/GL3: also mark GLES3/GS5 for radeonsi as done 2015-03-16 23:27:25 +01:00
Emil Velikov
c066669b8d st/dri: remove unused include from the automake/scons build
st/dri/common hasn't been around for a while.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:52 +00:00
Emil Velikov
55f0c0a29f auxiliary/os: fix the android build - s/drm_munmap/os_munmap/
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:36 +00:00
Emil Velikov
5664f57df3 gallium/sw/kms: trivial cleanups
Remove the forward declaration and make use of the DEBUG_PRINT macro for
debug builds.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-03-16 20:59:22 +00:00
Emil Velikov
771cd266b9 loader: include <sys/stat.h> for non-sysfs builds
Required by fstat(), otherwise we'll error out due to implicit function
declaration.

Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
2015-03-16 20:48:07 +00:00
Felix Janda
aead7fe2e2 c11/threads: Use PTHREAD_MUTEX_RECURSIVE by default
Previously PTHREAD_MUTEX_RECURSIVE_NP had been used on linux for
compatibility with old glibc. Since mesa defines __GNU_SOURCE__
on linux PTHREAD_MUTEX_RECURSIVE is also available since at least
1998. So we can unconditionally use the portable version
PTHREAD_MUTEX_RECURSIVE.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88534
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-16 20:41:39 +00:00
Marek Olšák
b5f19db976 radeonsi: implement TGSI_OPCODE_BFI (v2)
v2: Don't use the intrinsics, the shader backend can recognize these
    patterns and generates optimal code automatically.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 14:58:19 +01:00
Marek Olšák
d3723c614f radeonsi: add a helper for extracting bitfields from parameters (v2)
This will be used a lot (especially by tessellation).

v2: don't use the bfe intrinsic

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 14:58:19 +01:00
Antia Puentes
9735a62a2c i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7
IvyBridge and Haswell PRM say that the JIP should be emitted
with type W but we were using UD. The previous implementation
did not show adverse effects, but IMHO it is safer to follow
the specification thoroughly.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
2015-03-16 12:56:17 +01:00
Marek Olšák
dc39413640 radeonsi: move scratch reloc state setup
- move it to its own function
- do it after all states are emitted
- bump SI_MAX_DRAW_CS_DWORDS

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
567c8d7300 radeonsi: don't emit PA_SC_LINE_STIPPLE if not rendering lines
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
1f4bb38264 radeonsi: don't emit PA_SC_LINE_STIPPLE after every rasterizer state change
Do it only when the line stipple state is changed.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
f5832f3f9d radeonsi: move PA_SU_SC_MODE_CNTL to rasterizer state
This requires enabling the optional GL provoking vertex behavior for quads.

+ some cosmetic changes, so that the register is set exactly the same as
on r600.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
98a2398222 radeonsi: implement line and polygon smoothing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
303d23e10d radeonsi: add shader code for smoothing
The fragment shader multiplies the alpha channel with gl_SampleMaskIn.
If blending is enabled, it looks like MSAA.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:19 +01:00
Marek Olšák
4f20a8f278 radeonsi: split sample locations into its own state atom
Sample locations are not updated as often as framebuffers.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
f7796a966d radeonsi: add basic code for overrasterization
This will be used for line and polygon smoothing.
This is GCN-only even though it's in shared code.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
1921fa4304 radeonsi: small cleanup in si_shader_selector_key
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
52ff1edc51 radeonsi: simplify accessing alpha pointer in si_llvm_emit_fs_epilogue
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
955ebf2890 radeonsi: add support for easy opcodes from ARB_gpu_shader5
I have to use the BFE instrinsics, because BFE is one of the most complex
instructions that can't be matched easily. BFE has 3 conditional branches
and one of them is quite big.

In the isel DAG, lowered BFE has 27 nodes (including leafs).
2015-03-16 12:54:18 +01:00
Marek Olšák
755a2907a3 radeonsi: implement bit-finding opcodes from ARB_gpu_shader5
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
ca90cde81e radeonsi: implement gl_SampleMaskIn
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
f9fd0c4a55 radeonsi: add support for SQRT
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
d73c1c1304 radeonsi: add support for FMA
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
dfea35666e gallium/radeon: don't use LLVMReadOnlyAttribute for ALU
None of the instructions use a pointer argument.
(+ small cosmetic changes)

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2015-03-16 12:54:18 +01:00
Marek Olšák
9da9c8e3f4 tgsi: handle bitwise opcodes in tgsi_opcode_infer_type (v2)
v2: set the same types as the destination type in tgsi_exec

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-16 12:54:18 +01:00
Marek Olšák
216543ea54 gallium: add FMA and DFMA opcodes (v3)
Needed by ARB_gpu_shader5.

v2: select DMAD for FMA with double precision
v3: add and select DFMA

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-03-16 12:54:18 +01:00
Rob Clark
e92bc6b38e freedreno: update generated headers
Fix a3xx texture layer-size.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-15 18:00:19 -04:00
Rob Clark
d3fb949c03 freedreno/ir3: remove old compiler
Now that piglit is no longer falling back to old compiler for any tests,
we can remove it.  Hurray \o/

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15 13:27:03 -04:00
Rob Clark
feb858b788 freedreno/ir3: avoid scheduler deadlock
Deadlock can occur if we schedule an address register write, yet some
instructions which depend on that address register value also depend on
other unscheduled instructions that depend on a different address
register value.  To solve this, before scheduling an address register
write, ensure that all the other dependencies of the instructions which
consume this address register are already scheduled.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15 13:26:56 -04:00
Rob Clark
7208e96bb8 freedreno/ir3: bit of cleanup
Add an array_insert() macro to simplify inserting into dynamically sized
arrays, add a comment, and remove unused prototype inherited from the
original freedreno.git/fdre-a3xx test code, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-03-15 13:26:44 -04:00
Kenneth Graunke
db095eb43b i965: De-duplicate is_expression_commutative() functions.
Create a backend_inst::is_commutative() method to replace two static
functions that did the exact same thing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2015-03-15 03:14:53 -07:00
Chris Forbes
f68a973dfb i965/gen4-5: Cope with immutable-format texture revalidation
This is unfortunately sometimes necessary due to rebasing levels when
rendering into them.

16 piglits crash -> pass, when building mesa with debug enabled.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-03-14 15:55:17 +13:00
Emil Velikov
8ed1b65b62 docs: add news item and link release notes for mesa 10.5.1
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-03-13 23:36:33 +00:00
Emil Velikov
5f72847a88 docs: Add sha256 sums for the 10.5.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 2abba086ca)
2015-03-13 23:35:02 +00:00
Emil Velikov
6c96608937 Add release notes for the 10.5.1 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 11c0ff60ef)
2015-03-13 23:35:00 +00:00
Ilia Mirkin
620e29b748 freedreno: fix slice pitch calculations
For example if width were 65, the first slice would get 96 while the
second would get 32. However the hardware appears to expect the second
pitch to be 64, based on halving the 96 (and aligning up to 32).

This fixes texelFetch piglit tests on a3xx below a certain size. Going
higher they break again, but most likely due to unrelated reasons.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-03-13 16:05:16 -04:00
Ilia Mirkin
89b26d5a36 freedreno/a3xx: use the same layer size for all slices
We only program in one layer size per texture, so that means that all
levels must share one size. This makes the piglit test

bin/texelFetch fs sampler2DArray

have the same breakage as its non-array version instead of being
completely off, and makes

bin/ext_texture_array-gen-mipmap

start passing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robclark@freedesktop.org>
2015-03-13 16:05:16 -04:00
Ian Romanick
e76a8dc8ed i965/vs: Add missing resolve_bool_comparison calls on GEN4 and GEN5
The ir_unop_any problem was discovered by some later optimization passes
that generate ir_triop_csel.  I was also able to reproduce it by
modifying the gl-2.0-vertexattribpointer vertex shader to generate its
result using

   color = mix(vec4(0, 1, 0, 0),
               vec4(1, 0, 0, 0),
               bvec4(any(greaterThan(diff, vec4(tolerance)))));

instead of an if-statement.  This also required using #version 130 and
MESA_GLSL_VERSION_OVERRIDE=130.

I have not nominated this for stable releases because I don't think
there's any way to trigger the problem without GLSL 1.30 or
optimizations that don't exist in stable.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@intel.com>
2015-03-13 12:57:32 -07:00
Chris Forbes
21ff9bfe1c i965/disasm: Fix format strings
Most of the brw_inst_* api returns 64bit values. This fixes disassembly
of sampler messages, etc.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-14 07:51:18 +13:00
Chris Forbes
7c3095d6b7 i965/disasm: Mark format() as being printf-style.
This allows us to get warnings from GCC when we mess up the format
strings.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-03-14 07:50:48 +13:00
Matt Turner
97399fc751 docs: List ARB_shading_language_packing/EXT_shader_integer_mix.
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-13 10:42:38 -07:00
Matt Turner
8d3aa5926b glsl: Expose built-in packing functions under GLSL 4.2.
ARB_shading_language_packing is part of GLSL 4.2, not 4.0 as I
mistakenly believed. The following functions are available only with
ARB_shading_language_packing, GLSL 4.2 (not GLSL 4.0), or ES 3.0:

   - packSnorm2x16
   - unpackSnorm2x16
   - packHalf2x16
   - unpackHalf2x16

Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-03-13 10:42:38 -07:00
Matt Turner
dac2e7deaa egl: Create queryable strings in eglInitialize().
Creating/recreating the strings in eglQueryString() is extra work and
isn't thread-safe, as exhibited by shader-db's run.c using libepoxy.

Multiple threads in run.c call eglReleaseThread() around the same time.
libepoxy calls eglQueryString() to determine whether eglReleaseThread()
exists, and our EGL implementation passes a pointer to the version
string to libepoxy while simultaneously overwriting the string, leading
to a failure in libepoxy.

Moreover, the EGL spec says (emphasis mine):

"eglQueryString returns a pointer to a *static*, zero-terminated string"

This patch moves some auxiliary functions from eglmisc.c to eglapi.c so
that they may be used to create the extension, API, and version strings
once during eglInitialize(). The auxiliary functions are renamed from
_eglUpdate* to _eglCreate*, and some checks made unnecessary by calling
the functions from eglInitialize() are removed.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-03-13 10:42:38 -07:00
Samuel Iglesias Gonsalvez
b43bbfa90a glsl: optimize (0 cmp x + y) into (-x cmp y).
The optimization done by commit 34ec1a24d did not take it into account.

Fixes:

dEQP-GLES3.functional.shaders.random.all_features.fragment.20

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
2015-03-13 16:40:20 +01:00