Commit graph

51 commits

Author SHA1 Message Date
Ben Crocker
1359af930e gallivm/ppc64le: allow environmental control of Altivec code generation
In check_os_altivec_support(), allow control of Altivec (first PPC vector
instruction set) code generation via a new environmental control,
GALLIVM_ALTIVEC, which is expected to take on a value of 1 or 0.
The default is to enable Altivec code generation.

This environmental control of Altivec code generation is initially
available only #ifdef DEBUG.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ben Crocker <bcrocker@redhat.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
2017-10-05 02:14:14 +02:00
Tim Rowley
131b9f644c gallium/util: fix nondeterministic avx512 detection
cpuid.7 requires cx=0 to select the extended feature leaf.

avx512 detection was using the non-indexed cpuid resulting
in random non-detection of avx512.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-07-19 15:12:07 -05:00
Tomasz Figa
107b9c70d0 gallium: auxiliary: Fix standalone Android build of u_cpu_detect (v2)
Commit 463b7d0332c5("gallium: Enable ARM NEON CPU detection.")
introduced CPU feature detection based Android cpufeatures library.
Unfortunately it also added an assumption that if PIPE_OS_ANDROID is
defined, the library is also available, which is not true for the
standalone build without using Android build system.

Fix it by defining HAS_ANDROID_CPUFEATURES in Android.mk and replacing
respective #ifdefs to use it instead.

v2:
 - Add a comment explaining why the separate flag is needed (Emil).

Signed-off-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-07-19 13:28:23 +01:00
Eric Anholt
463b7d0332 gallium: Enable ARM NEON CPU detection.
I wrote this code with reference to pixman, though I've only decided to
cover Linux (what I'm testing) and Android (seems obvious enough).  Linux
has getauxval() as a cleaner interface to the /proc entry, but it's more
glibc-specific and I didn't want to add detection for that.

This will be used to enable NEON at runtime on ARMv6 builds of vc4.

v2: Actually initialize the temp vars in the Android path (noticed by
    daniels)
v3: Actually pull in the cpufeatures library (change by robher).
    Use O_CLOEXEC.  Break out of the loop when we find our feature.
v4: Drop VFP code, which was confused about what it was detecting and not
    actually used yet.

Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
2017-05-02 13:35:23 -07:00
Tim Rowley
b9578b683d gallium: detect avx512 cpu features
v3: fix check for xmm/ymm test
v2: style code, add avx512 to cpu dump

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-11-10 15:03:21 -06:00
Jose Fonseca
9e8edfa190 util,gallivm: Explicitly enable/disable fma attribute.
As suggested by Roland Scheidegger.

Use the same logic as f16c, since fma requires VEX encoding.

But disable FMA on LLVM 3.3 without MCJIT.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-10 13:47:35 +01:00
Jan Vesely
ebbe31d57c gallium,utils: Fix trivial sign compare warnings
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-05-03 12:00:09 -04:00
Jose Fonseca
0d4898ae80 include,gallium: Remove pre-MSVC 2013 compatibility.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-11 21:36:00 +00:00
François Tigeot
a48afb92ff gallium: Add DragonFly support
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2016-01-31 11:56:09 +00:00
Jose Fonseca
56aff6bb4e Remove Sun CC specific code.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Alan Coopersmith <alan.coopersmith@oracle.com>
2015-12-02 07:51:04 +00:00
Ilia Mirkin
a2a1a5805f gallium: replace INLINE with inline
Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile

and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability

to remove mentions of the inline define.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-07-21 17:52:16 -04:00
Leonid Shatz
5fea39ace3 gallium/util: make sure cache line size is not zero
The "normal" detection (querying clflush size) already made sure it is
non-zero, however another method did not. This lead to crashes if this
value happened to be zero (apparently can happen in virtualized environments
at least).
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87913

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 17:58:39 +01:00
Roland Scheidegger
b59c7ed0ab gallium/util: fix crash with daz detection on x86
The code used PIPE_ALIGN_VAR for the variable used by fxsave, however this
does not work if the stack isn't aligned. Hence use PIPE_ALIGN_STACK function
decoration to fix the segfault which can happen if stack alignment is only
4 bytes.
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=87658.

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-05 17:58:38 +01:00
Roland Scheidegger
f2c39dd0e1 util: don't try to emit half-float intrinsics if avx isn't available
These instructions only have vex encodings, thus they can't be used without
avx. (Technically, one can still use avx-128 if avx isn't available because
the environment doesn't store the ymm registers, however I don't think llvm
can.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-09-19 16:58:28 +02:00
Vinson Lee
d93e23ba25 util: Fix unmatched parenthesis.
Fixes MSVC build error introduced with commit
923d346714.

src\gallium\auxiliary\util\u_cpu_detect.c(286) : fatal error C1012: unmatched parenthesis : missing '('

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2013-09-10 10:33:47 -07:00
Brian Paul
923d346714 util: don't use _fxsave() with MSVC 2010 or older
And update _MSC_VER comments in p_config.h

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-10 11:01:37 -06:00
Roland Scheidegger
4b45b61fef util: add avx2 and xop detection to cpu detection code
Going to need this soon (not going to bother with avx2 intrinsics at this time
but don't want to do workarounds for true vector shifts if llvm itself can use
them just fine and won't need the gazillion instruction emulation).
Not really tested other than my cpu returns 0 for these features...
(I have no idea if llvm actually would emit avx2/xop instructions neither...)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-20 23:00:24 +02:00
Roland Scheidegger
836098f6b2 util: (trivial) fix asm input/output list for fxsave
Otherwise gcc might do very unsafe optimizations, spotted by Uros Bizjak.
Hopefully this time it's finally right?
2013-08-09 17:30:13 +02:00
Dieter Nützel
8f40fa0e7f util: (trivial) fix more compile errors in u_cpu_detect (gcc/x86 this time).
Oops. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=67921
2013-08-09 01:25:54 +02:00
Roland Scheidegger
43076a55c2 util: (trivial) fix compile error with MSVC on x86 2013-08-08 19:08:57 +02:00
Roland Scheidegger
883987503f util: try much harder to set DAZ flag
While so far this only causes some harmless test failures, there's lots more
cpus with DAZ. All 64bit capable ones can do it (particularly relevant for
AMD cpus as they supported sse3 very very late) but if really necessary we
can check support for that for real with some more magic.
(In fact just about ANY cpu with sse2 can support DAZ, I believe the only
exception are first gen P4 (Willamette) and from those only early steppings
which can't do it it's almost like intel forgot to add it... - a real pity
though docs say you can't just try to set it as they will throw a GPF.)
While this was meant to address https://bugs.freedesktop.org/show_bug.cgi?id=67672
it does not fix it. Most likely the tests need fixing as I don't think
there's any guarantee about denorm handling in the reference math library
functions if the flags aren't set to standard values. Nevertheless enabling
DAZ on all cpus which can do it should be the right thing to do.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-08-08 18:55:57 +02:00
Andre Heider
0acf3a8407 gallium/util: Fix detection of AVX cpu caps
For AVX it's not sufficient to only rely on the cpuid flags. If the CPU
supports these extensions, but the OS doesn't, issuing these insns will
trigger an undefined opcode exception.

In addition to the AVX cpuid bit we also need to:
* test cpuid for OSXSAVE support
* XGETBV to check if the OS saves/restores AVX regs on context switches

See "Detecting Availability and Support" at
http://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions

Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-07-23 23:12:58 +01:00
Roland Scheidegger
ffebefa114 util: (trivial) add has_popcnt field
Not used yet but there's a couple of places in llvmpipe which should use this
(occlusion count is currently very inefficent if there's no cpu popcnt
instruction).
2013-06-19 23:47:36 +02:00
Richard Sandiford
337f21bc35 util: Use sizeof(void *) rather than 0 as the fallback cache line size
Without this, llvmpipe ends up giving a zero size to all uncompressed textures
on non-x86 systems, since align() cannot handle a 0 alignment.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Richard Sandiford <rsandifo@linux.vnet.ibm.com>
2013-06-10 16:26:09 -04:00
Roland Scheidegger
067a0ae420 gallivm: use f16c hw support for float->half and half->float conversion
Should be way faster of course on cpus supporting this (includes AMD
Bulldozer and Jaguar cores, Intel Ivy Bridge and up (except budget models)).
Passes piglit fbo-blending-formats GL_ARB_texture_float -auto on Ivy Bridge.

Reviewed-by: Brian Paul <brianp@vmware.com>
2013-04-04 01:03:42 +02:00
Maxence Le Doré
ba588dd45d gallium/util: Correct shift value for TSC feature detection.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-03-08 21:21:53 -08:00
José Fonseca
7eb5040197 gallivm,llvmpipe: Use 4-wide vectors on AMD Bulldozer.
8-wide vectors is slower.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-09-04 08:49:00 +01:00
Vinson Lee
19b3910bd5 util: Add cpuid for Solaris Studio.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-03 12:28:07 -07:00
Brian Paul
ef2c80f506 util: add cpu detection for sse 4.2 and avx 2011-04-07 13:41:52 -06:00
Brian Paul
1e105741f1 util: simplify bit shifting in util_cpu_detect() 2011-04-07 13:41:52 -06:00
Vinson Lee
3bdbccef2a util: Use #ifdef instead of #if.
This is a typo fix of earlier commit 0f3b3751b8.
2010-08-21 23:36:30 -07:00
Vinson Lee
0f3b3751b8 util: Define dump_cpu only for DEBUG builds.
dump_cpu is used only when DEBUG is defined.

Fixes the following GCC warning on builds without DEBUG defined.
util/u_cpu_detect.c:76: warning: 'debug_get_option_dump_cpu' defined but not used
2010-08-21 23:28:52 -07:00
José Fonseca
7a40d15e6c util: Remove the x86 exception handlers.
Unused now that check_os_katmai_support was removed.
2010-08-21 10:07:12 +01:00
Vinson Lee
15a3b42e13 util: Remove check_os_katmai_support.
check_os_katmai_support checks that the operating system running on a
SSE-capable processor supports SSE. This is necessary for unpatched
2.2.x and earlier kernels. 2.4.x and later kernels support SSE.

check_os_katmai_support will disable SSE capabilities for 32-bit x86
operating systems for which there is no code path. Currently, this
function handles Linux, Windows, and several BSDs. Mac OS, Cygwin, and
Solaris are several operating systems with no code paths.

Rather than add code for the unhandled operating systems, remove this
function altogether. This will fix SSE detection on all recent 32-bit
x86 operating systems. This completely breaks functionality on unpatched
2.2.x and earlier kernels, although there are likely no Gallium3D users
on such operating systems.
2010-08-16 18:52:37 -07:00
Jakob Bornecrantz
7f5202be63 gallium: Make printing info on debug builds default off
This commit silences the printing off most of the debug information
when running debug builds. The big culprits are: the tgsi sanity checker
that gets run on all shaders on debug; all the options; and
finaly the cpu caps printer.
2010-08-15 01:03:23 +01:00
Luca Barbieri
9232566269 u_cpu_detect: remove arch and little_endian
This logic duplicates the one in p_config.h, so remove it and adjust
the only two places that were using it.
2010-08-14 19:23:52 +02:00
Brian Paul
14e9fbee1c gallium: remove stray semicolons 2010-08-06 15:09:41 -06:00
Jakob Bornecrantz
4d65055b1f util: Add option to not dump cpu caps 2010-08-05 17:25:13 -07:00
Brian Paul
b17ee335e3 util: fix unused function warning on non-x86 2010-07-26 20:48:29 -06:00
nobled
3cef6c42bc util: fix CPU detection on OS X
s/PIPE_OS_DARWIN/PIPE_OS_APPLE, since there is no PIPE_OS_DARWIN.

Acked-by: Vinson Lee <vlee@vmware.com>
2010-07-26 12:27:01 -07:00
Brian Paul
c4c11eb456 gallium/util: replace //-style comments 2009-11-17 16:16:30 -07:00
José Fonseca
5eba607db6 util: Drop return value from cpuid(). 2009-10-28 11:26:26 +00:00
José Fonseca
0426227b68 util: Fix cpuid on MSVC. 2009-10-28 11:26:26 +00:00
José Fonseca
4797ce0d19 util: Set cpu endianness too. 2009-10-22 19:12:13 +01:00
Marc Dietrich
b2b239691d gallium/util: fix cpu detection on ppc
As we are compiling with -D_BSD_SOURCE, sigjmp_buf and siglongjmp
should be replaced by the non-sig functions (see man 3 setjmp).
Tested on linux/cell.
2009-10-21 15:15:03 -06:00
José Fonseca
3ce3c03257 util: Fix cpu detection on Windows. Cleanup. 2009-10-14 17:27:06 +01:00
José Fonseca
c595dea23c util: Force ESI register for cpuid's ebx result.
Fixes a segfault and better code. Unfortunately using an arbitrary
register ("=r") causes the gcc to abort when the code is optimized saying
it can't satisfy the constraint. Setting seems to do the trick.
2009-10-09 13:22:42 +01:00
José Fonseca
6971be783b util: Improve the cpuid assembly.
No need to save ebx on 64bit. Use just xchgl. Refer to gcc's cpuid.h header.

Thanks to Uros Bizjak for pointing this out.
2009-10-05 16:49:21 +01:00
José Fonseca
7a7dfb09aa util: Fix cpuid invocation for x86_64. 2009-10-04 22:03:15 +01:00
José Fonseca
a81fb2a0d2 util: Cleanup u_cpu_detect, build. Support X86_64 and detect SSE4.1 too.
I was waiting for the need to use this code to arise, and it finally came.

I've tested building this on Linux and Windows, both x86 and x64_64. But
it might break other platforms. Please bear with me and help me fix it.

Many thanks to Dennis Smit who submitted this, and Eric Anholt whose
work this was based on.
2009-09-29 13:59:16 +01:00