All Gen platforms had pretty similar results. (Skylake shown)
total instructions in shared programs: 14277220 -> 14277216 (<.01%)
instructions in affected programs: 422 -> 418 (-0.95%)
helped: 2
HURT: 0
total cycles in shared programs: 532577908 -> 532577848 (<.01%)
cycles in affected programs: 2800 -> 2740 (-2.14%)
helped: 2
HURT: 0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Unlike the much older -abs(a) >= 0.0 transformation, this is not
precise. The behavior changes if the source is NaN.
No shader-db changes on any platform.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
v2: need to do MAX{start+count} instead of MAX{count}
added piglit tests
v3: use malloc
Cc: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
A problem was reported with arm,arm64 targets build due to missing
libLLVM shared library dependency with AOSP; to avoid this issue vulkan.radv
is built conditionally only when radeonsi is in BOARD_GPU_DRIVERS
Fixes: 0ca153f869 ("android: radv: enable build of vulkan.radv HAL module")
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
d3d10 requires NaNs to get converted to 0 for float->unorm conversions
(and float->int etc.). GL spec probably doesn't care in general, but it
would make sense to have reasonable behavior in any case imho - the
old code was converting negative NaNs to 0, and positive NaNs to 255.
(Note that using float comparison isn't actually all that much more
effort in any case, at least with sse2 it's just float comparison
(ucommiss) instead of int one - I converted the second comparison
to float too simply because it saves the probably somewhat expensive
transfer of the float from simd to int domain (with sse2 via stack),
so the generated code actually has 2 less instructions, although float
comparisons are more expensive than int ones.)
Reviewed-by: Brian Paul <brianp@vmware.com>
This is a forward port of a patch from the AOSP/master tree:
bd633f11de%5E%21/
Which replaces HOST_OUT_release with HOST_OUT
As per Dan's explanation, the current code was incorrect to use
$(HOST_OUT_release) as $(HOST_OUT) will be set properly for
whether the current build that's being cleaned during
incrementals is using host debug or release builds.
Additionally Dan noted it was incredibly uncommon to use a debug
host build, as there was never a shortcut and one had to set an
environment variable manually. Thus it was rarely if ever tested.
Change-Id: I7972c0a50fa3520dcfa962d6dd7e602bfe22368d
Cc: Rob Herring <rob.herring@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Marissa Wall <marissaw@google.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
This is a forward port of a patch from the AOSP/master tree:
bd30b663f5%5E%21/
Since https://android-review.googlesource.com/c/718518 added
timespec_get() to bionic, mesa3d doesn't build due to redefinition
of timespec_get().
Avoid redefinition by defining HAVE_TIMESPEC_GET flag.
Test: build and boot tested db820c to UI.
Change-Id: I3dcc8034b48785e45cd3fa50e4d9cf2c684694a0
Cc: Rob Herring <rob.herring@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Marissa Wall <marissaw@google.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
This is a partial cherry-pick from AOSP's mesa3d tree:
a88dcf769e%5E%21/
"We're deprecating make implicit rules, preferring static pattern
rules, or just regular rules."
Without this patch, the freedesktop/master branch won't build in
the AOSP environment, and this patch corrects that, as tested
on the Dragonboard 820c.
The i965 portion of the patch this is based on collided badly,
and I'm not sure how to best forward port it. However, so far
we don't see build issues without that portion.
Comments or feedback would be appreciated!
Change-Id: Id6dfd0d018cbd665fa19d80c14abd5f75fa10b8a
Cc: Rob Herring <rob.herring@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Marissa Wall <marissaw@google.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Previously, I added a switch case for GL 2.1 (ed7a0770b881791dd697f3).
I don't know of any driver which only supports GL 2.0, but adding
this switch case avoids a failure if the app queries
GL_SHADING_LANGUAGE_VERSION.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tarball distribution is done through "make distcheck". We include the
meson targets also into autotools so they won't fail when building
from the tarball.
Fixes: 6a60beba40 ("intel/tools: Add an error state to aub translator")
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dylan Baker <dylan.c.baker@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
This leaves us with a series of little anv_pipeline_compile_* functions
which each take a compiler object, a mem_ctx, the stage to compile, and
the previous stage for VUE linking purposes. Some of them do
interesting things but most are little more than wrappers around
brw_compile_*.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This breaks compilation up a bit into "link" and "compile". In the
"link" stage, new anv_pipeline_link_* helpers are called which are
responsible for setting up the binding table and doing anything needed
to properly link with the next stage in the pipeline if one exists.
They are called in reverse order starting with the fragment shader so
you can assume linking in later stages is already done.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
We can set active_stages much more directly and then it's just candy
around setting pipeline->stages[stage].
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Instead of hashing each stage separately (and TES and TCS together), we
hash the entire pipeline. This means we'll get fewer cache hits if
they, for instance, re-use the same VS over and over again but it also
means we can now safely do cross-stage optimizations.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Instead of having each anv_pipeline_compile_* function populate the
shader key, make it part of the anv_pipeline_stage struct and fill it
out up-front.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
With a sufficently recent meson, the following warning is produced:
WARNING: Passed invalid keyword argument "extra_args".
WARNING: This will become a hard error in the future.
It seems that compiler.links(args:) is meant here.
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Zeroing memory after calloc is not necessary. This also allows to avoid
possible crash when allocation fails, because memset is called before
checking screen for NULL.
Fixes: a29d63ecf7 "swr: refactor swr_create_screen to allow
for proper cleanup on error"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>