Kenneth Graunke
85ec7abc3f
intel/decoder: Fix control / evaluation label mixup.
...
Trivial. DS is TES, HS is TCS.
2018-02-01 09:44:15 -08:00
Kenneth Graunke
c3cd2aac27
i965: Bump official kernel requirement to Linux v3.9.
...
In commit 3f353342a6 (present in 17.3.0)
we started unconditionally using I915_EXEC_NO_RELOC, which was
introduced in Linux v3.9. ChromeOS kernel 3.8 has backported this,
so it should work too.
Running on older kernels would likely result in every single batch
being rejected by the kernel, which is pretty catastrophic. Yet, it
appears that nobody noticed. So, let's just bump the official
requirement and move forward ever so slowly.
Fixes: 3f353342a6 ("i965: Use I915_EXEC_NO_RELOC")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 07:58:58 -08:00
Marc Dietrich
4c5f0b4fd4
meson: don't install windows headers on non-windows platforms
...
Only dive into the windows subdir if windows platform is selected.
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
Fixes: 5ef75cb02b "meson: build src/glx/windows"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2018-02-01 15:33:02 +00:00
Marek Olšák
71c6f64e54
radeonsi: use ac_build_buffer_load_format for image buffer loads
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
b0a6053a99
ac/nir: use ac_build_buffer_load_format for image buffer loads
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
bac9fa9f17
ac: add glc parameter to ac_build_buffer_load_format
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
be973ed21f
radeonsi: load the right number of components for VS inputs and TBOs
...
The supported counts are 1, 2, 4. (3=4)
The following snippet loads float, vec2, vec3, and vec4:
Before:
buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 80000904
buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507
After:
buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 80000A04
buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen ; E0042000 80020805
buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006
s_waitcnt vmcnt(0) ; BF8C0F70
buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Marek Olšák
472361dd7e
radeonsi: remove unused si_shader_context members
...
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-02-01 16:20:19 +01:00
Jon Turney
d3540b405b
glx/apple: locate dispatch table functions to wrap by name
...
Avoid reaching into the dispatch table internals (and thus having to deal
with the complexities of remap etc.) by identifying functions to wrap by
name.
See:
https://lists.freedesktop.org/archives/mesa-dev/2015-June/086721.html et seq.
https://bugs.freedesktop.org/show_bug.cgi?id=90311
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:08 +00:00
Jon Turney
b37b7b42dc
glx/apple: include util/debug.h for env_var_as_boolean prototype
...
mesa/src/glx/glxcmds.c:1295:21: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
mesa/src/glx/apple/apple_visual.c:85:28: error: implicit declaration of function 'env_var_as_boolean' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:14:02 +00:00
Jon Turney
f8ed9f24d5
osx: ld doesn't support --build-id
...
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:56 +00:00
Jon Turney
7ad7a07c88
configure: Default to gbm=no on osx
...
Signed-off-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-02-01 15:13:00 +00:00
Andres Rodriguez
bbd00844a2
mesa: remove usage of alloca in externalobjects.c v4
...
Don't want an overly large numBufferBarriers/numTextureBarriers to blow
up the stack.
v2: handle malloc errors
v3: fix patch
v4: initialize texObjs/bufObjs
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-02-01 09:48:04 -05:00
Samuel Pitoiset
2ef5ce1198
radv: do not insert shaders in cache when it's disabled
...
When the application doesn't provide its own pipeline cache,
the driver uses a in-memory cache but it shouldn't insert any
entries when the cache is explicitely disabled by the user.
Found while running my experimental pipeline-db tool with a
ton of shaders, the memory footprint was just huge, and sometimes
the process was even killed...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:40:11 +01:00
Samuel Pitoiset
4922e7f25c
radv: use separate bindings for graphics and compute descriptors
...
The Vulkan spec says:
"pipelineBindPoint is a VkPipelineBindPoint indicating whether
the descriptors will be used by graphics pipelines or compute
pipelines. There is a separate set of bind points for each of
graphics and compute, so binding one does not disturb the other."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:09 +01:00
Samuel Pitoiset
cf224014dd
radv: store the bind point when creating descriptors with templates
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-02-01 09:37:07 +01:00
Dave Airlie
7ea15a36fb
r600/eg: make sure we allow vpm bit on other CF ops.
...
the vpm bit wasn't being applied to the push/pop instructions.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 13:41:32 +10:00
Timothy Arceri
4d982ae2c7
gallium/st/clover: remove unused PIPE_SHADER_IR_LLVM
...
This has been unused since 100796c15c .
Acked-by: Marek Olšák <marek.olsak@amd.com>
2018-02-01 13:56:34 +11:00
Dave Airlie
0491d5425f
r600/sb: just add some missing debug bits
...
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 12:06:40 +10:00
Dave Airlie
df155a73f4
r600: fix buffer resinfo opcode translation.
...
The vtx operations never got translated, so things worked by
0 being equal to 0, translate them so we can use the proper buffer
resinfo code.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-02-01 11:59:55 +10:00
Timothy Arceri
679e4e7a46
st/glsl_to_nir: add more nir opts to st_nir_opts()
...
All of the current gallium nir driver use these optimisations but
they do so in their backends. Having these called in the backend
only can cause a number of problems:
- Shader compile times are greater because the opts need to do
significant passes over all shader variants.
- The shader cache is partially defeated due to the significant
optimisation passes over variants.
- We might miss out on nir linking optimisation opportunities.
Adding these passes to st_nir_opts() alleviates these problems.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2018-02-01 09:42:57 +11:00
Andres Gomez
5a7aba2e0a
i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
...
The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the <size> passed
to the VertexAttribL* commands was <= 2.
In 61a8a55f55 ("i965/gen8: Fix vertex attrib upload for dvec3/4
shader inputs"), for gen8+ we needed to determine if the attrib was
dual slot to emit 128 or 256-bit, independently of the VAO size.
Similarly, for gen < 8 we also need to determine whether the attrib is
dual slot to force the emission of 256-bits through 2 uploads.
Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
second upload to fill these unspecified components with zeros, as we
also do for gen8+.
Fixes the following test on Haswell:
KHR-GL46.vertex_attrib_binding.basic-inputL-case1
v2: Added more inline comments to explain why we are using
ISL_FORMAT_R32_FLOAT and its consequences, as requested by
Alejandro and Antía.
Fixes: 75968a668e ("i965/gen7: expose OpenGL 4.2 on Haswell when
supported")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Antia Puentes <apuentes@igalia.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Antia Puentes <apuentes@igalia.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-01-31 22:50:06 +02:00
Kenneth Graunke
ab1f2e6bc4
i965: Make texture validation code use texture objects, not units.
...
This requires moving the _MaxLevel handling up to the callers. Another
user of intel_finalize_mipmap_tree will be added later that depends on
_MaxLevel not being modified.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
0a2e878c69
i965: Pass tObj into intel_update_max_level instead of intel_obj.
...
We want both anyway, but this will simplify things a tiny bit in an
upcoming patch.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2018-01-31 11:33:52 -08:00
Kenneth Graunke
876f1537e9
i965: Delete more misleading comments.
...
brw_bo_wait_rendering used to take a brw_context pointer for perf_debug
messages about stalls. Chris eliminated that in 833108ac14 .
This message about passing NULL to avoid those warnings is no longer
relevant, and just adds confusion. So, drop it.
2018-01-31 11:33:52 -08:00
Andres Rodriguez
8996610acb
docs/features: mark EXT_semaphore(_fd) as DONE v2
...
Support for these extensions is available in radeonsi.
v2: also updated relnotes
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
2018-01-31 12:31:40 -05:00
Brian Paul
d32c22a13f
st/mesa: whitespace, formatting fixes in st_glsl_to_tgsi.cpp
...
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
3b3d8275d8
st/mesa: s/int/GLenum/ in st_glsl_to_tgsi.cpp
...
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
1882ec4ff7
svga: use opcode local var to simplify some code
...
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Brian Paul
338c35c427
svga: s/unsigned/VGPU10_OPCODE_TYPE/
...
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2018-01-31 08:17:25 -07:00
Samuel Pitoiset
a097a6f519
radv: do not dump meta shader stats
...
That's quite useless and that pollutes the output.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:26 +01:00
Samuel Pitoiset
26cc3e74b9
ac/nir: fix emission of ffract for 64-bit
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-01-31 14:10:24 +01:00
Eric Engestrom
2f0db33527
meson: dedup gallium-xa logic
...
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
fa5d616bf9
meson: dedup gallium-va logic
...
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
86168ed31c
meson: dedup gallium-omx logic
...
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
724916c8a8
meson: dedup gallium-xvmc logic
...
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Eric Engestrom
992af0a4b8
meson: dedup gallium-vdpau logic
...
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
2018-01-31 11:17:03 +00:00
Antia Puentes
0da434fb47
Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format"
...
This reverts commit 513c2263cb .
_mesa_base_fbo_format_ is used to validate the internalformat
passed to RenderbufferStorage, which in the OpenGL 4.6 is said:
"An INVALID_ENUM error is generated if internalformat is not one of the
color-renderable, depth-renderable, or stencil-renderable formats defined
in section 9.4."
RGB9_E5 format is not renderable, as stated in the same specification
(Bug 9338).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
2018-01-31 12:06:00 +01:00
Michel Dänzer
1cf1bf32ef
winsys/radeon: Compute is_displayable in surf_drm_to_winsys
...
It was always 0, breaking (at least) DRI3 with Xwayland.
Bugzilla: https://bugs.freedesktop.org/104306
Fixes: 5f2073be32 ("ac/surface: add ac_surface::is_displayable")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:53:58 +01:00
Matthew Nicholls
ef272b161e
radv: remove predication on cache flushes
...
This can lead to a situation where cache flushes could get conditionally
disabled while still clearing the flush_bits, and thus flushes due to
application pipeline barriers may never get executed.
Fixes: a6c2001ace (radv: add support for cmd predication.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 13:37:18 +10:00
Brian Paul
1ea9efd2f8
mesa: fix broken glGet*(GL_POLYGON_MODE) query
...
This reverts part of the patch which introduced the GLenum16 change.
Fixes a conform regression found by Roland.
Fixes: f96a69f916 ("mesa: replace GLenum with GLenum16 in
common structures (v4)")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-30 20:32:37 -07:00
Dave Airlie
49c61d8b84
virgl: also remove dimension on indirect.
...
This fixes some dEQP tests that generated bad shaders.
Fixes: b6f6ead19 (virgl: drop const dimensions on first block.)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Tested-by: Gurchetan Singh <gurchetansingh@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-31 12:24:11 +10:00
Marek Olšák
fdf01d0244
radeonsi: remove DBG_PRECOMPILE
...
it's useless and shader-db stats only report the main shader part.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
148b48646b
radeonsi: print shader-db stats for main parts, not final binaries
...
This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
c02c9ee550
radeonsi: move max_simd_waves computation into a separate function
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-31 03:21:20 +01:00
Marek Olšák
a7311cd7ee
mesa: fix glGet MAX_VERTEX_ATTRIB queries
...
Broken by f96a69f916
Reviewed-by: Brian Paul <brianp@vmware.com>
2018-01-31 03:21:20 +01:00
Jason Ekstrand
97938dac36
anv/cmd_buffer: Re-emit the pipeline at every subpass
...
If we ever hit this edge-case, it can theoretically cause problem for
CNL because we could end up changing render targets without re-emitting
3DSTATE_MULTISAMPLE which is part of the pipeline. Just get rid of the
edge case.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-30 17:16:33 -08:00
Ian Romanick
ee63933a73
nir: Distribute binary operations with constants into bcsel
...
This was specifically designed to simplify 1+mix(0, a-1, condition) to
mix(1, a, condition) by pushing the 1+ inside.
Skylake, Broadwell, and Haswell had similar results. Skylake shown.
total instructions in shared programs: 14521753 -> 14521716 (<.01%)
instructions in affected programs: 10619 -> 10582 (-0.35%)
helped: 51
HURT: 14
helped stats (abs) min: 1 max: 12 x̄: 1.43 x̃: 1
helped stats (rel) min: 0.20% max: 3.58% x̄: 1.01% x̃: 0.95%
HURT stats (abs) min: 1 max: 11 x̄: 2.57 x̃: 1
HURT stats (rel) min: 0.22% max: 1.75% x̄: 1.20% x̃: 1.32%
95% mean confidence interval for instructions value: -1.31 0.17
95% mean confidence interval for instructions %-change: -0.80% -0.27%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 533000205 -> 533003533 (<.01%)
cycles in affected programs: 110610 -> 113938 (3.01%)
helped: 43
HURT: 28
helped stats (abs) min: 6 max: 440 x̄: 27.12 x̃: 16
helped stats (rel) min: 0.39% max: 4.84% x̄: 1.60% x̃: 1.67%
HURT stats (abs) min: 2 max: 3066 x̄: 160.50 x̃: 14
HURT stats (rel) min: 0.08% max: 77.78% x̄: 5.16% x̃: 0.62%
95% mean confidence interval for cycles value: -43.81 137.56
95% mean confidence interval for cycles %-change: -1.47% 3.60%
Inconclusive result (value mean confidence interval includes 0).
Ivy Bridge
total instructions in shared programs: 10018840 -> 10018713 (<.01%)
instructions in affected programs: 9431 -> 9304 (-1.35%)
helped: 51
HURT: 3
helped stats (abs) min: 1 max: 80 x̄: 2.76 x̃: 1
helped stats (rel) min: 0.20% max: 16.43% x̄: 1.16% x̃: 0.81%
HURT stats (abs) min: 1 max: 12 x̄: 4.67 x̃: 1
HURT stats (rel) min: 0.22% max: 1.33% x̄: 0.59% x̃: 0.22%
95% mean confidence interval for instructions value: -5.36 0.66
95% mean confidence interval for instructions %-change: -1.66% -0.46%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 87571944 -> 87572785 (<.01%)
cycles in affected programs: 117234 -> 118075 (0.72%)
helped: 42
HURT: 23
helped stats (abs) min: 2 max: 114 x̄: 51.90 x̃: 30
helped stats (rel) min: 0.11% max: 11.01% x̄: 4.45% x̃: 2.74%
HURT stats (abs) min: 1 max: 2341 x̄: 131.35 x̃: 10
HURT stats (rel) min: 0.06% max: 37.11% x̄: 2.75% x̃: 0.61%
95% mean confidence interval for cycles value: -61.05 86.93
95% mean confidence interval for cycles %-change: -3.47% -0.33%
Inconclusive result (value mean confidence interval includes 0).
Sandy Bridge
total instructions in shared programs: 10542933 -> 10542844 (<.01%)
instructions in affected programs: 11487 -> 11398 (-0.77%)
helped: 52
HURT: 3
helped stats (abs) min: 1 max: 40 x̄: 1.96 x̃: 1
helped stats (rel) min: 0.08% max: 8.16% x̄: 0.90% x̃: 0.72%
HURT stats (abs) min: 1 max: 11 x̄: 4.33 x̃: 1
HURT stats (rel) min: 0.22% max: 1.22% x̄: 0.55% x̃: 0.22%
95% mean confidence interval for instructions value: -3.17 -0.07
95% mean confidence interval for instructions %-change: -1.13% -0.52%
Instructions are helped.
total cycles in shared programs: 146098397 -> 146097094 (<.01%)
cycles in affected programs: 128140 -> 126837 (-1.02%)
helped: 47
HURT: 8
helped stats (abs) min: 2 max: 333 x̄: 29.21 x̃: 18
helped stats (rel) min: 0.13% max: 5.04% x̄: 1.18% x̃: 0.95%
HURT stats (abs) min: 1 max: 16 x̄: 8.75 x̃: 9
HURT stats (rel) min: 0.08% max: 0.43% x̄: 0.30% x̃: 0.34%
95% mean confidence interval for cycles value: -37.49 -9.90
95% mean confidence interval for cycles %-change: -1.22% -0.71%
Cycles are helped.
Iron Lake
total instructions in shared programs: 7886711 -> 7886509 (<.01%)
instructions in affected programs: 10425 -> 10223 (-1.94%)
helped: 50
HURT: 2
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 15.38% x̄: 1.12% x̃: 0.54%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.86% max: 0.91% x̄: 0.89% x̃: 0.89%
95% mean confidence interval for instructions value: -8.05 0.28
95% mean confidence interval for instructions %-change: -1.83% -0.26%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 178115324 -> 178114612 (<.01%)
cycles in affected programs: 765726 -> 765014 (-0.09%)
helped: 39
HURT: 1
helped stats (abs) min: 2 max: 276 x̄: 18.31 x̃: 8
helped stats (rel) min: <.01% max: 8.47% x̄: 0.39% x̃: 0.04%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -32.07 -3.53
95% mean confidence interval for cycles %-change: -0.86% 0.10%
Inconclusive result (%-change mean confidence interval includes 0).
GM45
total instructions in shared programs: 4857762 -> 4857661 (<.01%)
instructions in affected programs: 5523 -> 5422 (-1.83%)
helped: 25
HURT: 1
helped stats (abs) min: 1 max: 78 x̄: 4.08 x̃: 1
helped stats (rel) min: 0.34% max: 13.61% x̄: 1.04% x̃: 0.52%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.86% max: 0.86% x̄: 0.86% x̃: 0.86%
95% mean confidence interval for instructions value: -9.99 2.22
95% mean confidence interval for instructions %-change: -2.01% 0.08%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 122179674 -> 122179194 (<.01%)
cycles in affected programs: 530162 -> 529682 (-0.09%)
helped: 22
HURT: 1
helped stats (abs) min: 2 max: 292 x̄: 21.91 x̃: 7
helped stats (rel) min: <.01% max: 8.65% x̄: 0.44% x̃: 0.04%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -46.56 4.82
95% mean confidence interval for cycles %-change: -1.20% 0.36%
Inconclusive result (value mean confidence interval includes 0).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:15 -08:00
Ian Romanick
03fb13f646
nir: Rearrange logic op-compounded integer compares
...
Skylake and Broadwell had similar results. Skylake shown.
total instructions in shared programs: 14521769 -> 14521753 (<.01%)
instructions in affected programs: 8782 -> 8766 (-0.18%)
helped: 16
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.40% x̄: 0.20% x̃: 0.18%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.23% -0.16%
Instructions are helped.
total cycles in shared programs: 533000376 -> 533000205 (<.01%)
cycles in affected programs: 447035 -> 446864 (-0.04%)
helped: 9
HURT: 9
helped stats (abs) min: 2 max: 40 x̄: 35.78 x̃: 40
helped stats (rel) min: 0.02% max: 0.18% x̄: 0.10% x̃: 0.09%
HURT stats (abs) min: 1 max: 52 x̄: 16.78 x̃: 10
HURT stats (rel) min: <.01% max: 1.11% x̄: 0.29% x̃: 0.12%
95% mean confidence interval for cycles value: -25.07 6.07
95% mean confidence interval for cycles %-change: -0.08% 0.27%
Inconclusive result (value mean confidence interval includes 0).
No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00
Ian Romanick
053be9f020
nir: Rearrange and-compounded float compares
...
If both comparisons are used as sources for instructions other than the
iand, this transformation is detrimental. If the non-identical value in
both compares is constant, the fmin or fmax will be constant-folded
away, so the transformation is always a win.
It is interesting to me that on Iron Lake only 81 shaders have
instruction counts changed, but 726 shaders have cycle counts changed.
shader-db results:
Skylake
total instructions in shared programs: 14525728 -> 14521017 (-0.03%)
instructions in affected programs: 1164726 -> 1160015 (-0.40%)
helped: 1692
HURT: 5
helped stats (abs) min: 1 max: 637 x̄: 2.79 x̃: 2
helped stats (rel) min: 0.07% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs) min: 1 max: 12 x̄: 3.20 x̃: 1
HURT stats (rel) min: 0.38% max: 2.86% x̄: 2.36% x̃: 2.86%
95% mean confidence interval for instructions value: -3.52 -2.03
95% mean confidence interval for instructions %-change: -0.86% -0.74%
Instructions are helped.
total cycles in shared programs: 533115449 -> 532991404 (-0.02%)
cycles in affected programs: 119401803 -> 119277758 (-0.10%)
helped: 1145
HURT: 467
helped stats (abs) min: 1 max: 34644 x̄: 145.92 x̃: 18
helped stats (rel) min: <.01% max: 45.33% x̄: 1.58% x̃: 0.42%
HURT stats (abs) min: 1 max: 1590 x̄: 92.15 x̃: 15
HURT stats (rel) min: <.01% max: 13.48% x̄: 1.26% x̃: 0.39%
95% mean confidence interval for cycles value: -122.16 -31.74
95% mean confidence interval for cycles %-change: -0.94% -0.57%
Cycles are helped.
total spills in shared programs: 9597 -> 9534 (-0.66%)
spills in affected programs: 403 -> 340 (-15.63%)
helped: 1
HURT: 1
total fills in shared programs: 13904 -> 13790 (-0.82%)
fills in affected programs: 1627 -> 1513 (-7.01%)
helped: 2
HURT: 1
LOST: 0
GAINED: 2
Broadwell
total instructions in shared programs: 14816966 -> 14812590 (-0.03%)
instructions in affected programs: 1499885 -> 1495509 (-0.29%)
helped: 1672
HURT: 15
helped stats (abs) min: 1 max: 455 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.05% max: 16.36% x̄: 0.81% x̃: 0.33%
HURT stats (abs) min: 1 max: 21 x̄: 9.20 x̃: 8
HURT stats (rel) min: 0.08% max: 2.86% x̄: 1.06% x̃: 0.53%
95% mean confidence interval for instructions value: -3.14 -2.05
95% mean confidence interval for instructions %-change: -0.85% -0.73%
Instructions are helped.
total cycles in shared programs: 559353622 -> 559345595 (<.01%)
cycles in affected programs: 139893703 -> 139885676 (<.01%)
helped: 921
HURT: 697
helped stats (abs) min: 1 max: 42424 x̄: 143.45 x̃: 18
helped stats (rel) min: <.01% max: 36.23% x̄: 2.02% x̃: 0.87%
HURT stats (abs) min: 1 max: 2370 x̄: 178.03 x̃: 38
HURT stats (rel) min: <.01% max: 17.35% x̄: 0.71% x̃: 0.14%
95% mean confidence interval for cycles value: -59.64 49.72
95% mean confidence interval for cycles %-change: -1.02% -0.66%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 78902 -> 78861 (-0.05%)
spills in affected programs: 2418 -> 2377 (-1.70%)
helped: 1
HURT: 11
total fills in shared programs: 83782 -> 83678 (-0.12%)
fills in affected programs: 3515 -> 3411 (-2.96%)
helped: 2
HURT: 11
LOST: 0
GAINED: 5
Haswell and Ivy Bridge had similar results. Haswell shown.
total instructions in shared programs: 9033898 -> 9032010 (-0.02%)
instructions in affected programs: 308064 -> 306176 (-0.61%)
helped: 921
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.05 x̃: 1
helped stats (rel) min: 0.17% max: 17.54% x̄: 0.80% x̃: 0.35%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for instructions value: -2.21 -1.87
95% mean confidence interval for instructions %-change: -0.88% -0.68%
Instructions are helped.
total cycles in shared programs: 84628949 -> 84620520 (<.01%)
cycles in affected programs: 2164913 -> 2156484 (-0.39%)
helped: 518
HURT: 359
helped stats (abs) min: 1 max: 440 x̄: 41.52 x̃: 20
helped stats (rel) min: <.01% max: 17.17% x̄: 1.95% x̃: 1.01%
HURT stats (abs) min: 1 max: 586 x̄: 36.43 x̃: 8
HURT stats (rel) min: 0.04% max: 18.65% x̄: 1.47% x̃: 0.40%
95% mean confidence interval for cycles value: -15.17 -4.05
95% mean confidence interval for cycles %-change: -0.77% -0.32%
Cycles are helped.
LOST: 0
GAINED: 4
Sandy Bridge
total instructions in shared programs: 10544860 -> 10542933 (-0.02%)
instructions in affected programs: 360019 -> 358092 (-0.54%)
helped: 931
HURT: 4
helped stats (abs) min: 1 max: 20 x̄: 2.07 x̃: 1
helped stats (rel) min: 0.11% max: 15.52% x̄: 0.68% x̃: 0.30%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 3.33% max: 3.33% x̄: 3.33% x̃: 3.33%
95% mean confidence interval for instructions value: -2.23 -1.89
95% mean confidence interval for instructions %-change: -0.76% -0.58%
Instructions are helped.
total cycles in shared programs: 146106820 -> 146098397 (<.01%)
cycles in affected programs: 3435047 -> 3426624 (-0.25%)
helped: 572
HURT: 329
helped stats (abs) min: 1 max: 1289 x̄: 32.52 x̃: 15
helped stats (rel) min: <.01% max: 26.29% x̄: 0.97% x̃: 0.33%
HURT stats (abs) min: 1 max: 1714 x̄: 30.93 x̃: 6
HURT stats (rel) min: 0.02% max: 41.31% x̄: 1.13% x̃: 0.19%
95% mean confidence interval for cycles value: -16.85 -1.85
95% mean confidence interval for cycles %-change: -0.39% -0.01%
Cycles are helped.
LOST: 1
GAINED: 0
Iron Lake
total instructions in shared programs: 7886925 -> 7886711 (<.01%)
instructions in affected programs: 25763 -> 25549 (-0.83%)
helped: 75
HURT: 6
helped stats (abs) min: 1 max: 13 x̄: 3.33 x̃: 1
helped stats (rel) min: 0.35% max: 17.57% x̄: 1.96% x̃: 0.53%
HURT stats (abs) min: 1 max: 16 x̄: 6.00 x̃: 1
HURT stats (rel) min: 2.86% max: 4.79% x̄: 3.49% x̃: 2.86%
95% mean confidence interval for instructions value: -3.69 -1.60
95% mean confidence interval for instructions %-change: -2.54% -0.57%
Instructions are helped.
total cycles in shared programs: 178116888 -> 178115324 (<.01%)
cycles in affected programs: 5858790 -> 5857226 (-0.03%)
helped: 484
HURT: 242
helped stats (abs) min: 2 max: 76 x̄: 5.27 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs) min: 2 max: 76 x̄: 4.07 x̃: 2
HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.19% x̃: 0.03%
95% mean confidence interval for cycles value: -2.76 -1.55
95% mean confidence interval for cycles %-change: -0.12% 0.01%
Inconclusive result (%-change mean confidence interval includes 0).
GM45
total instructions in shared programs: 4857870 -> 4857762 (<.01%)
instructions in affected programs: 13994 -> 13886 (-0.77%)
helped: 39
HURT: 5
helped stats (abs) min: 1 max: 13 x̄: 3.28 x̃: 2
helped stats (rel) min: 0.33% max: 17.11% x̄: 1.86% x̃: 0.48%
HURT stats (abs) min: 1 max: 16 x̄: 4.00 x̃: 1
HURT stats (rel) min: 2.86% max: 4.71% x̄: 3.23% x̃: 2.86%
95% mean confidence interval for instructions value: -3.86 -1.05
95% mean confidence interval for instructions %-change: -2.61% 0.04%
Inconclusive result (%-change mean confidence interval includes 0).
total cycles in shared programs: 122180744 -> 122179674 (<.01%)
cycles in affected programs: 3686646 -> 3685576 (-0.03%)
helped: 273
HURT: 141
helped stats (abs) min: 2 max: 76 x̄: 5.81 x̃: 6
helped stats (rel) min: 0.01% max: 10.70% x̄: 0.18% x̃: 0.06%
HURT stats (abs) min: 2 max: 76 x̄: 3.66 x̃: 2
HURT stats (rel) min: 0.01% max: 3.99% x̄: 0.16% x̃: 0.02%
95% mean confidence interval for cycles value: -3.42 -1.75
95% mean confidence interval for cycles %-change: -0.15% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
2018-01-30 15:40:14 -08:00