Commit graph

2524 commits

Author SHA1 Message Date
Jason Ekstrand
8babaa84e8 intel/eu: Add support for the SENDS[C] messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d6a6e10390 intel/inst: Indent some code
We're about to add some more if cases so let's have the giant re-indent
in it's own patch to make review easier.

Acked-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d96969120d intel/inst: Fix the ia16_addr_imm helpers
These have clearly never seen any use.... On gen8, the bottom 4 bits are
missing so we need to shift them off before we call set_bits and shift
again when we get the bits.  Found by inspection.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
e46fb33143 intel/disasm: Rework SEND decoding to use descriptors
Instead of fetching the information out of the instruction directly,
fetch the descriptor and then pluck the information out of the
descriptor.  The current scheme works ok for SEND but with SENDS, it all
falls to pieces because the descriptor is completely shuffled around.

This commit doesn't actually convert everything.  One notable exception
is URB messages which don't even use descriptors in emit_urb_WRITE yet.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
13a6fabc62 intel/eu: Add more message descriptor helpers
We want to be able to extract data from descriptors as well as unify a
bit of the descriptor construction.

One of the unifications we do is to unify the read/write and dataport
descriptors.  On gen4-5, read/write are substantially different and the
read descriptors change between gen4 and gen4.x.  On gen6, they unified
layouts between read, write, and dataport.  Then, on gen8, they added
one bit to the message type field but left it reserved MBZ for
read/write messages.  This commit chooses to treat that as if they
expanded the field everywhere and just didn't have enough enum values
for read/write to bother with the extra bit.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
c3aa436bfe intel/eu/validate: SEND restrictions also apply to SENDC
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
fee6bd8d8e intel/eu: Use GET_BITS in brw_inst_set_send_ex_desc
It's a bit more readable

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
b284d222db intel/fs: Use SHADER_OPCODE_SEND for varying UBO pulls on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
8514eba693 intel/fs: Use SHADER_OPCODE_SEND for texturing on gen7+
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
f547cebbe0 intel/fs: Use a logical opcode for IMAGE_SIZE
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
d2d3e04501 intel/fs: Use SHADER_OPCODE_SEND for surface messages
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
7f1cf046cd intel/fs: Add a generic SEND opcode
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
ba3c5300f9 intel/eu: Rework surface descriptor helpers
This commit pulls the surface descriptor helpers out into brw_eu.h and
makes them no longer depend on the codegen infrastructure.  This should
allow us to use them directly from the IR code instead of the generator.
This change is unfortunately less mechanical than perhaps one would like
but it should be fairly straightforward.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
5b17379631 intel/eu: Add has_simd4x2 bools to surface_write functions
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
2ce93b88c0 intel/fs: Take an explicit exec size in brw_surface_payload_size()
Instead of magically falling back to SIMD8 for atomics and typed
messages on Ivy Bridge, explicitly figure out the exec size and pass
that into brw_surface_payload_size.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
cf42b0f9e2 intel/fs: Handle IMAGE_SIZE in size_read() and is_send_from_grf()
Like all the other sends, it's just mlen * REG_SIZE.

Fixes: 3cbc02e469 "intel: Use TXS for image_size when we have..."
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
009c0bd840 intel/defines: Explicitly cast to uint32_t in SET_FIELD and SET_BITS
If you pass a bool in as the value to set, the C standard says that it
gets converted to an int prior to shifting.  If you try to set a bool to
bit 31, this lands you in undefined behavior.  It's better just to add
the explicit cast and let the compiler delete it for us.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Jason Ekstrand
077b9557a4 intel/fs: Get rid of fs_inst::equals
There are piles of fields that it doesn't check so using it is a lie.
The only reason why it's not causing problem is because it has exactly
one user which only uses it for MOV instructions (which aren't very
interesting) and only on Sandy Bridge and earlier hardware.  Just get
rid of it and inline it in the one place that it's actually used.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-29 18:43:55 +00:00
Matt Turner
18b467c066 intel/compiler: Add a file-level description of brw_eu_validate.c
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-26 10:33:22 -08:00
Matt Turner
e166003cb7 intel/compiler: Reset default flag register in brw_find_live_channel()
emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its
flag_subreg set, so that the IR knows which flag is accessed. However
the flag is only used on Gen7 in Align1 mode.

To avoid setting unnecessary bits in the instruction words, get the
information we need and reset the default flag register. This allows
round-tripping through the assembler/disassembler.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-23 22:48:29 -08:00
Karol Herbst
b9fec2b38c nir: replace more nir_load_system_value calls with builder functions
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 00:16:51 +01:00
Karol Herbst
9b24028426 nir: rename nir_var_function to nir_var_function_temp
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-19 20:01:41 +01:00
Francisco Jerez
c84ec70b3a intel/fs: Promote execution type to 32-bit when any half-float conversion is needed.
The docs are fairly incomplete and inconsistent about it, but this
seems to be the reason why half-float destinations are required to be
DWORD-aligned on BDW+ projects.  This way the regioning lowering pass
will make sure that the destination components of W to HF and HF to W
conversions are aligned like the corresponding conversion operation
with 32-bit execution data type.

Tested-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-18 16:09:39 -08:00
Jason Ekstrand
eb32dad07c intel/fs: Don't touch accumulator destination while applying regioning alignment rule
In some shaders, you can end up with a stride in the source of a
SHADER_OPCODE_MULH.  One way this can happen is if the MULH is acting on
the top bits of a 64-bit value due to 64-bit integer lowering.  In this
case, the compiler will produce something like this:

mul(8)    acc0<1>UD   g5<8,4,2>UD   0x0004UW      { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

The new region fixup pass looks at the MUL and sees a strided source and
unstrided destination and determines that the sequence is illegal.  It
then attempts to fix the illegal stride by replacing the destination of
the MUL with a temporary and emitting a MOV into the accumulator:

mul(8)    g9<2>UD     g5<8,4,2>UD   0x0004UW      { align1 1Q };
mov(8)    acc0<1>UD   g9<8,4,2>UD                 { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

Unfortunately, this new sequence isn't correct because MOV accesses the
accumulator with a different precision to MUL and, instead of filling
the bottom 32 bits with the source and zeroing the top 32 bits, it
leaves the top 32 (or maybe 31) bits alone and full of garbage.  When
the MACH comes along and tries to complete the multiplication, the
result is correct in the bottom 32 bits (which we throw away) and
garbage in the top 32 bits which are actually returned by MACH.

This commit does two things:  First, it adds an assert to ensure that we
don't try to rewrite accumulator destinations of MUL instructions so we
can avoid this precision issue.  Second, it modifies
required_dst_byte_stride to require a tightly packed stride so that we
fix up the sources instead and the actual code which gets emitted is
this:

mov(8)    g9<1>UD     g5<8,4,2>UD                 { align1 1Q };
mul(8)    acc0<1>UD   g9<8,8,1>UD   0x0004UW      { align1 1Q };
mach(8)   g6<1>UD     g5<8,4,2>UD   0x00000004UD  { align1 1Q AccWrEnable };

Fixes: efa4e4bc5f "intel/fs: Introduce regioning lowering pass"
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2019-01-18 10:18:52 -06:00
Jason Ekstrand
0a7ac6d543 intel/eu: Stop overriding exec sizes in send_indirect_message
For a long time, we based exec sizes on destination register widths.
We've not been doing that since 1ca3a94427 but a few remnants
accidentally remained.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2019-01-18 10:18:52 -06:00
Kenneth Graunke
04c2f12ab2 i965: Drop mark_surface_used mechanism.
The original idea was that the backend compiler could eliminate
surfaces, so we would have it mark which ones are actually used,
then shrink the binding table accordingly.  Unfortunately, it's a
pretty blunt mechanism - it can only prune things from the end,
not the middle - since we decide the layout before we even start
the backend compiler, and only limit the size.  It also basically
gives up if it sees indirect array access.

Besides, we do the vast majority of our surface elimination in NIR
anyway, not the backend - and I don't see that trend changing any
time soon.  Vulkan abandoned this plan a long time ago, and I don't
use it in Iris, but it's still been kicking around in i965.

I hacked shader-db to print the binding table size in bytes, and
observed no changes with this patch.  So, this code appears to do
nothing useful.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-13 09:35:32 -08:00
Jason Ekstrand
24c8108ea6 intel/nir: Call nir_opt_deref in brw_nir_optimize
It's an optimization so we should probably be calling it in the
optimization loop.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-01-12 17:55:49 -06:00
Jason Ekstrand
1ede463b6e intel/peephole_ffma: Fix swizzle propagation
The num_components value passed into get_mul_for_src is used to only
compose the parts of the swizzle that we know will be used so we don't
compose invalid swizzle components.  However, we had a bug where we
passed the number of components of the add all the way through.  For the
given source, we need the number of components read from that source.
In the case where we have a narrow add, say 2 components, that is
sourced from a chain of wider instructions, we may not compose all the
swizzles.  All we really need to do is pass through the right number of
components at each level.

Fixes: 2231cf0ba3 "nir: Fix output swizzle in get_mul_for_src"
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-01-11 10:44:08 -06:00
Matt Turner
613ac3aaa2 i965: Compile fp64 software routines and lower double-ops
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
18b4e87370 intel/compiler: Heap-allocate temporary storage
Shaders containing software implementations of double-precision
operations can be very large such that we cannot stack-allocate
an array of grf_count*16.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
622d429128 intel/compiler: Expand size of the 'nr' field
Shaders containing software implementations of double-precision
operations can be very large such that we have more the 2^16 virtual
registers during optimization.

Move the 'nr' field to the union containing the immediate storage and
expand it to 32-bits.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
7e4e9da90d intel/compiler: Prevent warnings in the following patch
The next patch replaces an unsigned bitfield with a plain unsigned,
which triggers gcc to begin warning on signed/unsigned comparisons.

Keeping this patch separate from the actual move allows bisectablity and
generates no additional warnings temporarily.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
2b801b6668 intel/compiler: Rearrange code to avoid future problems
A follow on commit will move nr to the same union as the immediate
data, so we should assert these invariants before we overwrite the nr
field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
3b967e1724 intel/compiler: Avoid false positive assertions
A follow on patch will move the 'nr' field to the union containing the
immediate field, so prepare by checking that we're only testing these
assertions if the .file is correct.

The assertions with != ARF were kind of silly to begin with because the
<128 check is specifically only for things in the GRF.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:41 -08:00
Matt Turner
8534742404 intel/compiler: Split 64-bit MOV-indirects if needed
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-01-09 16:42:40 -08:00
Matt Turner
e76772af6c intel/compiler: Lower 64-bit MOV/SEL operations 2019-01-09 16:42:40 -08:00
Francisco Jerez
230a8a541d intel/fs: Remove FS_OPCODE_UNPACK_HALF_2x16_SPLIT opcodes.
These are broken on a future platform, but it turns out we don't need
to fix them, since they're just type-converting moves with strided
source.  Kill them.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
cbea91eb57 intel/fs: Remove nasty open-coded CHV/BXT 64-bit workarounds.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
2c99c7a56c intel/fs: Remove existing lower_conversions pass.
It's redundant with the functionality provided by lower_regioning now.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
efa4e4bc5f intel/fs: Introduce regioning lowering pass.
This legalization pass is meant to handle situations where the source
or destination regioning controls of an instruction are unsupported by
the hardware and need to be lowered away into separate instructions.
This should be more reliable and future-proof than the current
approach of handling CHV/BXT restrictions manually all over the
visitor.  The same mechanism is leveraged to lower unsupported type
conversions easily, which obsoletes the lower_conversions pass.

v2: Give conditional modifiers the same treatment as predicates for
    SEL instructions in lower_dst_modifiers() (Iago).  Special-case a
    couple of other instructions with inconsistent conditional mod
    semantics in lower_dst_modifiers() (Curro).

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
b94519971a intel/fs: Constify fs_inst::can_do_source_mods().
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:09 -08:00
Francisco Jerez
c301f447ea intel/fs: Respect CHV/BXT regioning restrictions in copy propagation pass.
Currently the visitor attempts to enforce the regioning restrictions
that apply to double-precision instructions on CHV/BXT at NIR-to-i965
translation time.  It is possible though for the copy propagation pass
to violate this restriction if a strided move is propagated into one
of the affected instructions.  I've only reproduced this issue on a
future platform but it could affect CHV/BXT too under the right
conditions.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
464e79144f intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.
I triggered this bug while prototyping code for a future platform on
IVB.  Could be a problem today though if a strided move is
copy-propagated into a type-converting move with DF destination.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
bc781a0323 intel/fs: Fix bug in lower_simd_width while splitting an instruction which was already split.
This seems to be a problem in combination with the lower_regioning
pass introduced by a future commit, which can modify a SIMD-split
instruction causing its execution size to become illegal again.  A
subsequent call to lower_simd_width() would hit this bug on a future
platform.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
812ede088f intel/fs: Implement quad swizzles on ICL+.
Align16 is no longer a thing, so a new implementation is provided
using Align1 instead.  Not all possible swizzles can be represented as
a single Align1 region, but some fast paths are provided for
frequently used swizzles that can be represented efficiently in Align1
mode.

Fixes ~90 subgroup quad swap Vulkan CTS tests.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Francisco Jerez
c5f9c0009d intel/fs: Handle source modifiers in lower_integer_multiplication().
lower_integer_multiplication() implements 32x32-bit multiplication on
some platforms by bit-casting one of the 32-bit sources into two
16-bit unsigned integer portions.  This can give incorrect results if
the original instruction specified a source modifier.  Fix it by
emitting an additional MOV instruction implementing the source
modifiers where necessary.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-01-09 12:03:08 -08:00
Karol Herbst
d0c6ef2793 nir: rename global/local to private/function memory
the naming is a bit confusing no matter how you look at it. Within SPIR-V
"global" memory is memory accessible from all threads. glsl "global" memory
normally refers to shader thread private memory declared at global scope. As
we already use "shared" for memory shared across all thrads of a work group
the solution where everybody could be happy with is to rename "global" to
"private" and use "global" later for memory usually stored within system
accessible memory (be it VRAM or system RAM if keeping SVM in mind).
glsl "local" memory is memory only accessible within a function, while SPIR-V
"local" memory is memory accessible within the same workgroup.

v2: rename local to function as well
v3: rename vtn_variable_mode_local as well

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-01-08 18:51:46 +01:00
Timothy Arceri
50de3f80a8 nir: rename nir_link_constant_varyings() nir_link_opt_varyings()
The following patches will add support for an additional
optimisation so this function will no longer just optimise varying
constants.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-01-02 12:19:17 +11:00
Iago Toral Quiroga
d6110d4d54 intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
The former expects to see SSA-only things, but the latter injects registers.

The assertions in the lowering where not seeing this because they asserted
on the bit_size values only, not on the is_ssa field, so add that assertion
too.

Fixes: 11dc130779 "nir: Add a bool to int32 lowering pass"
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-20 08:02:44 +01:00
Ian Romanick
af07141b33 intel/compiler: More peephole_select for pre-Gen6
No shader-db changes on any Gen6+ platform.

All of the shaders with cycles hurt by more than ~2% are from Master of
Orion.  All of the shaders have instructions helped.  It looks like the
pass enables some control flow to be converted to bcsels, then the
scheduler does dumb things.  These are new shaders (just added before
doing this shader-db run), so there's probably some low-hanging fruit.

Iron Lake
total instructions in shared programs: 8214327 -> 8213684 (<.01%)
instructions in affected programs: 84469 -> 83826 (-0.76%)
helped: 114
HURT: 26
helped stats (abs) min: 2 max: 18 x̄: 7.75 x̃: 9
helped stats (rel) min: 0.17% max: 13.73% x̄: 2.52% x̃: 1.05%
HURT stats (abs)   min: 2 max: 20 x̄: 9.23 x̃: 8
HURT stats (rel)   min: 0.70% max: 2.48% x̄: 1.66% x̃: 1.61%
95% mean confidence interval for instructions value: -5.87 -3.32
95% mean confidence interval for instructions %-change: -2.32% -1.17%
Instructions are helped.

total cycles in shared programs: 187736850 -> 187749314 (<.01%)
cycles in affected programs: 506750 -> 519214 (2.46%)
helped: 104
HURT: 36
helped stats (abs) min: 2 max: 72 x̄: 21.96 x̃: 16
helped stats (rel) min: 0.02% max: 6.16% x̄: 0.97% x̃: 0.63%
HURT stats (abs)   min: 4 max: 1402 x̄: 409.67 x̃: 40
HURT stats (rel)   min: 0.33% max: 23.12% x̄: 5.79% x̃: 1.39%
95% mean confidence interval for cycles value: 28.32 149.74
95% mean confidence interval for cycles %-change: -0.07% 1.61%
Inconclusive result (%-change mean confidence interval includes 0).

GM45
total instructions in shared programs: 5044014 -> 5043652 (<.01%)
instructions in affected programs: 46751 -> 46389 (-0.77%)
helped: 63
HURT: 13
helped stats (abs) min: 2 max: 29 x̄: 7.65 x̃: 9
helped stats (rel) min: 0.17% max: 13.73% x̄: 2.93% x̃: 1.04%
HURT stats (abs)   min: 2 max: 20 x̄: 9.23 x̃: 8
HURT stats (rel)   min: 0.66% max: 2.35% x̄: 1.58% x̃: 1.52%
95% mean confidence interval for instructions value: -6.54 -2.99
95% mean confidence interval for instructions %-change: -3.04% -1.28%
Instructions are helped.

total cycles in shared programs: 128143042 -> 128150188 (<.01%)
cycles in affected programs: 324564 -> 331710 (2.20%)
helped: 57
HURT: 19
helped stats (abs) min: 6 max: 74 x̄: 30.70 x̃: 32
helped stats (rel) min: 0.08% max: 4.74% x̄: 1.22% x̃: 0.81%
HURT stats (abs)   min: 10 max: 1400 x̄: 468.21 x̃: 60
HURT stats (rel)   min: 0.56% max: 19.94% x̄: 5.80% x̃: 1.70%
95% mean confidence interval for cycles value: 6.90 181.15
95% mean confidence interval for cycles %-change: -0.52% 1.59%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2018-12-17 13:47:06 -08:00