Commit graph

114780 commits

Author SHA1 Message Date
Danylo Piliaiev
97792279e4 nir/loop_analyze: Treat do{}while(false) loops as 0 iterations
Loops like:

block block_0:
vec1 32 ssa_2 = load_const (0x00000020)
vec1 32 ssa_3 = load_const (0x00000001)
loop {
    vec1 32 ssa_7 = phi block_0: ssa_3, block_4: ssa_9
    vec1 1 ssa_8 = ige ssa_2, ssa_7
    if ssa_8 {
        break
    } else {
    }
    vec1 32 ssa_9 = iadd ssa_7, ssa_1
}

Were treated as having more than 1 iteration and after unrolling
produced wrong results, however such loop will exit during
the first iteration if not unrolled.

So we check if loop will actually loop.

Fixes tests/shaders/glsl-fs-loop-while-false-02.shader_test

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
(cherry picked from commit e71fc7f238)
2019-09-18 09:09:54 -07:00
Dylan Baker
3771534b2f Bump version for rc3 2019-09-11 09:05:32 -07:00
Marek Olšák
481d82b65b radeonsi/gfx10: fix wave occupancy computations
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit d95afd8b9e)
2019-09-10 09:51:57 -07:00
Marek Olšák
732950bf36 radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contexts
This fixes a crash.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
(cherry picked from commit 28adf0d00c)
2019-09-10 09:51:57 -07:00
Lepton Wu
637e02c1b1 virgl: Fix pipe_resource leaks under multi-sample.
Fixes: 900a80f9e4 ("virgl: virgl_transfer should own its virgl_resource")

Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
(cherry picked from commit 263136fb5d)
2019-09-10 09:51:57 -07:00
Timur Kristóf
e4bb2ceb78 st/nine: Properly initialize GLSL types for NIR shaders.
NIR shaders use GLSL types (note: these live outside libglsl), and
nine needs to properly initialize these just like the other state
trackers. This fixes an assertion failure when TTN is used.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
2019-09-09 18:01:13 +00:00
Bas Nieuwenhuizen
d858388bfc Revert "ac/nir: Lower large indirect variables to scratch"
This reverts commit 74470baebb.

This change introduces some significant performance regressions. We
are fixing those on master, but the follow up work is large enough
not to backport to 19.2 .

Fixes: 74470baebb "ac/nir: Lower large indirect variables to scratch"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111576
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-09 17:32:38 +00:00
Jason Ekstrand
7635b74acf nir/dead_cf: Repair SSA if the pass makes progress
The dead_cf pass calls into the CF manipulation helpers which attempt to
keep NIR's SSA form sane.  However, when the only break is removed from
a loop, dominance gets messed up anyway because the CF SSA clean-up code
only looks at phis and doesn't consider the case of code becoming
unreachable.  One solution to this would be to put the loop into LCSSA
form before we modify any of its contents.  Another (and the approach
taken by this pass) is to just run the repair_ssa pass afterwards
because the CF manipulation helpers are smart enough to keep all the
use/def stuff sane; they just don't always preserve dominance
properties.

While we're here, we clean up some bogus indentation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit c832820ce9)
2019-09-09 09:14:48 -07:00
Jason Ekstrand
2012c4a75c nir/repair_ssa: Insert deref casts when needed
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 1005272a2b)
2019-09-09 09:14:43 -07:00
Jason Ekstrand
3f56fa8fe9 nir/repair_ssa: Repair dominance for unreachable blocks
NIR currently assumes that unreachable blocks are trivially dominated by
everything.  However, when considering well-formed SSA, there is no path
from any block to an unreachable block.  Therefore, we can break any
use-def chains where the use is in an unreachable block.  This removes
any dependencies on code created by uses in unreachable blocks and lets
DCE do a better job of cleaning it up.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit a3268599f3)
2019-09-09 09:14:38 -07:00
Jason Ekstrand
2c6e34ac93 nir: Add a block_is_unreachable helper
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit f81a2623d8)
2019-09-09 09:14:34 -07:00
Jason Ekstrand
23bc3a401d nir: Don't infinitely recurse in lower_ssa_defs_to_regs_block
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 517142252f)
2019-09-09 09:13:30 -07:00
Jason Ekstrand
8889cc1241 nir: Handle complex derefs in nir_split_array_vars
We already bail and don't split the vars but we were passing a NULL to
_mesa_hash_table_search which is not allowed.

Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..."
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit 37cdb7fc44)
2019-09-09 09:13:23 -07:00
Eric Engestrom
89776bb48c drirc: override minImageCount=2 for gfxbench
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110765
Fixes: 4689e98fe8 ("vulkan/wsi: Set X11 minImageCount to 3.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 27339fe9a7)
2019-09-09 09:13:19 -07:00
Eric Engestrom
55a10da818 radv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 5eb7d48b58)
2019-09-09 09:13:14 -07:00
Eric Engestrom
c9b1f07e55 amd: move adaptive sync to performance section, as it is defined in xmlpool
Fixes: 3844ed8d44 ("radv: Add adaptive_sync driconfig option and enable it by default.")
Fixes: e260493f2a ("radeonsi: Enable adaptive_sync by default for radeon")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 4ad99ee961)
2019-09-09 09:13:09 -07:00
Eric Engestrom
9d22b0dc90 anv: add support for vk_x11_override_min_image_count
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 037b5b567f)
2019-09-09 09:13:04 -07:00
Eric Engestrom
cb548d566d wsi: add minImageCount override
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit a72cdd00ab)
2019-09-09 09:12:46 -07:00
Eric Engestrom
172cff6357 anv: add support for driconf
No option is supported yet, this is just the boilerplate.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 4dcb1fff19)
2019-09-09 09:12:41 -07:00
Jason Ekstrand
c1114994f3 anv: Bump maxComputeWorkgroupSize
Fixes: 9a129510f5 "anv: Bump maxComputeWorkgroupInvocations"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111552
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 3b1a7e5333)
2019-09-09 09:12:36 -07:00
Caio Marcelo de Oliveira Filho
1a99cfef28 nir/lower_explicit_io: Handle 1 bit loads and stores
Load a 32-bit value then convert to 1-bit.  Convert 1-bit to 32-bit
value, then Store it.

These cases started to appear when we changed Anvil to use derefs for
shared memory.

v2: Use `bit_size` in a couple of places we were missing.  (Jason)
    Reassign `value` instead of `src[0]`.  (Jason)

Fixes: 024a46a407 ("anv: use derefs for shared memory access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
(cherry picked from commit c0c55bd84f)
2019-09-06 08:33:07 -07:00
Jonathan Marek
f42a4300aa freedreno/a2xx: ir2: fix lowering of instructions after float lowering
Some instructions generated by int/bool float lowering need to be lowered
by opt_algebraic.

Fixes: 43dbd7d6

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3516a90ab4)
2019-09-06 08:30:40 -07:00
Sergii Romantsov
0e6ccd180c intel/dri: finish proper glthread
KWin was able to get NULL-context in the call
intelUnbindContext. But a call _mesa_glthread_finish
is not resistent to such case.
Case can be catched with steps:
	1. Create both glx and egl contexts
	2. Make glx as current
	3. Make egl as current
	4. Reset glx context
	5. Make egl as current

Solution adds proper finishing of glthread-context
(context will be taken from the requested dri-context
for unbinding, but not from the saved current context).

Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271
Fixes: dca36d5516 (i965: Implement threaded GL support)
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
(cherry picked from commit 1dce75c183)
2019-09-06 08:30:35 -07:00
Dave Stevenson
5e72987777 broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.
Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with
arbitrary offset/stride." for v3d.

Allows YUV buffers with a single buffer and plane offsets to be
passed in.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 873b092e91)
2019-09-05 15:16:51 -07:00
Connor Abbott
95145376d1 radv: Call nir_propagate_invariant()
Without this, invariant qualifiers don't do anything. Together with a
fix to the game, this fixes flickering in No Man's Sky.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
(cherry picked from commit 3f5b541fc8)
2019-09-05 08:44:57 -07:00
Samuel Pitoiset
3ef013f0e9 nir: do not assume that the result of fexp2(a) is always an integral
It's only correct when 'a' is an integral greater or equal to 0.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493
Fixes: 5544b2cbbd ("nir/algebraic: Use value range analysis to eliminate useless unary ops")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 966a455bb9)
(conflicts resolved by Dylan Baker)
2019-09-04 16:25:49 -07:00
Dylan Baker
b84cbdfd4d nir: Add is_not_negative helper function
This was taken from 636da12433, and is
needed by the next patch.
2019-09-04 16:25:49 -07:00
Hal Gentz
f180b04d65 glx: Fix SEGV due to dereferencing a NULL ptr from XCB-GLX.
When run in optirun, applications that linked to `libGLX.so` and then
proceeded to querying Mesa for extension strings caused a SEGV in Mesa.

`glXQueryExtensionsString` was calling a chain of functions that
eventually led to `__glXQueryServerString`. This function would call
`xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`.
The latter for some unknown reason returned `NULL`. Passing this `NULL`
to `xcb_glx_query_server_string_string_length` would cause a SEGV as the
function tried to dereference it.

The reason behind the function returning `NULL` is yet to be determined,
however, simply checking that the ptr is not `NULL` resolves this. A
similar check has been added to `__glXGetString` for completeness sake,
although not immediately necessary.

In addition to that, we stumbled into a similar problem in
`AllocAndFetchScreenConfigs` which tries to access the configs to free
them if `__glXQueryServerString` fails. This, of course, SEGVs, because the
configs are yet to have been allocated. Simply continuing past the configs
if their config ptrs are `NULL` resolves this. We also switch to `calloc`
to make sure that the config ptrs are `NULL` by default, and not some
uninitialized value.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 24b8a8cfe8 "glx: implement __glXGetString, hide __glXGetStringFromServer"
Fixes: cb3610e37c "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it "
Reviewed-by: Adam Jackson <ajax@redhat.com>
Signed-off-by: Hal Gentz <zegentzy@protonmail.com>
(cherry picked from commit 1591d1fee5)
2019-09-04 16:15:49 -07:00
Kenneth Graunke
45b22fb873 iris: Report correct number of planes for planar images
We were only handling the modifiers case and not counting the number of
planes in actual planar images.

Fixes Piglit's ext_image_dma_buf_import-export.

Fixes: fc12fd05f5 ("iris: Implement pipe_screen::resource_get_param")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
(cherry picked from commit f8887909c6)
2019-09-04 16:15:41 -07:00
Eric Engestrom
ffa1f33ec0 nir: fix memleak in error path
Fixes: 2cf59861a8 ("nir: Add partial redundancy elimination for compares")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
(cherry picked from commit 7659c6197f)
2019-09-04 16:15:36 -07:00
Eric Engestrom
0bbf6eab52 freedreno/drm-shim: fix mem leak
Fixes: 494ecef6b4 ("freedreno: Add support for drm-shim.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit c4969b0a25)
2019-09-04 16:15:29 -07:00
Eric Engestrom
ef0ccde381 anv: fix format string in error message
Fixes: 9775894f10 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
(cherry picked from commit 7abf65aedc)
2019-09-04 16:15:23 -07:00
Eric Engestrom
ba255cdd50 util/os_file: fix double-close()
Fixes: 955c63d364 ("util/os_file: resize buffer to what was actually needed")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 1667360f7d)
2019-09-04 16:14:52 -07:00
Eric Engestrom
1a4e7d7293 egl: fix deadlock in malloc error path
Fixes: cb0980e69a ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 43d470404c)
2019-09-04 16:14:41 -07:00
Eric Engestrom
1cfc906898 ttn: fix 64-bit shift on 32-bit 1
Fixes: 4d0b2c7aaa ("ttn: Update shader->info as we generate code.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
(cherry picked from commit 3afe9d798a)
2019-09-04 16:14:31 -07:00
Lionel Landwerlin
0a83226dc6 vulkan/overlay: bounce image back to present layout
Once we write the overlay to an image to be presented, we must not
forget to put it back into present layout.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111401
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
(cherry picked from commit 320b0f66c2)
2019-09-04 16:14:26 -07:00
Lionel Landwerlin
26cd407481 egl: fix platform selection
Add missing "device" platform

v2: Add the missing platform (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jean Hertel <jean.hertel@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529
Fixes: d6edccee8d ("egl: add EGL_platform_device support")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
(cherry picked from commit 6775a52400)
2019-09-04 16:14:01 -07:00
Erik Faye-Lund
451ddeb429 gallium/auxiliary/indices: consistently apply start only to input
The majority of these only apply the start argument to the input, but a
few of them also does for the output-array. util_primconvert, the only
user of this argument expects this pass a non-zero start-argument does
not expect this to be applied to the output; if it is, it will write
outside of allocated memory, leading to VRAM corruption.

The reason this doesn't seem to have been noticed before, is that no
driver currently use util_primconvert to convert a primitive-type to
itself, which is the cases where this was broken. But for Zink, this
will no longer be true, because we need to eliminate the use of 8-bit
index-buffers.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 28f3f8d413 ("gallium/auxiliary/indices: add start param")
Reviewed-by: Rob Clark <robdclark@chromium.org>
(cherry picked from commit 52af1427c6)
2019-09-04 16:13:57 -07:00
Vinson Lee
6ac1d9b46e travis: Fail build if any command in if statement fails.
Travis is checking the exit code of the entire if statement.

Fixes: 64ffc289be ("travis: add MacOS Scons build")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
(cherry picked from commit 029b07b2ad)
2019-09-04 16:13:52 -07:00
Vinson Lee
529db83e7b swr: Fix build with llvm-9.0 again.
Commit 6f7306c029 ("swr/rast: Refactor memory API between rasterizer
core and swr") unintentionally removed changes for llvm-9.0.

Fixes: 6f7306c029 ("swr/rast: Refactor memory API between rasterizer core and swr")
Fixes: 5dd9ad1570 ("swr/rasterizer: Better implementation of scatter")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
(cherry picked from commit 3664a6600e)
2019-09-04 16:13:45 -07:00
Dylan Baker
7f9b49218f bump version to 19.2-rc2 2019-09-04 14:22:26 -07:00
Pierre-Eric Pelloux-Prayer
efa4aee99d glsl: replace 'x + (-x)' with constant 0
This fixes a hang in shadertoy for radeonsi where a buffer was initialized with:

   value -= value

with value being undefined.
In this case LLVM replace the operation with an assignment to NaN.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111241
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 47cc660d9c)
2019-09-04 11:57:07 -07:00
Thong Thai
ec74c76a1a Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"
This reverts commit 5a2e65be89.

Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL
still needs to be emitted by the UMD, or else the driver will hang

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
(cherry picked from commit 2a3a560407)
2019-09-04 11:57:03 -07:00
Ian Romanick
6934bc4f08 nir/range-analysis: Handle constants in nir_op_mov just like nir_op_bcsel
I discovered this while looking at a shader that was hurt by some other
work I'm doing.  When I examined the changes, I was confused that one
instance of a comparison that was used in a discard_if was (incorrectly)
eliminated, while another instance used by a bcsel was (correctly) not
eliminated.  I had to use NIR_PRINT=true to see exactly where things
when wrong.

A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and
Strike Suit Zero were impacted.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16280659 -> 16281075 (<.01%)
instructions in affected programs: 21042 -> 21458 (1.98%)
helped: 0
HURT: 136
HURT stats (abs)   min: 1 max: 9 x̄: 3.06 x̃: 3
HURT stats (rel)   min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03%
95% mean confidence interval for instructions value: 2.93 3.19
95% mean confidence interval for instructions %-change: 2.08% 2.37%
Instructions are HURT.

total cycles in shared programs: 367168270 -> 367170313 (<.01%)
cycles in affected programs: 172020 -> 174063 (1.19%)
helped: 14
HURT: 111
helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9
helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79%
HURT stats (abs)   min: 2 max: 584 x̄: 21.08 x̃: 5
HURT stats (rel)   min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40%
95% mean confidence interval for cycles value: 5.41 27.28
95% mean confidence interval for cycles %-change: 0.64% 1.81%
Cycles are HURT.

(cherry picked from commit 7dba7df5e5)
2019-09-04 11:56:59 -07:00
Ian Romanick
6aa7a10370 nir/range-analysis: Fix incorrect fadd range result for (ne_zero, ne_zero)
Found by inspection.  I tried really, really hard to make a test case
that would trigger this problem, but I was unsuccesful.  It's very hard
to get an instruction to produce a ne_zero result without ne_zero
sources.  The most plausible way is using bcsel.  That proves
problematic because bcsel interprets its sources as integers, so it
cannot currently be used to "clean" values for floating point
instructions.

No shader-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
(cherry picked from commit 0b4782fccd)
2019-09-04 11:56:54 -07:00
Ian Romanick
97e44d6817 nir/range-analysis: Adjust result range of multiplication to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-fma-compare-zero.shader_test
    - fs-underflow-mul-compare-zero.shader_test

v2: Add back part of comment accidentally deleted.  Noticed by
Caio. Remove is_not_zero function as it is no longer used.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: fa116ce357 ("nir/range-analysis: Range tracking for ffma and flrp")
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

All Gen7+ platforms** had similar results. (Ice Lake shown)
total instructions in shared programs: 16278465 -> 16279492 (<.01%)
instructions in affected programs: 16765 -> 17792 (6.13%)
helped: 0
HURT: 23
HURT stats (abs)   min: 7 max: 275 x̄: 44.65 x̃: 8
HURT stats (rel)   min: 1.15% max: 17.51% x̄: 4.23% x̃: 1.62%
95% mean confidence interval for instructions value: 9.57 79.74
95% mean confidence interval for instructions %-change: 1.85% 6.61%
Instructions are HURT.

total cycles in shared programs: 367135159 -> 367154270 (<.01%)
cycles in affected programs: 279306 -> 298417 (6.84%)
helped: 0
HURT: 23
HURT stats (abs)   min: 13 max: 6029 x̄: 830.91 x̃: 54
HURT stats (rel)   min: 0.17% max: 45.67% x̄: 7.33% x̃: 0.49%
95% mean confidence interval for cycles value: 100.89 1560.94
95% mean confidence interval for cycles %-change: 0.94% 13.71%
Cycles are HURT.

total spills in shared programs: 8870 -> 8869 (-0.01%)
spills in affected programs: 19 -> 18 (-5.26%)
helped: 1
HURT: 0

total fills in shared programs: 21904 -> 21901 (-0.01%)
fills in affected programs: 81 -> 78 (-3.70%)
helped: 1
HURT: 0

LOST:   0
GAINED: 1

** On Broadwell, a shader was hurt for spills / fills instead of
   helped.

No changes on any earlier platforms.

(cherry picked from commit ef2e235252)
2019-09-04 11:56:43 -07:00
Ian Romanick
a8525d7751 nir/range-analysis: Adjust result range of exp2 to account for flush-to-zero
Fixes piglit tests (new in piglit!110):

    - fs-underflow-exp2-compare-zero.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: 405de7ccb6 ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

Most of the shaders affected are, unsurprisingly, in Unigine Heaven.

All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16278207 -> 16278465 (<.01%)
instructions in affected programs: 11374 -> 11632 (2.27%)
helped: 0
HURT: 58
HURT stats (abs)   min: 2 max: 13 x̄: 4.45 x̃: 4
HURT stats (rel)   min: 0.54% max: 4.11% x̄: 2.42% x̃: 2.82%
95% mean confidence interval for instructions value: 3.77 5.13
95% mean confidence interval for instructions %-change: 2.19% 2.64%
Instructions are HURT.

total cycles in shared programs: 367134284 -> 367135159 (<.01%)
cycles in affected programs: 81207 -> 82082 (1.08%)
helped: 17
HURT: 36
helped stats (abs) min: 6 max: 356 x̄: 90.35 x̃: 6
helped stats (rel) min: 0.69% max: 21.45% x̄: 5.71% x̃: 0.78%
HURT stats (abs)   min: 4 max: 235 x̄: 66.97 x̃: 16
HURT stats (rel)   min: 0.35% max: 27.58% x̄: 5.34% x̃: 1.09%
95% mean confidence interval for cycles value: -20.36 53.38
95% mean confidence interval for cycles %-change: -1.08% 4.67%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier platforms.

(cherry picked from commit 33ad2bab4b)
2019-09-04 11:56:39 -07:00
Ian Romanick
2971f079e1 nir/algebraic: Mark some value range analysis-based optimizations imprecise
This didn't fix bug #111308, but it was found will trying to find the
actual cause of that bug.

Fixes piglit tests (new in piglit!110):

    - fs-fract-of-NaN.shader_test
    - fs-lt-nan-tautology.shader_test
    - fs-ge-nan-tautology.shader_test

No shader-db changes on any Intel platform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111308
Fixes: b77070e293 ("nir/algebraic: Use value range analysis to eliminate tautological compares")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
(cherry picked from commit ccb236d1bc)
2019-09-04 11:56:34 -07:00
Kenneth Graunke
da03ddf677 iris: Fix partial fast clear checks to account for miplevel.
We enabled fast clears at level > 0, but didn't minify the dimensions
when comparing the box size, so we always thought it was a partial
clear and as a result never actually enabled any.

This eliminates some slow clears in Civilization VI, but they are mostly
during initialization and not the main rendering.

Thanks to Dan Walsh for noticing we had too many slow clears.

Fixes: 393f659ed8 ("iris: Enable fast clears on other miplevels and layers than 0.")
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
(cherry picked from commit 30b9ed92ea)
2019-09-04 11:56:30 -07:00
Ian Romanick
753ea83477 intel/compiler: Request bitfield_reverse lowering on pre-Gen7 hardware
See the previous commit for the explanation of the Fixes tag.

Hurts 21 shaders in shader-db.  All of the hurt shaders are in Unreal
Engine 4 tech demos.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: 7afa26d4e3 ("nir: Add lowering for nir_op_bitfield_reverse.")
(cherry picked from commit b418269d7d)
2019-09-04 11:56:12 -07:00