Commit graph

4830 commits

Author SHA1 Message Date
Jason Ekstrand
d011fbde5c nir: Use a switch statement in nir_handle_add_jump
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
2020-05-19 17:21:23 +00:00
Jason Ekstrand
8c87082c94 nir: Validate jump instructions as an instruction type
This has the downside of putting block successor validation in two
places that are a bit further apart.  However, handling them as a
special case makes the code more confusing than needed.  At least two
different people have not noticed that we don't have jump instruction
validation in the last week or two and added it.  Being able to search
for validate_jump_instr is useful.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5101>
2020-05-19 17:21:23 +00:00
Caio Marcelo de Oliveira Filho
c4544f4716 nir: Consider atomic counter intrinsics when setting writes_memory
In i965 these get lowered after gather info, so let's consider them
too.  Fixes

    piglit.spec.arb_framebuffer_no_attachments.arb_framebuffer_no_attachments-atomic

in Gen9, HSW and IVB.

Fixes: 6a6c36e977 ("intel/fs: Use writes_memory from shader_info")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5093>
2020-05-18 19:06:53 -07:00
Caio Marcelo de Oliveira Filho
d89c28d314 nir: Use deref intrinsics to set writes_memory when gathering info
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4815>
2020-05-18 21:09:17 +00:00
Samuel Pitoiset
844d561c58 spirv: handle OpCopyObject correctly with any types
This implements OpCopyObject as a blind copy and propagates the
access mask properly even if the source object type isn't a SSA
value.

This fixes some recent dEQP-VK.descriptor_indexing.* failures
since CTS changed and now apply nonUniformEXT after constructing
a combined image/sampler.

Original patch is from Jason Ekstrand.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4909>
2020-05-15 19:18:53 +00:00
Jason Ekstrand
ea62c23703 nir: Use 8-bit types for most info fields
This shrinks nir_intrinsics.c.o from 73K to 35K and nir_opcodes.c.o from
64K to 31K on a release build.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5045>
2020-05-15 03:49:18 +00:00
Karol Herbst
4aeaef99c0 Revert "nir/validate: validate the stride for deref_ptr_as_array"
This reverts commit 667e14e7bd
2020-05-14 17:36:43 +00:00
Karol Herbst
667e14e7bd nir/validate: validate the stride for deref_ptr_as_array
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
2020-05-14 15:13:13 +00:00
Karol Herbst
7afc9632a6 nir/deref: copy ptr_stride when rematerializing
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4068>
2020-05-14 15:13:13 +00:00
Rob Clark
42d38ad028 nir: add pass to lower disjoint wrmask's
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:49 -07:00
Rob Clark
a506d49fae nir: add helper to copy const_index[]
It seems less brittle to not assume they are in the same order for src
and dst instructions.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:45 -07:00
Rob Clark
3d3cfea78b nir: fix indices for ir3 ssbo_atomic intrinsics
Caught by the sanity checking in nir_intrinsic_copy_const_indices()
(which is introduced by the next patch).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2020-05-13 20:24:42 -07:00
Kristian H. Kristensen
14969aab11 freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics
These intrinsics are supposed to map to the underlying hardware
instructions, which don't have wrmask. We use them when we lower
store_output in the geometry pipeline and since store_output gets
lowered to temps, we always see full wrmasks there.
2020-05-13 20:24:33 -07:00
Jason Ekstrand
4627bfcd69 nir: Add some docs to the metadata types
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5028>
2020-05-14 01:57:12 +00:00
Eric Anholt
d2a0cde390 nir: Include num_ubos in the printed shader (if nonzero).
I keep wanting this number for debugging shaders.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
2020-05-14 00:10:43 +00:00
Daniel Schürmann
1f7d1541df nir: reset ssa-defs as non-divergent during divergence analysis instead of upfront
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Daniel Schürmann
1b881f3d8e nir: simplify phi handling in divergence analysis
This patch adds some control flow information to the
state to keep track whether a loop contains divergent
continue or break statements to not having to
recalculate this property for every phi.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Daniel Schürmann
450b1d87ba nir: rework phi handling in divergence analysis
This patch splits the visit_phi() function into
three different ones according to the kind of phi
(merge-node, loop-header or loop-exit) and calls
them when visiting the cf_nodes.
This allows to revisit loops if the loop header's
phis have changed, only.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Daniel Schürmann
febef22459 nir: refactor divergence analysis state
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Daniel Schürmann
b9ea0ca6ee nir: add nir_intrinsic_elect to divergence analysis
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Jason Ekstrand
ca2d53f451 nir: Make "divergent" a property of an SSA value
v2: fix usage in ACO (by Daniel Schürmann)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
2020-05-13 18:49:22 +00:00
Brian Ho
a43e974064 turnip: Execute ir3_nir_lower_gs pass again
This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.

As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
2020-05-12 13:42:55 -07:00
Eric Anholt
4553fc66a5 nir: Fix count when we didn't lower load_uniforms but did shift load_ubos.
The fixed commit was really nice in mostly fixing num_ubos to reflect the
shader after lowering, but for
dEQP-GLES31.functional.compute.basic.ubo_to_ssbo_single_invocation there
are no default uniforms and so we skipped the increment, even though we
shifted the block index up.

Fixes: 4777ee1a62 ("nir: Always create UBO variable when lowering uniforms to ubo")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>
2020-05-12 17:01:55 +00:00
Ian Romanick
412e29c277 nir/algebraic: Eliminate useless extract before unpack
The shader helped for spills and fills is the big compute shader in Dirt
Showdown.  One of the shaders hurt for spills and fills on Broadwell is
the big compute shader in Bioshock Infinite, but combined with the
previous commit, it's still an impovement.

Tiger Lake
total instructions in shared programs: 21833218 -> 21832449 (<.01%)
instructions in affected programs: 66104 -> 65335 (-1.16%)
helped: 106
HURT: 14
helped stats (abs) min: 1 max: 67 x̄: 7.87 x̃: 5
helped stats (rel) min: 0.19% max: 5.76% x̄: 1.27% x̃: 0.95%
HURT stats (abs)   min: 1 max: 14 x̄: 4.64 x̃: 1
HURT stats (rel)   min: 0.19% max: 4.12% x̄: 1.41% x̃: 0.19%
95% mean confidence interval for instructions value: -8.51 -4.30
95% mean confidence interval for instructions %-change: -1.23% -0.69%
Instructions are helped.

total cycles in shared programs: 506180109 -> 506196314 (<.01%)
cycles in affected programs: 1671429 -> 1687634 (0.97%)
helped: 37
HURT: 84
helped stats (abs) min: 1 max: 490 x̄: 73.27 x̃: 24
helped stats (rel) min: 0.02% max: 7.98% x̄: 1.25% x̃: 0.41%
HURT stats (abs)   min: 1 max: 5000 x̄: 225.19 x̃: 8
HURT stats (rel)   min: 0.03% max: 10.22% x̄: 1.22% x̃: 0.42%
95% mean confidence interval for cycles value: 2.85 265.00
95% mean confidence interval for cycles %-change: 0.04% 0.88%
Cycles are HURT.

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 19961317 -> 19960543 (<.01%)
instructions in affected programs: 30268 -> 29494 (-2.56%)
helped: 39
HURT: 0
helped stats (abs) min: 1 max: 142 x̄: 19.85 x̃: 7
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.33% x̃: 2.31%
95% mean confidence interval for instructions value: -29.46 -10.23
95% mean confidence interval for instructions %-change: -2.95% -1.71%
Instructions are helped.

total cycles in shared programs: 498863755 -> 498865843 (<.01%)
cycles in affected programs: 1831136 -> 1833224 (0.11%)
helped: 57
HURT: 65
helped stats (abs) min: 1 max: 1400 x̄: 128.93 x̃: 25
helped stats (rel) min: 0.05% max: 3.49% x̄: 0.89% x̃: 0.71%
HURT stats (abs)   min: 1 max: 1887 x̄: 145.18 x̃: 15
HURT stats (rel)   min: 0.02% max: 9.88% x̄: 1.83% x̃: 0.73%
95% mean confidence interval for cycles value: -58.30 92.53
95% mean confidence interval for cycles %-change: 0.16% 0.97%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8774 -> 8773 (-0.01%)
spills in affected programs: 20 -> 19 (-5.00%)
helped: 1
HURT: 0

total fills in shared programs: 9496 -> 9494 (-0.02%)
fills in affected programs: 40 -> 38 (-5.00%)
helped: 1
HURT: 0

Broadwell
total instructions in shared programs: 17859373 -> 17858548 (<.01%)
instructions in affected programs: 38452 -> 37627 (-2.15%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 143 x̄: 26.61 x̃: 10
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.57% x̃: 2.69%
95% mean confidence interval for instructions value: -39.79 -13.44
95% mean confidence interval for instructions %-change: -3.25% -1.89%
Instructions are helped.

total cycles in shared programs: 525858109 -> 525869236 (<.01%)
cycles in affected programs: 2058597 -> 2069724 (0.54%)
helped: 44
HURT: 75
helped stats (abs) min: 2 max: 1330 x̄: 187.84 x̃: 23
helped stats (rel) min: 0.04% max: 31.31% x̄: 2.13% x̃: 0.85%
HURT stats (abs)   min: 1 max: 3915 x̄: 258.56 x̃: 47
HURT stats (rel)   min: 0.02% max: 10.53% x̄: 2.81% x̃: 2.21%
95% mean confidence interval for cycles value: -26.06 213.07
95% mean confidence interval for cycles %-change: 0.19% 1.78%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 25744 -> 25730 (-0.05%)
spills in affected programs: 1578 -> 1564 (-0.89%)
helped: 4
HURT: 2

total fills in shared programs: 31710 -> 31689 (-0.07%)
fills in affected programs: 4346 -> 4325 (-0.48%)
helped: 3
HURT: 3

Haswell
total instructions in shared programs: 16228399 -> 16227783 (<.01%)
instructions in affected programs: 22201 -> 21585 (-2.77%)
helped: 27
HURT: 0
helped stats (abs) min: 1 max: 68 x̄: 22.81 x̃: 11
helped stats (rel) min: 0.19% max: 7.87% x̄: 2.92% x̃: 2.86%
95% mean confidence interval for instructions value: -31.96 -13.66
95% mean confidence interval for instructions %-change: -3.68% -2.15%
Instructions are helped.

total cycles in shared programs: 538613967 -> 538701354 (0.02%)
cycles in affected programs: 1653044 -> 1740431 (5.29%)
helped: 36
HURT: 81
helped stats (abs) min: 2 max: 708 x̄: 104.50 x̃: 17
helped stats (rel) min: <.01% max: 15.01% x̄: 1.67% x̃: 0.65%
HURT stats (abs)   min: 1 max: 30100 x̄: 1125.30 x̃: 304
HURT stats (rel)   min: 0.02% max: 16.21% x̄: 8.98% x̃: 11.60%
95% mean confidence interval for cycles value: 23.78 1470.01
95% mean confidence interval for cycles %-change: 4.29% 7.12%
Cycles are HURT.

total spills in shared programs: 23418 -> 23409 (-0.04%)
spills in affected programs: 177 -> 168 (-5.08%)
helped: 2
HURT: 0

total fills in shared programs: 25919 -> 25896 (-0.09%)
fills in affected programs: 568 -> 545 (-4.05%)
helped: 3
HURT: 0

Ivy Bridge
total instructions in shared programs: 15265983 -> 15265759 (<.01%)
instructions in affected programs: 8418 -> 8194 (-2.66%)
helped: 5
HURT: 0
helped stats (abs) min: 18 max: 99 x̄: 44.80 x̃: 26
helped stats (rel) min: 1.74% max: 4.26% x̄: 3.12% x̃: 3.00%
95% mean confidence interval for instructions value: -86.29 -3.31
95% mean confidence interval for instructions %-change: -4.43% -1.81%
Instructions are helped.

total cycles in shared programs: 422930336 -> 422929589 (<.01%)
cycles in affected programs: 59347 -> 58600 (-1.26%)
helped: 3
HURT: 2
helped stats (abs) min: 72 max: 1060 x̄: 433.33 x̃: 168
helped stats (rel) min: 1.14% max: 3.48% x̄: 2.23% x̃: 2.06%
HURT stats (abs)   min: 265 max: 288 x̄: 276.50 x̃: 276
HURT stats (rel)   min: 4.79% max: 5.64% x̄: 5.22% x̃: 5.22%
95% mean confidence interval for cycles value: -829.08 530.28
95% mean confidence interval for cycles %-change: -4.43% 5.93%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 4953 -> 4946 (-0.14%)
spills in affected programs: 344 -> 337 (-2.03%)
helped: 2
HURT: 0

total fills in shared programs: 5548 -> 5521 (-0.49%)
fills in affected programs: 838 -> 811 (-3.22%)
helped: 2
HURT: 0

No shader-db changes on any earlier Intel platform.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
2020-05-11 12:07:01 -07:00
Ian Romanick
bc0bbb8f0b nir/algebraic: Add some half packing optimizations for pack_half_2x16_split
Like 1f72857739 ("nir/algebraic: add some half packing optimizations"),
but for the pack_half_2x16_split variant.

The shader helped for spills and fills is the big compute shader in
Bioshock Infinite.

Tiger Lake
total instructions in shared programs: 21834539 -> 21833218 (<.01%)
instructions in affected programs: 60119 -> 58798 (-2.20%)
helped: 105
HURT: 0
helped stats (abs) min: 5 max: 50 x̄: 12.58 x̃: 9
helped stats (rel) min: 0.86% max: 26.46% x̄: 2.58% x̃: 1.70%
95% mean confidence interval for instructions value: -14.35 -10.81
95% mean confidence interval for instructions %-change: -3.20% -1.97%
Instructions are helped.

total cycles in shared programs: 506215169 -> 506180109 (<.01%)
cycles in affected programs: 1445088 -> 1410028 (-2.43%)
helped: 97
HURT: 8
helped stats (abs) min: 1 max: 16882 x̄: 387.76 x̃: 26
helped stats (rel) min: 0.05% max: 18.31% x̄: 1.77% x̃: 1.34%
HURT stats (abs)   min: 21 max: 635 x̄: 319.12 x̃: 212
HURT stats (rel)   min: 0.39% max: 20.08% x̄: 8.96% x̃: 4.46%
95% mean confidence interval for cycles value: -782.96 115.15
95% mean confidence interval for cycles %-change: -1.74% -0.16%
Inconclusive result (value mean confidence interval includes 0).

Ice Lake, Skylake, and Broadwell had similar results. (Ice Lake shown)
total instructions in shared programs: 19962974 -> 19961317 (<.01%)
instructions in affected programs: 63471 -> 61814 (-2.61%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 82 x̄: 15.78 x̃: 11
helped stats (rel) min: 1.11% max: 28.65% x̄: 3.17% x̃: 2.16%
95% mean confidence interval for instructions value: -18.38 -13.18
95% mean confidence interval for instructions %-change: -3.86% -2.48%
Instructions are helped.

total cycles in shared programs: 498908953 -> 498863755 (<.01%)
cycles in affected programs: 1566998 -> 1521800 (-2.88%)
helped: 89
HURT: 15
helped stats (abs) min: 2 max: 17502 x̄: 532.19 x̃: 69
helped stats (rel) min: 0.07% max: 18.54% x̄: 4.71% x̃: 3.12%
HURT stats (abs)   min: 3 max: 661 x̄: 144.47 x̃: 16
HURT stats (rel)   min: 0.14% max: 20.57% x̄: 4.29% x̃: 0.30%
95% mean confidence interval for cycles value: -903.93 34.74
95% mean confidence interval for cycles %-change: -4.50% -2.32%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 8776 -> 8774 (-0.02%)
spills in affected programs: 25 -> 23 (-8.00%)
helped: 1
HURT: 0

total fills in shared programs: 9500 -> 9496 (-0.04%)
fills in affected programs: 46 -> 42 (-8.70%)
helped: 1
HURT: 0

Haswell
total instructions in shared programs: 16229912 -> 16228399 (<.01%)
instructions in affected programs: 61257 -> 59744 (-2.47%)
helped: 105
HURT: 0
helped stats (abs) min: 6 max: 51 x̄: 14.41 x̃: 11
helped stats (rel) min: 0.77% max: 28.65% x̄: 3.08% x̃: 2.15%
95% mean confidence interval for instructions value: -16.14 -12.68
95% mean confidence interval for instructions %-change: -3.77% -2.40%
Instructions are helped.

total cycles in shared programs: 538654481 -> 538613967 (<.01%)
cycles in affected programs: 1448966 -> 1408452 (-2.80%)
helped: 58
HURT: 47
helped stats (abs) min: 9 max: 22604 x̄: 957.00 x̃: 74
helped stats (rel) min: 0.40% max: 18.81% x̄: 6.22% x̃: 3.03%
HURT stats (abs)   min: 5 max: 3720 x̄: 318.98 x̃: 49
HURT stats (rel)   min: 0.20% max: 34.50% x̄: 5.05% x̃: 2.12%
95% mean confidence interval for cycles value: -999.84 228.14
95% mean confidence interval for cycles %-change: -2.86% 0.51%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 15266086 -> 15265983 (<.01%)
instructions in affected programs: 7272 -> 7169 (-1.42%)
helped: 3
HURT: 0
helped stats (abs) min: 21 max: 41 x̄: 34.33 x̃: 41
helped stats (rel) min: 0.66% max: 5.43% x̄: 2.44% x̃: 1.23%

total cycles in shared programs: 422930883 -> 422930336 (<.01%)
cycles in affected programs: 49259 -> 48712 (-1.11%)
helped: 3
HURT: 0
helped stats (abs) min: 106 max: 221 x̄: 182.33 x̃: 220
helped stats (rel) min: 0.71% max: 5.95% x̄: 2.46% x̃: 0.72%

No changes on any earilier Intel platforms.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
2020-05-11 12:07:01 -07:00
Ian Romanick
a2bf41ec65 nir/algebraic: Optimize ushr of pack_half, not ishr
When a = -1.0, pack_half_2x16(vec2(0x0000, 0xBC00)) will produce
0xBC000000.  The ishr will produce 0xFFFFBC00.  The replacement
pack_half_2x16(vec2(0xBC00, 0x0000)) will produce 0x0000BC00.

Fixes: 1f72857739 ("nir/algebraic: add some half packing optimizations")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4515>
2020-05-11 12:07:01 -07:00
Samuel Pitoiset
04718a9cd6 nir: do not vectorize load/store if offset can overflow and robustness enabled
This prevents vectorization for loads/stores that can overflow if
the low offset is negative and the range greater or equal than 0.

The caller can pass the list of variable modes that matter for
robust access.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4881>
2020-05-11 07:25:15 +00:00
Ian Romanick
3b6449d453 nir/algebraic: Optimize some bfe patterns
v2: Use -x instead of 32-x in shift counts.

Tiger Lake
total instructions in shared programs: 17597691 -> 17597405 (<.01%)
instructions in affected programs: 224557 -> 224271 (-0.13%)
helped: 74
HURT: 17
helped stats (abs) min: 1 max: 71 x̄: 14.36 x̃: 7
helped stats (rel) min: 0.08% max: 1.80% x̄: 0.50% x̃: 0.37%
HURT stats (abs)   min: 1 max: 141 x̄: 45.71 x̃: 40
HURT stats (rel)   min: 0.03% max: 3.55% x̄: 1.20% x̃: 1.14%
95% mean confidence interval for instructions value: -10.53 4.24
95% mean confidence interval for instructions %-change: -0.38% 0.01%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 333595656 -> 333180770 (-0.12%)
cycles in affected programs: 70056467 -> 69641581 (-0.59%)
helped: 91
HURT: 4
helped stats (abs) min: 1 max: 25174 x̄: 4571.40 x̃: 400
helped stats (rel) min: <.01% max: 2.23% x̄: 0.40% x̃: 0.21%
HURT stats (abs)   min: 1 max: 370 x̄: 277.75 x̃: 370
HURT stats (rel)   min: 0.01% max: 0.04% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for cycles value: -5981.55 -2752.89
95% mean confidence interval for cycles %-change: -0.48% -0.29%
Cycles are helped.

Ice Lake, Skylake, Broadwell, and Haswell had similar results. (Ice Lake shown)
total instructions in shared programs: 16117204 -> 16116723 (<.01%)
instructions in affected programs: 207109 -> 206628 (-0.23%)
helped: 100
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 4.81 x̃: 7
helped stats (rel) min: 0.10% max: 1.58% x̄: 0.23% x̃: 0.20%
95% mean confidence interval for instructions value: -5.51 -4.11
95% mean confidence interval for instructions %-change: -0.27% -0.19%
Instructions are helped.

total cycles in shared programs: 330487341 -> 330082421 (-0.12%)
cycles in affected programs: 68037050 -> 67632130 (-0.60%)
helped: 89
HURT: 7
helped stats (abs) min: 2 max: 24610 x̄: 4567.07 x̃: 400
helped stats (rel) min: <.01% max: 1.52% x̄: 0.39% x̃: 0.22%
HURT stats (abs)   min: 1 max: 370 x̄: 221.29 x̃: 170
HURT stats (rel)   min: 0.01% max: 1.66% x̄: 0.58% x̃: 0.04%
95% mean confidence interval for cycles value: -5780.79 -2655.05
95% mean confidence interval for cycles %-change: -0.42% -0.22%
Cycles are helped.

Ivy Bridge
total instructions in shared programs: 11873641 -> 11873137 (<.01%)
instructions in affected programs: 147464 -> 146960 (-0.34%)
helped: 54
HURT: 0
helped stats (abs) min: 9 max: 10 x̄: 9.33 x̃: 9
helped stats (rel) min: 0.29% max: 0.41% x̄: 0.34% x̃: 0.34%
95% mean confidence interval for instructions value: -9.46 -9.20
95% mean confidence interval for instructions %-change: -0.35% -0.33%
Instructions are helped.

total cycles in shared programs: 175769085 -> 175549519 (-0.12%)
cycles in affected programs: 60770592 -> 60551026 (-0.36%)
helped: 54
HURT: 0
helped stats (abs) min: 252 max: 13434 x̄: 4066.04 x̃: 1290
helped stats (rel) min: 0.02% max: 0.74% x̄: 0.34% x̃: 0.26%
95% mean confidence interval for cycles value: -5323.59 -2808.48
95% mean confidence interval for cycles %-change: -0.41% -0.27%
Cycles are helped.

No changes on any earlier Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
2020-05-07 10:55:50 -07:00
Ian Romanick
f46eabf84e nir/algebraic: Split ibfe and ubfe with two constant sources
I also tried splitting ubfe instructions with one or zero constants,
and zero shaders in shader-db were affected.

The "lost" shader is a compute shader that was promoted from SIMD8 to
SIMD16, so is also counted as the gained shader.

v2: Further restrict bfe splitting.  bfe with multiple constants is
better on at least some Radeon GPUs.  Use -x instead of 32-x in shift
counts.

v3: Fix the outer shift count for ibfe lowering.  Add c=0 optimizations
to prevent bad lowering.  Both suggested by Rhys.  Add shift by -32
optimizations.

Tiger Lake
total instructions in shared programs: 17608764 -> 17596316 (-0.07%)
instructions in affected programs: 303765 -> 291317 (-4.10%)
helped: 113
HURT: 46
helped stats (abs) min: 1 max: 458 x̄: 120.67 x̃: 21
helped stats (rel) min: 0.09% max: 11.23% x̄: 3.47% x̃: 1.39%
HURT stats (abs)   min: 1 max: 201 x̄: 25.83 x̃: 6
HURT stats (rel)   min: 0.23% max: 5.18% x̄: 1.53% x̃: 1.11%
95% mean confidence interval for instructions value: -101.13 -55.45
95% mean confidence interval for instructions %-change: -2.61% -1.44%
Instructions are helped.

total cycles in shared programs: 338390770 -> 333530868 (-1.44%)
cycles in affected programs: 79438330 -> 74578428 (-6.12%)
helped: 112
HURT: 64
helped stats (abs) min: 2 max: 268955 x̄: 44261.93 x̃: 1452
helped stats (rel) min: <.01% max: 29.51% x̄: 4.72% x̃: 2.23%
HURT stats (abs)   min: 2 max: 17618 x̄: 1522.41 x̃: 84
HURT stats (rel)   min: <.01% max: 7.34% x̄: 1.35% x̃: 0.34%
95% mean confidence interval for cycles value: -37232.47 -17993.69
95% mean confidence interval for cycles %-change: -3.37% -1.65%
Cycles are helped.

total spills in shared programs: 8944 -> 8138 (-9.01%)
spills in affected programs: 3240 -> 2434 (-24.88%)
helped: 67
HURT: 0

total fills in shared programs: 9373 -> 7842 (-16.33%)
fills in affected programs: 4736 -> 3205 (-32.33%)
helped: 67
HURT: 0

LOST:   1
GAINED: 2

Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 16123288 -> 16116876 (-0.04%)
instructions in affected programs: 241155 -> 234743 (-2.66%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 209 x̄: 50.90 x̃: 7
helped stats (rel) min: 0.07% max: 5.94% x̄: 1.76% x̃: 0.65%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.05% max: 0.24% x̄: 0.15% x̃: 0.15%
95% mean confidence interval for instructions value: -61.29 -38.89
95% mean confidence interval for instructions %-change: -2.05% -1.42%
Instructions are helped.

total cycles in shared programs: 335419163 -> 330438819 (-1.48%)
cycles in affected programs: 77515502 -> 72535158 (-6.42%)
helped: 139
HURT: 37
helped stats (abs) min: 2 max: 269140 x̄: 36374.19 x̃: 597
helped stats (rel) min: <.01% max: 28.60% x̄: 3.67% x̃: 1.31%
HURT stats (abs)   min: 4 max: 17618 x̄: 2045.08 x̃: 174
HURT stats (rel)   min: 0.02% max: 8.32% x̄: 2.61% x̃: 0.62%
95% mean confidence interval for cycles value: -37799.30 -18795.51
95% mean confidence interval for cycles %-change: -3.13% -1.57%
Cycles are helped.

total spills in shared programs: 8065 -> 7306 (-9.41%)
spills in affected programs: 3153 -> 2394 (-24.07%)
helped: 67
HURT: 0

total fills in shared programs: 8710 -> 7412 (-14.90%)
fills in affected programs: 4466 -> 3168 (-29.06%)
helped: 67
HURT: 0

LOST:   1
GAINED: 1

Broadwell
total instructions in shared programs: 14970538 -> 14965967 (-0.03%)
instructions in affected programs: 227040 -> 222469 (-2.01%)
helped: 126
HURT: 2
helped stats (abs) min: 1 max: 136 x̄: 36.29 x̃: 8
helped stats (rel) min: 0.07% max: 6.02% x̄: 1.47% x̃: 0.89%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.05% max: 0.24% x̄: 0.14% x̃: 0.14%
95% mean confidence interval for instructions value: -43.05 -28.37
95% mean confidence interval for instructions %-change: -1.69% -1.19%
Instructions are helped.

total cycles in shared programs: 336237662 -> 333035960 (-0.95%)
cycles in affected programs: 72066394 -> 68864692 (-4.44%)
helped: 134
HURT: 42
helped stats (abs) min: 4 max: 122640 x̄: 24344.54 x̃: 1833
helped stats (rel) min: <.01% max: 26.93% x̄: 4.02% x̃: 2.38%
HURT stats (abs)   min: 1 max: 17205 x̄: 1439.69 x̃: 92
HURT stats (rel)   min: <.01% max: 7.12% x̄: 1.34% x̃: 0.62%
95% mean confidence interval for cycles value: -23753.58 -12629.40
95% mean confidence interval for cycles %-change: -3.50% -1.98%
Cycles are helped.

total spills in shared programs: 21122 -> 20204 (-4.35%)
spills in affected programs: 3644 -> 2726 (-25.19%)
helped: 67
HURT: 0

total fills in shared programs: 24879 -> 23460 (-5.70%)
fills in affected programs: 4883 -> 3464 (-29.06%)
helped: 67
HURT: 0

Haswell
total instructions in shared programs: 13148269 -> 13145444 (-0.02%)
instructions in affected programs: 137046 -> 134221 (-2.06%)
helped: 97
HURT: 3
helped stats (abs) min: 1 max: 137 x̄: 30.58 x̃: 3
helped stats (rel) min: 0.14% max: 4.38% x̄: 1.38% x̃: 0.44%
HURT stats (abs)   min: 1 max: 70 x̄: 47.00 x̃: 70
HURT stats (rel)   min: 0.05% max: 5.82% x̄: 3.90% x̃: 5.82%
95% mean confidence interval for instructions value: -37.15 -19.35
95% mean confidence interval for instructions %-change: -1.56% -0.89%
Instructions are helped.

total cycles in shared programs: 321221834 -> 318333159 (-0.90%)
cycles in affected programs: 54932349 -> 52043674 (-5.26%)
helped: 95
HURT: 53
helped stats (abs) min: 4 max: 123390 x̄: 30648.39 x̃: 702
helped stats (rel) min: <.01% max: 28.87% x̄: 4.27% x̃: 2.87%
HURT stats (abs)   min: 4 max: 2357 x̄: 432.49 x̃: 113
HURT stats (rel)   min: <.01% max: 3.44% x̄: 1.03% x̃: 0.54%
95% mean confidence interval for cycles value: -26154.16 -12881.99
95% mean confidence interval for cycles %-change: -3.20% -1.55%
Cycles are helped.

total spills in shared programs: 19878 -> 19293 (-2.94%)
spills in affected programs: 3020 -> 2435 (-19.37%)
helped: 41
HURT: 2

total fills in shared programs: 20918 -> 19875 (-4.99%)
fills in affected programs: 3968 -> 2925 (-26.29%)
helped: 41
HURT: 2

LOST:   0
GAINED: 1

Ivy Bridge
total instructions in shared programs: 11875585 -> 11873641 (-0.02%)
instructions in affected programs: 78065 -> 76121 (-2.49%)
helped: 27
HURT: 0
helped stats (abs) min: 8 max: 134 x̄: 72.00 x̃: 72
helped stats (rel) min: 0.36% max: 4.23% x̄: 2.42% x̃: 2.42%
95% mean confidence interval for instructions value: -83.68 -60.32
95% mean confidence interval for instructions %-change: -2.78% -2.07%
Instructions are helped.

total cycles in shared programs: 178232734 -> 175769085 (-1.38%)
cycles in affected programs: 50018707 -> 47555058 (-4.93%)
helped: 27
HURT: 0
helped stats (abs) min: 82035 max: 99953 x̄: 91246.26 x̃: 92278
helped stats (rel) min: 4.40% max: 5.69% x̄: 4.93% x̃: 4.95%
95% mean confidence interval for cycles value: -93674.20 -88818.32
95% mean confidence interval for cycles %-change: -5.09% -4.78%
Cycles are helped.

total spills in shared programs: 4182 -> 3739 (-10.59%)
spills in affected programs: 1089 -> 646 (-40.68%)
helped: 27
HURT: 0

total fills in shared programs: 5216 -> 4345 (-16.70%)
fills in affected programs: 1874 -> 1003 (-46.48%)
helped: 27
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
2020-05-07 10:55:50 -07:00
Ian Romanick
0d605a8bbf nir/algebraic: Recognize open-coded byte or word extract from bfe
v2: Move word-extract patterns up near the byte-extract patterns.
Suggested by Rhys.

Tiger Lake
total instructions in shared programs: 21369236 -> 21368712 (<.01%)
instructions in affected programs: 913104 -> 912580 (-0.06%)
helped: 209
HURT: 165
helped stats (abs) min: 1 max: 30 x̄: 5.35 x̃: 3
helped stats (rel) min: 0.03% max: 6.92% x̄: 0.28% x̃: 0.12%
HURT stats (abs)   min: 1 max: 18 x̄: 3.61 x̃: 3
HURT stats (rel)   min: 0.04% max: 0.87% x̄: 0.16% x̃: 0.12%
95% mean confidence interval for instructions value: -2.04 -0.76
95% mean confidence interval for instructions %-change: -0.14% -0.04%
Instructions are helped.

total cycles in shared programs: 490161481 -> 490175959 (<.01%)
cycles in affected programs: 72557244 -> 72571722 (0.02%)
helped: 193
HURT: 189
helped stats (abs) min: 1 max: 14240 x̄: 509.16 x̃: 71
helped stats (rel) min: <.01% max: 13.71% x̄: 0.44% x̃: 0.05%
HURT stats (abs)   min: 2 max: 4210 x̄: 596.53 x̃: 173
HURT stats (rel)   min: <.01% max: 5.59% x̄: 0.54% x̃: 0.14%
95% mean confidence interval for cycles value: -96.33 172.13
95% mean confidence interval for cycles %-change: -0.07% 0.16%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 10780 -> 10782 (0.02%)
spills in affected programs: 18 -> 20 (11.11%)
helped: 0
HURT: 1

total fills in shared programs: 10396 -> 10370 (-0.25%)
fills in affected programs: 2292 -> 2266 (-1.13%)
helped: 27
HURT: 1

Ice Lake
total instructions in shared programs: 19556356 -> 19555446 (<.01%)
instructions in affected programs: 833336 -> 832426 (-0.11%)
helped: 400
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 2.27 x̃: 2
helped stats (rel) min: 0.07% max: 4.42% x̄: 0.14% x̃: 0.10%
95% mean confidence interval for instructions value: -2.42 -2.13
95% mean confidence interval for instructions %-change: -0.18% -0.11%
Instructions are helped.

total cycles in shared programs: 488026481 -> 488008714 (<.01%)
cycles in affected programs: 81581708 -> 81563941 (-0.02%)
helped: 193
HURT: 206
helped stats (abs) min: 1 max: 3615 x̄: 576.35 x̃: 131
helped stats (rel) min: <.01% max: 4.50% x̄: 0.49% x̃: 0.22%
HURT stats (abs)   min: 1 max: 2244 x̄: 453.73 x̃: 170
HURT stats (rel)   min: <.01% max: 5.71% x̄: 0.36% x̃: 0.14%
95% mean confidence interval for cycles value: -127.23 38.17
95% mean confidence interval for cycles %-change: -0.12% 0.03%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 9935 -> 9908 (-0.27%)
fills in affected programs: 2208 -> 2181 (-1.22%)
helped: 27
HURT: 0

Skylake
total instructions in shared programs: 17766078 -> 17765186 (<.01%)
instructions in affected programs: 822017 -> 821125 (-0.11%)
helped: 399
HURT: 1
helped stats (abs) min: 1 max: 20 x̄: 2.27 x̃: 2
helped stats (rel) min: 0.07% max: 4.46% x̄: 0.15% x̃: 0.10%
HURT stats (abs)   min: 12 max: 12 x̄: 12.00 x̃: 12
HURT stats (rel)   min: 0.50% max: 0.50% x̄: 0.50% x̃: 0.50%
95% mean confidence interval for instructions value: -2.39 -2.07
95% mean confidence interval for instructions %-change: -0.18% -0.11%
Instructions are helped.

total cycles in shared programs: 470905548 -> 470907497 (<.01%)
cycles in affected programs: 78598491 -> 78600440 (<.01%)
helped: 202
HURT: 192
helped stats (abs) min: 1 max: 3690 x̄: 228.98 x̃: 60
helped stats (rel) min: <.01% max: 4.51% x̄: 0.24% x̃: 0.03%
HURT stats (abs)   min: 1 max: 2260 x̄: 251.05 x̃: 77
HURT stats (rel)   min: <.01% max: 5.31% x̄: 0.24% x̃: 0.06%
95% mean confidence interval for cycles value: -45.01 54.90
95% mean confidence interval for cycles %-change: -0.07% 0.05%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 9941 -> 9943 (0.02%)
spills in affected programs: 26 -> 28 (7.69%)
helped: 0
HURT: 1

total fills in shared programs: 10293 -> 10268 (-0.24%)
fills in affected programs: 2391 -> 2366 (-1.05%)
helped: 27
HURT: 1

Broadwell
total instructions in shared programs: 17463211 -> 17462366 (<.01%)
instructions in affected programs: 861444 -> 860599 (-0.10%)
helped: 399
HURT: 1
helped stats (abs) min: 1 max: 20 x̄: 2.14 x̃: 2
helped stats (rel) min: 0.03% max: 4.46% x̄: 0.14% x̃: 0.09%
HURT stats (abs)   min: 7 max: 7 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 0.33% max: 0.33% x̄: 0.33% x̃: 0.33%
95% mean confidence interval for instructions value: -2.26 -1.97
95% mean confidence interval for instructions %-change: -0.17% -0.10%
Instructions are helped.

total cycles in shared programs: 507048912 -> 506898243 (-0.03%)
cycles in affected programs: 79806433 -> 79655764 (-0.19%)
helped: 248
HURT: 136
helped stats (abs) min: 1 max: 8450 x̄: 1124.18 x̃: 64
helped stats (rel) min: <.01% max: 5.91% x̄: 0.83% x̃: 0.05%
HURT stats (abs)   min: 2 max: 7632 x̄: 942.12 x̃: 103
HURT stats (rel)   min: <.01% max: 5.62% x̄: 0.71% x̃: 0.08%
95% mean confidence interval for cycles value: -647.01 -137.73
95% mean confidence interval for cycles %-change: -0.47% -0.10%
Cycles are helped.

total spills in shared programs: 22996 -> 22998 (<.01%)
spills in affected programs: 31 -> 33 (6.45%)
helped: 0
HURT: 1

total fills in shared programs: 25951 -> 25923 (-0.11%)
fills in affected programs: 2444 -> 2416 (-1.15%)
helped: 29
HURT: 1

Haswell
total instructions in shared programs: 15841325 -> 15840554 (<.01%)
instructions in affected programs: 869679 -> 868908 (-0.09%)
helped: 394
HURT: 6
helped stats (abs) min: 1 max: 20 x̄: 2.15 x̃: 2
helped stats (rel) min: 0.06% max: 4.46% x̄: 0.14% x̃: 0.09%
HURT stats (abs)   min: 7 max: 18 x̄: 12.83 x̃: 13
HURT stats (rel)   min: 0.32% max: 0.82% x̄: 0.59% x̃: 0.61%
95% mean confidence interval for instructions value: -2.16 -1.69
95% mean confidence interval for instructions %-change: -0.16% -0.09%
Instructions are helped.

total cycles in shared programs: 520417167 -> 520279766 (-0.03%)
cycles in affected programs: 80949963 -> 80812562 (-0.17%)
helped: 246
HURT: 139
helped stats (abs) min: 1 max: 8152 x̄: 790.08 x̃: 129
helped stats (rel) min: <.01% max: 11.46% x̄: 0.70% x̃: 0.09%
HURT stats (abs)   min: 1 max: 7085 x̄: 409.78 x̃: 80
HURT stats (rel)   min: <.01% max: 5.25% x̄: 0.31% x̃: 0.06%
95% mean confidence interval for cycles value: -526.34 -187.43
95% mean confidence interval for cycles %-change: -0.49% -0.18%
Cycles are helped.

total spills in shared programs: 21714 -> 21729 (0.07%)
spills in affected programs: 174 -> 189 (8.62%)
helped: 0
HURT: 6

total fills in shared programs: 22136 -> 22132 (-0.02%)
fills in affected programs: 2848 -> 2844 (-0.14%)
helped: 31
HURT: 6

Ivy Bridge
total instructions in shared programs: 15177059 -> 15177003 (<.01%)
instructions in affected programs: 79370 -> 79314 (-0.07%)
helped: 29
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.93 x̃: 2
helped stats (rel) min: 0.06% max: 0.16% x̄: 0.08% x̃: 0.07%
95% mean confidence interval for instructions value: -2.03 -1.83
95% mean confidence interval for instructions %-change: -0.09% -0.07%
Instructions are helped.

total cycles in shared programs: 420424359 -> 420417254 (<.01%)
cycles in affected programs: 29562648 -> 29555543 (-0.02%)
helped: 23
HURT: 6
helped stats (abs) min: 2 max: 2741 x̄: 432.57 x̃: 142
helped stats (rel) min: <.01% max: 0.26% x̄: 0.04% x̃: 0.02%
HURT stats (abs)   min: 4 max: 1184 x̄: 474.00 x̃: 226
HURT stats (rel)   min: <.01% max: 0.11% x̄: 0.05% x̃: 0.05%
95% mean confidence interval for cycles value: -553.48 63.48
95% mean confidence interval for cycles %-change: -0.05% <.01%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 6420 -> 6393 (-0.42%)
fills in affected programs: 1901 -> 1874 (-1.42%)
helped: 27
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4156>
2020-05-07 10:55:50 -07:00
Rhys Perry
abc4a82857 nir: make fsat return 0.0 with NaN instead of passing it through
This is how lower_fsat and ACO implements fsat and is a more useful
definition since it can be exactly created from fmin(fmax(a, 0.0), 1.0).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3716>
2020-05-07 10:39:19 +00:00
Rhys Perry
d8a27c0bb3 compiler/spirv: flag nclamp/nmin/nmax as exact
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3716>
2020-05-07 10:39:19 +00:00
Rhys Perry
a46aa3dc2e nir: add missing group_memory_barrier handling
Totals from 2 (0.00% of 127638) affected shaders:
VGPRs: 164 -> 168 (+2.44%)
CodeSize: 18420 -> 18756 (+1.82%)
Instrs: 3658 -> 3700 (+1.15%)
Cycles: 82912 -> 83080 (+0.20%)
VMEM: 70 -> 69 (-1.43%)
PreVGPRs: 155 -> 168 (+8.39%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
CC: <mesa-stable@lists.freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4889>
2020-05-05 18:34:02 +00:00
Pierre-Eric Pelloux-Prayer
ea289d1502 mesa: extend GLSLZeroInit semantics
This commit introduces a new way to zero-init variables but keep the
old one to not break any existing behavior.

With this change GLSLZeroInit becomes an integer, with the following
possible values:
 - 0: no 0 init
 - 1: current behavior
 - 2: new behavior. Similar to 1, except ir_var_function_out type are
      0 initialized but ir_var_shader_out.

The rationale behind 2 is: zero initializing ir_var_shader_out can
prevent some optimization where out variables are completely eliminated
when not written to.

On the other hand, zero initializing "ir_var_function_out" has no
effect on correct shaders but typically helps shadertoy since the main
function is:

   void mainImage(out vec4 fragColor) { ... }

So with this change we're sure that fragColor will always get a value.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607>
2020-05-05 12:26:02 +02:00
Pierre-Eric Pelloux-Prayer
679421628b glsl: add a is_implicit_initializer flag
Shared globals and glsl_zero_init can cause linker errors if the
variable is only initialized in 1 place.

This commit adds a flag to variables that have been implicitely
initialized to be able in this situation to keep the explicit
initialization value.

Without this change the global-single-initializer-2-shaders piglit
test fails when using glsl_zero_init.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607>
2020-05-05 12:26:02 +02:00
Pierre-Eric Pelloux-Prayer
fa6b22d36a glsl: rework zero initialization
This commit makes zero_init a bitfield of types of variables to zeroinit.

This will allow some flexibility that will be used in the next commit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607>
2020-05-05 12:26:02 +02:00
Pierre-Eric Pelloux-Prayer
84f58a0863 glsl: init gl_FragColor if zero_init=true
This fixes shaders doing "gl_FragColor += ..." and doesn't hurt correct
shaders, because the zero init is discarded.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4607>
2020-05-05 12:26:02 +02:00
Louis-Francis Ratté-Boulianne
4777ee1a62 nir: Always create UBO variable when lowering uniforms to ubo
Zink needs to know the sizes of UBOs, and for normal UBOs we get this
from the nir_var_mem_ubo variables. This allows us to treat all of these
the same way.

We're about to need the same information for the in-progress D3D12
driver, so let's do this in a central location instead of in the driver.

This version is also a bit more careful than the Zink version. In
particular, for two reasons:
1. We increase the variable bindings when we adjust the pre-existing
   UBOs.
2. We increase shader->info.num_ubos when we insert a new UBO variable.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4734>
2020-05-05 09:17:51 +00:00
Erik Faye-Lund
8471f7a5fa compiler/glsl: explicitly store NumUniformBlocks
It's not great to use shader_info for this information, because it
might have gone through lowering of uniforms to UBOs, which can change
the number of UBOs. So let's make sure we know the size of the
UniformBlocks array from when the shader was linked instead.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4734>
2020-05-05 09:17:51 +00:00
Danylo Piliaiev
8059f206da glsl: rename has_implicit_uint_to_int_conversion to *_int_to_uint_*
There is no uint to int implicit conversion in glsl, this is just
a typo in the name of this function. The correct one would be:
has_implicit_int_to_uint_conversion.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4884>
2020-05-05 08:18:14 +00:00
Erik Faye-Lund
af55bdd05d vtn/opencl: native sqrt support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Erik Faye-Lund
337ff9c088 vtn/opencl: native rsqrt support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Erik Faye-Lund
2ab6a58c19 vtn/opencl: native recip support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Erik Faye-Lund
a698c2eedb vtn/opencl: native powr support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Erik Faye-Lund
594c49be08 vtn/opencl: native divide support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Erik Faye-Lund
bce8a86b65 vtn/opencl: native variants of sin/cos
These obviously map directly to nir opcodes.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Erik Faye-Lund
f76b379a9a vtn/opencl: add native_tan-support
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Erik Faye-Lund
aab1361d59 compiler/nir: move tan-calculation to helper
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4811>
2020-05-04 11:31:29 +00:00
Caio Marcelo de Oliveira Filho
2663759af0 intel/fs: Add and use a new load_simd_width_intel intrinsic
Intrinsic to get the SIMD width, which not always the same as subgroup
size.  Starting with a small scope (Intel), but we can rename it later
to generalize if this turns out useful for other drivers.

Change brw_nir_lower_cs_intrinsics() to use this intrinsic instead of
a width will be passed as argument.  The pass also used to optimized
load_subgroup_id for the case that the workgroup fitted into a single
thread (it will be constant zero).  This optimization moved together
with lowering of the SIMD.

This is a preparation for letting the drivers call it before the
brw_compile_cs() step.

No shader-db changes in BDW, SKL, ICL and TGL.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4794>
2020-05-01 12:50:37 -07:00
Eric Anholt
8c1c218909 freedreno/ir3: Improve shader key normalization.
We can remove a bunch of conditional code at key comparison time by
computing a bitmask of used key bits at ir3_shader creation time.  This
also gives us a nice place to put additional key simplification to reduce
how many variants we create (like skipping rastflat if we don't read
colors in the FS, or skipping vclamp_color if we don't write colors).

It does mean walking the whole key to AND it, but the key is just 28 bytes
so far so that seems pretty fine.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
2020-05-01 16:26:32 +00:00