Commit graph

5840 commits

Author SHA1 Message Date
Mike Blumenkrantz
652e51e1f3 nir/lower_uniforms_to_ubo: set explicit_binding on uniform_0
this variable is always bound to buffer index 0, so the binding info
here is actually useful

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7935>
2021-01-14 17:29:09 +00:00
Mike Blumenkrantz
491e7decad util/set: add the found param to search_or_add
this brings parity with the internal api

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8450>
2021-01-14 13:51:35 +00:00
Rhys Perry
dfe429eb41 nir/loop_unroll: unroll more aggressively if it can improve load scheduling
Significantly improves performance of a Control compute shader. Also seems
to increase FPS at the very start of the game by ~5% (RX 580, 1080p,
medium settings, no MSAA).

fossil-db (Sienna):
Totals from 81 (0.06% of 139391) affected shaders:
SGPRs: 3848 -> 4362 (+13.36%); split: -0.99%, +14.35%
VGPRs: 4132 -> 4648 (+12.49%)
CodeSize: 275532 -> 659188 (+139.24%)
MaxWaves: 986 -> 906 (-8.11%)
Instrs: 54422 -> 126865 (+133.11%)
Cycles: 1057240 -> 750464 (-29.02%); split: -42.61%, +13.60%
VMEM: 26507 -> 61829 (+133.26%); split: +135.56%, -2.30%
SMEM: 4748 -> 5895 (+24.16%); split: +31.47%, -7.31%
VClause: 1933 -> 6802 (+251.89%); split: -0.72%, +252.61%
SClause: 1179 -> 1810 (+53.52%); split: -3.14%, +56.66%
Branches: 1174 -> 1157 (-1.45%); split: -23.94%, +22.49%
PreVGPRs: 3219 -> 3387 (+5.22%); split: -0.96%, +6.18%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6538>
2021-01-13 18:54:18 +00:00
Daniel Schürmann
08fbd5d454 nir/divergence_analysis: mark load_push_constant as uniform
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8439>
2021-01-12 14:46:13 +00:00
Mike Blumenkrantz
f7527f7f65 glcpp: disable 'windows' tests
these timeout a lot

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8321>
2021-01-12 01:51:16 +00:00
Daniel Schürmann
bd8e84eb8d nir: replace .lower_sub with .has_fsub and .has_isub
This allows a more fine-grained control about whether
a backend supports one of these instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
2021-01-11 19:13:51 +00:00
Daniel Schürmann
b3ce55b445 nir,vc4: Lower fneg to fmul(x, -1.0)
This patch also replaces lower_negate with lower_ineg / lower_fneg.

The fneg semantics have been clarified as of Version 1.5, Revision 1
of the SPIR-V specification, which means that the previous lowering
to fsub is not a viable solution anymore, and is replaced with
lowering to fmul(x, -1.0).

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
2021-01-11 19:13:51 +00:00
Erico Nunes
faaba0d6af nir/lower_vec_to_movs: don't vectorize unsupports ops
If the instruction being coalesced would be vectorized but the target
doesn't support vectorizing that op, skip coalescing.
Reuse the callbacks from alu_to_scalar to describe which ops should not
be vectorized.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6506>
2021-01-11 13:13:30 +00:00
Rhys Perry
b634d7f3e2 nir/opt_vectorize: fix srcs_equal() with two different non-const
To match hash_alu_src(), this should return false if both are different
non-const ssa defs.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391>
2021-01-09 11:14:05 +00:00
Rhys Perry
bdf316ae7b nir/opt_vectorize: fix typo in instr_can_rewrite()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391>
2021-01-09 11:14:05 +00:00
Eric Anholt
670944ba04 nir/lower_locals_to_regs: Use the imul_imm helper instead of forcing it.
Cleaned up a bit of addressing math in the shader I just had to debug.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8373>
2021-01-08 21:04:31 +00:00
Rhys Perry
f5adf27fb9 nir,radv: add and use nir_vectorize_tess_levels()
fossil-db (Sienna):
Totals from 1342 (0.97% of 138791) affected shaders:
CodeSize: 3287996 -> 3269572 (-0.56%); split: -0.56%, +0.00%
Instrs: 629896 -> 628191 (-0.27%); split: -0.31%, +0.04%
Cycles: 2619244 -> 2612424 (-0.26%); split: -0.30%, +0.04%
VMEM: 388807 -> 389273 (+0.12%); split: +0.14%, -0.02%
SMEM: 90655 -> 90700 (+0.05%); split: +0.06%, -0.01%
VClause: 21831 -> 21812 (-0.09%)
PreVGPRs: 44155 -> 44058 (-0.22%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
f199b7188b nir/load_store_vectorize: add data as callback args
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
00c8bec47b nir: add nir_load_store_vectorize_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
f4eb833a12 nir/load_store_vectorize: don't ignore subgroup memory barriers
Not sure why I thought this was correct, but we should consider them for
optimization purposes.

Fixes: ce9205c03b ('nir: add a load/store vectorization pass')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
c73c246e05 nir: gather whether a compute shader uses non-quad subgroup intrinsics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>
2021-01-07 15:01:02 +00:00
Rhys Perry
f7a5b8ed35 vtn: support SpvCapabilitySparseResidency
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
7d1d4acbd5 nir/lower_tex: fix lower_tg4_offsets with sparse fetches
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
2d2decc905 nir: add sparse_residency_code_and
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
4cbdf9ec4d nir,spirv: implement SpvOpImageSparseTexelsResident
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
1fd8b46667 nir,spirv: add sparse image loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
3a7972f72a nir,spirv: add sparse texture fetches
Like SPIR-V and GL_ARB_sparse_texture2, these return a residency code. It
is placed in the destination after the rest of the result. If it's zero,
then the texel is resident. Otherwise, it's not resident.

Besides the larger destination and the residency code, sparse fetches
work the same as normal fetches.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
95819663b7 nir: allow 5 component vectors
These will be useful for sparse texture instructions and image load
intrinsics.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
ba4a73a502 nir/tests: fix callback for load/store vectorizer tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Daniel Schürmann
22b89d9a52 nir/opt_vectorize: fix call to filter function
Due to the typo, it could happen that instructions
got further vectorized than intended.

Fixes: 8eaf9c61d1 ('nir/opt_vectorize: don't hash filtered instructions')
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8352>
2021-01-06 19:03:07 +00:00
Christian Gmeiner
c0fe111d64 nir: use intrinsic builders
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8295>
2021-01-06 14:34:41 +00:00
Mike Blumenkrantz
b5fb66a5ed nir: preserve explicit_binding in lower_atomics_to_ssbo
it's important to be able to tell whether this is explicitly set by the
user

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7489>
2021-01-06 12:56:09 +00:00
Jesse Natalie
4d83306a9a nir: Update saturated float->int/uint conversion algorithm
The mantissa for a float doesn't contain enough data to accurately represent
the min/max values for some destination types. Instead of clamping before
converting, clamp after converting when coming from floats. This improves
conformance of CL conversions, specifically for float -> long/ulong with
int64 emulation enabled.

Refactors the limit determination from the clamp, so we can determine
limits for the dest type (int/uint) in both the source (float) and dest
type. The limit as a float is used for comparison, while the limit as a
dest type is used for bcsel.

Important note is that the comparison is inverted to fge instead of flt,
so the bcsel chooses the direct int/uint over the converted float in the
case where the comparison comes up equal, but the conversion can't produce
the exact min/max value.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8256>
2021-01-05 19:46:25 +00:00
Alexander von Gluck IV
c7486c996e glsl/builtin_functions: Rename int64 function to int64_avail
* int64 is a core type on Haiku (and potentially other platforms)
* rename to int64_avail matching other similar calls

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2021-01-04 21:18:55 -06:00
Ian Romanick
539c25c2da nir/algebraic: Move the flrp -> bcsel rule earlier
If multiple rules could match, the rule that appears first in the file
is used.

Only Tiger Lake and Ice Lake are affected.  Other platforms either have
a LRP instruction or can't run any shaders from shader-db that would
benefit.

v2: Fix issues created when this commit was rebased on top of
3c8934a644 ("nir/algebraic: add flrp patterns for 16 and 64 bits").
Noticed by Caio.

Tiger Lake and Ice Lake had similar results.
total instructions in shared programs: 20908672 -> 20908661 (<.01%)
instructions in affected programs: 419 -> 408 (-2.63%)
helped: 5
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 2.20 x̃: 3
helped stats (rel) min: 1.85% max: 3.19% x̄: 2.49% x̃: 2.65%
95% mean confidence interval for instructions value: -3.56 -0.84
95% mean confidence interval for instructions %-change: -3.24% -1.73%
Instructions are helped.

total cycles in shared programs: 473513940 -> 473513793 (<.01%)
cycles in affected programs: 7176 -> 7029 (-2.05%)
helped: 12
HURT: 0
helped stats (abs) min: 5 max: 22 x̄: 12.25 x̃: 12
helped stats (rel) min: 0.84% max: 3.24% x̄: 2.09% x̃: 1.80%
95% mean confidence interval for cycles value: -15.43 -9.07
95% mean confidence interval for cycles %-change: -2.57% -1.61%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
ec16f935fe nir/algebraic: Mark comparisons generated from lowered fsign precise
This prevents other transformations from converting them to 'a != 0'.
For example, both of these transformations can do this:

   (('~flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
   (('~flt', ('fneg', ('fabs', a)), 0.0), ('fne', a, 0.0)),

Both fsign(fabs(NaN)) and fsign(fneg(fabs(NaN))) should produce zero,
but, since 'NaN != 0.0' is true, cascading these transformations could
cause them to generate 1.0 or -1.0 respecively.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
9771af5dde nir/algebraic: Fix broken NaN and -0.0 behavior
No shader-db or fossil-db changes on any Intel platform.

v2: Add a coding line to fix SCons build problems caused by the ±
character.

Fixes: 25bfba3335 ("nir/algebraic: Recognize open-coded copysign(1.0, a)")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
010e663cc3 spir-v: Mark floating point comparisons exact
OpenGL GLSL, OpenGL ARB assembly shaders, and DX9 are pretty loose about
the behavior in the presence of NaNs.  Many GPUs that implement these
specifications do not even have a representation of NaN.  However,
OpenCL and Vulkan SPIR-V are not so lax.  Both actually have some
required behavior in the presence of NaN, and, of the two, OpenCL is the
most strict.

For years we have implemented SPIR-V by using the same comparison
opcodes as we use for OpenGL GLSL and OpenGL assembly shaders.  This has
repeatedly caused problems where an optimization that is valid in the
NaN-relaxed world is not valid in Vulkan or OpenCL.  To fix this, set
the "exact" flag on comparisons instructions generated from SPIR-V.
This will block optimizations that may have different NaN behavior.

v2: Set the exact flag in the nir_builder, not in the vtn_builder.

v3: Add an assertion in vtn_handle_constant that the exact flag wasn't
set (because it's ignored).  Rebase on 80163bbec3 ("nir/vtn: Support
OpOrdered and OpUnordered opcodes").  Mark the NIR generated for those
opcodes as exact as well.

v4: s/unused_exact/exact/ in a couple places, and assert that exact has
the expected value (true in one place, false in the other).  Suggested
by Caio.

Closes: #3345
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Fixes: 8513b12590 ("nir/opt_if: split ALU from Phi more aggressively")

This commit doesn't really fix anything in 8513b12590.  However,
without 8513b12590, a regression is triggered in RADV on No Man's
Sky.  I want to ensure that this change is only applied on top of
8513b12590, and Fixes: seems the safest way to do that.

No shader-db changes on any Intel platform.  This only affects SPIR-V,
and we have no OpenGL SPIR-V shaders in shader-db.

124 shaders in Shadow of the Tomb Raider (Steam "native") were hurt by 1
spill and 1 fill each.

All Intel platforms had similar results. (Tiger Lake shown)
Instructions in all programs: 155668276 -> 155685764 (+0.0%)

SENDs in all programs: 6474570 -> 6474570 (+0.0%)

Loops in all programs: 35271 -> 35271 (+0.0%)

Cycles in all programs: 3198055373 -> 3198628031 (+0.0%)

Spills in all programs: 231522 -> 231646 (+0.1%)

Fills in all programs: 347571 -> 347695 (+0.0%)

Vega
Totals:
SGPRs: 20955712 -> 20956756 (+0.00%); split: -0.02%, +0.03%
VGPRs: 13476920 -> 13473132 (-0.03%); split: -0.07%, +0.04%
CodeSize: 613371940 -> 613339348 (-0.01%); split: -0.06%, +0.05%
MaxWaves: 3111886 -> 3112481 (+0.02%); split: +0.02%, -0.00%
Instrs: 120723785 -> 120746991 (+0.02%); split: -0.04%, +0.06%
Cycles: 626658992 -> 626862708 (+0.03%); split: -0.05%, +0.08%
VMEM: 216330854 -> 216343196 (+0.01%); split: +0.04%, -0.04%
SMEM: 32079391 -> 32081972 (+0.01%); split: +0.05%, -0.04%
VClause: 2688784 -> 2688789 (+0.00%); split: -0.03%, +0.03%
SClause: 6554669 -> 6556251 (+0.02%); split: -0.01%, +0.03%
Copies: 5356667 -> 5353283 (-0.06%); split: -0.36%, +0.29%
Branches: 954466 -> 954716 (+0.03%); split: -0.01%, +0.04%
PreSGPRs: 9078300 -> 9081626 (+0.04%); split: -0.01%, +0.05%
PreVGPRs: 10972090 -> 10966576 (-0.05%); split: -0.06%, +0.01%

Totals from 48239 (12.08% of 399432) affected shaders:
SGPRs: 2713984 -> 2715028 (+0.04%); split: -0.16%, +0.19%
VGPRs: 1997804 -> 1994016 (-0.19%); split: -0.46%, +0.27%
CodeSize: 172094092 -> 172061500 (-0.02%); split: -0.21%, +0.19%
MaxWaves: 337327 -> 337922 (+0.18%); split: +0.20%, -0.02%
Instrs: 33053657 -> 33076863 (+0.07%); split: -0.15%, +0.22%
Cycles: 254961228 -> 255164944 (+0.08%); split: -0.12%, +0.20%
VMEM: 15165226 -> 15177568 (+0.08%); split: +0.59%, -0.51%
SMEM: 3304938 -> 3307519 (+0.08%); split: +0.49%, -0.41%
VClause: 766225 -> 766230 (+0.00%); split: -0.12%, +0.12%
SClause: 1332645 -> 1334227 (+0.12%); split: -0.04%, +0.16%
Copies: 2040651 -> 2037267 (-0.17%); split: -0.94%, +0.77%
Branches: 743668 -> 743918 (+0.03%); split: -0.01%, +0.05%
PreSGPRs: 1697667 -> 1700993 (+0.20%); split: -0.07%, +0.27%
PreVGPRs: 1718424 -> 1712910 (-0.32%); split: -0.39%, +0.07%

Polaris
Totals:
SGPRs: 21349172 -> 21354376 (+0.02%); split: -0.02%, +0.04%
VGPRs: 13690680 -> 13686920 (-0.03%); split: -0.07%, +0.04%
CodeSize: 613745824 -> 613704988 (-0.01%); split: -0.06%, +0.05%
MaxWaves: 2775012 -> 2775189 (+0.01%); split: +0.01%, -0.00%
Instrs: 120735079 -> 120756209 (+0.02%); split: -0.04%, +0.06%
Cycles: 627906100 -> 628076156 (+0.03%); split: -0.05%, +0.08%
VMEM: 216623065 -> 216641838 (+0.01%); split: +0.04%, -0.04%
SMEM: 32295618 -> 32299338 (+0.01%); split: +0.05%, -0.04%
VClause: 2711025 -> 2711141 (+0.00%); split: -0.03%, +0.04%
SClause: 6545185 -> 6546769 (+0.02%); split: -0.01%, +0.03%
Copies: 5387723 -> 5383249 (-0.08%); split: -0.37%, +0.29%
Branches: 953775 -> 953954 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 9148814 -> 9153211 (+0.05%); split: -0.01%, +0.06%
PreVGPRs: 11029429 -> 11023915 (-0.05%); split: -0.06%, +0.01%

Totals from 48239 (12.00% of 402052) affected shaders:

SGPRs: 2682056 -> 2687260 (+0.19%); split: -0.16%, +0.35%
VGPRs: 1994436 -> 1990676 (-0.19%); split: -0.46%, +0.27%
CodeSize: 170857060 -> 170816224 (-0.02%); split: -0.21%, +0.19%
MaxWaves: 295429 -> 295606 (+0.06%); split: +0.07%, -0.01%
Instrs: 32808802 -> 32829932 (+0.06%); split: -0.16%, +0.22%
Cycles: 254633252 -> 254803308 (+0.07%); split: -0.13%, +0.20%
VMEM: 14897934 -> 14916707 (+0.13%); split: +0.65%, -0.52%
SMEM: 3289726 -> 3293446 (+0.11%); split: +0.53%, -0.42%
VClause: 775318 -> 775434 (+0.01%); split: -0.11%, +0.13%
SClause: 1304867 -> 1306451 (+0.12%); split: -0.04%, +0.16%
Copies: 2026334 -> 2021860 (-0.22%); split: -0.99%, +0.77%
Branches: 742554 -> 742733 (+0.02%); split: -0.02%, +0.04%
PreSGPRs: 1690887 -> 1695284 (+0.26%); split: -0.07%, +0.33%
PreVGPRs: 1717709 -> 1712195 (-0.32%); split: -0.40%, +0.07%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
55621c6d1c nir/algebraic: Add some compare-with-zero optimizations that are exact
This prevents some fossil-db regressions in "spir-v: Mark floating point
comparisons exact".

v2: Note that the patterns and replacements produce the same value when
isnan(b).  Suggested by Caio.

v3: Use C99 isfinite() instead of (obsolete) BSD finite().  Fixes
various Windows builds.

No fossil-db changes on any Inetl platform, Vega, or Polaris10.

All Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 20908670 -> 20908672 (<.01%)
instructions in affected programs: 69 -> 71 (2.90%)
helped: 0
HURT: 1

total cycles in shared programs: 473515288 -> 473513940 (<.01%)
cycles in affected programs: 4942 -> 3594 (-27.28%)
helped: 2
HURT: 0

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
9167324a86 nir/algebraic: Mark some logic-joined comparison reductions as exact
This also prevents some fossil-db regressions in "spir-v: Mark floating
point comparisons exact".

v2: Mark the fmin / fmax in the replacement exact to prevent other
optimizations from ruining the NaN-clensing property of the fmin / fmax.
Suggested by Rhys.  Don't assume that constants are not NaN because some
components of a vector might be NaN while others are numbers.  Noticed
by Rhys.  This causes ~8 more shaders in Age of Wonders III (dxvk) to
regress on cycles (not instructions) by less than 1% when "spir-v: Mark
floating point comparisons exact" is applied.  This difference is too
small to care.

All Intel platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 20908668 -> 20908670 (<.01%)
instructions in affected programs: 9196 -> 9198 (0.02%)
helped: 10
HURT: 5
helped stats (abs) min: 1 max: 2 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.02% max: 5.41% x̄: 2.20% x̃: 2.16%
HURT stats (abs)   min: 2 max: 6 x̄: 3.20 x̃: 3
HURT stats (rel)   min: 2.44% max: 16.67% x̄: 9.39% x̃: 12.50%
95% mean confidence interval for instructions value: -1.22 1.49
95% mean confidence interval for instructions %-change: -2.08% 5.41%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 473515330 -> 473515288 (<.01%)
cycles in affected programs: 67146 -> 67104 (-0.06%)
helped: 10
HURT: 7
helped stats (abs) min: 1 max: 36 x̄: 15.90 x̃: 17
helped stats (rel) min: 0.01% max: 1.29% x̄: 0.66% x̃: 0.89%
HURT stats (abs)   min: 1 max: 48 x̄: 16.71 x̃: 4
HURT stats (rel)   min: 0.08% max: 1.94% x̄: 0.87% x̃: 0.19%
95% mean confidence interval for cycles value: -13.88 8.94
95% mean confidence interval for cycles %-change: -0.56% 0.49%
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
71961c73a9 nir: Correctly constant fold fsign(NaN) and fsign(-0)
GLSL and SPIR-V GLSL.std.450 don't have any requirements for fsign(NaN),
and both only require that FSign(-0.0) == 0.0.  OpenCL, on the other
hand, requires sign(-0.0) be exactly -0.0.  It also requires that
sign(NaN) be exactly 0.0.

In practice, this change is difficult to test.  Our GLSL frontend
already constant folds sign(NaN) to 0.0 before even getting to NIR.  As
far as I can tell, glslang does the same.  I don't have a good way to
run an OpenCL SPIR-V test.  Maybe SPIR-V GLSL.std.450 assembly?

No shader-db or fossil-db changes on any Intel platform.

Acked-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
fe3c518277 nir/algebraic: Don't add reordered version of patterns for commutative instructions
The reordered are automatically considered by nir_algebraic rules for
commutative instructions.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
314a40c902 Revert "nir: Replace an odd comparison involving fmin of -b2f"
I originally noticed that 3b30814791 ("nir/algebraic: Optimize 1-bit
Booleans") caused this pattern no longer be matched by incorrectly
replacing b@32 with b@1.  Making that correct had no effect on
shader-db.  When this pattern originally was added, it only affected 4
shaders, so it's not worth the effort to debug further.

This reverts commit f50400cc80.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
aec0547838 nir/algebraic: Make some notes about comparison rearrangements versus infinity
The original comment was a little terse and a little incorrect.  The
rearrangements are fine w.r.t. NaN.  However, they produce incorrect
results if one operand is +Inf and the other is -Inf.

A later commit, "nir/algebraic: Add some compare-with-zero optimizations
that are exact", will add some more patterns here.  It may be reasonable
to squash this commit (forward) into that commit.

v2: Fix some incorrect comparisons operators in the comment (<= vs >=).
Add commentary that subtraction works like addition w.r.t. NaN.  Both
noticed / suggested by Caio.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
363efc2823 nir: Make some notes about fsign versus NaN
This commit only documents the current behavior, even if that behavior
is not the behavior preferred by the relevant specs.

In SPIR-V, there are two flavors of the sign instruction, and each lives
in an extended instruction set.  The GLSL.std.450 FSign instruction is
defined as:

    Result is 1.0 if x > 0, 0.0 if x = 0, or -1.0 if x < 0.

This also matches the GLSL 4.60 definition.

However, the OpenCL.ExtendedInstructionSet.100 sign instruction is
defined as:

    Returns 1.0 if x > 0, -0.0 if x = -0.0, +0.0 if x = +0.0, or -1.0 if
    x < 0. Returns 0.0 if x is a NaN.

There are two differences.  Each treats -0.0 differently, and each also
treats NaN differently.  Specifically, GLSL.std.450 FSign does not
define any specific behavior for NaN.

There has been some discussion in Khronos about the NaN behavior of
GLSL.std.450 FSign.  As part of that discussion, I did some research
into how we treat NaN for nir_op_fsign, and this commit just captures
some of those notes.

v2: Document the expected behavior of nir_op_fsign more thoroughly.
Suggested by Rhys.  Note that the current implementation of constant
folding does not produce the expected result for NaN.  Suggested by
Caio.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v1]
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Danylo Piliaiev
81132983cd nir: fix missing nir_lower_pntc_ytransform.c in the makefile
Fixes: 33fd9e5d "nir: account for point-coord origin when lowering it"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8308>
2021-01-04 15:37:20 +00:00
Danylo Piliaiev
33fd9e5d8a nir: account for point-coord origin when lowering it
The resulting point-coord origin not only depends on whether
the draw buffer is flipped but also on GL_POINT_SPRITE_COORD_ORIGIN
state. Which makes its transform differ from a transform of wpos.

On freedreno fixes:
 gl-3.2-pointsprite-origin
 gl-3.2-pointsprite-origin -fbo

Fixes: d934d320 "nir: Add flipping of gl_PointCoord.y in nir_lower_wpos_ytransform."
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8200>
2021-01-04 13:41:33 +00:00
Icecream95
1840404783 nir: Handle load_kernel_input in nir_get_io_offset_src
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8264>
2021-01-01 02:58:49 +00:00
Daniel Schürmann
a3785e3481 nir/opt_vectorize: hash whether a swizzle accesses elements beyond the maximum vectorization factor
Swizzles that access components outside of the maximum
vector size cannot be vectorized with each other.
This patch creates different hash bins for this case.

For example accesses to .x and .y are considered different variables
compared to accesses to .z and .w for 16-bit vec2.

This prevents the vectorization of things like
   vec2 16 ssa_3 = iadd ssa_1.xz, ssa_2.xz

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6666>
2020-12-31 16:44:58 +00:00
Daniel Schürmann
46e7428031 nir/opt_vectorize: rehash users of vectorized instructions
This ensures that chains of ALU instructions are vectorized
in a single iteration.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6666>
2020-12-31 16:44:58 +00:00
Daniel Schürmann
8eaf9c61d1 nir/opt_vectorize: don't hash filtered instructions
This patch also changes nir_opt_vectorize_cb to
use only one instruction as parameter.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6666>
2020-12-31 16:44:58 +00:00
Daniel Schürmann
23b2885514 nir/opt_vectorize: don't hash instructions which are already vectorized
This guarantees that the hashset contains exactly the instructions
which can be vectorized.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6666>
2020-12-31 16:44:58 +00:00
Daniel Schürmann
ad37e4df73 nir/opt_vectorize: use a single instruction per hash entry instead of a vector
This drastically simplifies vectorization but may potentially
lead to slightly worse vectorizations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6666>
2020-12-31 16:44:58 +00:00
Samuel Pitoiset
0b503d8de9 nir: fix determining if an addition might overflow for phi sources
nir_addition_might_overflow() expects the parent instruction to be
an alu instr but it might be a phi instr. Fix it by assuming that
the addition might overflow.

This fixes compiler crashes with Horizon Zero Dawn.

No fossils-db changes.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8268>
2020-12-31 16:17:08 +00:00
Timothy Arceri
6c8cc9be12 glsl: default to compat shaders in compat profile
If the shader does not specify "core" or "compatibility" in shaders
above 1.40 we were defaulting these shaders to core shaders when
in a compat profile. Instead default to compat shaders.

This brings us inline with the behaviour of the binary drivers and
fixes a crash on start-up for the game Foundation.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3594

Fixes: c7e3d31b0b ("glsl: fix compat shaders in GLSL 1.40")

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6993>
2020-12-30 11:47:49 +11:00