Commit graph

9507 commits

Author SHA1 Message Date
Connor Abbott
3d803893da nir/lower_bool: Rewrite dest_type for boolean destinations
This happens with nir_texop_samples_identical, and we need to keep
things consistent and (soon) keep the validator happy when expanding
booleans once we switch that to having a dest_type of bool1.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989>
2021-01-25 11:21:42 +01:00
Connor Abbott
acd6616eab nir/lower_tex: Handle sized tex destination types
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989>
2021-01-25 11:21:26 +01:00
Mike Blumenkrantz
64fd191d8a spirv: handle NoContraction in GLSL450 alu ops
we were dropping this when it was set, leading to incorrect algebraic
optimizations that broke various types of tests, e.g., running
spec@arb_gpu_shader5@execution@precise@fs-fract-of-nan in zink

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6116>
2021-01-23 01:39:09 +00:00
Jason Ekstrand
178820212b nir/lower_int64: Lower 64-bit vote_ieq
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
731adf1e17 nir/lower_int64: Add lowering for 64-bit iadd shuffle/reduce
Lowering iadd is a bit trickier because we have to deal with potential
overflow but it's still not bad to do in NIR.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
bf7a114246 nir/lower_int64: Add lowering for some 64-bit subgroup ops
These are all pretty trivial because we can just split the op into one
subgroup op per half of the value.  There's some question as to whether
these belong in lower_int64 or lower_subgroups but, on Intel, they key
decider of whether or not we need the lowering is based on whether or
not we have hardware int64 support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
da331f814f nir/lower_int64: Fix lowering of f2[ui]64 for 16-bit float
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Jason Ekstrand
70b4524de5 nir/lower_int64: Add a level of wrapper functions
We're about to start lowering a few intrinsics so we need support more
than just ALU.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329>
2021-01-22 18:38:37 +00:00
Marek Olšák
fb73058ad2 mesa: add upper bound to limit program state var iterations
State parameters are sometimes not perfectly sorted.
This optimizes the number of iterations we have to do for fetch_state.

Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>
2021-01-21 21:59:29 +00:00
Marek Olšák
0c77190b31 glsl: split gl_CurrentAttribFragMESA into elements
This reduces the constant buffer size by eliminating unused elements
because it's no longer a uniform array that the compiler can't split.

This looks silly, but there is no other way because all elements must be
globally declared, which means they can't be generated by a loop.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>
2021-01-21 21:59:29 +00:00
Marek Olšák
e3a7acf958 glsl: remove unused internal builtin gl_CurrentAttribVertMESA
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>
2021-01-21 21:59:29 +00:00
Marek Olšák
0eccba1ac0 mesa: flatten STATE_MATERIAL and STATE_LIGHTPROD tokens
Flattening continue to get optimal code in fetch_state.

This merges the "face" field with the "attrib" field using the combined
MAT_ATTRIB_* enums. The outcome is that the inner switch statements can
be flatten because we can use MAT_ATTRIB_* to index into the attrib array
directly.

With LightSource attributes that don't have two sides, more math is
involved to get the correct index but it works out nicely too.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>
2021-01-21 21:59:29 +00:00
Marek Olšák
b4f3497786 mesa: remove STATE_INTERNAL
Let's flatten the tokens to generate optimal code for fetch_state.

There was only one name conflict: STATE_NORMAL_SCALE was used both as
internal and non-internal.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8183>
2021-01-21 21:59:29 +00:00
Rhys Perry
a6d92eaf4f nir/sink,nir/move: sink/move reorderable load_ssbo
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6490>
2021-01-21 18:07:03 +00:00
Rhys Perry
e200ce0996 nir/lower_io: fix array_length lowering if buffer is smaller than offset
Matches SPIR-V -> NIR implementation of OpArrayLength.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8163>
2021-01-21 11:53:12 +00:00
Jesse Natalie
13b21156e4 nir: Work around MSVC x86 internal compiler error
Fixes: 1fd8b466 ("nir,spirv: add sparse image loads")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4108
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8581>
2021-01-20 20:42:48 +00:00
Ilia Mirkin
a0f4affcf6 glsl: only expose int64 atomics when extension is enabled
This limits the exposure of these functions to when the extension is
available. Prevents crashes otherwise, as the rest of the infrastructure
doesn't necessarily expect these functions when the extension is not
available.

Fixes: 40c1f9883e ("mesa,glsl: add support for GL_NV_shader_atomic_int64")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8533>
2021-01-16 18:21:03 +00:00
Mike Blumenkrantz
652e51e1f3 nir/lower_uniforms_to_ubo: set explicit_binding on uniform_0
this variable is always bound to buffer index 0, so the binding info
here is actually useful

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7935>
2021-01-14 17:29:09 +00:00
Mike Blumenkrantz
491e7decad util/set: add the found param to search_or_add
this brings parity with the internal api

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8450>
2021-01-14 13:51:35 +00:00
Rhys Perry
dfe429eb41 nir/loop_unroll: unroll more aggressively if it can improve load scheduling
Significantly improves performance of a Control compute shader. Also seems
to increase FPS at the very start of the game by ~5% (RX 580, 1080p,
medium settings, no MSAA).

fossil-db (Sienna):
Totals from 81 (0.06% of 139391) affected shaders:
SGPRs: 3848 -> 4362 (+13.36%); split: -0.99%, +14.35%
VGPRs: 4132 -> 4648 (+12.49%)
CodeSize: 275532 -> 659188 (+139.24%)
MaxWaves: 986 -> 906 (-8.11%)
Instrs: 54422 -> 126865 (+133.11%)
Cycles: 1057240 -> 750464 (-29.02%); split: -42.61%, +13.60%
VMEM: 26507 -> 61829 (+133.26%); split: +135.56%, -2.30%
SMEM: 4748 -> 5895 (+24.16%); split: +31.47%, -7.31%
VClause: 1933 -> 6802 (+251.89%); split: -0.72%, +252.61%
SClause: 1179 -> 1810 (+53.52%); split: -3.14%, +56.66%
Branches: 1174 -> 1157 (-1.45%); split: -23.94%, +22.49%
PreVGPRs: 3219 -> 3387 (+5.22%); split: -0.96%, +6.18%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6538>
2021-01-13 18:54:18 +00:00
Daniel Schürmann
08fbd5d454 nir/divergence_analysis: mark load_push_constant as uniform
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8439>
2021-01-12 14:46:13 +00:00
Mike Blumenkrantz
f7527f7f65 glcpp: disable 'windows' tests
these timeout a lot

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8321>
2021-01-12 01:51:16 +00:00
Daniel Schürmann
bd8e84eb8d nir: replace .lower_sub with .has_fsub and .has_isub
This allows a more fine-grained control about whether
a backend supports one of these instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
2021-01-11 19:13:51 +00:00
Daniel Schürmann
b3ce55b445 nir,vc4: Lower fneg to fmul(x, -1.0)
This patch also replaces lower_negate with lower_ineg / lower_fneg.

The fneg semantics have been clarified as of Version 1.5, Revision 1
of the SPIR-V specification, which means that the previous lowering
to fsub is not a viable solution anymore, and is replaced with
lowering to fmul(x, -1.0).

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>
2021-01-11 19:13:51 +00:00
Erico Nunes
faaba0d6af nir/lower_vec_to_movs: don't vectorize unsupports ops
If the instruction being coalesced would be vectorized but the target
doesn't support vectorizing that op, skip coalescing.
Reuse the callbacks from alu_to_scalar to describe which ops should not
be vectorized.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6506>
2021-01-11 13:13:30 +00:00
Rhys Perry
b634d7f3e2 nir/opt_vectorize: fix srcs_equal() with two different non-const
To match hash_alu_src(), this should return false if both are different
non-const ssa defs.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391>
2021-01-09 11:14:05 +00:00
Rhys Perry
bdf316ae7b nir/opt_vectorize: fix typo in instr_can_rewrite()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8391>
2021-01-09 11:14:05 +00:00
Eric Anholt
670944ba04 nir/lower_locals_to_regs: Use the imul_imm helper instead of forcing it.
Cleaned up a bit of addressing math in the shader I just had to debug.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8373>
2021-01-08 21:04:31 +00:00
Rhys Perry
f5adf27fb9 nir,radv: add and use nir_vectorize_tess_levels()
fossil-db (Sienna):
Totals from 1342 (0.97% of 138791) affected shaders:
CodeSize: 3287996 -> 3269572 (-0.56%); split: -0.56%, +0.00%
Instrs: 629896 -> 628191 (-0.27%); split: -0.31%, +0.04%
Cycles: 2619244 -> 2612424 (-0.26%); split: -0.30%, +0.04%
VMEM: 388807 -> 389273 (+0.12%); split: +0.14%, -0.02%
SMEM: 90655 -> 90700 (+0.05%); split: +0.06%, -0.01%
VClause: 21831 -> 21812 (-0.09%)
PreVGPRs: 44155 -> 44058 (-0.22%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
f199b7188b nir/load_store_vectorize: add data as callback args
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
00c8bec47b nir: add nir_load_store_vectorize_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
f4eb833a12 nir/load_store_vectorize: don't ignore subgroup memory barriers
Not sure why I thought this was correct, but we should consider them for
optimization purposes.

Fixes: ce9205c03b ('nir: add a load/store vectorization pass')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry
c73c246e05 nir: gather whether a compute shader uses non-quad subgroup intrinsics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>
2021-01-07 15:01:02 +00:00
Rhys Perry
f7a5b8ed35 vtn: support SpvCapabilitySparseResidency
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
7d1d4acbd5 nir/lower_tex: fix lower_tg4_offsets with sparse fetches
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
2d2decc905 nir: add sparse_residency_code_and
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
4cbdf9ec4d nir,spirv: implement SpvOpImageSparseTexelsResident
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
1fd8b46667 nir,spirv: add sparse image loads
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
3a7972f72a nir,spirv: add sparse texture fetches
Like SPIR-V and GL_ARB_sparse_texture2, these return a residency code. It
is placed in the destination after the rest of the result. If it's zero,
then the texel is resident. Otherwise, it's not resident.

Besides the larger destination and the residency code, sparse fetches
work the same as normal fetches.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
95819663b7 nir: allow 5 component vectors
These will be useful for sparse texture instructions and image load
intrinsics.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Rhys Perry
ba4a73a502 nir/tests: fix callback for load/store vectorizer tests
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>
2021-01-06 20:36:38 +00:00
Daniel Schürmann
22b89d9a52 nir/opt_vectorize: fix call to filter function
Due to the typo, it could happen that instructions
got further vectorized than intended.

Fixes: 8eaf9c61d1 ('nir/opt_vectorize: don't hash filtered instructions')
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8352>
2021-01-06 19:03:07 +00:00
Christian Gmeiner
c0fe111d64 nir: use intrinsic builders
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8295>
2021-01-06 14:34:41 +00:00
Mike Blumenkrantz
b5fb66a5ed nir: preserve explicit_binding in lower_atomics_to_ssbo
it's important to be able to tell whether this is explicitly set by the
user

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7489>
2021-01-06 12:56:09 +00:00
Jesse Natalie
4d83306a9a nir: Update saturated float->int/uint conversion algorithm
The mantissa for a float doesn't contain enough data to accurately represent
the min/max values for some destination types. Instead of clamping before
converting, clamp after converting when coming from floats. This improves
conformance of CL conversions, specifically for float -> long/ulong with
int64 emulation enabled.

Refactors the limit determination from the clamp, so we can determine
limits for the dest type (int/uint) in both the source (float) and dest
type. The limit as a float is used for comparison, while the limit as a
dest type is used for bcsel.

Important note is that the comparison is inverted to fge instead of flt,
so the bcsel chooses the direct int/uint over the converted float in the
case where the comparison comes up equal, but the conversion can't produce
the exact min/max value.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8256>
2021-01-05 19:46:25 +00:00
Alexander von Gluck IV
c7486c996e glsl/builtin_functions: Rename int64 function to int64_avail
* int64 is a core type on Haiku (and potentially other platforms)
* rename to int64_avail matching other similar calls

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2021-01-04 21:18:55 -06:00
Ian Romanick
539c25c2da nir/algebraic: Move the flrp -> bcsel rule earlier
If multiple rules could match, the rule that appears first in the file
is used.

Only Tiger Lake and Ice Lake are affected.  Other platforms either have
a LRP instruction or can't run any shaders from shader-db that would
benefit.

v2: Fix issues created when this commit was rebased on top of
3c8934a644 ("nir/algebraic: add flrp patterns for 16 and 64 bits").
Noticed by Caio.

Tiger Lake and Ice Lake had similar results.
total instructions in shared programs: 20908672 -> 20908661 (<.01%)
instructions in affected programs: 419 -> 408 (-2.63%)
helped: 5
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 2.20 x̃: 3
helped stats (rel) min: 1.85% max: 3.19% x̄: 2.49% x̃: 2.65%
95% mean confidence interval for instructions value: -3.56 -0.84
95% mean confidence interval for instructions %-change: -3.24% -1.73%
Instructions are helped.

total cycles in shared programs: 473513940 -> 473513793 (<.01%)
cycles in affected programs: 7176 -> 7029 (-2.05%)
helped: 12
HURT: 0
helped stats (abs) min: 5 max: 22 x̄: 12.25 x̃: 12
helped stats (rel) min: 0.84% max: 3.24% x̄: 2.09% x̃: 1.80%
95% mean confidence interval for cycles value: -15.43 -9.07
95% mean confidence interval for cycles %-change: -2.57% -1.61%
Cycles are helped.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
ec16f935fe nir/algebraic: Mark comparisons generated from lowered fsign precise
This prevents other transformations from converting them to 'a != 0'.
For example, both of these transformations can do this:

   (('~flt', 0.0, ('fabs', a)), ('fne', a, 0.0)),
   (('~flt', ('fneg', ('fabs', a)), 0.0), ('fne', a, 0.0)),

Both fsign(fabs(NaN)) and fsign(fneg(fabs(NaN))) should produce zero,
but, since 'NaN != 0.0' is true, cascading these transformations could
cause them to generate 1.0 or -1.0 respecively.

No shader-db or fossil-db changes on any Intel platform.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
9771af5dde nir/algebraic: Fix broken NaN and -0.0 behavior
No shader-db or fossil-db changes on any Intel platform.

v2: Add a coding line to fix SCons build problems caused by the ±
character.

Fixes: 25bfba3335 ("nir/algebraic: Recognize open-coded copysign(1.0, a)")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00
Ian Romanick
010e663cc3 spir-v: Mark floating point comparisons exact
OpenGL GLSL, OpenGL ARB assembly shaders, and DX9 are pretty loose about
the behavior in the presence of NaNs.  Many GPUs that implement these
specifications do not even have a representation of NaN.  However,
OpenCL and Vulkan SPIR-V are not so lax.  Both actually have some
required behavior in the presence of NaN, and, of the two, OpenCL is the
most strict.

For years we have implemented SPIR-V by using the same comparison
opcodes as we use for OpenGL GLSL and OpenGL assembly shaders.  This has
repeatedly caused problems where an optimization that is valid in the
NaN-relaxed world is not valid in Vulkan or OpenCL.  To fix this, set
the "exact" flag on comparisons instructions generated from SPIR-V.
This will block optimizations that may have different NaN behavior.

v2: Set the exact flag in the nir_builder, not in the vtn_builder.

v3: Add an assertion in vtn_handle_constant that the exact flag wasn't
set (because it's ignored).  Rebase on 80163bbec3 ("nir/vtn: Support
OpOrdered and OpUnordered opcodes").  Mark the NIR generated for those
opcodes as exact as well.

v4: s/unused_exact/exact/ in a couple places, and assert that exact has
the expected value (true in one place, false in the other).  Suggested
by Caio.

Closes: #3345
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Fixes: 8513b12590 ("nir/opt_if: split ALU from Phi more aggressively")

This commit doesn't really fix anything in 8513b12590.  However,
without 8513b12590, a regression is triggered in RADV on No Man's
Sky.  I want to ensure that this change is only applied on top of
8513b12590, and Fixes: seems the safest way to do that.

No shader-db changes on any Intel platform.  This only affects SPIR-V,
and we have no OpenGL SPIR-V shaders in shader-db.

124 shaders in Shadow of the Tomb Raider (Steam "native") were hurt by 1
spill and 1 fill each.

All Intel platforms had similar results. (Tiger Lake shown)
Instructions in all programs: 155668276 -> 155685764 (+0.0%)

SENDs in all programs: 6474570 -> 6474570 (+0.0%)

Loops in all programs: 35271 -> 35271 (+0.0%)

Cycles in all programs: 3198055373 -> 3198628031 (+0.0%)

Spills in all programs: 231522 -> 231646 (+0.1%)

Fills in all programs: 347571 -> 347695 (+0.0%)

Vega
Totals:
SGPRs: 20955712 -> 20956756 (+0.00%); split: -0.02%, +0.03%
VGPRs: 13476920 -> 13473132 (-0.03%); split: -0.07%, +0.04%
CodeSize: 613371940 -> 613339348 (-0.01%); split: -0.06%, +0.05%
MaxWaves: 3111886 -> 3112481 (+0.02%); split: +0.02%, -0.00%
Instrs: 120723785 -> 120746991 (+0.02%); split: -0.04%, +0.06%
Cycles: 626658992 -> 626862708 (+0.03%); split: -0.05%, +0.08%
VMEM: 216330854 -> 216343196 (+0.01%); split: +0.04%, -0.04%
SMEM: 32079391 -> 32081972 (+0.01%); split: +0.05%, -0.04%
VClause: 2688784 -> 2688789 (+0.00%); split: -0.03%, +0.03%
SClause: 6554669 -> 6556251 (+0.02%); split: -0.01%, +0.03%
Copies: 5356667 -> 5353283 (-0.06%); split: -0.36%, +0.29%
Branches: 954466 -> 954716 (+0.03%); split: -0.01%, +0.04%
PreSGPRs: 9078300 -> 9081626 (+0.04%); split: -0.01%, +0.05%
PreVGPRs: 10972090 -> 10966576 (-0.05%); split: -0.06%, +0.01%

Totals from 48239 (12.08% of 399432) affected shaders:
SGPRs: 2713984 -> 2715028 (+0.04%); split: -0.16%, +0.19%
VGPRs: 1997804 -> 1994016 (-0.19%); split: -0.46%, +0.27%
CodeSize: 172094092 -> 172061500 (-0.02%); split: -0.21%, +0.19%
MaxWaves: 337327 -> 337922 (+0.18%); split: +0.20%, -0.02%
Instrs: 33053657 -> 33076863 (+0.07%); split: -0.15%, +0.22%
Cycles: 254961228 -> 255164944 (+0.08%); split: -0.12%, +0.20%
VMEM: 15165226 -> 15177568 (+0.08%); split: +0.59%, -0.51%
SMEM: 3304938 -> 3307519 (+0.08%); split: +0.49%, -0.41%
VClause: 766225 -> 766230 (+0.00%); split: -0.12%, +0.12%
SClause: 1332645 -> 1334227 (+0.12%); split: -0.04%, +0.16%
Copies: 2040651 -> 2037267 (-0.17%); split: -0.94%, +0.77%
Branches: 743668 -> 743918 (+0.03%); split: -0.01%, +0.05%
PreSGPRs: 1697667 -> 1700993 (+0.20%); split: -0.07%, +0.27%
PreVGPRs: 1718424 -> 1712910 (-0.32%); split: -0.39%, +0.07%

Polaris
Totals:
SGPRs: 21349172 -> 21354376 (+0.02%); split: -0.02%, +0.04%
VGPRs: 13690680 -> 13686920 (-0.03%); split: -0.07%, +0.04%
CodeSize: 613745824 -> 613704988 (-0.01%); split: -0.06%, +0.05%
MaxWaves: 2775012 -> 2775189 (+0.01%); split: +0.01%, -0.00%
Instrs: 120735079 -> 120756209 (+0.02%); split: -0.04%, +0.06%
Cycles: 627906100 -> 628076156 (+0.03%); split: -0.05%, +0.08%
VMEM: 216623065 -> 216641838 (+0.01%); split: +0.04%, -0.04%
SMEM: 32295618 -> 32299338 (+0.01%); split: +0.05%, -0.04%
VClause: 2711025 -> 2711141 (+0.00%); split: -0.03%, +0.04%
SClause: 6545185 -> 6546769 (+0.02%); split: -0.01%, +0.03%
Copies: 5387723 -> 5383249 (-0.08%); split: -0.37%, +0.29%
Branches: 953775 -> 953954 (+0.02%); split: -0.01%, +0.03%
PreSGPRs: 9148814 -> 9153211 (+0.05%); split: -0.01%, +0.06%
PreVGPRs: 11029429 -> 11023915 (-0.05%); split: -0.06%, +0.01%

Totals from 48239 (12.00% of 402052) affected shaders:

SGPRs: 2682056 -> 2687260 (+0.19%); split: -0.16%, +0.35%
VGPRs: 1994436 -> 1990676 (-0.19%); split: -0.46%, +0.27%
CodeSize: 170857060 -> 170816224 (-0.02%); split: -0.21%, +0.19%
MaxWaves: 295429 -> 295606 (+0.06%); split: +0.07%, -0.01%
Instrs: 32808802 -> 32829932 (+0.06%); split: -0.16%, +0.22%
Cycles: 254633252 -> 254803308 (+0.07%); split: -0.13%, +0.20%
VMEM: 14897934 -> 14916707 (+0.13%); split: +0.65%, -0.52%
SMEM: 3289726 -> 3293446 (+0.11%); split: +0.53%, -0.42%
VClause: 775318 -> 775434 (+0.01%); split: -0.11%, +0.13%
SClause: 1304867 -> 1306451 (+0.12%); split: -0.04%, +0.16%
Copies: 2026334 -> 2021860 (-0.22%); split: -0.99%, +0.77%
Branches: 742554 -> 742733 (+0.02%); split: -0.02%, +0.04%
PreSGPRs: 1690887 -> 1695284 (+0.26%); split: -0.07%, +0.33%
PreVGPRs: 1717709 -> 1712195 (-0.32%); split: -0.40%, +0.07%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>
2021-01-05 02:07:09 +00:00