fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-21 22:20:14 +01:00

Author	SHA1	Message	Date
Rhys Perry	614ab26afd	nir/algebraic: optimize out exact a+0.0 if it's used only as a float fossil-db (GFX10): Totals from 133 (0.10% of 139391) affected shaders: SGPRs: 7864 -> 7856 (-0.10%); split: -0.20%, +0.10% VGPRs: 4884 -> 4836 (-0.98%) CodeSize: 288932 -> 287084 (-0.64%) MaxWaves: 1973 -> 1979 (+0.30%) Instrs: 53899 -> 53550 (-0.65%) fossil-db (GFX10.3): Totals from 133 (0.10% of 139391) affected shaders: SGPRs: 7832 -> 7835 (+0.04%); split: -0.06%, +0.10% VGPRs: 5144 -> 5088 (-1.09%) CodeSize: 318912 -> 316696 (-0.69%) MaxWaves: 1735 -> 1746 (+0.63%) Instrs: 65367 -> 64853 (-0.79%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Rhys Perry	2849f0b5aa	nir/algebraic: optimize out exact a*1.0 if it's used only as a float fossil-db (GFX10): Totals from 10180 (7.30% of 139391) affected shaders: SGPRs: 549392 -> 549448 (+0.01%); split: -0.00%, +0.01% VGPRs: 243228 -> 243008 (-0.09%); split: -0.11%, +0.02% CodeSize: 12939080 -> 12603996 (-2.59%); split: -2.59%, +0.00% MaxWaves: 186948 -> 186976 (+0.01%) Instrs: 2497266 -> 2414648 (-3.31%) fossil-db (GFX10.3): Totals from 10180 (7.30% of 139391) affected shaders: SGPRs: 549672 -> 549280 (-0.07%); split: -0.23%, +0.16% VGPRs: 289296 -> 283672 (-1.94%); split: -2.83%, +0.88% CodeSize: 13920180 -> 13255560 (-4.77%); split: -4.77%, +0.00% MaxWaves: 151789 -> 153165 (+0.91%) Instrs: 2756978 -> 2671517 (-3.10%); split: -3.10%, +0.00% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5523>	2021-01-26 11:36:13 +00:00
Daniel Schürmann	bd8e84eb8d	nir: replace .lower_sub with .has_fsub and .has_isub This allows a more fine-grained control about whether a backend supports one of these instructions. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Daniel Schürmann	b3ce55b445	nir,vc4: Lower fneg to fmul(x, -1.0) This patch also replaces lower_negate with lower_ineg / lower_fneg. The fneg semantics have been clarified as of Version 1.5, Revision 1 of the SPIR-V specification, which means that the previous lowering to fsub is not a viable solution anymore, and is replaced with lowering to fmul(x, -1.0). Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6597>	2021-01-11 19:13:51 +00:00
Ian Romanick	539c25c2da	nir/algebraic: Move the flrp -> bcsel rule earlier If multiple rules could match, the rule that appears first in the file is used. Only Tiger Lake and Ice Lake are affected. Other platforms either have a LRP instruction or can't run any shaders from shader-db that would benefit. v2: Fix issues created when this commit was rebased on top of `3c8934a644` ("nir/algebraic: add flrp patterns for 16 and 64 bits"). Noticed by Caio. Tiger Lake and Ice Lake had similar results. total instructions in shared programs: 20908672 -> 20908661 (<.01%) instructions in affected programs: 419 -> 408 (-2.63%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 2.20 x̃: 3 helped stats (rel) min: 1.85% max: 3.19% x̄: 2.49% x̃: 2.65% 95% mean confidence interval for instructions value: -3.56 -0.84 95% mean confidence interval for instructions %-change: -3.24% -1.73% Instructions are helped. total cycles in shared programs: 473513940 -> 473513793 (<.01%) cycles in affected programs: 7176 -> 7029 (-2.05%) helped: 12 HURT: 0 helped stats (abs) min: 5 max: 22 x̄: 12.25 x̃: 12 helped stats (rel) min: 0.84% max: 3.24% x̄: 2.09% x̃: 1.80% 95% mean confidence interval for cycles value: -15.43 -9.07 95% mean confidence interval for cycles %-change: -2.57% -1.61% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	ec16f935fe	nir/algebraic: Mark comparisons generated from lowered fsign precise This prevents other transformations from converting them to 'a != 0'. For example, both of these transformations can do this: (('~flt', 0.0, ('fabs', a)), ('fne', a, 0.0)), (('~flt', ('fneg', ('fabs', a)), 0.0), ('fne', a, 0.0)), Both fsign(fabs(NaN)) and fsign(fneg(fabs(NaN))) should produce zero, but, since 'NaN != 0.0' is true, cascading these transformations could cause them to generate 1.0 or -1.0 respecively. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	9771af5dde	nir/algebraic: Fix broken NaN and -0.0 behavior No shader-db or fossil-db changes on any Intel platform. v2: Add a coding line to fix SCons build problems caused by the ± character. Fixes: `25bfba3335` ("nir/algebraic: Recognize open-coded copysign(1.0, a)") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	55621c6d1c	nir/algebraic: Add some compare-with-zero optimizations that are exact This prevents some fossil-db regressions in "spir-v: Mark floating point comparisons exact". v2: Note that the patterns and replacements produce the same value when isnan(b). Suggested by Caio. v3: Use C99 isfinite() instead of (obsolete) BSD finite(). Fixes various Windows builds. No fossil-db changes on any Inetl platform, Vega, or Polaris10. All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 20908670 -> 20908672 (<.01%) instructions in affected programs: 69 -> 71 (2.90%) helped: 0 HURT: 1 total cycles in shared programs: 473515288 -> 473513940 (<.01%) cycles in affected programs: 4942 -> 3594 (-27.28%) helped: 2 HURT: 0 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	9167324a86	nir/algebraic: Mark some logic-joined comparison reductions as exact This also prevents some fossil-db regressions in "spir-v: Mark floating point comparisons exact". v2: Mark the fmin / fmax in the replacement exact to prevent other optimizations from ruining the NaN-clensing property of the fmin / fmax. Suggested by Rhys. Don't assume that constants are not NaN because some components of a vector might be NaN while others are numbers. Noticed by Rhys. This causes ~8 more shaders in Age of Wonders III (dxvk) to regress on cycles (not instructions) by less than 1% when "spir-v: Mark floating point comparisons exact" is applied. This difference is too small to care. All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 20908668 -> 20908670 (<.01%) instructions in affected programs: 9196 -> 9198 (0.02%) helped: 10 HURT: 5 helped stats (abs) min: 1 max: 2 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.02% max: 5.41% x̄: 2.20% x̃: 2.16% HURT stats (abs) min: 2 max: 6 x̄: 3.20 x̃: 3 HURT stats (rel) min: 2.44% max: 16.67% x̄: 9.39% x̃: 12.50% 95% mean confidence interval for instructions value: -1.22 1.49 95% mean confidence interval for instructions %-change: -2.08% 5.41% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 473515330 -> 473515288 (<.01%) cycles in affected programs: 67146 -> 67104 (-0.06%) helped: 10 HURT: 7 helped stats (abs) min: 1 max: 36 x̄: 15.90 x̃: 17 helped stats (rel) min: 0.01% max: 1.29% x̄: 0.66% x̃: 0.89% HURT stats (abs) min: 1 max: 48 x̄: 16.71 x̃: 4 HURT stats (rel) min: 0.08% max: 1.94% x̄: 0.87% x̃: 0.19% 95% mean confidence interval for cycles value: -13.88 8.94 95% mean confidence interval for cycles %-change: -0.56% 0.49% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	fe3c518277	nir/algebraic: Don't add reordered version of patterns for commutative instructions The reordered are automatically considered by nir_algebraic rules for commutative instructions. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	314a40c902	Revert "nir: Replace an odd comparison involving fmin of -b2f" I originally noticed that `3b30814791` ("nir/algebraic: Optimize 1-bit Booleans") caused this pattern no longer be matched by incorrectly replacing b@32 with b@1. Making that correct had no effect on shader-db. When this pattern originally was added, it only affected 4 shaders, so it's not worth the effort to debug further. This reverts commit `f50400cc80`. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	aec0547838	nir/algebraic: Make some notes about comparison rearrangements versus infinity The original comment was a little terse and a little incorrect. The rearrangements are fine w.r.t. NaN. However, they produce incorrect results if one operand is +Inf and the other is -Inf. A later commit, "nir/algebraic: Add some compare-with-zero optimizations that are exact", will add some more patterns here. It may be reasonable to squash this commit (forward) into that commit. v2: Fix some incorrect comparisons operators in the comment (<= vs >=). Add commentary that subtraction works like addition w.r.t. NaN. Both noticed / suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	363efc2823	nir: Make some notes about fsign versus NaN This commit only documents the current behavior, even if that behavior is not the behavior preferred by the relevant specs. In SPIR-V, there are two flavors of the sign instruction, and each lives in an extended instruction set. The GLSL.std.450 FSign instruction is defined as: Result is 1.0 if x > 0, 0.0 if x = 0, or -1.0 if x < 0. This also matches the GLSL 4.60 definition. However, the OpenCL.ExtendedInstructionSet.100 sign instruction is defined as: Returns 1.0 if x > 0, -0.0 if x = -0.0, +0.0 if x = +0.0, or -1.0 if x < 0. Returns 0.0 if x is a NaN. There are two differences. Each treats -0.0 differently, and each also treats NaN differently. Specifically, GLSL.std.450 FSign does not define any specific behavior for NaN. There has been some discussion in Khronos about the NaN behavior of GLSL.std.450 FSign. As part of that discussion, I did some research into how we treat NaN for nir_op_fsign, and this commit just captures some of those notes. v2: Document the expected behavior of nir_op_fsign more thoroughly. Suggested by Rhys. Note that the current implementation of constant folding does not produce the expected result for NaN. Suggested by Caio. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v1] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Jesse Natalie	3f77901342	nir: Add an algebraic optimization for float->double->float As part of this series, it removes the need for float->double conversion, just to be able to print a single float. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8254>	2020-12-29 09:07:24 +10:00
Kenneth Graunke	531843cf2e	nir/algebraic: Avoid creating new fp64 ops when using softfp64 In commit `00b28a50b2`, Marek extended a number of optimizations that had been 32-bit specific to work on other bit-sizes. Most optimizations preserve the data type across the transformation. In other words, an optimization which generates e.g. fp64 operations only does so when the source expression also contains fp64 operations. These transformations are fine with respect to lowering, because we will lower away all expressions that would trigger the search portion of the expression, and so we'd never apply those rules. However, a few of the rules create new operations that run afoul of lowering passes. For example, ('bcsel', a, 1.0, 0.0) => ('b2f', a) where the result is a double would simply be a selection between two different 64-bit constants. The replacement expression, on the other hand, involves a nir_op_b2f64 ALU operation. If we're run after nir_lower_doubles, then it may not be legal to generate such an expression anymore (at least without running lowering again, which we don't do today). Regressions due to this are blocking the 20.3 release, so for now, we take the easy route and simply disallow those few rules when doing full softfp64 lowering, which fixes the immediate problem. But it doesn't solve the long-term problem in an extensible manner. In the future, we may want to add a `lowered_alu_ops` bitfield to the NIR shader, and as lowering passes are run, mark them as taboo. Then, we could have each algebraic transformation track which operations it creates in the replacement expression. With both of those in place, nir_replace_instr could compare the transformation's list of ALU ops against `lowered_alu_ops` and implicitly skip rules that generate forbidden ALU operations. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3504 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7841>	2020-12-01 06:29:31 +00:00
Daniel Schürmann	0ef5f3552f	nir: add strength reduction pattern for imod/irem with pow2 divisor. Affected games are Detroit : Become Human and Doom : Eternal. Totals from 6262 (4.54% of 138013) affected shaders (RAVEN): SGPRs: 678472 -> 678640 (+0.02%) VGPRs: 498288 -> 498360 (+0.01%) CodeSize: 67064196 -> 65926000 (-1.70%) MaxWaves: 19390 -> 19382 (-0.04%) Instrs: 13175372 -> 12932517 (-1.84%) Cycles: 1444043256 -> 1443022576 (-0.07%); split: -0.08%, +0.01% VMEM: 929560 -> 908726 (-2.24%); split: +0.39%, -2.63% SMEM: 406207 -> 400062 (-1.51%); split: +0.46%, -1.97% VClause: 215168 -> 215031 (-0.06%) SClause: 443312 -> 442324 (-0.22%); split: -0.25%, +0.03% Copies: 1350793 -> 1344326 (-0.48%); split: -0.52%, +0.04% Branches: 506432 -> 506370 (-0.01%); split: -0.02%, +0.01% PreSGPRs: 619652 -> 619619 (-0.01%) PreVGPRs: 473212 -> 473168 (-0.01%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/175>	2020-11-13 15:59:03 +01:00
Samuel Pitoiset	1aa1c1aec2	nir/algebraic: optimize bitfield_select(a, iand(a, b), c) fossils-db (Vega10): Totals from 242 (0.17% of 139517) affected shaders: CodeSize: 853752 -> 852752 (-0.12%) Instrs: 165944 -> 165694 (-0.15%) Cycles: 855720 -> 854528 (-0.14%) VMEM: 83772 -> 83668 (-0.12%); split: +0.13%, -0.25% SMEM: 12360 -> 12316 (-0.36%) SClause: 8222 -> 8238 (+0.19%) Only helps Control. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7531>	2020-11-11 15:28:01 +01:00
Samuel Pitoiset	1c5271346a	nir/algebraic: optimize bitfield_select(a, b, 0) to iand(a, b) (src0 & src1) \| (~src0 & src2) to (src0 & src1). fossils-db (Polaris10): Totals from 873 (0.63% of 138014) affected shaders: SGPRs: 33781 -> 33733 (-0.14%) VGPRs: 37704 -> 37520 (-0.49%); split: -0.51%, +0.02% CodeSize: 3861460 -> 3853424 (-0.21%); split: -0.21%, +0.00% MaxWaves: 5306 -> 5305 (-0.02%) Instrs: 743798 -> 743486 (-0.04%); split: -0.04%, +0.00% Cycles: 10962244 -> 10960936 (-0.01%); split: -0.01%, +0.00% VMEM: 128309 -> 128350 (+0.03%); split: +0.33%, -0.30% SMEM: 44797 -> 44113 (-1.53%); split: +0.02%, -1.54% Copies: 71875 -> 71674 (-0.28%); split: -0.31%, +0.03% PreSGPRs: 23484 -> 23479 (-0.02%) PreVGPRs: 34582 -> 34529 (-0.15%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7479>	2020-11-09 19:51:27 +00:00
Samuel Pitoiset	77d6fda0f5	nir/algebraic: distribute imul(iadd(a, b), c) when b and c are constants This distributes imul(iadd(a, b), c) to iadd(imul(a, c), b * c) when both b and c are constants. This might allow some compiler backends to create more MADs. For ACO, this allows to combine more DS additions. fossilds-db (Vega10): Totals from 673 (0.49% of 136546) affected shaders: VGPRs: 44548 -> 44516 (-0.07%); split: -0.11%, +0.04% CodeSize: 8301552 -> 8286220 (-0.18%); split: -0.19%, +0.01% MaxWaves: 2731 -> 2735 (+0.15%); split: +0.26%, -0.11% Instrs: 1642684 -> 1638725 (-0.24%); split: -0.24%, +0.00% Cycles: 20846156 -> 20793444 (-0.25%); split: -0.25%, +0.00% VMEM: 108870 -> 108106 (-0.70%); split: +0.03%, -0.73% SMEM: 35718 -> 35674 (-0.12%); split: +0.22%, -0.34% VClause: 20603 -> 20622 (+0.09%); split: -0.01%, +0.10% SClause: 48527 -> 48539 (+0.02%) Copies: 156735 -> 156742 (+0.00%); split: -0.05%, +0.05% PreSGPRs: 43169 -> 43166 (-0.01%); split: -0.02%, +0.02% PreVGPRs: 41369 -> 41330 (-0.09%) shader-db results on Intel: Ice Lake total instructions in shared programs: 20027588 -> 20027446 (<.01%) instructions in affected programs: 71766 -> 71624 (-0.20%) helped: 70 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 2.03 x̃: 1 helped stats (rel) min: 0.10% max: 2.50% x̄: 0.29% x̃: 0.15% 95% mean confidence interval for instructions value: -2.42 -1.64 95% mean confidence interval for instructions %-change: -0.38% -0.20% Instructions are helped. total cycles in shared programs: 977525222 -> 977494323 (<.01%) cycles in affected programs: 8884593 -> 8853694 (-0.35%) helped: 56 HURT: 16 helped stats (abs) min: 2 max: 7852 x̄: 681.29 x̃: 400 helped stats (rel) min: <.01% max: 19.84% x̄: 2.79% x̃: 0.41% HURT stats (abs) min: 2 max: 1212 x̄: 453.31 x̃: 120 HURT stats (rel) min: 0.05% max: 1.09% x̄: 0.32% x̃: 0.11% 95% mean confidence interval for cycles value: -802.75 -55.56 95% mean confidence interval for cycles %-change: -3.19% -1.01% Cycles are helped. total sends in shared programs: 1032273 -> 1032272 (<.01%) sends in affected programs: 41 -> 40 (-2.44%) helped: 1 HURT: 0 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7445>	2020-11-06 07:49:02 +00:00
Rhys Perry	89c4bba8bc	nir/algebraic: better propagate constants up fadd chains Make the optimization create more mad-friendly code if the order of the fadd's operands is unlucky. fossil-db (Navi): Totals from 9259 (8.07% of 114665) affected shaders: SGPRs: 615991 -> 616191 (+0.03%); split: -0.05%, +0.08% VGPRs: 442184 -> 443568 (+0.31%); split: -0.10%, +0.41% CodeSize: 32674876 -> 32625572 (-0.15%); split: -0.17%, +0.02% MaxWaves: 108560 -> 108152 (-0.38%); split: +0.07%, -0.44% Instrs: 6126473 -> 6120463 (-0.10%); split: -0.13%, +0.03% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5631>	2020-11-03 14:56:00 +00:00
Ian Romanick	67956689bb	nir: Rename replicated-result dot-product instructions All these instructions replicate the result of a N-component dot-product to a vec4. Naming them fdot_replicatedN gives the impression that are some sort of abstract dot-product that replicates the result to a vecN. They also deviate from fdph_replicated... which nobody would reasonably consider naming fdot_replicatedh. Naming these opcodes fdotN_replicated more closely matches what they are, and it matches the pattern of fdph_replicated. I believe that the only reason these opcodes were named this way was because it simplified the implementation of the binop_reduce function in nir_opcodes.py. I made some fairly simple changes to that function, and I think the end result is ok. The bulk of the changes come from the sed rename: sed --in-place -e 's/fdot_replicated$[234]$/fdot\1_replicated/g' \ $(grep -r 'fdot_replicated[234]' src/) v2: Use a named parameter to binop_reduce instead of using isinstance(name, str). Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5725>	2020-10-22 18:00:19 +00:00
Daniel Schürmann	f503699e10	nir/opt_algebraic: optimize unpack_half_2x16_split_x(ushr, a, 16) Same as extract_u16(a, 1) Totals from 2021 (1.48% of 136546) affected shaders (RAVEN): VGPRs: 129516 -> 129524 (+0.01%); split: -0.00%, +0.01% CodeSize: 12485704 -> 12486600 (+0.01%); split: -0.00%, +0.01% Instrs: 2435041 -> 2434999 (-0.00%); split: -0.00%, +0.00% Cycles: 20952552 -> 20952624 (+0.00%); split: -0.00%, +0.00% VMEM: 374492 -> 374212 (-0.07%); split: +0.01%, -0.08% SMEM: 123309 -> 123291 (-0.01%); split: +0.00%, -0.02% VClause: 64156 -> 64164 (+0.01%) Copies: 191620 -> 191616 (-0.00%); split: -0.03%, +0.03% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>	2020-10-14 15:31:38 +00:00
Jose Maria Casanova Crespo	e7127b3468	nir/algebraic: optimize iand/ior of (n)eq zero when umax/umin not available Before `8e1b75b330` ("nir/algebraic: optimize iand/ior of (n)eq zero") this optimization didn't need the use of umax/umin. VC4 HW supports only signed integer max/min operations. lower_umin and lower_umax are added to allow enabling previous optimizations behaviour for this cases. Fixes: `8e1b75b330` ("nir/algebraic: optimize iand/ior of (n)eq zero") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7083>	2020-10-10 13:16:37 +02:00
Marek Olšák	1e7d82c881	nir/algebraic: always lower idiv to shifts if bitops are allowed why would you want anything else The only platform significantly affected by this is Intel where `lower_idiv` is not set today but neither is `lower_bitops`. There it seems to still be a boon over-all. Shader-db results on Ice Lake: total instructions in shared programs: 19719051 -> 19735766 (0.08%) instructions in affected programs: 106992 -> 123707 (15.62%) helped: 0 HURT: 445 HURT stats (abs) min: 3 max: 295 x̄: 37.56 x̃: 44 HURT stats (rel) min: 0.16% max: 33.33% x̄: 19.60% x̃: 19.38% 95% mean confidence interval for instructions value: 33.60 41.53 95% mean confidence interval for instructions %-change: 18.97% 20.23% Instructions are HURT. total loops in shared programs: 5973 -> 5973 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 489405810 -> 486917482 (-0.51%) cycles in affected programs: 4759097 -> 2270769 (-52.29%) helped: 406 HURT: 34 helped stats (abs) min: 2 max: 64661 x̄: 6291.95 x̃: 3126 helped stats (rel) min: 0.02% max: 79.42% x̄: 43.32% x̃: 55.83% HURT stats (abs) min: 2 max: 29376 x̄: 1947.12 x̃: 30 HURT stats (rel) min: 0.04% max: 23.82% x̄: 4.66% x̃: 1.33% 95% mean confidence interval for cycles value: -6753.06 -4557.52 95% mean confidence interval for cycles %-change: -42.60% -36.63% Cycles are helped. total spills in shared programs: 12481 -> 12482 (<.01%) spills in affected programs: 47 -> 48 (2.13%) helped: 0 HURT: 1 total fills in shared programs: 12816 -> 12819 (0.02%) fills in affected programs: 71 -> 74 (4.23%) helped: 0 HURT: 1 total sends in shared programs: 1010124 -> 1010124 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 1 GAINED: 0 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6963>	2020-10-07 10:50:53 -04:00
Kenneth Graunke	140f53e646	Revert "nir: replace lower_ffma and fuse_ffma with has_ffma" This reverts commit `939ddf3f67`. Intel has a separate pass for fusing FFMAs selectively. We split these flags in commit `1b72c31e1f` and the reasoning still stands. The patch being reverted was just a cleanup, so there should be no issue with reverting it. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849>	2020-09-24 13:11:50 -07:00
Marek Olšák	939ddf3f67	nir: replace lower_ffma and fuse_ffma with has_ffma Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Marek Olšák	771aad3027	nir: split lower_ffma into lower_ffma16/32/64 AMD wants different behavior for each bit size Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Marek Olšák	21174dedec	nir: split fuse_ffma into fuse_ffma16/32/64 AMD wants different behavior for each bit size Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756>	2020-09-24 12:29:11 +00:00
Marek Olšák	57bf4c2028	nir,radeonsi: move ffma fusing to late optimizations for better codegen The freedreno trace changes were suggested by Rob Clark. ALU performance is higher, because ffma is used more often, but so is register usage, because trinary opcodes (such as ffma) usually need at least 3 live registers. 54793 shaders in 33659 tests Totals: SGPRS: 2639746 -> 2642938 (0.12 %) VGPRS: 1534120 -> 1536392 (0.15 %) Spilled SGPRs: 3541 -> 3618 (2.17 %) Spilled VGPRs: 33 -> 44 (33.33 %) Scratch size: 292 -> 312 (6.85 %) dwords per thread Code Size: 55639836 -> 55620116 (-0.04 %) bytes Max Waves: 964785 -> 963977 (-0.08 %) Totals from affected shaders: SGPRS: 1105800 -> 1108992 (0.29 %) VGPRS: 635292 -> 637564 (0.36 %) Spilled SGPRs: 3193 -> 3270 (2.41 %) Spilled VGPRs: 33 -> 44 (33.33 %) Scratch size: 36 -> 56 (55.56 %) dwords per thread Code Size: 31568708 -> 31548988 (-0.06 %) bytes Max Waves: 319991 -> 319183 (-0.25 %) Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6596>	2020-09-16 02:39:02 +00:00
Italo Nicola	00914e2179	nir/algebraic: fold some nested comparisons with ball and bany Signed-off-by: Italo Nicola <italonicola@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6604>	2020-09-14 17:47:39 +00:00
Marek Olšák	50d335804f	nir/algebraic: add late optimizations that optimize out mediump conversions (v3) v2: move 2mp patterns to the end of late_optimizations v3: remove ftrunc from the optimizations to fix: dEQP-GLES3.functional.shaders.builtin_functions.common.modf.vec2_lowp_vertex Reviewed-by: Rob Clark <robdclark@chromium.org> (v1) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	b86305bb57	nir/algebraic: collapse conversion opcodes (many patterns) mediump inserts a lot of conversions. This cleans up the IR. All other combinations are covered too. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	cdd498bbe8	nir: add new mediump opcodes f2[ui]mp, i2fmp, u2fmp Algebraic optimizations will select them. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	3d3df8dbff	nir: remove redundant opcode u2ump Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	26fc5e1f4a	nir/algebraic: expand existing 32-bit patterns to all bit sizes using loops Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	3c8934a644	nir/algebraic: add flrp patterns for 16 and 64 bits Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	a7ece63de9	nir/algebraic: add 16-bit versions of a few 32-bit patterns Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>	2020-09-04 17:06:22 +00:00
Marek Olšák	00b28a50b2	nir/algebraic: trivially enable existing 32-bit patterns for all bit sizes Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>	2020-09-04 17:06:22 +00:00
Eric Anholt	479d9c97eb	nir: Add simplistic lowering for bany_equal/ball_inequal. It would be nice if we could do swizzling of an expression on the replacement side so that we could have a single ieq/ine of the vector after CSE. However, if you do want vector operations, nir_opt_vectorize() does just fine. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6567>	2020-09-02 09:58:44 -07:00
Samuel Pitoiset	bc123c396a	nir/algebraic: mark some optimizations with fsat(NaN) as inexact If a is Nan, fsat(NaN) is expected to be 0 and some optimizations should be marked as inexact. Fixes a GPU hang with Death Stranding and RADV/ACO (RADV/LLVM isn't affected because it lowers fsat). No fossils-db change. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3368 Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6519>	2020-09-01 11:20:03 +02:00
Jesse Natalie	d91f85f16e	nir: Remove 32bit restriction for uadd_carry optimization Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6313>	2020-08-27 16:57:42 +00:00
Daniel Schürmann	a79dad950b	nir,amd: remove trinary_minmax opcodes These consist of the variations nir_op_{i\|u\|f}{min\|max\|med}3 which are either lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>	2020-08-24 20:56:11 +00:00
Erik Faye-Lund	5e841e8b4f	nir: add iabs-lowering code Microsoft's DXIL is based on LLVM, which doesn't have an integer ABS opcode, but instead needs it lowered to NEG + MAX. We need to do this with an option, to prevent an already existing optimization rule from undoing this. Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5211>	2020-08-24 10:02:47 +00:00
Karol Herbst	e5899c1e88	nir: rename nir_op_fne to nir_op_fneu It was always fneu but naming it fne causes confusion from time to time. So lets rename it. Later we also want to add other unordered and fne, this is a smaller preparation for that. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>	2020-08-21 17:26:21 +00:00
Boris Brezillon	18e464cfc0	compiler/nir: Add new flags to lower pack/unpack split instructions And add new rules to do this lowering in nir_opt_algebraic.py. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6309>	2020-08-17 19:46:10 +00:00
Jesse Natalie	a1ed83fddd	nir: Optimize mask+downcast to just downcast Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6330>	2020-08-17 14:36:18 +00:00
Daniel Schürmann	5f79e4e69a	nir/algebraic: fold some nested bcsel Totals from 14266 (10.62% of 134368) affected shaders (Polaris): SGPRs: 761756 -> 762732 (+0.13%); split: -0.00%, +0.13% VGPRs: 430392 -> 430924 (+0.12%); split: -0.05%, +0.17% SpillSGPRs: 4652 -> 4628 (-0.52%); split: -0.60%, +0.09% CodeSize: 30133000 -> 29949780 (-0.61%); split: -0.66%, +0.05% MaxWaves: 102122 -> 102111 (-0.01%); split: +0.00%, -0.01% Instrs: 5845085 -> 5841668 (-0.06%); split: -0.08%, +0.03% Cycles: 69033140 -> 68889188 (-0.21%); split: -0.22%, +0.01% VMEM: 8479021 -> 8474978 (-0.05%); split: +0.03%, -0.08% SMEM: 831437 -> 830464 (-0.12%); split: +0.06%, -0.18% VClause: 105411 -> 105410 (-0.00%); split: -0.01%, +0.01% SClause: 327727 -> 327780 (+0.02%); split: -0.00%, +0.02% Copies: 372704 -> 373306 (+0.16%); split: -0.16%, +0.32% Branches: 112260 -> 112269 (+0.01%); split: -0.00%, +0.01% PreSGPRs: 433308 -> 433631 (+0.07%); split: -0.01%, +0.09% PreVGPRs: 397888 -> 397905 (+0.00%); split: -0.01%, +0.01% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	27244662f2	nir/algebraic: propagate b2i out of ior/iand Totals from 761 (0.57% of 134368) affected shaders (Polaris): SGPRs: 29496 -> 29488 (-0.03%) SpillSGPRs: 41 -> 43 (+4.88%) CodeSize: 1922036 -> `1882408` (-2.06%); split: -2.08%, +0.02% Instrs: 366051 -> 360362 (-1.55%); split: -1.57%, +0.02% Cycles: 7692516 -> 7661216 (-0.41%); split: -0.41%, +0.01% VMEM: 365175 -> 365172 (-0.00%) VClause: 15324 -> 15322 (-0.01%) SClause: 9825 -> 9824 (-0.01%); split: -0.02%, +0.01% Copies: 41216 -> 41294 (+0.19%); split: -0.01%, +0.20% Branches: 7020 -> 7033 (+0.19%) PreSGPRs: 22103 -> 22106 (+0.01%) PreVGPRs: 26518 -> 26515 (-0.01%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	baee5a9812	nir/algebraic: add distributive rules for ior/iand Totals from 581 (0.43% of 134368) affected shaders (Polaris): CodeSize: 1389560 -> 1386488 (-0.22%) Instrs: 264488 -> 263984 (-0.19%) Cycles: 1057952 -> 1055936 (-0.19%) VMEM: 296016 -> 291613 (-1.49%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00
Daniel Schürmann	70d3efeb88	nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a) Totals from affected shaders: (VEGA) SGPRS: 13920 -> 13920 (0.00 %) VGPRS: 10252 -> 10252 (0.00 %) Spilled SGPRs: 62 -> 62 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 587648 -> 587224 (-0.07 %) bytes LDS: 5 -> 5 (0.00 %) blocks Max Waves: 1489 -> 1489 (0.00 %) Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830>	2020-07-20 15:56:46 +00:00

1 2 3 4 5 ...

394 commits