fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-15 20:48:13 +02:00

Author	SHA1	Message	Date
Ian Romanick	5544b2cbbd	nir/algebraic: Use value range analysis to eliminate useless unary ops Sandy Bridge is the big winner because it lies at something of a crossroads. It supports a fairly high OpenGL version, and it still has the old style math box. The high OpenGL version means a lot more shaders can run on it. The old style math box means extra moves are necessary to resolve source modifiers on operands to complex math instructions like COS, SQRT, and RCP. v2: Remove a couple patterns that are now redundant. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16282006 -> 16278207 (-0.02%) instructions in affected programs: 174555 -> 170756 (-2.18%) helped: 661 HURT: 0 helped stats (abs) min: 1 max: 36 x̄: 5.75 x̃: 3 helped stats (rel) min: 0.06% max: 23.68% x̄: 2.81% x̃: 1.94% 95% mean confidence interval for instructions value: -6.16 -5.34 95% mean confidence interval for instructions %-change: -3.02% -2.60% Instructions are helped. total cycles in shared programs: 367168597 -> 367134284 (<.01%) cycles in affected programs: 1105276 -> 1070963 (-3.10%) helped: 460 HURT: 150 helped stats (abs) min: 1 max: 568 x̄: 96.60 x̃: 82 helped stats (rel) min: 0.02% max: 32.50% x̄: 7.99% x̃: 4.27% HURT stats (abs) min: 1 max: 901 x̄: 67.49 x̃: 39 HURT stats (rel) min: 0.07% max: 20.00% x̄: 4.90% x̃: 4.22% 95% mean confidence interval for cycles value: -65.68 -46.82 95% mean confidence interval for cycles %-change: -5.59% -4.05% Cycles are helped. Sandy Bridge total instructions in shared programs: 10824272 -> 10802557 (-0.20%) instructions in affected programs: 1237988 -> 1216273 (-1.75%) helped: 8199 HURT: 0 helped stats (abs) min: 1 max: 41 x̄: 2.65 x̃: 2 helped stats (rel) min: 0.12% max: 20.00% x̄: 2.04% x̃: 1.73% 95% mean confidence interval for instructions value: -2.70 -2.59 95% mean confidence interval for instructions %-change: -2.07% -2.00% Instructions are helped. total cycles in shared programs: 154009894 -> 153843598 (-0.11%) cycles in affected programs: 10650486 -> 10484190 (-1.56%) helped: 4973 HURT: 1533 helped stats (abs) min: 1 max: 3904 x̄: 40.20 x̃: 20 helped stats (rel) min: 0.02% max: 41.72% x̄: 2.63% x̃: 1.67% HURT stats (abs) min: 1 max: 453 x̄: 21.94 x̃: 8 HURT stats (rel) min: 0.02% max: 41.91% x̄: 1.54% x̃: 0.58% 95% mean confidence interval for cycles value: -28.02 -23.10 95% mean confidence interval for cycles %-change: -1.74% -1.56% Cycles are helped. LOST: 0 GAINED: 2 GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8135196 -> 8134888 (<.01%) instructions in affected programs: 31920 -> 31612 (-0.96%) helped: 169 HURT: 0 helped stats (abs) min: 1 max: 12 x̄: 1.82 x̃: 2 helped stats (rel) min: 0.43% max: 3.23% x̄: 1.23% x̃: 1.16% 95% mean confidence interval for instructions value: -2.01 -1.64 95% mean confidence interval for instructions %-change: -1.32% -1.15% Instructions are helped. total cycles in shared programs: 188575724 -> 188574092 (<.01%) cycles in affected programs: 406840 -> 405208 (-0.40%) helped: 169 HURT: 0 helped stats (abs) min: 4 max: 72 x̄: 9.66 x̃: 10 helped stats (rel) min: 0.07% max: 2.16% x̄: 0.57% x̃: 0.47% 95% mean confidence interval for cycles value: -10.72 -8.59 95% mean confidence interval for cycles %-change: -0.63% -0.50% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:14 -07:00
Ian Romanick	8d14380971	nir/algebraic: Use value range analysis to convert fmin to fsat All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16297320 -> 16282006 (-0.09%) instructions in affected programs: 2434498 -> 2419184 (-0.63%) helped: 8091 HURT: 1 helped stats (abs) min: 1 max: 51 x̄: 1.89 x̃: 2 helped stats (rel) min: 0.04% max: 14.29% x̄: 0.98% x̃: 0.95% HURT stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7 HURT stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28% 95% mean confidence interval for instructions value: -1.94 -1.85 95% mean confidence interval for instructions %-change: -0.99% -0.96% Instructions are helped. total cycles in shared programs: 367221624 -> 367168597 (-0.01%) cycles in affected programs: 126409635 -> 126356608 (-0.04%) helped: 5612 HURT: 1023 helped stats (abs) min: 1 max: 2332 x̄: 31.11 x̃: 16 helped stats (rel) min: <.01% max: 30.31% x̄: 1.69% x̃: 1.42% HURT stats (abs) min: 1 max: 2372 x̄: 118.84 x̃: 16 HURT stats (rel) min: <.01% max: 46.98% x̄: 1.46% x̃: 0.35% 95% mean confidence interval for cycles value: -11.52 -4.46 95% mean confidence interval for cycles %-change: -1.26% -1.14% Cycles are helped. total spills in shared programs: 8868 -> 8870 (0.02%) spills in affected programs: 28 -> 30 (7.14%) helped: 0 HURT: 1 total fills in shared programs: 21903 -> 21904 (<.01%) fills in affected programs: 42 -> 43 (2.38%) helped: 0 HURT: 1 Haswell total instructions in shared programs: 13353925 -> 13338728 (-0.11%) instructions in affected programs: 2265850 -> 2250653 (-0.67%) helped: 8127 HURT: 5 helped stats (abs) min: 1 max: 51 x̄: 1.88 x̃: 2 helped stats (rel) min: 0.04% max: 20.00% x̄: 1.13% x̃: 1.07% HURT stats (abs) min: 5 max: 16 x̄: 9.00 x̃: 6 HURT stats (rel) min: 0.19% max: 0.52% x̄: 0.35% x̃: 0.28% 95% mean confidence interval for instructions value: -1.91 -1.83 95% mean confidence interval for instructions %-change: -1.15% -1.11% Instructions are helped. total cycles in shared programs: 375535444 -> 375536343 (<.01%) cycles in affected programs: 131206582 -> 131207481 (<.01%) helped: 5590 HURT: 1055 helped stats (abs) min: 1 max: 2844 x̄: 34.15 x̃: 16 helped stats (rel) min: <.01% max: 21.57% x̄: 2.08% x̃: 1.60% HURT stats (abs) min: 1 max: 2487 x̄: 181.78 x̃: 21 HURT stats (rel) min: <.01% max: 40.66% x̄: 1.96% x̃: 0.37% 95% mean confidence interval for cycles value: -4.74 5.01 95% mean confidence interval for cycles %-change: -1.51% -1.37% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 23401 -> 23407 (0.03%) spills in affected programs: 248 -> 254 (2.42%) helped: 2 HURT: 5 total fills in shared programs: 34850 -> 34845 (-0.01%) fills in affected programs: 383 -> 378 (-1.31%) helped: 2 HURT: 5 Ivy Bridge total instructions in shared programs: 11975423 -> 11968117 (-0.06%) instructions in affected programs: 845703 -> 838397 (-0.86%) helped: 4071 HURT: 0 helped stats (abs) min: 1 max: 51 x̄: 1.79 x̃: 1 helped stats (rel) min: 0.08% max: 8.21% x̄: 1.04% x̃: 0.93% 95% mean confidence interval for instructions value: -1.87 -1.71 95% mean confidence interval for instructions %-change: -1.06% -1.02% Instructions are helped. total cycles in shared programs: 179674318 -> 179635552 (-0.02%) cycles in affected programs: 5100065 -> 5061299 (-0.76%) helped: 2650 HURT: 611 helped stats (abs) min: 1 max: 900 x̄: 21.85 x̃: 16 helped stats (rel) min: <.01% max: 21.55% x̄: 2.39% x̃: 1.40% HURT stats (abs) min: 1 max: 1841 x̄: 31.33 x̃: 6 HURT stats (rel) min: <.01% max: 58.71% x̄: 1.64% x̃: 0.37% 95% mean confidence interval for cycles value: -14.14 -9.64 95% mean confidence interval for cycles %-change: -1.75% -1.52% Cycles are helped. LOST: 3 GAINED: 7 Sandy Bridge total instructions in shared programs: 10828844 -> 10824272 (-0.04%) instructions in affected programs: 525678 -> 521106 (-0.87%) helped: 2386 HURT: 0 helped stats (abs) min: 1 max: 51 x̄: 1.92 x̃: 2 helped stats (rel) min: 0.11% max: 7.96% x̄: 1.05% x̃: 0.94% 95% mean confidence interval for instructions value: -2.04 -1.80 95% mean confidence interval for instructions %-change: -1.08% -1.03% Instructions are helped. total cycles in shared programs: 154024591 -> 154009894 (<.01%) cycles in affected programs: 4005766 -> 3991069 (-0.37%) helped: 1245 HURT: 506 helped stats (abs) min: 1 max: 585 x̄: 21.07 x̃: 16 helped stats (rel) min: 0.02% max: 11.57% x̄: 1.98% x̃: 0.83% HURT stats (abs) min: 1 max: 639 x̄: 22.81 x̃: 6 HURT stats (rel) min: 0.01% max: 26.21% x̄: 1.07% x̃: 0.26% 95% mean confidence interval for cycles value: -10.57 -6.21 95% mean confidence interval for cycles %-change: -1.23% -0.97% Cycles are helped. GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8137248 -> 8135196 (-0.03%) instructions in affected programs: 148322 -> 146270 (-1.38%) helped: 992 HURT: 0 helped stats (abs) min: 1 max: 32 x̄: 2.07 x̃: 2 helped stats (rel) min: 0.41% max: 9.73% x̄: 1.74% x̃: 1.51% 95% mean confidence interval for instructions value: -2.16 -1.98 95% mean confidence interval for instructions %-change: -1.80% -1.67% Instructions are helped. total cycles in shared programs: 188583424 -> 188575724 (<.01%) cycles in affected programs: 4409620 -> 4401920 (-0.17%) helped: 956 HURT: 6 helped stats (abs) min: 2 max: 168 x̄: 8.09 x̃: 8 helped stats (rel) min: 0.04% max: 6.76% x̄: 0.27% x̃: 0.18% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 0.10% max: 0.10% x̄: 0.10% x̃: 0.10% 95% mean confidence interval for cycles value: -8.41 -7.60 95% mean confidence interval for cycles %-change: -0.29% -0.25% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:14 -07:00
Ian Romanick	b77070e293	nir/algebraic: Use value range analysis to eliminate tautological compares It's only one application on one platform (Haswell) that's affected, but spills and fills increase quite dramatically. :( All Gen8+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16320850 -> 16297320 (-0.14%) instructions in affected programs: 448012 -> 424482 (-5.25%) helped: 1938 HURT: 0 helped stats (abs) min: 2 max: 264 x̄: 12.14 x̃: 10 helped stats (rel) min: 0.35% max: 43.75% x̄: 5.85% x̃: 5.38% 95% mean confidence interval for instructions value: -12.80 -11.48 95% mean confidence interval for instructions %-change: -5.99% -5.72% Instructions are helped. total cycles in shared programs: 367496943 -> 367221624 (-0.07%) cycles in affected programs: 8557232 -> 8281913 (-3.22%) helped: 1907 HURT: 26 helped stats (abs) min: 4 max: 12802 x̄: 147.21 x̃: 48 helped stats (rel) min: 0.03% max: 75.85% x̄: 5.55% x̃: 3.94% HURT stats (abs) min: 4 max: 1870 x̄: 208.23 x̃: 20 HURT stats (rel) min: 0.16% max: 32.11% x̄: 8.31% x̃: 0.79% 95% mean confidence interval for cycles value: -165.38 -119.48 95% mean confidence interval for cycles %-change: -5.68% -5.04% Cycles are helped. LOST: 1 GAINED: 0 Haswell total instructions in shared programs: 13374211 -> 13353925 (-0.15%) instructions in affected programs: 349868 -> 329582 (-5.80%) helped: 1669 HURT: 1 helped stats (abs) min: 1 max: 264 x̄: 12.57 x̃: 10 helped stats (rel) min: 0.12% max: 46.81% x̄: 6.86% x̃: 6.49% HURT stats (abs) min: 700 max: 700 x̄: 700.00 x̃: 700 HURT stats (rel) min: 64.34% max: 64.34% x̄: 64.34% x̃: 64.34% 95% mean confidence interval for instructions value: -13.25 -11.04 95% mean confidence interval for instructions %-change: -7.01% -6.63% Instructions are helped. total cycles in shared programs: 375763544 -> 375535444 (-0.06%) cycles in affected programs: 6932686 -> 6704586 (-3.29%) helped: 1622 HURT: 48 helped stats (abs) min: 2 max: 12229 x̄: 148.31 x̃: 68 helped stats (rel) min: 0.06% max: 74.03% x̄: 5.94% x̃: 4.12% HURT stats (abs) min: 3 max: 7451 x̄: 259.44 x̃: 41 HURT stats (rel) min: 0.05% max: 54.99% x̄: 8.52% x̃: 2.88% 95% mean confidence interval for cycles value: -159.86 -113.31 95% mean confidence interval for cycles %-change: -5.86% -5.18% Cycles are helped. total spills in shared programs: 23258 -> 23401 (0.61%) spills in affected programs: 54 -> 197 (264.81%) helped: 4 HURT: 2 total fills in shared programs: 34775 -> 34850 (0.22%) fills in affected programs: 52 -> 127 (144.23%) helped: 4 HURT: 1 LOST: 5 GAINED: 0 Ivy Bridge total instructions in shared programs: 11996051 -> 11977964 (-0.15%) instructions in affected programs: 346679 -> 328592 (-5.22%) helped: 1508 HURT: 0 helped stats (abs) min: 2 max: 198 x̄: 11.99 x̃: 10 helped stats (rel) min: 0.26% max: 19.83% x̄: 5.73% x̃: 5.43% 95% mean confidence interval for instructions value: -12.65 -11.34 95% mean confidence interval for instructions %-change: -5.86% -5.60% Instructions are helped. total cycles in shared programs: 179891389 -> 179691339 (-0.11%) cycles in affected programs: 7869479 -> 7669429 (-2.54%) helped: 1485 HURT: 23 helped stats (abs) min: 1 max: 12615 x̄: 136.16 x̃: 54 helped stats (rel) min: 0.02% max: 71.84% x̄: 4.69% x̃: 3.49% HURT stats (abs) min: 1 max: 403 x̄: 93.48 x̃: 6 HURT stats (rel) min: 0.04% max: 34.01% x̄: 8.68% x̃: 0.81% 95% mean confidence interval for cycles value: -154.59 -110.73 95% mean confidence interval for cycles %-change: -4.79% -4.19% Cycles are helped. Sandy Bridge total instructions in shared programs: 10829247 -> 10828844 (<.01%) instructions in affected programs: 21258 -> 20855 (-1.90%) helped: 88 HURT: 0 helped stats (abs) min: 2 max: 17 x̄: 4.58 x̃: 5 helped stats (rel) min: 0.52% max: 3.92% x̄: 2.05% x̃: 2.21% 95% mean confidence interval for instructions value: -5.03 -4.13 95% mean confidence interval for instructions %-change: -2.21% -1.89% Instructions are helped. total cycles in shared programs: 154035437 -> 154024591 (<.01%) cycles in affected programs: 430176 -> 419330 (-2.52%) helped: 78 HURT: 10 helped stats (abs) min: 2 max: 4649 x̄: 143.06 x̃: 32 helped stats (rel) min: 0.05% max: 6.02% x̄: 2.03% x̃: 1.07% HURT stats (abs) min: 3 max: 265 x̄: 31.30 x̃: 6 HURT stats (rel) min: 0.10% max: 8.67% x̄: 1.03% x̃: 0.21% 95% mean confidence interval for cycles value: -232.53 -13.97 95% mean confidence interval for cycles %-change: -2.13% -1.23% Cycles are helped. Iron Lake and GM45 had similar results. (Iron Lake shown) total instructions in shared programs: 8137402 -> 8137248 (<.01%) instructions in affected programs: 2280 -> 2126 (-6.75%) helped: 10 HURT: 0 helped stats (abs) min: 12 max: 19 x̄: 15.40 x̃: 15 helped stats (rel) min: 3.90% max: 11.73% x̄: 7.19% x̃: 6.95% 95% mean confidence interval for instructions value: -17.69 -13.11 95% mean confidence interval for instructions %-change: -8.99% -5.39% Instructions are helped. total cycles in shared programs: 188538716 -> 188583424 (0.02%) cycles in affected programs: 69326 -> 114034 (64.49%) helped: 0 HURT: 10 HURT stats (abs) min: 2068 max: 7686 x̄: 4470.80 x̃: 4870 HURT stats (rel) min: 27.20% max: 173.66% x̄: 69.55% x̃: 59.41% 95% mean confidence interval for cycles value: 2830.86 6110.74 95% mean confidence interval for cycles %-change: 39.18% 99.91% Cycles are HURT. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	96fcb3f95b	nir/algebraic: Use value range analysis to eliminate tautological compares not used by if-statements This just eliminates tautological / contradictory compares that are used for bcsel and other non-if-statement cases. If-statements are not affected because removing flow control can cause the i965 instrution scheduler to create some very long live ranges resulting in unncessary spilling. This causes some shaders to fall of a performance cliff. Since many small if-statements are already flattened to bcsel, this optimization covers more than 68% of the possible cases (2417 shaders helped for instructions on Skylake vs. 3554). v2: Reorder and add whitespace to make the relationship between the patterns more obvious. Suggested by Caio. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16333474 -> 16322028 (-0.07%) instructions in affected programs: 438559 -> 427113 (-2.61%) helped: 1765 HURT: 0 helped stats (abs) min: 1 max: 275 x̄: 6.48 x̃: 4 helped stats (rel) min: 0.20% max: 36.36% x̄: 4.07% x̃: 1.82% 95% mean confidence interval for instructions value: -6.87 -6.10 95% mean confidence interval for instructions %-change: -4.30% -3.84% Instructions are helped. total cycles in shared programs: 367608554 -> 367511103 (-0.03%) cycles in affected programs: `8368829` -> 8271378 (-1.16%) helped: 1541 HURT: 129 helped stats (abs) min: 1 max: 4468 x̄: 66.78 x̃: 39 helped stats (rel) min: 0.01% max: 45.69% x̄: 4.10% x̃: 2.17% HURT stats (abs) min: 1 max: 973 x̄: 42.25 x̃: 10 HURT stats (rel) min: 0.02% max: 64.39% x̄: 2.15% x̃: 0.60% 95% mean confidence interval for cycles value: -64.90 -51.81 95% mean confidence interval for cycles %-change: -3.89% -3.36% Cycles are helped. total spills in shared programs: 8867 -> 8868 (0.01%) spills in affected programs: 18 -> 19 (5.56%) helped: 0 HURT: 1 total fills in shared programs: 21900 -> 21903 (0.01%) fills in affected programs: 78 -> 81 (3.85%) helped: 0 HURT: 1 All Gen6 and earlier platforms had similar results. (Sandy Bridge shown) total instructions in shared programs: 10829877 -> 10829247 (<.01%) instructions in affected programs: 30240 -> 29610 (-2.08%) helped: 177 HURT: 0 helped stats (abs) min: 1 max: 15 x̄: 3.56 x̃: 3 helped stats (rel) min: 0.37% max: 17.39% x̄: 2.68% x̃: 1.94% 95% mean confidence interval for instructions value: -3.93 -3.18 95% mean confidence interval for instructions %-change: -3.04% -2.32% Instructions are helped. total cycles in shared programs: 154036580 -> 154035437 (<.01%) cycles in affected programs: 352402 -> 351259 (-0.32%) helped: 96 HURT: 28 helped stats (abs) min: 1 max: 128 x̄: 14.73 x̃: 6 helped stats (rel) min: 0.03% max: 24.00% x̄: 1.51% x̃: 0.46% HURT stats (abs) min: 1 max: 117 x̄: 9.68 x̃: 4 HURT stats (rel) min: 0.03% max: 2.24% x̄: 0.43% x̃: 0.23% 95% mean confidence interval for cycles value: -13.40 -5.03 95% mean confidence interval for cycles %-change: -1.62% -0.53% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	fa116ce357	nir/range-analysis: Range tracking for ffma and flrp A similar technique could be used for fmin3, fmax3, and fmid3. This could be squashed with the previous commit. I kept it separate to ease review. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	586602c5d9	nir/range-analysis: Range tracking for bcsel This could be squashed with the previous commit. I kept it separate to ease review. v2: Add some missing cases. Use nir_src_is_const helper. Both suggested by Caio. Use a table for mapping source ranges to a result range. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	3009cbed50	nir/range-analysis: Tighten the range of fsat based on the range of its source This could be squashed with the previous commit. I kept it separate to ease review. v2: Use a switch statement and add more comments. Both suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	405de7ccb6	nir/range-analysis: Rudimentary value range analysis pass Most integer operations are omitted because dealing with integer overflow is hard. There are a few things that could be smarter if there was a small amount more tracking of ranges of integer types (i.e., operands are Boolean, operand values fit in 16 bits, etc.). The changes to nir_search_helpers.h are included in this patch to simplify reordering the changes to nir_opt_algebraic.py. v2: Memoize range analysis results. Without this, some shaders appear to get stuck in infinite loops. v3: Rebase on many months of Mesa changes, including 1-bit Boolean changes. v4: Rebase on "nir: Drop imov/fmov in favor of one mov instruction". v5: Use nir_alu_srcs_equal for detecting (aa). Previously just the SSA value was compared, and this incorrectly matched (a.xa.y). v6: Many code improvements including (but not limited to) better names, more comments, and better use of helper functions. All suggested by Caio. Rework the handling of several opcodes to use a table for mapping source ranges to a result range. This change fixed a bug that caused fmax(gt_zero, ge_zero) to be incorrectly recognized as ge_zero. Slightly tighten the range of fmul by recognizing that xx is gt_zero if x is gt_zero. Add similar handling for -xx. v7: Use _______ in the tables as an alias for unknown. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	d24edb4b8c	nir/algebraic: Simplify some comparisons like a+constant < constant v2: Remove unsafe integer versions of the optimizations. This change had no effect on shader-db results. Suggested by Caio. All Gen6+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16333713 -> 16332631 (<.01%) instructions in affected programs: 258112 -> 257030 (-0.42%) helped: 1275 HURT: 407 helped stats (abs) min: 1 max: 7 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.20% max: 8.33% x̄: 1.33% x̃: 0.86% HURT stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.11% max: 2.94% x̄: 0.98% x̃: 0.98% 95% mean confidence interval for instructions value: -0.70 -0.59 95% mean confidence interval for instructions %-change: -0.84% -0.70% Instructions are helped. total cycles in shared programs: 367596791 -> 367601268 (<.01%) cycles in affected programs: 3420062 -> 3424539 (0.13%) helped: 1553 HURT: 783 helped stats (abs) min: 1 max: 742 x̄: 24.36 x̃: 6 helped stats (rel) min: 0.05% max: 21.12% x̄: 1.47% x̃: 0.65% HURT stats (abs) min: 1 max: 557 x̄: 54.04 x̃: 14 HURT stats (rel) min: 0.01% max: 33.66% x̄: 3.36% x̃: 1.43% 95% mean confidence interval for cycles value: -1.60 5.43 95% mean confidence interval for cycles %-change: -0.03% 0.33% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Iron Lake total instructions in shared programs: 8137992 -> 8137874 (<.01%) instructions in affected programs: 17501 -> 17383 (-0.67%) helped: 104 HURT: 2 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.25% max: 2.63% x̄: 0.87% x̃: 0.72% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.22 -1.00 95% mean confidence interval for instructions %-change: -0.94% -0.76% Instructions are helped. total cycles in shared programs: 188540038 -> 188539650 (<.01%) cycles in affected programs: 704574 -> 704186 (-0.06%) helped: 125 HURT: 84 helped stats (abs) min: 2 max: 96 x̄: 6.45 x̃: 4 helped stats (rel) min: <.01% max: 3.47% x̄: 0.42% x̃: 0.25% HURT stats (abs) min: 2 max: 58 x̄: 4.98 x̃: 4 HURT stats (rel) min: 0.01% max: 2.75% x̄: 0.36% x̃: 0.33% 95% mean confidence interval for cycles value: -3.20 -0.52 95% mean confidence interval for cycles %-change: -0.19% -0.03% Cycles are helped. GM45 total instructions in shared programs: 5008889 -> 5008830 (<.01%) instructions in affected programs: 8824 -> 8765 (-0.67%) helped: 52 HURT: 1 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.25% max: 2.38% x̄: 0.86% x̃: 0.72% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45% 95% mean confidence interval for instructions value: -1.27 -0.95 95% mean confidence interval for instructions %-change: -0.96% -0.71% Instructions are helped. total cycles in shared programs: 128969426 -> 128969128 (<.01%) cycles in affected programs: 399798 -> 399500 (-0.07%) helped: 74 HURT: 30 helped stats (abs) min: 2 max: 22 x̄: 6.76 x̃: 6 helped stats (rel) min: <.01% max: 1.83% x̄: 0.46% x̃: 0.29% HURT stats (abs) min: 2 max: 58 x̄: 6.73 x̃: 6 HURT stats (rel) min: 0.06% max: 2.75% x̄: 0.42% x̃: 0.21% 95% mean confidence interval for cycles value: -4.60 -1.14 95% mean confidence interval for cycles %-change: -0.32% -0.08% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	7c64cbf49d	nir/algebraic: Recognize (a < 0 \|\| 0 < b) as min(a, -b) < 0 Similar to commit `97e6c1b9` and `f5cf74d8ba`. First apply 0 < b => -b < 0 to get (a < 0 \|\| -b < 0), then apply some pre-existing rules to get min(a, -b) < 0. v2: Substantially update the comment explaining the use of is_used_once and the duplication of patterns. Suggested by Caio. Also, while flt and fge are not commutative, ior and iand are. Half of the original patterns were redundant, so delete them. As alternate justification for deleting them, fmin(a, -b) < 0 <=> 0 < fmax(-a, b). Proof left as an exercise for the reader. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16333789 -> 16333713 (<.01%) instructions in affected programs: 11424 -> 11348 (-0.67%) helped: 32 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 2.38 x̃: 2 helped stats (rel) min: 0.20% max: 1.67% x̄: 0.76% x̃: 0.69% 95% mean confidence interval for instructions value: -3.03 -1.72 95% mean confidence interval for instructions %-change: -0.89% -0.62% Instructions are helped. total cycles in shared programs: 367598295 -> 367596791 (<.01%) cycles in affected programs: 141414 -> 139910 (-1.06%) helped: 23 HURT: 6 helped stats (abs) min: 3 max: 386 x̄: 72.52 x̃: 20 helped stats (rel) min: 0.15% max: 4.86% x̄: 1.01% x̃: 0.76% HURT stats (abs) min: 4 max: 88 x̄: 27.33 x̃: 12 HURT stats (rel) min: 0.22% max: 3.95% x̄: 1.08% x̃: 0.59% 95% mean confidence interval for cycles value: -93.51 -10.21 95% mean confidence interval for cycles %-change: -1.10% -0.05% Cycles are helped. total instructions in shared programs: 10830836 -> 10830779 (<.01%) instructions in affected programs: 6895 -> 6838 (-0.83%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 14 x̄: 4.75 x̃: 1 helped stats (rel) min: 0.14% max: 1.61% x̄: 0.65% x̃: 0.33% 95% mean confidence interval for instructions value: -8.46 -1.04 95% mean confidence interval for instructions %-change: -1.03% -0.27% Instructions are helped. total cycles in shared programs: 154028477 -> 154032740 (<.01%) cycles in affected programs: 178433 -> 182696 (2.39%) helped: 3 HURT: 9 helped stats (abs) min: 3 max: 20 x̄: 11.00 x̃: 10 helped stats (rel) min: 0.07% max: 0.20% x̄: 0.12% x̃: 0.09% HURT stats (abs) min: 27 max: 1415 x̄: 477.33 x̃: 262 HURT stats (rel) min: 0.22% max: 6.45% x̄: 2.49% x̃: 1.76% 95% mean confidence interval for cycles value: 28.68 681.82 95% mean confidence interval for cycles %-change: 0.37% 3.30% Cycles are HURT. Iron Lake total instructions in shared programs: 8137966 -> 8137992 (<.01%) instructions in affected programs: 3281 -> 3307 (0.79%) helped: 0 HURT: 6 HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3 HURT stats (rel) min: 0.63% max: 1.01% x̄: 0.76% x̃: 0.64% 95% mean confidence interval for instructions value: 2.17 6.50 95% mean confidence interval for instructions %-change: 0.56% 0.96% Instructions are HURT. total cycles in shared programs: 188539386 -> 188540038 (<.01%) cycles in affected programs: 103826 -> 104478 (0.63%) helped: 0 HURT: 7 HURT stats (abs) min: 16 max: 218 x̄: 93.14 x̃: 80 HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.53% x̃: 0.46% 95% mean confidence interval for cycles value: 10.26 176.02 95% mean confidence interval for cycles %-change: 0.24% 0.81% Cycles are HURT. GM45 total instructions in shared programs: 5008876 -> 5008889 (<.01%) instructions in affected programs: 1645 -> 1658 (0.79%) helped: 0 HURT: 3 HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3 HURT stats (rel) min: 0.63% max: 1.00% x̄: 0.76% x̃: 0.63% total cycles in shared programs: 128968950 -> 128969426 (<.01%) cycles in affected programs: 64854 -> 65330 (0.73%) helped: 0 HURT: 4 HURT stats (abs) min: 18 max: 218 x̄: 119.00 x̃: 120 HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.60% x̃: 0.66% 95% mean confidence interval for cycles value: -62.92 300.92 95% mean confidence interval for cycles %-change: -0.05% 1.26% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Ian Romanick	92b75c126b	nir/algebraic: Replace checks that a value is between (or not) [0, 1] v2: Add an extra line to one of the proofs. Suggested by Caio. All Gen7+ platforms had similar results. (Ice Lake shown) total instructions in shared programs: 16329772 -> 16329427 (<.01%) instructions in affected programs: 41980 -> 41635 (-0.82%) helped: 110 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 3.14 x̃: 2 helped stats (rel) min: 0.19% max: 5.56% x̄: 1.12% x̃: 0.94% 95% mean confidence interval for instructions value: -4.10 -2.17 95% mean confidence interval for instructions %-change: -1.28% -0.96% Instructions are helped. total cycles in shared programs: 367551273 -> 367549979 (<.01%) cycles in affected programs: 492462 -> 491168 (-0.26%) helped: 76 HURT: 25 helped stats (abs) min: 1 max: 400 x̄: 42.86 x̃: 12 helped stats (rel) min: 0.06% max: 10.72% x̄: 1.23% x̃: 0.75% HURT stats (abs) min: 2 max: 730 x̄: 78.52 x̃: 16 HURT stats (rel) min: 0.17% max: 6.89% x̄: 2.08% x̃: 1.23% 95% mean confidence interval for cycles value: -37.79 12.16 95% mean confidence interval for cycles %-change: -0.90% 0.07% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 2 Sandy Bridge total instructions in shared programs: 10831115 -> 10830836 (<.01%) instructions in affected programs: 37830 -> 37551 (-0.74%) helped: 70 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 3.99 x̃: 2 helped stats (rel) min: 0.33% max: 7.14% x̄: 1.21% x̃: 0.97% 95% mean confidence interval for instructions value: -5.47 -2.50 95% mean confidence interval for instructions %-change: -1.49% -0.92% Instructions are helped. total cycles in shared programs: 154029323 -> 154028477 (<.01%) cycles in affected programs: 247909 -> 247063 (-0.34%) helped: 52 HURT: 6 helped stats (abs) min: 2 max: 254 x̄: 25.81 x̃: 4 helped stats (rel) min: 0.07% max: 4.39% x̄: 0.81% x̃: 0.19% HURT stats (abs) min: 4 max: 403 x̄: 82.67 x̃: 8 HURT stats (rel) min: 0.18% max: 1.60% x̄: 0.71% x̃: 0.53% 95% mean confidence interval for cycles value: -34.83 5.65 95% mean confidence interval for cycles %-change: -0.98% -0.32% Inconclusive result (value mean confidence interval includes 0). Iron Lake total instructions in shared programs: 8138007 -> 8137966 (<.01%) instructions in affected programs: 4060 -> 4019 (-1.01%) helped: 31 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.68% max: 8.33% x̄: 1.45% x̃: 0.90% 95% mean confidence interval for instructions value: -1.50 -1.15 95% mean confidence interval for instructions %-change: -2.11% -0.79% Instructions are helped. total cycles in shared programs: 188539492 -> 188539386 (<.01%) cycles in affected programs: 26280 -> 26174 (-0.40%) helped: 25 HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 4.24 x̃: 4 helped stats (rel) min: 0.08% max: 2.11% x̄: 0.54% x̃: 0.50% 95% mean confidence interval for cycles value: -5.08 -3.40 95% mean confidence interval for cycles %-change: -0.70% -0.37% Cycles are helped. GM45 total instructions in shared programs: 5008897 -> 5008876 (<.01%) instructions in affected programs: 2096 -> 2075 (-1.00%) helped: 16 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1 helped stats (rel) min: 0.68% max: 7.69% x̄: 1.41% x̃: 0.89% 95% mean confidence interval for instructions value: -1.57 -1.06 95% mean confidence interval for instructions %-change: -2.32% -0.49% Instructions are helped. total cycles in shared programs: 128969020 -> 128968950 (<.01%) cycles in affected programs: 18490 -> 18420 (-0.38%) helped: 15 HURT: 0 helped stats (abs) min: 2 max: 8 x̄: 4.67 x̃: 4 helped stats (rel) min: 0.08% max: 2.11% x̄: 0.51% x̃: 0.48% 95% mean confidence interval for cycles value: -6.03 -3.30 95% mean confidence interval for cycles %-change: -0.78% -0.24% Cycles are helped. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-08-05 20:14:13 -07:00
Eric Engestrom	d2d85b950d	meson: replace libmesa_util with idep_mesautil This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Acked-by: Eric Anholt <eric@anholt.net> Tested-by: Vinson Lee <vlee@freedesktop.org>	2019-08-03 00:08:37 +00:00
Connor Abbott	f41516bdb5	nir/find_array_copies: Reject copies with mismatched type When we detect a scalar/vector copy through load_deref/store_deref, we have to be careful since those can bitcast an int to a float and vice-versa even though copy_deref can't. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251 Fixes: `156306e5e6` ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>	2019-08-02 10:34:29 +02:00
Jason Ekstrand	078dcb7ccd	nir/lower_io: Add an option to lower 64-bit varyings Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 18:14:09 -05:00
Dave Airlie	7ad6ec80d9	nir: use common deref has indirect code in scratch lowering. This doesn't seem to need it's own copy here. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-01 08:32:12 +10:00
Eric Engestrom	5d7bcac4e7	nir: remove explicit nir_intrinsic_index_flag values These were left after a rebase and happen to make NIR_INTRINSIC_SWIZZLE_MASK == NIR_INTRINSIC_SRC_ACCESS, which is how it was noticed. Fixes: `6f20643b47` ("nir: Allow qualifiers on copy_deref and image instructions") Cc: Connor Abbott <cwabbott0@gmail.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-31 23:28:20 +01:00
Erico Nunes	b3676a6548	nir/algebraic: rename lower_bitshift to lower_bitops Optimizations that insert bitshift or bitwise operations should not be applied on GPUs that don't support integer operations. The .lower_bitshift could be used to control the bitshift related ones, but there was also one bitwise optimization uncovered. Since only lima and freedreno use this option and the use case is that no bit operations are wanted, let's rename it to .lower_bitops and use it to control all bitops related optimizations. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca>	2019-07-31 23:06:04 +02:00
Erico Nunes	4a407df682	nir/algebraic: add new fsum ops and fdot lowering The Mali400 pp doesn't implement fdot but has fsum3 and fsum4, which can be used to optimize fdot lowering. fsum2 is not implemented and can be further lowered to an add with the vector components. Currently lima ppir handles this lowering internally, however this happens in a very late stage and requires a big chunk of code compared to a nir_opt_algebraic lowering. By having fsum in nir, we can reduce ppir complexity and enable the lowered ops to be part of other nir optimizations in the optimization loop. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-31 21:35:58 +02:00
Eric Engestrom	bbeb507543	compiler/nir: add an ASSERTED Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	abc226cf41	tree-wide: replace MAYBE_UNUSED with ASSERTED Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Eric Engestrom	5febd4d575	compiler: replace MAYBE_UNUSED with UNUSED MAYBE_UNUSED is going away, so let's replace legitimate uses of it with UNUSED, which the former aliased to so far anyway. Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-31 09:41:05 +01:00
Connor Abbott	a094928abc	nir/find_array_copies: Use correct parent array length instr->type is the type of the array element, not the type of the array being dereferenced. Rather than fishing out the parent type, just use parent->num_children which should be the length plus 1. While we're here add another assert for the issue fixed by the previous commit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251 Fixes: `156306e5e6` ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-30 17:14:33 +02:00
Connor Abbott	7788992bc6	nir: Fix comparison for nir_deref_instr_is_known_out_of_bounds() There was an off-by-one error. Fixes: `156306e5e6` ("nir/find_array_copies: Handle wildcards and overlapping copies") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-30 17:14:28 +02:00
Jason Ekstrand	4bb6e6817e	intel: Use a system value for gl_FragCoord It's kind-of an anomaly that the Intel drivers are still treating gl_FragCoord as an input. It also makes zero sense because we have to special-case it in the back-end. Because ANV is the only user of nir_lower_wpos_center, we go ahead and just update it to look for nir_intrinsic_load_frag_coord as part of this patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 23:30:26 +00:00
Eric Anholt	3c46778b75	nir: Fix helgrind complaints about data race in trivial_swizzle init. Even if the data race wasn't real (I'm not great at reasoning about this), helgrind is a nice enough tool that keeping noise out of it is probably worthwhile. Besides, typing out the numbers keeps the data in the read-only data section instead of emitting code to initialize it every time. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2019-07-29 12:50:49 -07:00
Connor Abbott	156306e5e6	nir/find_array_copies: Handle wildcards and overlapping copies This commit rewrites opt_find_array_copies to be able to handle an array copy sequence with other intervening operations in between. In particular, this handles the case where we OpLoad an array of structs and then OpStore it, which generates code like: foo[0].a = bar[0].a foo[0].b = bar[0].b foo[1].a = bar[1].a foo[1].b = bar[1].b ... that wasn't recognized by the previous pass. In order to correctly handle copying arrays of arrays, and in particular to correctly handle copies involving wildcards, we need to use a tree structure similar to lower_vars_to_ssa so that we can walk all the partial array copies invalidated by a particular write, including ones where one of the common indices is a wildcard. I actually think that when factoring in the needed hashing/comparing code, a hash table based approach wouldn't be a lot smaller anyways. All of the changes come from tessellation control shaders in Strange Brigade, where we're able to remove the DXVK-inserted copy at the beginning of the shader. These are the result for radv: Totals from affected shaders: SGPRS: 4576 -> 4576 (0.00 %) VGPRS: 13784 -> 5560 (-59.66 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread Code Size: 329940 -> 263268 (-20.21 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 330 -> 898 (172.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 11:36:25 +02:00
Connor Abbott	c6543efe7a	nir: Print array deref indices as decimal We print the size as decimal too, and using hex without a leading "0x" was very confusing. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 11:36:19 +02:00
Sagar Ghuge	d5992ab134	nir: Optimize umod lowering We don't have calculate final quotient in order to calculate unsigned modulo result. Once we are done with error correction we have partial result which can be used to find out modulo operation result Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-26 11:19:23 -07:00
Lionel Landwerlin	8c330728f3	nir: add access to image_deref intrinsics SPIRV added the ability to access variables and have expressions non dynamically uniform and because spirv_to_nir generates deref instructions, we'll need to have that access there. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-26 14:09:55 +00:00
Jonathan Marek	97c8314c5f	nir/algebraic: add scmp algebraic optimizations When 'x' is the result of a scmp op: x != 0.0 or x == 1.0: passthrough x == 0.0 or x != 1.0: invert Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	9be902097c	nir/algebraic: add option to lower fall_equalN/fany_nequalN Add generic lowerings for fall_equalN/fany_nequalN. These should be optimal for vec4 backends that doesn't have any special instructions for it, as long as they support saturate. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	397375d3f3	nir/algebraic: add fdot2 optimizations Add simple fdot2 optimizations that are missing. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	1e089d0575	nir/algebraic: add option to lower fdph For backends that don't have a 'fdph' instructions Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	bc3b6168ba	nir: replace lower_sincos with algebraic opt This version has less ops for the same precision. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Acked-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Jonathan Marek	5a4e71c082	nir/algebraic: allow swizzle in nir_algebraic replace expression This is to allow optimizations in nir_opt_algebraic not otherwise possible Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Acked-by: Matt Turner <mattst88@gmail.com>	2019-07-24 17:36:21 -04:00
Daniel Schürmann	e272fdd508	nir,intel: lower if (cond) demote() to new intrinsic demote_if(cond) This will effectively enable the optimization in anv. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-24 13:02:18 -05:00
Jason Ekstrand	799f0f7b28	nir/lower_subgroups: Properly lower masks when subgroup_size == 0 Instead of building a constant mask (which depends on knowing the subgroup size), we build an expression. Because the pass uses the nir_shader_lower_instructions helper, subgroup lowering will be run on any newly emitted instructions as well as the previously existing instructions. In particular, if the subgroup size is known, the newly emitted subgroup_size intrinsic will get turned into a constant and a later constant folding pass will clean it up. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Sagar Ghuge	87cef718e1	nir: Add lowering for nir_op_irem and nir_op_imod Tested on Gen > 9. v2: 1) Fix lowering 2) Keep a consistent i/u order (Matt Turner) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-24 10:33:09 -07:00
Jason Ekstrand	9700e45463	nir/lower_io: Return SSA defs from helpers I can't find a single place where nir_lower_io is called after going out of SSA which is the only real reason why you wouldn't do this. Returning SSA defs is more idiomatic and is required for the next commit. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-23 17:48:49 -05:00
Jason Ekstrand	ae392d73c9	nir/gather_info: Look for uses of helper invocations The one obvious omission here is gl_HelperInvocation itself. However, the spec doesn't require that we generate then when gl_HelperInvocation is used, it merely mandates that we report them if they are there. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Jason Ekstrand	41ab92a327	nir/gather_info: Move setting uses_64bit out of the switch Otherwise, as we add things to the switch, we're going to forget and add some 64-bit op at some point in the future and it'll stop getting flagged. There's no reason why we can't do the check for derivatives. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Jason Ekstrand	0e6cb481fa	nir: Add a nir_tex_instr_has_implicit_derivatives helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Jason Ekstrand	7a98c7804c	nir: Move nir_alu_instr_is_comparison to the ALU section Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-23 13:40:41 -05:00
Andrii Simiklit	79ab2c3e57	nir: use \| instead of \|\| operator warning: use of logical '\|\|' with constant operand note: use '\|' for a bitwise operation Fixes: `758fdce9fe` ("nir: Add some generic helpers for writing lowering passes") Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-07-23 18:08:58 +03:00
Eric Engestrom	3acc4278ad	nir: don't return void Fixes: `14531d676b` ("nir: make nir_const_value scalar") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-07-23 16:02:37 +01:00
Jason Ekstrand	5c5f11d1dd	nir: Remove a bunch of large stack arrays Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-22 16:17:18 -05:00
Jason Ekstrand	6301f80b84	nir: Only rematerialize comparisons with all SSA sources Otherwise, you may end up moving a register read and that could result in an incorrect shader. This commit fixes a rendering issue in Elite: Dangerous. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111152 Fixes: `3ee2e84c60` "nir: Rematerialize compare instructions" Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-19 19:45:36 +00:00
Caio Marcelo de Oliveira Filho	4061a3f6c9	nir: use a switch when printing intrinsic indices Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-07-19 10:04:52 -07:00
Rhys Perry	e8644122ed	nir/algebraic: mark a few comparison simplifications as precise No vkpipeline-db changes found. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-07-19 16:33:01 +00:00
Rhys Perry	79801b9d7d	nir/algebraic: optimize contradictory iand operands Some of these were found in a few GTAV, Rise of the Tomb Raider and Shadow of the Tomb Raider shaders. Results from vkpipeline-db run with ACO: Totals from affected shaders: SGPRS: 376 -> 376 (0.00 %) VGPRS: 220 -> 220 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 13492 -> 11560 (-14.32 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 69 -> 69 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: use False instead of 0 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-07-19 16:33:01 +00:00

1 2 3 4 5 ...

1779 commits