fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2026-05-20 02:38:07 +02:00

Author	SHA1	Message	Date
Rhys Perry	ec4b425f59	nir/algebraic: fix imod by negative power-of-two If "a" is a multiple of "b", then the result would have been "b" instead of 0. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `0ef5f3552f` ("nir: add strength reduction pattern for imod/irem with pow2 divisor.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Dave Airlie	ad92c2b253	nir: add fisnormal lowering just lower the 32-bit version for now. Thanks to alyssa for this suggested lowering. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 14:27:48 +10:00
Dave Airlie	330e28155f	nir: add 32-bit bool of fisfinite Add the bool lowering as well. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 12:06:21 +10:00
Connor Abbott	8115cde3ba	tu, freedreno/a6xx, ir3: Rewrite tess PrimID handling The previous handling conflated RelPatchID and PrimID, which would result in incorrect gl_PrimitiveID when doing draw splitting and didn't work with PrimID passthrough which fills the VPC slot with the "correct" PrimID value from the tess factor BO which we left 0. Replace PrimID in the tess lowering pass with a new RelPatchID sysval, and relace PrimID with RelPatchID in the VS input code in turnip/freedreno at the same time so that there is no net change in the tess lowering code. However, now we have to add new mechanisms for getting the user-level PrimID: - In the TCS it comes from the VS, just like gl_PrimitiveIDIn in the GS. This means we have to add another register to our VS->TCS ABI. I decided to put PrimID in r0.z, after the TCS header and RelPatchID, because it might not be read in the TCS. - If any stage after the TCS uses PrimID, the TCS stores it in the first dword of the tess factor BO, and it is read by the fixed-function tessellator and accessed in the TES via the newly-uncovered DSPRIMID field. If we have tess and GS, the TES passes this value through to the GS in the same way as the VS does. PrimID passthrough for reading it in the FS when there's tess but no GS also "just works" once we start storing it in the TCS. In particular this fixes dEQP-VK.pipeline.misc.primitive_id_from_tess which tests exactly that. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12166>	2021-08-05 16:35:41 +00:00
Jason Ekstrand	0ddac113f8	nir: Removing uses of SSA defs destroys SSA liveness The liveness information will be a superset of real liveness so it's unlikely something will explode if it tries to use it. However, it is out-of-date and should be re-run if someone really wants it. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12186>	2021-08-03 21:36:53 +00:00
Ian Romanick	72259a870f	util: Add and use functions to calculate min and max int for a size Many places need to know the maximum or minimum possible value for a given size integer... so everyone just open-codes their favorite version. There is some potential to hit either undefined or implementation-defined behavior, so having one version that Just Works seems beneficial. v2: Fix copy-and-pasted bug (INT64_MAX instead of INT64_MIN) in u_intmin. Noticed by CI. Lol. Rename functions `s/u_(uint\|int)(min\|max)/u_\1N_\2/g`. Suggested by Jason. Add some unit tests that would have caught the copy-and-paste bug before wasting CI time. Change the implementation of u_intN_min to use the same pattern as stdint.h. This avoids the integer division. Noticed by Jason. v3: Add changes to convert_clear_color (src/gallium/drivers/iris/iris_clear.c). Suggested by Nanley. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12177>	2021-08-03 12:55:02 -07:00
Timothy Arceri	6538b3e566	nir: add heuristic for instructions in loops with GCM Moving instructions out of large loops tends to cause excessive spilling. This appears to be a good limit. In future it might make sense to make this a NIR options so other drivers can set their own limits. Tiger Lake total instructions in shared programs: 20930180 -> 20926952 (-0.02%) instructions in affected programs: 280768 -> 277540 (-1.15%) helped: 734 HURT: 192 helped stats (abs) min: 1 max: 61 x̄: 5.16 x̃: 4 helped stats (rel) min: 0.04% max: 10.64% x̄: 3.23% x̃: 3.14% HURT stats (abs) min: 1 max: 52 x̄: 2.90 x̃: 1 HURT stats (rel) min: 0.03% max: 9.76% x̄: 1.13% x̃: 0.61% 95% mean confidence interval for instructions value: -3.89 -3.08 95% mean confidence interval for instructions %-change: -2.49% -2.16% Instructions are helped. total cycles in shared programs: 841825217 -> 838817552 (-0.36%) cycles in affected programs: 122088078 -> 119080413 (-2.46%) helped: 941 HURT: 100 helped stats (abs) min: 1 max: 160080 x̄: 3274.31 x̃: 2660 helped stats (rel) min: <.01% max: 41.64% x̄: 5.50% x̃: 4.80% HURT stats (abs) min: 1 max: 41856 x̄: 734.62 x̃: 26 HURT stats (rel) min: <.01% max: 7.29% x̄: 0.44% x̃: 0.27% 95% mean confidence interval for cycles value: -3236.56 -2541.85 95% mean confidence interval for cycles %-change: -5.26% -4.60% Cycles are helped. total sends in shared programs: 977905 -> 977782 (-0.01%) sends in affected programs: 2279 -> 2156 (-5.40%) helped: 119 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 6.93% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.42% -6.45% Sends are helped. LOST: 2 GAINED: 0 Ice Lake total instructions in shared programs: 19865361 -> 19861747 (-0.02%) instructions in affected programs: 185789 -> 182175 (-1.95%) helped: 593 HURT: 47 helped stats (abs) min: 1 max: 27 x̄: 6.17 x̃: 4 helped stats (rel) min: 0.19% max: 8.65% x̄: 4.53% x̃: 4.60% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.03% max: 0.23% x̄: 0.11% x̃: 0.04% 95% mean confidence interval for instructions value: -5.93 -5.37 95% mean confidence interval for instructions %-change: -4.32% -4.06% Instructions are helped. total loops in shared programs: 6120 -> 6117 (-0.05%) loops in affected programs: 6 -> 3 (-50.00%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% total cycles in shared programs: 961777176 -> 959404350 (-0.25%) cycles in affected programs: 172224180 -> 169851354 (-1.38%) helped: 936 HURT: 80 helped stats (abs) min: 1 max: 9566 x̄: 2621.08 x̃: 2550 helped stats (rel) min: <.01% max: 41.77% x̄: 4.22% x̃: 3.84% HURT stats (abs) min: 1 max: 59146 x̄: 1006.34 x̃: 24 HURT stats (rel) min: <.01% max: 3.78% x̄: 0.44% x̃: 0.25% 95% mean confidence interval for cycles value: -2513.72 -2157.20 95% mean confidence interval for cycles %-change: -4.13% -3.57% Cycles are helped. total sends in shared programs: 1019995 -> 1019872 (-0.01%) sends in affected programs: 2283 -> 2160 (-5.39%) helped: 119 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 6.91% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.39% -6.42% Sends are helped. LOST: 4 GAINED: 0 Skylake total instructions in shared programs: 17994337 -> 17993846 (<.01%) instructions in affected programs: 146294 -> 145803 (-0.34%) helped: 190 HURT: 47 helped stats (abs) min: 1 max: 12 x̄: 2.83 x̃: 3 helped stats (rel) min: 0.14% max: 4.29% x̄: 1.08% x̃: 0.90% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.03% max: 0.22% x̄: 0.11% x̃: 0.04% 95% mean confidence interval for instructions value: -2.30 -1.84 95% mean confidence interval for instructions %-change: -0.95% -0.74% Instructions are helped. total loops in shared programs: 6029 -> 6023 (-0.10%) loops in affected programs: 12 -> 6 (-50.00%) helped: 6 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 939062940 -> 938023548 (-0.11%) cycles in affected programs: 169671482 -> 168632090 (-0.61%) helped: 980 HURT: 134 helped stats (abs) min: 1 max: 25000 x̄: 1075.57 x̃: 1052 helped stats (rel) min: <.01% max: 42.75% x̄: 2.51% x̃: 1.32% HURT stats (abs) min: 1 max: 837 x̄: 109.45 x̃: 20 HURT stats (rel) min: <.01% max: 5.71% x̄: 0.73% x̃: 0.21% 95% mean confidence interval for cycles value: -1005.89 -860.17 95% mean confidence interval for cycles %-change: -2.39% -1.84% Cycles are helped. total sends in shared programs: 1026848 -> 1026724 (-0.01%) sends in affected programs: 2302 -> 2178 (-5.39%) helped: 120 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 6.91% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.40% -6.43% Sends are helped. LOST: 1 GAINED: 1 Broadwell total instructions in shared programs: 17605621 -> 17605154 (<.01%) instructions in affected programs: 145691 -> 145224 (-0.32%) helped: 184 HURT: 48 helped stats (abs) min: 1 max: 12 x̄: 2.83 x̃: 3 helped stats (rel) min: 0.13% max: 4.29% x̄: 1.09% x̃: 0.93% HURT stats (abs) min: 1 max: 7 x̄: 1.12 x̃: 1 HURT stats (rel) min: 0.03% max: 0.48% x̄: 0.12% x̃: 0.04% 95% mean confidence interval for instructions value: -2.26 -1.77 95% mean confidence interval for instructions %-change: -0.95% -0.73% Instructions are helped. total loops in shared programs: 5968 -> 5963 (-0.08%) loops in affected programs: 10 -> 5 (-50.00%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 1000679489 -> 998592756 (-0.21%) cycles in affected programs: 173421234 -> 171334501 (-1.20%) helped: 993 HURT: 153 helped stats (abs) min: 1 max: 766608 x̄: 2118.49 x̃: 1080 helped stats (rel) min: <.01% max: 54.61% x̄: 2.61% x̃: 1.73% HURT stats (abs) min: 1 max: 2200 x̄: 110.61 x̃: 11 HURT stats (rel) min: <.01% max: 5.68% x̄: 0.63% x̃: 0.06% 95% mean confidence interval for cycles value: -3191.23 -450.54 95% mean confidence interval for cycles %-change: -2.47% -1.89% Cycles are helped. total sends in shared programs: 996341 -> 996222 (-0.01%) sends in affected programs: 2151 -> 2032 (-5.53%) helped: 115 HURT: 0 helped stats (abs) min: 1 max: 4 x̄: 1.03 x̃: 1 helped stats (rel) min: 0.60% max: 14.29% x̄: 7.07% x̃: 6.67% 95% mean confidence interval for sends value: -1.09 -0.98 95% mean confidence interval for sends %-change: -7.55% -6.58% Sends are helped. Haswell total instructions in shared programs: 16038375 -> 16038121 (<.01%) instructions in affected programs: 216797 -> 216543 (-0.12%) helped: 185 HURT: 217 helped stats (abs) min: 1 max: 12 x̄: 2.84 x̃: 3 helped stats (rel) min: 0.13% max: 4.23% x̄: 1.30% x̃: 1.20% HURT stats (abs) min: 1 max: 6 x̄: 1.25 x̃: 1 HURT stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.40% 95% mean confidence interval for instructions value: -0.85 -0.41 95% mean confidence interval for instructions %-change: -0.40% -0.14% Instructions are helped. total loops in shared programs: 5947 -> 5942 (-0.08%) loops in affected programs: 10 -> 5 (-50.00%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 967655093 -> 965746713 (-0.20%) cycles in affected programs: 197288924 -> 195380544 (-0.97%) helped: 950 HURT: 195 helped stats (abs) min: 1 max: 782820 x̄: 2274.79 x̃: 1260 helped stats (rel) min: <.01% max: 54.26% x̄: 3.02% x̃: 1.71% HURT stats (abs) min: 1 max: 15790 x̄: 1295.73 x̃: 21 HURT stats (rel) min: <.01% max: 119.85% x̄: 7.76% x̃: 0.11% 95% mean confidence interval for cycles value: -3014.22 -319.19 95% mean confidence interval for cycles %-change: -1.83% -0.55% Cycles are helped. total sends in shared programs: 934894 -> 934765 (-0.01%) sends in affected programs: 2192 -> 2063 (-5.89%) helped: 115 HURT: 2 helped stats (abs) min: 1 max: 4 x̄: 1.14 x̃: 1 helped stats (rel) min: 0.60% max: 28.57% x̄: 7.68% x̃: 6.67% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 16.67% max: 16.67% x̄: 16.67% x̃: 16.67% 95% mean confidence interval for sends value: -1.23 -0.98 95% mean confidence interval for sends %-change: -8.28% -6.24% Sends are helped. LOST: 1 GAINED: 18 Ivy Bridge total instructions in shared programs: 15269357 -> 15269398 (<.01%) instructions in affected programs: 190484 -> 190525 (0.02%) helped: 77 HURT: 206 helped stats (abs) min: 1 max: 6 x̄: 2.47 x̃: 3 helped stats (rel) min: 0.14% max: 5.31% x̄: 1.46% x̃: 1.65% HURT stats (abs) min: 1 max: 3 x̄: 1.12 x̃: 1 HURT stats (rel) min: 0.03% max: 2.38% x̄: 0.42% x̃: 0.40% 95% mean confidence interval for instructions value: -0.06 0.35 95% mean confidence interval for instructions %-change: -0.21% 0.03% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4001 -> 3996 (-0.12%) loops in affected programs: 10 -> 5 (-50.00%) helped: 5 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 562045564 -> 561063543 (-0.17%) cycles in affected programs: 200924872 -> 199942851 (-0.49%) helped: 748 HURT: 160 helped stats (abs) min: 2 max: 14926 x̄: 1692.94 x̃: 1620 helped stats (rel) min: <.01% max: 53.29% x̄: 3.17% x̃: 1.87% HURT stats (abs) min: 2 max: 15726 x̄: 1776.86 x̃: 36 HURT stats (rel) min: <.01% max: 114.43% x̄: 10.66% x̃: 0.21% 95% mean confidence interval for cycles value: -1237.33 -925.71 95% mean confidence interval for cycles %-change: -1.54% 0.08% Inconclusive result (%-change mean confidence interval includes 0). total sends in shared programs: 893348 -> 893330 (<.01%) sends in affected programs: 187 -> 169 (-9.63%) helped: 14 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.29 x̃: 1 helped stats (rel) min: 4.08% max: 22.22% x̄: 11.70% x̃: 10.10% 95% mean confidence interval for sends value: -1.56 -1.02 95% mean confidence interval for sends %-change: -14.92% -8.48% Sends are helped. LOST: 1 GAINED: 19 Sandy Bridge total instructions in shared programs: 11785227 -> 11785774 (<.01%) instructions in affected programs: 78403 -> 78950 (0.70%) helped: 65 HURT: 505 helped stats (abs) min: 1 max: 4 x̄: 2.22 x̃: 3 helped stats (rel) min: 0.14% max: 4.17% x̄: 1.19% x̃: 1.38% HURT stats (abs) min: 1 max: 5 x̄: 1.37 x̃: 1 HURT stats (rel) min: 0.24% max: 3.33% x̄: 1.57% x̃: 1.72% 95% mean confidence interval for instructions value: 0.85 1.07 95% mean confidence interval for instructions %-change: 1.16% 1.36% Instructions are HURT. total loops in shared programs: 2441 -> 2437 (-0.16%) loops in affected programs: 8 -> 4 (-50.00%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for loops value: -1.00 -1.00 95% mean confidence interval for loops %-change: -50.00% -50.00% Loops are helped. total cycles in shared programs: 497178796 -> 496669298 (-0.10%) cycles in affected programs: 51483322 -> 50973824 (-0.99%) helped: 476 HURT: 137 helped stats (abs) min: 2 max: 7502 x̄: 1079.36 x̃: 1260 helped stats (rel) min: <.01% max: 42.50% x̄: 2.31% x̃: 0.86% HURT stats (abs) min: 2 max: 754 x̄: 31.23 x̃: 18 HURT stats (rel) min: <.01% max: 3.01% x̄: 0.09% x̃: 0.02% 95% mean confidence interval for cycles value: -901.99 -760.32 95% mean confidence interval for cycles %-change: -2.20% -1.36% Cycles are helped. total sends in shared programs: 642919 -> 642915 (<.01%) sends in affected programs: 32 -> 28 (-12.50%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 11.11% max: 14.29% x̄: 12.70% x̃: 12.70% 95% mean confidence interval for sends value: -1.00 -1.00 95% mean confidence interval for sends %-change: -15.61% -9.78% Sends are helped. Iron Lake total instructions in shared programs: 8180061 -> 8180248 (<.01%) instructions in affected programs: 65004 -> 65191 (0.29%) helped: 59 HURT: 253 helped stats (abs) min: 1 max: 4 x̄: 2.24 x̃: 3 helped stats (rel) min: 0.16% max: 2.23% x̄: 1.04% x̃: 1.29% HURT stats (abs) min: 1 max: 5 x̄: 1.26 x̃: 1 HURT stats (rel) min: 0.21% max: 3.85% x̄: 0.93% x̃: 0.60% 95% mean confidence interval for instructions value: 0.43 0.77 95% mean confidence interval for instructions %-change: 0.45% 0.68% Instructions are HURT. total loops in shared programs: 863 -> 861 (-0.23%) loops in affected programs: 4 -> 2 (-50.00%) helped: 2 HURT: 0 total cycles in shared programs: 239357490 -> 238907668 (-0.19%) cycles in affected programs: 17314006 -> 16864184 (-2.60%) helped: 176 HURT: 34 helped stats (abs) min: 4 max: 13400 x̄: 2558.05 x̃: 2920 helped stats (rel) min: 0.01% max: 35.58% x̄: 3.76% x̃: 2.69% HURT stats (abs) min: 2 max: 14 x̄: 11.59 x̃: 14 HURT stats (rel) min: <.01% max: 0.06% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -2440.68 -1843.34 95% mean confidence interval for cycles %-change: -3.78% -2.51% Cycles are helped. GM45 total instructions in shared programs: 4985293 -> 4985401 (<.01%) instructions in affected programs: 58807 -> 58915 (0.18%) helped: 57 HURT: 202 helped stats (abs) min: 1 max: 4 x̄: 2.26 x̃: 3 helped stats (rel) min: 0.15% max: 2.23% x̄: 1.06% x̃: 1.29% HURT stats (abs) min: 1 max: 5 x̄: 1.17 x̃: 1 HURT stats (rel) min: 0.21% max: 3.85% x̄: 0.76% x̃: 0.48% 95% mean confidence interval for instructions value: 0.22 0.61 95% mean confidence interval for instructions %-change: 0.24% 0.48% Instructions are HURT. total loops in shared programs: 639 -> 638 (-0.16%) loops in affected programs: 2 -> 1 (-50.00%) helped: 1 HURT: 0 total cycles in shared programs: 153794236 -> 153546274 (-0.16%) cycles in affected programs: 9947778 -> 9699816 (-2.49%) helped: 110 HURT: 31 helped stats (abs) min: 4 max: 13400 x̄: 2257.51 x̃: 1796 helped stats (rel) min: 0.01% max: 35.58% x̄: 4.33% x̃: 2.45% HURT stats (abs) min: 2 max: 14 x̄: 11.74 x̃: 14 HURT stats (rel) min: <.01% max: 0.06% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -2113.77 -1403.42 95% mean confidence interval for cycles %-change: -4.27% -2.47% Cycles are helped. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2899 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Timothy Arceri	a7f2e683de	nir: move nir_block_ends_in_break() to nir.h Will be used in a following commit. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Timothy Arceri	a9ed4538ab	nir: add indirect loop unrolling to compiler options This is where it should be rather than having to pass it into the optimisation pass every time. It also allows us to call the loop analysis pass without having to duplicate these options which we will do later in this series. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>	2021-08-03 10:54:50 +00:00
Timur Kristóf	da9f4b2e67	nir, aco: Remove vertex and primitive count overwrite intrinsic. It's no longer needed. No Fossil DB changes. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Timur Kristóf	1bbea90f50	aco, nir, ac: Simplify sequence of getting initial NGG VS edge flags. Instead of v_bfe + v_lshl_or for each vertex, get all 3 edge flags at once of every vertex. This takes fewer VALU instructions than previously. Fossil DB results on Sienna Cichlid (with NGGC on): Totals from 56917 (44.24% of 128647) affected shaders: CodeSize: 161028288 -> 158751628 (-1.41%) Instrs: 30917985 -> 30519571 (-1.29%) Latency: 130617204 -> 129975532 (-0.49%); split: -0.50%, +0.01% InvThroughput: 21280238 -> 20927401 (-1.66%) Copies: 3011120 -> 3011125 (+0.00%); split: -0.00%, +0.00% No Fossil DB changed with NGGC off. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11908>	2021-08-02 11:38:25 +00:00
Emma Anholt	9ffd00bcf1	nir_to_tgsi: Pack our tex coords into vec4 nir_tex_src_backend[12]. For TGSI, we need the coordinate, comparator, bias, and LOD all together in the first two vec4 args, and by doing it in the backend we were generating extra MOVs. softpipe shader-db results: total instructions in shared programs: 2985416 -> 2953625 (-1.06%) instructions in affected programs: 499937 -> 468146 (-6.36%) total temps in shared programs: 544769 -> 565869 (3.87%) temps in affected programs: 105469 -> 126569 (20.01%) i915g shader-db: total instructions in shared programs: 371625 -> 369594 (-0.55%) instructions in affected programs: 24903 -> 22872 (-8.16%) total tex_indirect in shared programs: 11381 -> 11365 (-0.14%) tex_indirect in affected programs: 43 -> 27 (-37.21%) LOST: 7 GAINED: 16 The temps increase is the pre-existing issue that we never release temps for NIR regs, which doesn't matter much for softpipe (just memory/cache footprint) but does for i915g as seen by shaders that no longer compile (though overall we seem to win). Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11912>	2021-07-29 09:05:05 -07:00
Enrico Galli	16ef26ffcb	nir_lower_readonly_images_to_tex: Fix typeo on image arrays Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12119>	2021-07-29 01:44:45 +00:00
Lionel Landwerlin	7e3bad0f8e	nir/lower_shader_calls: adding missing stack offset alignment Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8dfb240b1f` ("nir: Add raytracing shader call lowering pass.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12112>	2021-07-28 23:04:21 +00:00
Daniel Schürmann	bc500da67d	nir/shrink_vectors: shrink vecN properly This patch allows to shrink vecN instructions where one or more components at any position are unused. Stat changes for softpipe: total instructions in shared programs: 2986101 -> 2985416 (-0.02%) instructions in affected programs: 51216 -> 50531 (-1.34%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00
Daniel Schürmann	36fe7398c0	nir/shrink_vectors: shrink ALU properly ALU instructions of which not all components are read, can be shrunk to the number of read components. Previously, this would only remove trailing components. This patch enables to remove components from any position. Stat changes for softpipe: total instructions in shared programs: 3001291 -> 2984698 (-0.55%) instructions in affected programs: 225585 -> 208992 (-7.36%) total loops in shared programs: 1389 -> 1358 (-2.23%) loops in affected programs: 36 -> 5 (-86.11%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00
Daniel Schürmann	8317fe314c	nir/opt_shrink_vectors: reverse iteration order This pass should be backwards in order to reach the fixed point in linear time. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00
Daniel Schürmann	d27417b597	nir: consider write_mask in nir_ssa_def_components_read() Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00
Daniel Schürmann	73905c4d01	nir/opt_shrink_vectors: don't shrink vectors used by intrinsics Store intrinsics shrink the sources by creating a new vecN. Other intrinsics cannot shrink their sources. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00
Daniel Schürmann	ece99eb69f	nir/lower_alu_to_scalar: don't skip gaps in write_mask Otherwise, this may lead to segmentation faults. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11411>	2021-07-26 09:24:37 +00:00
Jason Ekstrand	1431f6c765	nir: Validate newly documented texture restrictions Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Mike Blumenkrantz	499cc7a9ec	nir/validate: refactor validate_assert to have a return value Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Jason Ekstrand	74ec2b12be	nir/lower_tex: Rework invalid implicit LOD lowering Only fragment and some compute shaders support implicit derivatives. They're totally meaningless without helper invocations and some understanding of the dispatch pattern. We've got code to lower nir_texop_tex in these shader stages to use an explicit derivative of 0 but it was pretty badly broken: 1. It only handled nir_texop_tex, not nir_texop_txb or nir_texop_lod. 2. It didn't take min_lod into account 3. It was conflated with adding a missing LOD parameter to opcodes which expect one such as nir_texop_txf. While not really a bug, this does make it way harder to reason about the code. 4. Unless you set a flag (which most drivers don't), it left the opcode nir_texop_tex instead of nir_texop_txl which it should have been. This reworks it to go through roughly the same path as other LOD lowering only with a constant lod of 0 instead of calling out to nir_texop_lod. We also get rid of the lower_tex_without_implicit_lod flag because most drivers set it and those that don't are probably subtly broken. If someone really wants to get nir_texop_tex in their vertex shaders, they can write a new patch to add the flag back in. Fixes: `e382890e25` "nir: set default lod to texture opcodes that..." Fixes: `d5ac5d6e83` "nir: Add option to lower tex to txl when..." Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Jason Ekstrand	fa717a202c	docs,nir: Document NIR texture instructions Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Jason Ekstrand	4465ca296d	nir: Suffix all the MCS texture stuff _intel It's intel-specific, used to get at MSAA compression information. Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>	2021-07-23 15:53:57 +00:00
Jason Ekstrand	60b5faf572	nir/lower_tex: Add a lower_txs_cube_array option Several bits of hardware require the division by 6 to happen in the shader. May as well have common lowering for it. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12005>	2021-07-22 14:22:35 -05:00
Jason Ekstrand	c6102dda0a	nir/lower_image: Handle index and bindless image_size Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12005>	2021-07-22 14:22:35 -05:00
Jordan Justen	6898549d56	nir: Add nir_lower_image() to lower cube image sizes Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9466>	2021-07-21 11:02:15 -07:00
Jason Ekstrand	b0fba89cf6	nir/lower_subgroups: Handle down-casts in uint_to_ballot_type This is required for Zink where the API ballot type is a uint64_t and the "hardware" ballot type is uvec4. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11989>	2021-07-21 16:41:56 +00:00
Timothy Arceri	5cc36887ab	nir/gcm: be less destructive with instruction order This changes the pass to extract pinned instructions and not just unpinned instructions when rescheduling instructions. This stops pinned instructions from being bunched together when instructions are reinserted into the blocks which can result in regressions with regards to cycles and instruction counts on i965 and register use/Max Waves on AMD hardware. In order to do this we also throw away the post-order depth-first search linearization algorithm used to re-insert the instructions, which itself causes possible regressions when instructions are reinserted into a less than ideal new order (of which the bunched together pinned instructions is one example). Instead we simply insert instructions in the reverse order they were extracted. This will simply place instructions that were scheduled earlier onto the end of their new block and instructions that were scheduled later to the start of their new block. With this everything should remain in order without the need to run over uses. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>	2021-07-21 14:24:00 +00:00
Ian Romanick	436668874a	nir/gcm: Clear out pass_flags before starting With this pass enabled in Intel drivers, running shader-db on shaders/unity/38.shader_test resulted in Program received signal SIGSEGV, Segmentation fault. gcm_schedule_early_src (src=0x555555d45348, void_state=0x7fffffffba40) at ../../SOURCE/master/src/compiler/nir/nir_opt_gcm.c:297 297 if (info->early_block->index < src_info->early_block->index) (gdb) print src_info->early_block $1 = (nir_block *) 0x0 I tracked this down to an early exit from gcm_schedule_early_instr on the parent instruction because instr->pass_flags was 0x1c. That should be an impossible value for this pass, so I inferred that pass_flags must have dirt left from some previous pass. Fixes: `8dfe6f672f` ("nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned") Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>	2021-07-21 14:24:00 +00:00
Mike Blumenkrantz	3ab74d0ffa	nir: add nir_imm_ivec3 builder the other ones exist, so why not this one too Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11983>	2021-07-21 13:57:14 +00:00
Jason Ekstrand	393ee837fb	nir: Add a format field to _deref image intrinsics The rules here are the same as for texture instructions. The bits on the intrinsic are the ground truth and are allowed to vary from the deref a bit as-needed. If the intrinsic says PIPE_FORMAT_NONE, then we can look at the variable, if visible, to get format information. This means that we need to be careful when we rewrite intrinsics based on the deref to only override the format from the _deref intrinsic from the image variable unless the intrinsic is PIPE_FORMAT_NONE. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11849>	2021-07-20 23:18:22 +00:00
Jason Ekstrand	0b57272af8	nir: Set src_components = -1 for image intrinsic deref sources Semantically, -1 means "Unknown; don't validate" but it's really only used for derefs because they often need to be flexible. We don't really need that flexibility for image intrinsics but this makes it more consistent. More immediately useful is that this gives us the ability to tell _deref forms of these intrinsics apart from the lowered ones. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11849>	2021-07-20 23:18:22 +00:00
Jason Ekstrand	c0afb60258	nir: Set IMAGE_DIM and IMAGE_ARRAY on deref intrinsics The rules here are the same as for texture instructions. The bits on the intrinsic are the ground truth and are allowed to vary from the deref a bit as-needed. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11849>	2021-07-20 23:18:22 +00:00
Mike Blumenkrantz	50f9519ea5	nir/lower_point_size_mov: zero nir_state_slot::swizzle in new variable this is otherwise uninitialized during nir_serialize calls Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11932>	2021-07-20 16:34:51 +00:00
Sagar Ghuge	06ab737686	nir: Add optimizations for iadd3 This patch also adds has_iadd3 bit to give more control if backend supports ternary add instruction or not. v2: - Add patterns in late optimization (Connor Abbott) Suggested-by: Alyssa/Jason Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:56 +00:00
Sagar Ghuge	e8dff256c0	nir: Add new opcode for ternary addition v2: - Make it 2src commutative (Connor Abbott) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:55 +00:00
Jason Ekstrand	0ee322acdb	nir: Better document the Boissinot algorithm in nir_from_ssa() Reviewed-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8815>	2021-07-16 06:19:25 +00:00
Emma Anholt	bb35195b73	nir: Validate after deserialization. It's a particularly relevant place for NIR bugs to occur, and if you make a mistake in this code it gets caught in your debug build in something like mesa/st's call to nir_split_var_copies() during finalization, which is rather misleading. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11860>	2021-07-15 18:43:42 +00:00
Timur Kristóf	48e638ab29	nir: Add AMD specific intrinsics for NGG shader based culling. The new intrinsics fall into the following categories: 1. New viewport intrinsics: For missing components that we need. RADV will emit new SGPR arguments which will contain the viewport information for culling shaders. These are used to compute the screen space coordinates for small primitive culling. 2. load_cull_xxx: Load the culling settings in runtime. These will be a new SGPR argument in RADV. 3. overwrite_xxx: These are needed because system values such as vertex and instance ID are not writeable, but we need to change them after repacking shader invocations of VS and TES. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10525>	2021-07-13 23:56:33 +00:00
Jason Ekstrand	2111551485	Convert a few files to UTF-8 Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11788>	2021-07-12 23:45:34 +00:00
Jason Ekstrand	a195ef123e	nir/lower_subgroups: Pad ballot values before bitcasting Otherwise, if we cast from a uint32_t to a uint64_t, the bitcast will fail before we pad. This happens on Intel. Fixes: `e4e79de2a4` "nir/subgroups: Support > 1 ballot components" Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5045 Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11786>	2021-07-09 14:21:26 +00:00
Jason Ekstrand	624e799cc3	nir: Drop nir_ssa_def::name and nir_register::name We say that they're for debug only but we don't really have a good policy around when to set them and when not to. In particular, nir_lower_system_values and nir_lower_vars_to_ssa which are the chief producers of SSA values which might reasonably have a name do not bother to set one. We have some names set from things like BLORP and RADV's meta shaders but AFAICT, they're setting a name more because it's there than because they actually care. Also, most things other than nir_clone and nir_serialize don't bother to try and preserve them. You can see in the diffstat of this commit exactly what passes attempt to preserve names. Notably missing from the list is opt_algebraic which is the single largest source of SSA def churn and it happily throws names away. These observations lead me to question whether or not names are actually useful at all or if they're just taking up space (8B per instruction) and wasting CPU cycles (to ralloc_strdup on the off chance we do have one). I don't think I can think of a single time in recent history where I've been debugging a shader issue and a SSA value name has been there and been useful. If anything, the few times they are there, they just throw me off because they mess up the indentation in nir_print. iris shader-db on my system gets runtime -2.07734% +/- 1.26933% (n=5) Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5439>	2021-07-08 17:34:41 +00:00
Connor Abbott	68b8b9e9e1	tu, ir3: Plumb through support for CS subgroup size/id The way that the blob obtains the subgroup id on compute shaders is by just and'ing gl_LocalInvocationIndex with 63, since it advertizes a subgroupSize of 64. In order to support VK_EXT_subgroup_size_control and expose a subgroupSize of 128, we'll have to do something a little more flexible. Sometimes we have to fall back to a subgroup size of 64 due to various constraints, and in that case we have to fake a subgroup size of 128 while actually using 64 under the hood, by just pretending that the upper 64 invocations are all disabled. However when computing the subgroup id we need to use the "real" subgroup size. For this purpose we plumb through a driver param which exposes the real subgroup size. If the user forces a particular subgroup size then we lower load_subgroup_size in nir_lower_subgroups, otherwise we let it through, and we assume when translating to ir3 that load_subgroup_size means "give me the actual subgroup size that you decided in RA" and give you the driver param. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	cc514bfa0e	nir: Add read_invocation_cond_ir3 intrinsic On qualcomm, we have shared registers similar to SGPR's on AMD. However, there is no readlane or readfirstlane primitive. shared registers can only be written to when just one lane is active. This means that we have to lower readInvocation(val, id) to something like: if (gl_SubgroupInvocation == id) { scalar_reg = val; } return scalar_reg; However it's a bit difficult to actually get the value of gl_SubgroupInvocation in the backend, because for compute it requires some calculations and we don't have any CSE support in the backend. This intrinsic lets us turn it into "readInvocationCond(val, id == gl_SubgroupInvocation)" in NIR at which point the backend code generation is a lot easier. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	e4e79de2a4	nir/subgroups: Support > 1 ballot components Qualcomm has a mode with a subgroup size of 128, so just emitting larger integer operations and then lowering them later isn't an option. This makes the pass able to handle the lowering itself, so that we don't have to go down to 64-thread wavefronts when ballots are used. (The GLSL and legacy SPIR-V extensions only support a maximum of 64 threads, but I guess we'll cross that bridge when we come to it...) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	90819b9b0e	nir/subgroups: Replace lower_vote_eq_to_ballot with lower_vote_eq Lower it to a vote instead of a ballot. This was only used for AMD, and in that case they're pretty much the same. However Qualcomm has a vote builtin, which we want to use instead of ballots. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Mike Blumenkrantz	b67a4ba4ad	nir/format_convert: add ssa version of uint packing Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10619>	2021-07-07 13:41:37 +00:00
Mike Blumenkrantz	c948251d2b	nir/format_convert: nir_shift -> nir_shift_imm Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10619>	2021-07-07 13:41:37 +00:00

1 2 3 4 5 ...

3236 commits