fdo-mirrors/mesa

mirror of https://gitlab.freedesktop.org/mesa/mesa.git synced 2025-12-22 17:50:12 +01:00

Author	SHA1	Message	Date
Timur Kristóf	be68aeafdc	nir/opt_algebraic: Add various bitfield extract patterns. v2 (Georg Lehmann): - fixed incorrect imin in ubfe_ubfe - simplied outer_bits of ushr((ubfe, ...), ...) opt - added is_used_once to iand(ushr(), ...) opt to improve stats For-DB Navi21: Totals from 3309 (4.18% of 79206) affected shaders: Instrs: 5295291 -> 5282128 (-0.25%); split: -0.28%, +0.03% CodeSize: 28299320 -> 28298456 (-0.00%); split: -0.07%, +0.06% Latency: 51566173 -> 51521923 (-0.09%); split: -0.09%, +0.01% InvThroughput: 13222050 -> 13204557 (-0.13%); split: -0.14%, +0.01% VClause: 116451 -> 116458 (+0.01%); split: -0.02%, +0.02% SClause: 160356 -> 160324 (-0.02%); split: -0.03%, +0.01% Copies: 424152 -> 423670 (-0.11%); split: -0.20%, +0.09% Branches: 156701 -> 156192 (-0.32%); split: -0.33%, +0.01% PreSGPRs: 168507 -> 168500 (-0.00%); split: -0.02%, +0.01% PreVGPRs: 151477 -> 151474 (-0.00%) VALU: 3486077 -> 3476675 (-0.27%); split: -0.31%, +0.04% SALU: 786467 -> 783109 (-0.43%); split: -0.45%, +0.03% VMEM: 188035 -> 188060 (+0.01%) SMEM: 259632 -> 259630 (-0.00%) Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31852>	2024-10-29 10:51:09 +00:00
Rhys Perry	8efc765a3d	nir/algebraic: fix shfr optimization with zero src2 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `08903bbe89` ("nir: add mqsad_4x8, shfr and nir_opt_mqsad") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>	2024-10-25 09:59:40 +00:00
Georg Lehmann	1f9b82bb2a	nir/opt_algebraic: optimize -0.0 + a Foz-DB Navi21: Totals from 428 (0.54% of 79395) affected shaders: MaxWaves: 8510 -> 8512 (+0.02%) Instrs: 731062 -> 729665 (-0.19%); split: -0.19%, +0.00% CodeSize: 3735788 -> 3728324 (-0.20%); split: -0.20%, +0.00% VGPRs: 27328 -> 27336 (+0.03%); split: -0.03%, +0.06% SpillSGPRs: 315 -> 314 (-0.32%) Latency: 3872986 -> 3873236 (+0.01%); split: -0.08%, +0.09% InvThroughput: 971001 -> 970056 (-0.10%); split: -0.17%, +0.08% VClause: 11954 -> 11956 (+0.02%); split: -0.02%, +0.03% SClause: 17361 -> 17358 (-0.02%) Copies: 59038 -> 59045 (+0.01%); split: -0.22%, +0.24% Branches: 17685 -> 17656 (-0.16%) PreSGPRs: 26103 -> 26102 (-0.00%) PreVGPRs: 23220 -> 23206 (-0.06%) VALU: 515293 -> 513963 (-0.26%); split: -0.26%, +0.00% SALU: 91591 -> 91544 (-0.05%) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31770>	2024-10-23 08:58:34 +00:00
Georg Lehmann	f9d2aad7a3	nir: remove alu ddx/ddy Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31014>	2024-10-17 09:50:19 +00:00
Gert Wollny	f19f1ec17b	nir/opt_algebraic: Allow two-step lowering of ftrunc@64 to use ffract@64 If ftrunc@64 is lowered by nir_lower_doubles it is turned into a comparable long series of 32 bit operations. If the hardware supports ffract@64 then nir_opt_algebraic can first lower ftrunc@64 to use some combinations with ffloor@64. They can then be turned into a combination of fsub@64 and ffract@64 resulting in less all-over instructions. Fixes: `5218cff34b` nir/algebraic: avoid double lowering of some fp64 operations Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29281>	2024-09-30 23:51:02 +00:00
Ian Romanick	057c7c9f53	nir/algebraic: Recognize open-coded bitfield_reverse in XCOM 2 The XCOM 2 shaders in my shader-db use iadd instead of ior. No fossil-db changes on any Intel platform. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19787210 -> 19787034 (<.01%) instructions in affected programs: 1187 -> 1011 (-14.83%) helped: 6 / HURT: 0 total cycles in shared programs: 906024436 -> 906012612 (<.01%) cycles in affected programs: 72978 -> 61154 (-16.20%) helped: 6 / HURT: 0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>	2024-09-13 00:21:00 +00:00
Ian Romanick	a780305818	nir/algebraic: Optimize more comparisons with b2f shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19781108 -> 19772614 (-0.04%) instructions in affected programs: 372638 -> 364144 (-2.28%) helped: 2915 / HURT: 0 total cycles in shared programs: 905907644 -> 905822682 (<.01%) cycles in affected programs: 5573453 -> 5488491 (-1.52%) helped: 2363 / HURT: 234 LOST: 42 GAINED: 16 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 152519634 -> 152519610 (-0.00%) Cycle count: 17122707642 -> 17122710974 (+0.00%); split: -0.00%, +0.00% Totals from 5 (0.00% of 633222) affected shaders: Instrs: 2827 -> 2803 (-0.85%) Cycle count: 83089 -> 86421 (+4.01%); split: -0.12%, +4.13% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31068>	2024-09-10 04:15:58 +00:00
Georg Lehmann	6378bbaa82	nir/opt_algebraic: reassociate constants in ior(iand) chains Mostly affects one F1_23 shader that packs bitfields bit by bit. Totals from 3 (0.00% of 79395) affected shaders: Instrs: 5004 -> 4202 (-16.03%) CodeSize: 30992 -> 23952 (-22.72%) Latency: 28894 -> 28464 (-1.49%) InvThroughput: 4095 -> 3934 (-3.93%) Copies: 363 -> 376 (+3.58%) PreVGPRs: 110 -> 109 (-0.91%) VALU: 3035 -> 2504 (-17.50%) SALU: 463 -> 459 (-0.86%) Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31009>	2024-09-05 22:04:05 +00:00
Caio Oliveira	74be809237	compiler: Allow derivative_group to be used for all stages in shader_info These will now also be used by stages that have workgroups. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30950>	2024-09-03 20:03:18 +00:00
Ian Romanick	f11a414645	nir/algebraic: Remove incorrect bfi of iand pattern The comment says, "This expands to (b & 3) & ~0xc which is (b & 3) & 3." This is not correct. ~0xc is actually 0xfffffff3. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Closes: #11695 Fixes: `1c7e35d4e0` ("nir/algebraic: Optimize some bit operation nonsense observed in some shaders") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30913>	2024-08-29 22:21:55 +00:00
Georg Lehmann	ef970c5a9d	nir: optimize pack_uint_2x16 of pack_half(a, 0) Foz-DB Navi31: Totals from 31 (0.04% of 79395) affected shaders: Instrs: 6157 -> 6065 (-1.49%) CodeSize: 35676 -> 34936 (-2.07%) Latency: 23979 -> 23805 (-0.73%); split: -0.79%, +0.07% InvThroughput: 5248 -> 5124 (-2.36%) VALU: 3224 -> 3162 (-1.92%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30855>	2024-08-28 07:16:55 +00:00
Ian Romanick	198d8d9c03	nir/algebraic: Improve some find_lsb and ifind_msb patterns These patterns were observed in shaders from parallel-rdp. No shader-db changes on any Intel platform. fossil-db: Meteor Lake, DG2, Ice Lake had Skylake similar results. (Meteor Lake shown) Totals: Instrs: 152535883 -> 152535673 (-0.00%); split: -0.00%, +0.00% Cycle count: 17112406110 -> 17122827810 (+0.06%); split: -0.01%, +0.07% Spill count: 78525 -> 78523 (-0.00%) Fill count: 148132 -> 148127 (-0.00%); split: -0.01%, +0.00% Max live registers: 31855320 -> 31855314 (-0.00%) Totals from 206 (0.03% of 633223) affected shaders: Instrs: 797124 -> 796914 (-0.03%); split: -0.03%, +0.00% Cycle count: 4716743323 -> 4727165023 (+0.22%); split: -0.05%, +0.27% Spill count: 18781 -> 18779 (-0.01%) Fill count: 31381 -> 31376 (-0.02%); split: -0.03%, +0.01% Max live registers: 31872 -> 31866 (-0.02%) Tiger Lake Totals: Instrs: 150560465 -> 150560343 (-0.00%); split: -0.00%, +0.00% Cycle count: 15482372893 -> 15479328542 (-0.02%); split: -0.02%, +0.00% Fill count: 103509 -> 103512 (+0.00%) Max live registers: 31760378 -> 31760374 (-0.00%) Totals from 199 (0.03% of 632445) affected shaders: Instrs: 679513 -> 679391 (-0.02%); split: -0.02%, +0.00% Cycle count: 4258406125 -> 4255361774 (-0.07%); split: -0.09%, +0.02% Fill count: 30609 -> 30612 (+0.01%) Max live registers: 30502 -> 30498 (-0.01%) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30650>	2024-08-16 14:52:04 +00:00
Marek Olšák	ecfefe823e	nir/opt_algebraic: use fmulz for fpow lowering to fix incorrect rendering The original implementation in all radeon drivers had this behavior. Fixes: `9bc1fb4c07` - ac/llvm,radeonsi: lower nir_fpow for aco and llvm Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11464 Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30069>	2024-07-23 15:23:27 +00:00
Ian Romanick	faee9426ab	nir/algebraic: Optimize some masking of extract_u8 operations I observed this pattern in several Red Dead Redemption 2 shaders. No shader-db changes on any Intel platform. v2: Remove duplicated patterns. Noticed by Georg. fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 151519393 -> 151507192 (-0.01%); split: -0.01%, +0.00% Cycle count: 17208246858 -> 17177437340 (-0.18%); split: -0.25%, +0.07% Spill count: 80830 -> 80759 (-0.09%); split: -0.09%, +0.00% Fill count: 152754 -> 152179 (-0.38%); split: -0.40%, +0.02% Totals from 7531 (1.20% of 630198) affected shaders: Instrs: 12606141 -> 12593940 (-0.10%); split: -0.10%, +0.00% Cycle count: 5466605514 -> 5435795996 (-0.56%); split: -0.79%, +0.22% Spill count: 25251 -> 25180 (-0.28%); split: -0.29%, +0.01% Fill count: 45143 -> 44568 (-1.27%); split: -1.36%, +0.08% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30158>	2024-07-20 00:19:05 +00:00
Ian Romanick	1c7e35d4e0	nir/algebraic: Optimize some bit operation nonsense observed in some shaders In updates (not post at the time of this writing) to !29884, a change caused many spill and fill regressions shader for OpenGL Tomb Raider. While looking at that shader, I noticed some odd patterns. I initially added these patterns to counteract the regressions caused by the other change, but I had no luck. On Ice Lake... this cuts 99 instructions from the shader. shader-db: All Intel platforms had simliar results. (Meteor Lake shown) total instructions in shared programs: 19732341 -> 19732295 (<.01%) instructions in affected programs: 1744 -> 1698 (-2.64%) helped: 1 / HURT: 0 total cycles in shared programs: 916273716 -> 916273068 (<.01%) cycles in affected programs: 14266 -> 13618 (-4.54%) helped: 1 / HURT: 0 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 151519575 -> 151519393 (-0.00%) Cycle count: 17208402120 -> 17208246858 (-0.00%); split: -0.00%, +0.00% Totals from 159 (0.03% of 630198) affected shaders: Instrs: 51970 -> 51788 (-0.35%) Cycle count: 11474176 -> 11318914 (-1.35%); split: -1.36%, +0.01% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30158>	2024-07-20 00:19:05 +00:00
Christian Gmeiner	87786a7a7e	nak: Move imad late optimization to nir It is more or less just a code move, but I touched is_only_used_by_iadd(..) to match the style of the other functions in that file. Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30099>	2024-07-12 05:54:46 +00:00
Konstantin Seurer	d9e41e8a8c	nir: Stop using "capture : true" for nir_opt_algebraic "calture : true" is suboptimal and and prevents the script from writing multiple files in one go. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30041>	2024-07-06 15:51:06 +00:00
Georg Lehmann	3e86d2452f	nir/opt_algebraic: add various unordered/ordered patterns from aco Foz-DB Navi21: Totals from 6747 (8.50% of 79395) affected shaders: MaxWaves: 134646 -> 134642 (-0.00%) Instrs: 7830299 -> 7828851 (-0.02%); split: -0.03%, +0.01% CodeSize: 43045532 -> 43010260 (-0.08%); split: -0.09%, +0.00% VGPRs: 378960 -> 378968 (+0.00%) SpillSGPRs: 1209 -> 1208 (-0.08%) Latency: 74667977 -> 74670405 (+0.00%); split: -0.02%, +0.02% InvThroughput: 20124981 -> 20124768 (-0.00%); split: -0.02%, +0.02% VClause: 162870 -> 162868 (-0.00%); split: -0.00%, +0.00% SClause: 277280 -> 277315 (+0.01%); split: -0.00%, +0.02% Copies: 528627 -> 528667 (+0.01%); split: -0.00%, +0.01% PreSGPRs: 319526 -> 319508 (-0.01%) PreVGPRs: 334264 -> 334265 (+0.00%); split: -0.00%, +0.00% VALU: 5485412 -> `5485408` (-0.00%); split: -0.02%, +0.02% SALU: 743882 -> 742301 (-0.21%); split: -0.21%, +0.00% Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>	2024-06-27 08:12:30 +00:00
Georg Lehmann	434dfb51ca	nir/opt_algebraic: optimize cmp(fneg(a), #b) and feq with fabs Foz-DB Navi21: Totals from 2483 (3.13% of 79395) affected shaders: Instrs: 4067533 -> 4067756 (+0.01%); split: -0.00%, +0.01% CodeSize: 22525156 -> 22499904 (-0.11%); split: -0.12%, +0.01% Latency: 51967223 -> 51963654 (-0.01%); split: -0.01%, +0.00% InvThroughput: 16685020 -> 16683045 (-0.01%); split: -0.01%, +0.00% SClause: 131890 -> 131907 (+0.01%) Copies: 402557 -> 402510 (-0.01%); split: -0.01%, +0.00% Branches: 146962 -> 146958 (-0.00%) PreSGPRs: 118404 -> 118401 (-0.00%) PreVGPRs: 123791 -> 123787 (-0.00%) VALU: 2709846 -> 2710174 (+0.01%); split: -0.00%, +0.01% SALU: 565883 -> 565786 (-0.02%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>	2024-06-27 08:12:30 +00:00
Georg Lehmann	98cc57bccb	nir/optimize cmp(a, -0.0) +0.0 can use an inline constant for AMD hardware, -0.0 needs a literal. Foz-DB Navi21: Totals from 1014 (1.28% of 79395) affected shaders: Instrs: 3037490 -> 3036849 (-0.02%); split: -0.02%, +0.00% CodeSize: 17060228 -> 17051276 (-0.05%); split: -0.05%, +0.00% Latency: 45916788 -> 45916600 (-0.00%); split: -0.00%, +0.00% InvThroughput: 12982201 -> 12982187 (-0.00%); split: -0.00%, +0.00% VClause: 79475 -> 79478 (+0.00%) SClause: 119935 -> 119934 (-0.00%); split: -0.00%, +0.00% Copies: 301641 -> 300964 (-0.22%); split: -0.23%, +0.00% PreSGPRs: 59155 -> 59144 (-0.02%) VALU: 2032016 -> 2032034 (+0.00%) SALU: 386424 -> 385729 (-0.18%) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>	2024-06-27 08:12:30 +00:00
Georg Lehmann	8e6bf596cb	nir/opt_algebraic: look through fabs/fneg when matching fmulz/ffmaz Prevents regressions when removing input modifiers from a == 0.0. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29467>	2024-06-27 08:12:30 +00:00
Georg Lehmann	75b1fa9263	nir/opt_algebraic: alternative 8bit pack_[us]norm_4x8 lowering Foz-DB Navi21: Totals from 42 (0.05% of 79395) affected shaders: Instrs: 2709529 -> 2705848 (-0.14%) CodeSize: 14720732 -> 14711384 (-0.06%); split: -0.06%, +0.00% VGPRs: 4096 -> 4104 (+0.20%) Latency: 17907612 -> 17904468 (-0.02%); split: -0.02%, +0.00% InvThroughput: 4723551 -> 4722649 (-0.02%); split: -0.02%, +0.00% Copies: 223516 -> 219819 (-1.65%) Branches: 109578 -> 109594 (+0.01%); split: -0.00%, +0.02% VALU: 1730848 -> 1727151 (-0.21%) Tested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28882>	2024-06-04 17:00:29 +00:00
Ian Romanick	7b7e5cf5d4	nir/algebraic: intel/fs: Optimize some patterns before lowering 64-bit integers v2: Add some comments explaining some of the nuance of the shift optimizations. Fix a bug in the shift count calculation of the upper 32-bits. Move the @64 from the variable to the opcode. All suggested by Jordan. No shader-db changes on any Intel platform. fossil-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 154507026 -> 154506576 (-0.00%) Cycle count: 17436298868 -> 17436295016 (-0.00%) Max live registers: 32635309 -> 32635297 (-0.00%) Totals from 42 (0.01% of 632575) affected shaders: Instrs: 5616 -> 5166 (-8.01%) Cycle count: 133680 -> 129828 (-2.88%) Max live registers: 1158 -> 1146 (-1.04%) No fossil-db changes on any other Intel platform. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>	2024-05-31 09:13:23 -07:00
Ian Romanick	4834df82e2	nir/algebraic: More patterns to generate iadd3 I noticed some shaders with patterns similar to these while working on cooperative matrix lowering. Meteor Lake and DG2 are the only platforms that support iadd3, so there were no shader-db or fossil-db changes on any other platforms. shader-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) total instructions in shared programs: 19869445 -> 19868343 (<.01%) instructions in affected programs: 419426 -> 418324 (-0.26%) helped: 913 / HURT: 2 total cycles in shared programs: 936010029 -> 935909811 (-0.01%) cycles in affected programs: 31746523 -> 31646305 (-0.32%) helped: 495 / HURT: 356 LOST: 10 GAINED: 12 fossil-db: Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 154514596 -> 154505466 (-0.01%); split: -0.01%, +0.00% Cycle count: 17540226067 -> 17436266198 (-0.59%); split: -0.63%, +0.04% Spill count: 146887 -> 146886 (-0.00%) Fill count: 272499 -> 272489 (-0.00%); split: -0.01%, +0.00% Max live registers: 32634290 -> 32634739 (+0.00%); split: -0.00%, +0.00% Max dispatch width: 5550128 -> 5550368 (+0.00%) Totals from 4401 (0.70% of 632560) affected shaders: Instrs: `3095239` -> 3086109 (-0.29%); split: -0.30%, +0.00% Cycle count: 7327352564 -> 7223392695 (-1.42%); split: -1.51%, +0.10% Spill count: 28105 -> 28104 (-0.00%) Fill count: 45830 -> 45820 (-0.02%); split: -0.04%, +0.02% Max live registers: 264376 -> 264825 (+0.17%); split: -0.05%, +0.22% Max dispatch width: 43768 -> 44008 (+0.55%) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>	2024-05-31 09:13:23 -07:00
Ian Romanick	22095c60bc	nir/algebraic: Add nir_lower_int64_options::nir_lower_iadd3_64 This allows us to not generate 64-bit iadd3 on Intel but continue generating it for NVIDIA. No shader-db or fossil-db changes. v2: Add nir_lower_iadd3_64 flag so we can continue to generate 64-bit iadd3 on NVIDIA platforms. v3: s/bit_size == 64/s == 64/. This cut-and-paste bug prevented any of the optimizations from ever occuring. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29148>	2024-05-31 09:13:23 -07:00
Georg Lehmann	dcab408a6c	nir: remove unpack_half_flush_to_zero It doesn't make sense to have two sets of opcodes for this when all backends that support the flush_to_zero variant just rely on the global floating point mode anyway. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29433>	2024-05-31 09:46:35 +00:00
Marek Olšák	b4bd380704	nir/algebraic: eliminate pack+unpack and unpack+pack pairs A new NIR shader for AMD drivers will need this. Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29233>	2024-05-17 22:04:00 +00:00
Francisco Jerez	15a10786e3	nir: Add option to lower 64-bit uadd_sat. C.f. `16be909936`. Intel Xe2 won't support saturation for 64-bit integer addition, regardless of signedness. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28283>	2024-05-15 17:16:51 +00:00
Ian Romanick	1b8cf06fc7	nir/algebraic: Optimize some extract_* expressions v2: Add missing '!options->lower_extract_byte' to the last two patterns. Every driver except Asahi sets both or neither. shader-db: All Intel platforms had similar results. (DG2 shown) total instructions in shared programs: 19659360 -> 19659356 (<.01%) instructions in affected programs: 44 -> 40 (-9.09%) helped: 2 / HURT: 0 total cycles in shared programs: 823432524 -> 823432520 (<.01%) cycles in affected programs: 1722 -> 1718 (-0.23%) helped: 2 / HURT: 0 fossil-db: All Intel platforms had similar results. (DG2 shown) Totals: Instrs: 153989787 -> 153989617 (-0.00%) Cycle count: 17562079230 -> 17562079493 (+0.00%); split: -0.00%, +0.00% Totals from 24 (0.00% of 631369) affected shaders: Instrs: 13733 -> 13563 (-1.24%) Cycle count: 341392 -> 341655 (+0.08%); split: -0.25%, +0.33% Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> [v1] Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891>	2024-05-03 15:01:43 -07:00
Jesse Natalie	894f7f4387	nir_opt_algebraic: Add a couple optimizations for lowered unpack(pack()) I noticed some unnecessary 64-bit ints in shaders that were using doubles. Perhaps there's a different missing optimization that should run on the actual pack/unpack instructions before they're lowered, or maybe I'm just lowering them too early, but these seem simple enough that we might want them even for hand-rolled pack/unpack pairs. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27314>	2024-05-01 21:55:20 +00:00
Connor Abbott	32308fe9f1	ir3/nir: Fix imadsh_mix16 definition The constant-folding definition and comments say that it takes the high 16 bits of the first source and low 16 bits of the second source, but actually it's the opposite. The algebraic optimization, which actually happens and needs to be correct, was correct but the comment above it was wrong. Note that in the way we use it when lowering multiplications, the ordering doesn't matter. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22075>	2024-04-26 12:55:14 +00:00
Iván Briano	7f97fa6df0	nir/algebraic: move float control conditions to be per instruction Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27281>	2024-04-25 12:13:41 +00:00
Iván Briano	8c4cd3e74e	nir/algebraic: support float controls conditions per instruction v?: - Make the Python not awful (Dylan) Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27281>	2024-04-25 12:13:41 +00:00
Iván Briano	5218cff34b	nir/algebraic: avoid double lowering of some fp64 operations The ffloor@64 case, which lowers to use ffract, is already ignored if nir_lower_dfract is set. Do the same thing for ftrunc@64 and ffract@64 and let nir_lower_doubles take care of them directly instead. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28702>	2024-04-16 13:34:36 -07:00
Rhys Perry	08903bbe89	nir: add mqsad_4x8, shfr and nir_opt_mqsad Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26251>	2024-04-05 11:01:39 +00:00
Rhys Perry	b37804c8de	nir/algebraic: optimize 64-bit comparisons with zero'd halves to 32-bit These expect nir_lower_int64 to replace u2u64 to pack_64_2x32_split(, 0). fossil-db (navi31): Totals from 149 (0.19% of 79242) affected shaders: Instrs: 433095 -> 431830 (-0.29%); split: -0.29%, +0.00% CodeSize: 2165980 -> 2160284 (-0.26%); split: -0.27%, +0.00% SpillSGPRs: 689 -> 688 (-0.15%) Latency: 3801497 -> 3799901 (-0.04%); split: -0.05%, +0.01% InvThroughput: 1547916 -> 1546567 (-0.09%); split: -0.09%, +0.01% VClause: 4698 -> 4693 (-0.11%) SClause: 9981 -> 9977 (-0.04%); split: -0.05%, +0.01% Copies: 66148 -> 65431 (-1.08%); split: -1.09%, +0.01% PreSGPRs: 6732 -> 6729 (-0.04%) PreVGPRs: 7976 -> 7945 (-0.39%) VALU: 252936 -> 252336 (-0.24%) SALU: 51794 -> 51274 (-1.00%); split: -1.03%, +0.02% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>	2024-03-06 15:23:18 +00:00
Rhys Perry	417eb390c6	nir/algebraic: remove duplicated iand(ien, ine)/ior(ieq, ieq) patterns These don't seem useful, since they're already done in the early optimizations. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>	2024-03-06 15:23:18 +00:00
Rhys Perry	6952bb359c	nir/algebraic: don't create 64-bit min/max/ior if lowered fossil-db (navi31): Totals from 58 (0.07% of 79242) affected shaders: Instrs: 11692 -> 11304 (-3.32%) CodeSize: 65836 -> 62412 (-5.20%) VGPRs: 1320 -> 1344 (+1.82%) Latency: 51712 -> 50234 (-2.86%) InvThroughput: 10190 -> 10160 (-0.29%) Copies: 460 -> 688 (+49.57%) VALU: 6130 -> 5897 (-3.80%) SALU: 1231 -> 1284 (+4.31%); split: -0.32%, +4.63% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27335>	2024-03-06 15:23:18 +00:00
Karol Herbst	f2b7c4ce29	nir: rework and fix rotate lowering No driver supports urol/uror on all bit sizes. Intel gen11+ only for 16 and 32 bit, Nvidia GV100+ only for 32 bit. Etnaviv can support it on 8, 16 and 32 bit. Also turn the `lower` into a `has` option as only two drivers actually support `uror` and `urol` at this momemt. Fixes crashes with CL integer_rotate on iris and nouveau since we emit urol for `rotate`. v2: always lower 64 bit Fixes: `fe0965afa6` ("spirv: Don't use libclc for rotate") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by (Intel and nir): Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27090>	2024-01-22 10:27:44 +00:00
Rhys Perry	e86ab8173b	nir/algebraic: optimize vkd3d-proton's MSAD Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26907>	2024-01-05 18:55:22 +00:00
Konstantin Seurer	b88ac6b381	nir: Optimize fpow with small constant exponents They would be turned into exp(log(a)*b) instead, which is slow. Totals from 2146 (2.52% of 85071) affected shaders: MaxWaves: 35769 -> 35779 (+0.03%); split: +0.03%, -0.01% Instrs: 6476835 -> 6465494 (-0.18%); split: -0.18%, +0.00% CodeSize: 35382288 -> 35347092 (-0.10%); split: -0.10%, +0.00% SpillSGPRs: 1055 -> 1017 (-3.60%) Latency: 75211743 -> 75063623 (-0.20%); split: -0.20%, +0.00% InvThroughput: 17525115 -> 17501745 (-0.13%); split: -0.14%, +0.00% VClause: 200089 -> 200077 (-0.01%); split: -0.01%, +0.01% SClause: 293566 -> 293480 (-0.03%); split: -0.03%, +0.00% Copies: 649631 -> 640516 (-1.40%); split: -1.44%, +0.03% Branches: 268441 -> 268325 (-0.04%) PreSGPRs: 146868 -> 146045 (-0.56%) PreVGPRs: 134125 -> 134128 (+0.00%); split: -0.00%, +0.01% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26727>	2024-01-02 11:16:14 +01:00
Faith Ekstrand	aac1e3f595	nir: Add a new has_fmulz_no_denorms flag Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26569>	2023-12-11 15:29:17 +00:00
Faith Ekstrand	d2ffcb6092	nir: Lower [su]dot_4x8_[ui]add_sat to [su]dot_4x8_[ui]add Since nir_opt_algebraic runs on its own results, if the driver doesn't have [su]dot_4x8_[ui]add then the [su]dot_4x8_[ui]add lowering rules will kick in and lower that to what we had originally. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26533>	2023-12-06 23:15:33 +00:00
Faith Ekstrand	09fc5e1c4d	nir: Split has_[su]dot_4x8 bits into regular and _sat versions Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26533>	2023-12-06 23:15:33 +00:00
Qiang Yu	7e4aac46ad	nir: add force_f2f16_rtz option to lower f2f16 to f2f16_rtz Used by OpenGL driver like radeonsi which has undefined rounding mode. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25990>	2023-11-20 02:20:17 +00:00
Faith Ekstrand	3af5af429e	nir: Optimize boolean ieq/ine with an immediate Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26120>	2023-11-10 21:46:55 +00:00
Eric Anholt	b416248cb5	nir: Add nir_lower_dsign as 64-bit fsign lowering. Right now some drivers are doing dsign lowering in GLSL and haven't had to have a NIR path due to there not being a corresponding vulkan driver. We want this in NIR now so that we can retire that batch of GLSL lowering code. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25777>	2023-10-24 00:16:30 +00:00
Marek Olšák	8ff4847b64	nir/algebraic: use only signed_zero_preserve_* for addition by 0 patterns, etc. Some GLSL versions will set inf_preserve but not the other flags. Additions by 0 only affect signed zeros. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25392>	2023-10-17 17:27:12 +00:00
Alyssa Rosenzweig	be0ab37bac	nir/opt_algebraic: Optimize LLVM booleans Helps OpenCL kernels. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25687>	2023-10-13 02:55:48 +00:00
Alyssa Rosenzweig	8df8d1e2f2	nir/opt_algebraic: Reduce int64 If we just want the bottom 32-bits we don't need a full 64-bit operation. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25625>	2023-10-12 21:03:31 +00:00

1 2 3 4 5 ...

552 commits