As it turns out, the original implementation using
`hb_ot_shape_glyphs_closure` is extremely slow if a font has a rich set of
OpenType features. For example, this function was called 66954 times while
loading font `arial.ttf` version 7.00, increasing FreeType's startup time by
a factor of 10, which is unacceptable.
The new algorithm uses a completely different, more low-level approach, no
longer working with OpenType features but with OpenType lookups. It relies
on function `hb_ot_layout_lookup_get_glyph_alternates`, also replacing
recursion with a simple loop. In total, this brings the additional startup
time back to an acceptable range of a few percent.
A side effect of the new approach is that it catches more alternate forms:
the old code didn't properly handle script-specific features.
To make the change more readable, this commit only adds new code.
Reported as
https://bugs.ghostscript.com/show_bug.cgi?id=708295
* src/autofit/aflatin.c
(af_glyph_hints_apply_vertical_separation_adjustments): Initialize all
array elements of `contour_y_minima` and `contour_y_maxima`.
Call the functions once per font instead of once per glyph.
* src/autofit/afadjust.c (af_all_glyph_variants): Move code to compute the
`feature_tags` and `type_3_lookup_indices` sets to...
(af_reverse_character_map_new): ...this function.
Due to the way the reverse map array gets constructed with HarfBuzz, there
might be multiple, identical glyph index entries with different character
values in the array. As an example, an OpenType feature like 'unic' might
map lowercase glyph 'ae' to uppercase glyph 'AE', in addition to the already
present cmap entry for 'AE'.
In most cases, this incorrect mapping is harmless (but still wrong).
However, there exist some lowercase/uppercase character pairs where the
diacritic for the uppercase character is on the other vertical side of the
base character as for the lowercase character. An example is U+0122 (LATIN
CAPITAL LETTER G WITH CEDILLA) and U+0123 (LATIN SMALL LETTER G WITH
CEDILLA): the former has the cedilla below, the latter above. A wrong
mapping would thus shift the base glyph 'G' up by a pixel instead of
shifting the cedilla down.
We fix this by always giving precedence to cmap entries.
* src/autofit/afadjust.c (af_reverse_character_map_entry_compare): Do a
secondary sort on the character code.
(af_reverse_character_map_lookup): Adjust binary search to return the
first occurrence of an entry (i.e., the one with the lowest array index).
(af_reverse_character_map_new)[FT_CONFIG_OPTION_USE_HARFBUZZ]: Implement
cmap priority.
Test vertical maxima instead of vertical minima to identify the highest
contour (and vice versa to identify the lowest contour). Doing so will
allow support of diacritics that consist of more than a single outline.
This works because of the topological constraints ensured by the adjustment
database.
* src/autofit/aflatin.c (af_find_highest_contour,
af_glyph_hints_apply_vertical_separation_adjustments): Implement it.
We need this for better positioning support of diacritics.
* src/autofit/afhints.h (AF_GlyphHintsRec): New fields `contour_y_minima`
and `contour_y_maxima`, together with its embedded variants.
* src/autofit/afhints.c (af_glyph_hints_done, af_glyph_hints_reload): Handle
new arrays.
If HarfBuzz is enabled, the reverse character map generation now considers
GSUB entries when looking for glyphs that correspond to a code point.
* src/autofit/afadjust.c (af_all_glyph_variants_helper,
af_all_glyph_variants) [FT_CONFIG_OPTION_USE_HARFBUZZ]: New functions.
(af_reverse_character_map_new) [FT_CONFIG_OPTION_USE_HARFBUZZ]: Call new
code.
With 64-bit platforms widely available, it is more efficient to use
64-bit variables readily. It results in noticeable 10% improvement
in glyph loading speed.
* src/truetype/ttinterp.c (TT_MulFix14, TT_DotFix14) [FT_INT64]:
Prioritize available implementation with arguments adjusted based on
the use cases.
Resolves inconsistencies in 64-bit multiplication discussed in !355.
Importantly, FT_MulFix arguments and return value is FT_Long,
whatever sizeof FT_Long is on 64-bit platforms: 8 bytes on Linux or
4 bytes on Windows.
* include/freetype/internal/ftcalc.h (FT_MulFix_x86_64): Removed.
(FT_MulFix_64): Generalize and prioritize the inline implementation
for all 64-bit platforms ifdef FT_INT64.
* src/base/ftcalc.c (FT_MulFix)[FT_INT64]: Call 'FT_MulFix_64'.
* src/base/ftbase.c: Include 'ftcalc.c' after the FT_MulFix callers
to enable its inlining.
Resolves inconsistencies in 64-bit multiplication discussed in !355. Importantly, FT_MulFix arguments and return value is FT_Long, whatever sizeof FT_Long is on 64-bit platforms: 8 bytes on Linux or 4 bytes on Windows. * include/freetype/internal/ftcalc.h (FT_MulFix_x86_64): Removed. (FT_MulFix_64): Generalize and prioritize the inline implementation
for all 64-bit platforms ifdef FT_INT64. * src/base/ftcalc.c (FT_MulFix)[FT_INT64]: Call 'FT_MulFix_64'. * src/base/ftbase.c: Include 'ftcalc.c' after the FT_MulFix callers to enable its inlining.
Before, we were loading a palette (again and again) even if the
same was requested. Even if the font only had one palette...
For a font like NotoColorEmoji that has over 5000 colors in its
palette, this was dominating the COLRv1 loading times for HarfBuzz
(and I believe all other clients) because they have to set the
palette to get access to the colors.
* src/base/ftcolor.c (FT_Palette_Select): Check the current palette.
* src/bdf/bdflib.c (bdf_parse_start_): Reject fonts with initial
COMMENTs.
(bdf_parse_properties_): Skip COMMENTs so that...
(bdf_add_property_): Do not make exception for COMMENT.
(bdf_parse_glyphs_, bdf_add_comments): Updated.