Python 字体工具:检查字体是否支持多代码点表情符号

我正在尝试检查字体是否具有多代码点表情符号的字形，例如Python 3.x中的" 👱🏼 ♂️ "，" 🐱 🐉 或"。 🖐🏼

对于像"或"😄"😉这样的单个代码点表情符号，我可以使用Python fonttols通过以下代码验证它们的支持：

from fontTools.ttLib import TTFont
def __isEmojiSupportedByFont(emoji: str) -> bool:
    font = TTFont(r"C:WindowsFontsseguiemj.ttf")
    emojiCodepoint = ord(str) # Only works for single codepoint emoji
    for table in font['cmap'].tables:
        for char_code, glyph_name in table.cmap.items():
            if char_code == emojiCodepoint:
                return True
    return False

如何为多代码点表情符号执行此操作，因为cmp中只有一个代码点表情符号？

要检查多代码点表情符号，我必须"查询"GSUB查找列表。

检查 python 中的字体是否支持表情符号的一种更简单的方法是使用 HarfBuzz 或更确切的 harfpy。

我想出的解决方案是：

from uharfbuzz import Face, Font, Buffer, ot_font_set_funcs, shape
def __isEmojiSupportedByFont(self, emoji: str) -> bool:
    # Load font:
    with open(r"C:WindowsFontsseguiemj.ttf", 'rb') as fontfile:
        self.fontdata = fontfile.read()
    # Load font (has to be done for call):
    face = Face(self.fontdata)
    font = Font(face)
    upem = face.upem
    font.scale = (upem, upem)
    ot_font_set_funcs(font)
    # Create text buffer:
    buf = Buffer()
    buf.add_str(emoji)
    buf.guess_segment_properties()
    # Shape text:
    features = {"kern": True, "liga": True}
    shape(font, buf, features)
    # Remove all variant selectors:
    while len(infos) > 0 and infos[-1].codepoint == 3:
        infos = infos[:-1]
    # Filter empty:
    if len(infos) <= 0:
        return False
    # Remove uncombined, ending with skin tone like "👭🏿":
    lastCp = infos[-1].codepoint
    if lastCp == 1076 or lastCp == 1079 or lastCp == 1082 or lastCp == 1085 or lastCp == 1088:
        return False
    # If there is a code point 0 or 3 => Emoji not fully supported by font:
    return all(info.codepoint != 0 and info.codepoint != 3 for info in infos)

感谢 khaledhosny 和 justvanrossum 在 GitHub/fonttols 上！

相关内容

最新更新

热门标签：