Linux 找出给定字体支持哪些字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4458696/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Finding out what characters a given font supports
提问by Till Ulen
How do I extract the list of supported Unicode characters from a TrueType or embedded OpenType font on Linux?
如何从 Linux 上的 TrueType 或嵌入式 OpenType 字体中提取支持的 Unicode 字符列表?
Is there a tool or a library I can use to process a .ttf or a .eot file and build a list of code points (like U+0123, U+1234, etc.) provided by the font?
是否有工具或库可用于处理 .ttf 或 .eot 文件并构建字体提供的代码点列表(如 U+0123、U+1234 等)?
采纳答案by Janus Troelsen
Here is a method using the FontToolsmodule (which you can install with something like pip install fonttools
):
这是使用FontTools模块的方法(您可以使用类似的东西安装pip install fonttools
):
#!/usr/bin/env python
from itertools import chain
import sys
from fontTools.ttLib import TTFont
from fontTools.unicode import Unicode
ttf = TTFont(sys.argv[1], 0, verbose=0, allowVID=0,
ignoreDecompileErrors=True,
fontNumber=-1)
chars = chain.from_iterable([y + (Unicode[y[0]],) for y in x.cmap.items()] for x in ttf["cmap"].tables)
print(list(chars))
# Use this for just checking if the font contains the codepoint given as
# second argument:
#char = int(sys.argv[2], 0)
#print(Unicode[char])
#print(char in (x[0] for x in chars))
ttf.close()
The script takes as argument the font path?:
该脚本将字体路径作为参数?:
python checkfont.py /path/to/font.ttf
回答by hippietrail
回答by wschang
The character code points for a ttf/otf font are stored in the CMAP
table.
ttf/otf 字体的字符代码点存储在CMAP
表中。
You can use ttx
to generate a XML representation of the CMAP
table. see here.
您可以使用它ttx
来生成CMAP
表的 XML 表示。看到这里。
You can run the command ttx.exe -t cmap MyFont.ttf
and it should output a file MyFont.ttx
. Open it in a text editor and it should show you all the character code it found in the font.
您可以运行该命令ttx.exe -t cmap MyFont.ttf
,它应该会输出一个文件MyFont.ttx
。在文本编辑器中打开它,它应该会显示它在字体中找到的所有字符代码。
回答by ecmanaut
I just had the same problem, and made a HOWTOthat goes one step further, baking a regexp of all the supported Unicode code points.
我刚刚遇到了同样的问题,并制作了一个更进一步的HOWTO,烘焙了所有支持的 Unicode 代码点的正则表达式。
If you just want the array of codepoints, you can use this when peeking at your ttx
xml in Chrome devtools, after running ttx -t cmap myfont.ttf
and, probably, renaming myfont.ttx
to myfont.xml
to invoke Chrome's xml mode:
如果你只是想码点的阵列,你可以在你偷看时使用ttx
的镀铬devtools XML,运行后ttx -t cmap myfont.ttf
和可能,重新命名myfont.ttx
,以myfont.xml
调用浏览器的XML模式:
function codepoint(node) { return Number(node.nodeValue); }
$x('//cmap/*[@platformID="0"]/*/@code').map(codepoint);
(Also relies on fonttools
from gilamesh's suggestion; sudo apt-get install fonttools
if you're on an ubuntu system.)
(也依赖于fonttools
gilamesh 的建议;sudo apt-get install fonttools
如果您使用的是 ubuntu 系统。)
回答by nim
fc-query my-font.ttf
will give you a map of supported glyphs and all the locales the font is appropriate for according to fontconfig
fc-query my-font.ttf
将根据 fontconfig为您提供支持的字形图和字体适合的所有语言环境
Since pretty much all modern linux apps are fontconfig-based this is much more useful than a raw unicode list
由于几乎所有现代 linux 应用程序都是基于 fontconfig 的,因此这比原始 unicode 列表有用得多
The actual output format is discussed here http://lists.freedesktop.org/archives/fontconfig/2013-September/004915.html
实际的输出格式在这里讨论 http://lists.freedesktop.org/archives/fontconfig/2013-September/004915.html
回答by Spencer
The Linux program xfd can do this. It's provided in my distro as 'xorg-xfd'. To see all characters for a font, you can run this in terminal:
Linux 程序 xfd 可以做到这一点。它在我的发行版中作为“xorg-xfd”提供。要查看字体的所有字符,您可以在终端中运行:
xfd -fa "DejaVu Sans Mono"
回答by deceleratedcaviar
If you ONLY want to "view" the fonts, the following might be helpful (if your terminal supports the font in question):
如果您只想“查看”字体,以下内容可能会有所帮助(如果您的终端支持相关字体):
#!/usr/bin/env python
import sys
from fontTools.ttLib import TTFont
with TTFont(sys.argv[1], 0, ignoreDecompileErrors=True) as ttf:
for x in ttf["cmap"].tables:
for (_, code) in x.cmap.items():
point = code.replace('uni', '\u').lower()
print("echo -e '" + point + "'")
An unsafe, but easy way to view:
一种不安全但简单的查看方式:
python font.py my-font.ttf | sh
Thanks to Janus (https://stackoverflow.com/a/19438403/431528) for the answer above.
感谢 Janus ( https://stackoverflow.com/a/19438403/431528) 提供上述答案。
回答by Neil Mayhew
The fontconfig
commands can output the glyph list as a compact list of ranges, eg:
这些fontconfig
命令可以将字形列表输出为范围的紧凑列表,例如:
$ fc-match --format='%{charset}\n' OpenSans
20-7e a0-17f 192 1a0-1a1 1af-1b0 1f0 1fa-1ff 218-21b 237 2bc 2c6-2c7 2c9
2d8-2dd 2f3 300-301 303 309 30f 323 384-38a 38c 38e-3a1 3a3-3ce 3d1-3d2 3d6
400-486 488-513 1e00-1e01 1e3e-1e3f 1e80-1e85 1ea0-1ef9 1f4d 2000-200b
2013-2015 2017-201e 2020-2022 2026 2030 2032-2033 2039-203a 203c 2044 2070
2074-2079 207f 20a3-20a4 20a7 20ab-20ac 2105 2113 2116 2120 2122 2126 212e
215b-215e 2202 2206 220f 2211-2212 221a 221e 222b 2248 2260 2264-2265 25ca
fb00-fb04 feff fffc-fffd
Use fc-query
for a .ttf
file and fc-match
for an installed font name.
使用fc-query
的.ttf
文件和fc-match
已安装的字体名称。
This likely doesn't involve installing any extra packages, and doesn't involve translating a bitmap.
这可能不涉及安装任何额外的包,也不涉及转换位图。
Use fc-match --format='%{file}\n'
to check whether the right font is being matched.
使用fc-match --format='%{file}\n'
检查正确的字体是否被匹配。
回答by zhk_tiger
The above Janus's answer (https://stackoverflow.com/a/19438403/431528) works. But python is too slow, especially for Asian fonts. It costs minutes for a 40MB file size font on my E5 computer.
以上 Janus 的回答(https://stackoverflow.com/a/19438403/431528)有效。但是python太慢了,尤其是亚洲字体。在我的 E5 计算机上使用 40MB 文件大小的字体需要几分钟。
So I write a little C++ program to do that. It is depends on FreeType2(https://www.freetype.org/). It is a vs2015 project, but it is easy to port to linux for it is a console application.
所以我写了一个小 C++ 程序来做到这一点。它取决于 FreeType2( https://www.freetype.org/)。它是一个 vs2015 项目,但很容易移植到 linux,因为它是一个控制台应用程序。
Code can be found here, https://github.com/zhk/AllCodePointsFor the 40MB file size Asian font, it costs about 30 ms on my E5 computer.
代码可以在这里找到,https://github.com/zhk/AllCodePoints对于 40MB 文件大小的亚洲字体,在我的 E5 计算机上花费大约 30 毫秒。
回答by brunoob
If you want to get all characters supported by a font, you may use the following (based on Janus's answer)
如果您想获得字体支持的所有字符,您可以使用以下内容(基于 Janus 的回答)
from fontTools.ttLib import TTFont
def get_font_characters(font_path):
with TTFont(font_path) as font:
characters = {chr(y[0]) for x in font["cmap"].tables for y in x.cmap.items()}
return characters