Bash:将非 ASCII 字符转换为 ASCII
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1975057/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Bash: Convert non-ASCII characters to ASCII
提问by watain
How can I convert a string like ?vaig?d?s auk?tyb?j u?ges
or ?ü??ü?
to Zvaigzdes aukstybej uzges
or auoOUA
, respectively, using Bash?
如何使用 Bash将像?vaig?d?s auk?tyb?j u?ges
or一样的字符串分别转换?ü??ü?
为Zvaigzdes aukstybej uzges
or auoOUA
?
Basically I just want to convert all characters which aren't in the Latin alphabet.
基本上我只想转换所有不在拉丁字母表中的字符。
Thanks
谢谢
回答by Michael Krelin - hacker
Depending on your machine you can try piping your strings through
根据您的机器,您可以尝试将琴弦穿过
iconv -f utf-8 -t ascii//translit
(or whatever your encoding is, if it's not utf-8)
(或无论您的编码是什么,如果它不是 utf-8)
回答by Steve De Caux
You might be able to use iconv.
您也许可以使用 iconv。
For example, the string:
例如,字符串:
?vaig?d?s auk?tyb?j u?ges or ?ü??ü?
?vaig?d?s auk?tyb?ju?ges 或?ü??ü?
is in file testutf8.txt, utf8 format.
在文件 testutf8.txt 中,utf8 格式。
Running command:
运行命令:
iconv -f UTF8 -t US-ASCII//TRANSLIT testutf8.txt
iconv -f UTF8 -t US-ASCII//TRANSLIT testutf8.txt
results in:
结果是:
Zvaigzdes aukstybej uzges or auoOUA
Zvaigzdes aukstybej uzges 或 auoOUA
回答by Emil Vikstr?m
echo Hej p? dig, du den d?ra | iconv -f utf-8 -t us-ascii//TRANSLIT
gives:
给出:
Hej pa dig, du den dara
回答by Emil Vikstr?m
try {
String name = "?vaig?d?s auk?tyb?j u?ges ";
String s1 = Normalizer.normalize(name, Normalizer.Form.NFKD);
String regex = "[\p{InCombiningDiacriticalMarks}\p{IsLm}\p{IsSk}]+";
String s2 = new String(s1.replaceAll(regex, "").getBytes("ascii"), "ascii");
} catch (UnsupportedEncodingException e) {
}