Bash:将非 ASCII 字符转换为 ASCII

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1975057/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 18:48:45  来源:igfitidea点击:

Bash: Convert non-ASCII characters to ASCII

bashascii

提问by watain

How can I convert a string like ?vaig?d?s auk?tyb?j u?gesor ?ü??ü?to Zvaigzdes aukstybej uzgesor auoOUA, respectively, using Bash?

如何使用 Bash将像?vaig?d?s auk?tyb?j u?gesor一样的字符串分别转换?ü??ü?Zvaigzdes aukstybej uzgesor auoOUA

Basically I just want to convert all characters which aren't in the Latin alphabet.

基本上我只想转换所有不在拉丁字母表中的字符。

Thanks

谢谢

回答by Michael Krelin - hacker

Depending on your machine you can try piping your strings through

根据您的机器,您可以尝试将琴弦穿过

iconv -f utf-8 -t ascii//translit

(or whatever your encoding is, if it's not utf-8)

(或无论您的编码是什么,如果它不是 utf-8)

回答by Steve De Caux

You might be able to use iconv.

您也许可以使用 iconv。

For example, the string:

例如,字符串:

?vaig?d?s auk?tyb?j u?ges or ?ü??ü?

?vaig?d?s auk?tyb?ju?ges 或?ü??ü?

is in file testutf8.txt, utf8 format.

在文件 testutf8.txt 中,utf8 格式。

Running command:

运行命令:

iconv -f UTF8 -t US-ASCII//TRANSLIT testutf8.txt

iconv -f UTF8 -t US-ASCII//TRANSLIT testutf8.txt

results in:

结果是:

Zvaigzdes aukstybej uzges or auoOUA

Zvaigzdes aukstybej uzges 或 auoOUA

回答by Emil Vikstr?m

echo Hej p? dig, du den d?ra | iconv -f utf-8 -t us-ascii//TRANSLIT

gives:

给出:

Hej pa dig, du den dara

回答by Emil Vikstr?m

 try {
        String name = "?vaig?d?s auk?tyb?j u?ges ";
        String s1 = Normalizer.normalize(name, Normalizer.Form.NFKD);
        String regex = "[\p{InCombiningDiacriticalMarks}\p{IsLm}\p{IsSk}]+";

        String s2 = new String(s1.replaceAll(regex, "").getBytes("ascii"), "ascii");

    } catch (UnsupportedEncodingException e) {
    }