Java 将 áé??ú 更改为 aeouu
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4122170/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java change áé??ú to aeouu
提问by lacas
Possible Duplicates:
Remove diacritical marks (ń ? ň ? ? ? ? ? ? ? ? ? ? ? ?) from Unicode chars
Is there a way to get rid of accents and convert a whole string to regular letters?
可能的重复项:
从 Unicode 字符中删除变音符号 (ń ? ň ? ? ? ? ? ? ? ? ? ?)
有没有办法摆脱重音并将整个字符串转换为常规字母?
How can i do this? Thanks for the help
我怎样才能做到这一点?谢谢您的帮助
采纳答案by Sean Patrick Floyd
I think your question is the same as these:
我认为你的问题与这些相同:
and hence the answer is also the same:
因此答案也是一样的:
String convertedString =
Normalizer
.normalize(input, Normalizer.Form.NFD)
.replaceAll("[^\p{ASCII}]", "");
See
看
- JavaDoc: Normalizer.normalize(String, Normalizer.Form)
- JavaDoc: Normalizer.Form.NFD
- Sun Java Tutorial: Normalizer's API)
- JavaDoc: Normalizer.normalize(String, Normalizer.Form)
- JavaDoc:Normalizer.Form.NFD
- Sun Java 教程:Normalizer 的 API)
Example Code:
示例代码:
final String input = "T??? ?? a f?ň?? ????ń?";
System.out.println(
Normalizer
.normalize(input, Normalizer.Form.NFD)
.replaceAll("[^\p{ASCII}]", "")
);
Output:
输出:
This is a funky String
这是一个时髦的字符串
回答by Bozho
First - you shouldn't. These symbols carry special phonetic properties which should not be ignored.
首先 - 你不应该。这些符号带有特殊的语音特性,不应被忽略。
The way to convert them is to create a Map
that holds each pair:
转换它们的方法是创建一个Map
包含每一对的:
Map<Character, Character> map = new HashMap<Character, Character>();
map.put('á', 'a');
map.put('é', 'e');
//etc..
and then loop the chars in the string, creating a new string by calling map.get(currentChar)
然后循环字符串中的字符,通过调用创建一个新字符串 map.get(currentChar)
回答by Michael Borgwardt
You can use java.text.Normalizer
to separate base letters and diacritics, then remove the latter via a regexp:
您可以使用java.text.Normalizer
分隔基本字母和变音符号,然后通过正则表达式删除后者:
public static String stripDiacriticas(String s) {
return Normalizer.normalize(s, Form.NFD)
.replaceAll("\p{InCombiningDiacriticalMarks}+", "");
}