java 比较忽略重音字符的字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/28833797/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Compare strings ignoring accented characters
提问by alexandre1985
I would like to know if there is a method that compares 2 strings and ignores the accents making "no??o" equal to "nocao". it would be something like string1.methodCompareIgnoreAccent(string2);
我想知道是否有一种方法可以比较 2 个字符串并忽略使“no??o”等于“nocao”的重音。它类似于 string1.methodCompareIgnoreAccent(string2);
回答by Kennedy Oliveira
You can use java Collators for comparing the tests ignoring the accent, see a simple example:
您可以使用 java Collators 来比较忽略重音的测试,请看一个简单的例子:
import java.text.Collator;
/**
* @author Kennedy
*/
public class SimpleTest
{
public static void main(String[] args)
{
String a = "nocao";
String b = "no??o";
final Collator instance = Collator.getInstance();
// This strategy mean it'll ignore the accents
instance.setStrength(Collator.NO_DECOMPOSITION);
// Will print 0 because its EQUAL
System.out.println(instance.compare(a, b));
}
}
Documentation: JavaDoc
文档:JavaDoc
I'll not explain in details because i used just a little of Collators and i'm not a expert in it, but you can google there's some articles about it.
我不会详细解释,因为我只使用了一点 Collators 并且我不是这方面的专家,但是你可以 google 那里有一些关于它的文章。
回答by intrigus
There is no built in method to do this, so you have to build your own:
没有内置方法可以执行此操作,因此您必须构建自己的方法:
A part of this is solution is from here: This first splits all accented characters into their deAccented counterparts followed by their combining diacritics. Then you simply remove all combining diacritics. Also see https://stackoverflow.com/a/1215117/4095834
解决方案的一部分来自这里:这首先将所有重音字符拆分为它们的 deAccented 对应字符,然后是它们的组合变音符号。然后您只需删除所有组合变音符号。另见https://stackoverflow.com/a/1215117/4095834
And then your equals method will look like this:
然后您的 equals 方法将如下所示:
import java.text.Normalizer;
import java.text.Normalizer.Form;
public boolean equals(Object o) {
// Code omitted
if (yourField.equals(removeAccents(anotherField))) {
return true;
}
}
public static String removeAccents(String text) {
return text == null ? null : Normalizer.normalize(text, Form.NFD)
.replaceAll("\p{InCombiningDiacriticalMarks}+", "");
}