java 比较忽略重音字符的字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28833797/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 14:15:02  来源:igfitidea点击:

Compare strings ignoring accented characters

javastringcompareaccent-insensitive

提问by alexandre1985

I would like to know if there is a method that compares 2 strings and ignores the accents making "no??o" equal to "nocao". it would be something like string1.methodCompareIgnoreAccent(string2);

我想知道是否有一种方法可以比较 2 个字符串并忽略使“no??o”等于“nocao”的重音。它类似于 string1.methodCompareIgnoreAccent(string2);

回答by Kennedy Oliveira

You can use java Collators for comparing the tests ignoring the accent, see a simple example:

您可以使用 java Collat​​ors 来比较忽略重音的测试,请看一个简单的例子:

import java.text.Collator;

/**
 * @author Kennedy
 */
public class SimpleTest
{

  public static void main(String[] args)
  {
    String a = "nocao";
    String b = "no??o";

    final Collator instance = Collator.getInstance();

    // This strategy mean it'll ignore the accents
    instance.setStrength(Collator.NO_DECOMPOSITION);

    // Will print 0 because its EQUAL
    System.out.println(instance.compare(a, b));
  }
}

Documentation: JavaDoc

文档:JavaDoc

I'll not explain in details because i used just a little of Collators and i'm not a expert in it, but you can google there's some articles about it.

我不会详细解释,因为我只使用了一点 Collat​​ors 并且我不是这方面的专家,但是你可以 google 那里有一些关于它的文章。

回答by intrigus

There is no built in method to do this, so you have to build your own:

没有内置方法可以执行此操作,因此您必须构建自己的方法:

A part of this is solution is from here: This first splits all accented characters into their deAccented counterparts followed by their combining diacritics. Then you simply remove all combining diacritics. Also see https://stackoverflow.com/a/1215117/4095834

解决方案的一部分来自这里:这首先将所有重音字符拆分为它们的 deAccented 对应字符,然后是它们的组合变音符号。然后您只需删除所有组合变音符号。另见https://stackoverflow.com/a/1215117/4095834

And then your equals method will look like this:

然后您的 equals 方法将如下所示:

import java.text.Normalizer;
import java.text.Normalizer.Form;

public boolean equals(Object o) {
    // Code omitted
    if (yourField.equals(removeAccents(anotherField))) {
        return true;
    }
}

public static String removeAccents(String text) {
    return text == null ? null : Normalizer.normalize(text, Form.NFD)
            .replaceAll("\p{InCombiningDiacriticalMarks}+", "");
}