如何在 PostgreSQL 和 JPA 2 中做到不区分大小写和不区分重音？

Question

提问by user1180339

I have a Java EE project using PostgreSQL 9.X and JPA2 (Hibernate implementation). How can I force a like query to be case insensitive and accent insensitive?

我有一个使用 PostgreSQL 9.X 和 JPA2（Hibernate 实现）的 Java EE 项目。如何强制类似查询不区分大小写和重音不敏感？

I'm able to change the charset of the DB because it's the first project using it.

我能够更改数据库的字符集，因为它是第一个使用它的项目。

Answer 1

回答by Craig Ringer

In general there is no standard way to write "accent-insensitive" code, or to compare words for equality while ignoring accents. The whole idea makes very little sense, as different accented characters mean different things in different languages/dialects, and their "plain ascii" substitutions/expansions vary by language. Please don't do this; resumeand résuméare different words, and the situation gets even worse when considering any language(s) other than English.

一般来说，没有标准的方法来编写“不区分重音”的代码，或者在忽略重音的情况下比较单词的相等性。整个想法毫无意义，因为不同的重音字符在不同的语言/方言中意味着不同的东西，并且它们的“普通 ascii”替换/扩展因语言而异。请不要这样做；resume和résumé是不同的词，当考虑除英语之外的任何语言时，情况会变得更糟。

For case-insensitivity you can use lower(the_col) like lower('%match_expression')in JPQL. As far as I know ilikeisn't supported in JPQL, but I have not checked the standardto verify this. It's fairly readable, so consider just downloading the JPA2 spec and reading it. JPA2 Criteria offers Restrictions.ilikefor the purpose. Neither will normalize/strip/ignore accented characters.

对于不区分大小写的情况，您可以lower(the_col) like lower('%match_expression')在 JPQL 中使用。据我所知ilikeJPQL 不支持，但我还没有检查标准来验证这一点。它具有相当的可读性，因此请考虑下载 JPA2 规范并阅读它。JPA2 CriteriaRestrictions.ilike为此目的而提供。也不会规范化/剥离/忽略重音字符。

For stripping accents, etc, you will probably need to use database-engine specific stored functions or native queries. See, eg this prior answer, or if you intended to substituteaccented characters with an unaccented alternative this PostgreSQL wiki entry- but again, please don't do thisexcept for very limited purposes like finding places where words may've been "unaccented" by misguided software or users.

对于剥离重音等，您可能需要使用数据库引擎特定的存储函数或本机查询。参见，例如这个先前的答案，或者如果您打算用这个 PostgreSQL wiki 条目的非重音替代替代重音字符- 但同样，除了非常有限的目的（例如查找单词可能已“未重音”的地方）之外，请不要这样做被误导的软件或用户。

Answer 2

回答by Clodoaldo Neto

If the unaccent extensionis installed:

如果安装了unaccent 扩展：

select unaccent(lower('?óê'));
 unaccent 
----------
 aoe

Answer 3

回答by motus

I had this issue, and I couldn't use database functions. So instead I used a REGEX restriction in my criteria code:

我遇到了这个问题，我无法使用数据库功能。因此，我在标准代码中使用了 REGEX 限制：

searchText = unaccent(searchText);
String expression = "firstName ~* '.*" + searchText + ".*'";
Criterion searchCriteria = Restrictions.sqlRestriction(expression);

Then I wrote a function called unaccent to change each character to a or-statement, for example any letter e will become (e|é|è). A query for "hello" will become "h(e|é|è)llo".

然后我写了一个叫 unaccent 的函数把每个字符变成一个 or 语句，例如任何字母 e 都会变成 (e|é|è)。对“hello”的查询将变成“h(e|é|è)llo”。

Here is the function inspired from this thread Postgres accent insensitive LIKE search in Rails 3.1 on Heroku

这是受此线程启发的函数Postgres 重音不敏感 LIKE 在 Heroku 上的 Rails 3.1 中搜索

private String unaccent(String text) {
    String String charactersProcessed = ""; // To avoid doing a replace multiple times.
    String newText = text.toLowerCase();
    text = newText; // Case statement is expecting lowercase.
    for (int i = 0; i < text.length(); i++) {
        char c = text.charAt(i);
        if (charactersProcessed.contains(c + "")) {
            continue; // We have already processed this character.
        }
        String replacement = "";
        switch (c) {
        case '1': {
            replacement = "1";
            break;
        }
        case '2': {
            replacement = "2";
            break;
        }
        case '3': {
            replacement = "3";
            break;
        }
        case 'a': {
            replacement = "á|à|a|?|?|?|ā|?|?|à|á|?|?|?|?|ā|?|?|?";
            break;
        }
        case 'c': {
            replacement = "?|?|?|?|?|?|?";
            break;
        }
        case 'd': {
            replacement = "?|D";
            break;
        }
        case 'e': {
            replacement = "è|é|ê|ё|?|ē|?|?|?|ě|è|ê|?|Ё|ē|?|?|?|ě|";
            break;
        }
        case 'g': {
            replacement = "?|?";
            break;
        }
        case 'i': {
            replacement = "?|ì|í|?|?|ì|?|ī|?|ì|í|?|?|?|ì|?|ī|?";
            break;
        }
        case 'l': {
            replacement = "?|?";
            break;
        }
        case 'n': {
            replacement = "ń|ň|?|?|?|?";
            break;
        }
        case 'o': {
            replacement = "ò|ó|?|?|?|ō|?|?|?|ò|ó|?|?|?|ō|?|?|?|?";
            break;
        }
        case 'r': {
            replacement = "?|?|?";
            break;
        }
        case 's': {
            replacement = "?|?|?|?|?|?|?";
            break;
        }
        case 'u': {
            replacement = "ù|ú|?|ü|?|ū|?|?|ù|ú|?|ü|?|ū|?|?";
            break;
        }
        case 'y': {
            replacement = "y|?|Y|?";
            break;
        }
        case 'z': {
            replacement = "?|?|?|?|?|?";
            break;
        }
        }
        if (!replacement.isEmpty()) {
            charactersProcessed = charactersProcessed + c;
            newText = newText.replace(c + "", "(" + c + "|" + replacement + ")");
        }
    }

    return newText;
}

如何在 PostgreSQL 和 JPA 2 中做到不区分大小写和不区分重音？

提问by user1180339

回答by Craig Ringer

回答by Clodoaldo Neto

回答by motus

相关推荐

最近更新

标签

如何在 PostgreSQL 和 JPA 2 中做到不区分大小写和不区分重音？

提问by user1180339

回答by Craig Ringer

回答by Clodoaldo Neto

回答by motus

相关推荐

postgresql 在 where 子句中使用“AND”时出现错误“参数必须是布尔类型”

postgresql 在存储函数中使用准备好的语句

创建具有登录名（用户）的 PostgreSQL 9 角色只是为了执行函数

postgresql 防止 GROUP BY 中的行重复计算

相关推荐

最近更新

标签