Java 用空字符串替换所有非字母数字字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1805518/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 22:50:49  来源:igfitidea点击:

Replacing all non-alphanumeric characters with empty strings

javaregexnon-alphanumeric

提问by Alex Gomes

I tried using this but didn't work-

我试过用这个,但没有用-

return value.replaceAll("/[^A-Za-z0-9 ]/", "");

采纳答案by Mirek Pluta

Use [^A-Za-z0-9].

使用[^A-Za-z0-9].

Note: removed the space since that is not typically considered alphanumeric.

注意:删除了空格,因为它通常不被视为字母数字。

回答by erickson

return value.replaceAll("[^A-Za-z0-9 ]", "");

This will leavespaces intact. I assume that's what you want. Otherwise, remove the space from the regex.

这将留下空间不变。我想这就是你想要的。否则,从正则表达式中删除空格。

回答by Andrew Duffy

Try

尝试

return value.replaceAll("[^A-Za-z0-9]", "");

or

或者

return value.replaceAll("[\W]|_", "");

回答by abyx

Java's regular expressions don't require you to put a forward-slash (/) or any other delimiter around the regex, as opposed to other languages like Perl, for example.

Java 的正则表达式不需要您/在正则表达式周围放置正斜杠 ( ) 或任何其他分隔符,这与 Perl 等其他语言相反。

回答by zneo

I made this method for creating filenames:

我用这个方法来创建文件名:

public static String safeChar(String input)
{
    char[] allowed = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_".toCharArray();
    char[] charArray = input.toString().toCharArray();
    StringBuilder result = new StringBuilder();
    for (char c : charArray)
    {
        for (char a : allowed)
        {
            if(c==a) result.append(a);
        }
    }
    return result.toString();
}

回答by saurav

You could also try this simpler regex:

你也可以试试这个更简单的正则表达式:

 str = str.replaceAll("\P{Alnum}", "");

回答by Andre Steingress

You should be aware that [^a-zA-Z]will replace characters not being itself in the character range A-Z/a-z. That means special characters like é, ?etc. or cyrillic characters and such will be removed.

您应该知道这[^a-zA-Z]将替换不在字符范围 AZ/az 中的字符。这意味着特殊字符,如é,?等或西里尔字符等将被删除。

If the replacement of these characters is not wanted use pre-defined character classes instead:

如果不想替换这些字符,请改用预定义的字符类:

 str.replaceAll("[^\p{IsAlphabetic}\p{IsDigit}]", "");

PS: \p{Alnum}does not achieve this effect, it acts the same as [A-Za-z0-9].

PS:\p{Alnum}没有达到这个效果,作用和[A-Za-z0-9].

回答by Alberto Cerqueira

Simple method:

简单方法:

public boolean isBlank(String value) {
    return (value == null || value.equals("") || value.equals("null") || value.trim().equals(""));
}

public String normalizeOnlyLettersNumbers(String str) {
    if (!isBlank(str)) {
        return str.replaceAll("[^\p{L}\p{Nd}]+", "");
    } else {
        return "";
    }
}

回答by Albin

public static void main(String[] args) {
    String value = " Chlamydia_spp. IgG, IgM & IgA Abs (8006) ";

    System.out.println(value.replaceAll("[^A-Za-z0-9]", ""));

}

output: ChlamydiasppIgGIgMIgAAbs8006

输出:衣原体IgGIgMIgAAbs8006

Github: https://github.com/AlbinViju/Learning/blob/master/StripNonAlphaNumericFromString.java

Github:https: //github.com/AlbinViju/Learning/blob/master/StripNonAlphaNumericFromString.java

回答by snap

If you want to also allow alphanumeric characters which don't belong to the ascii characters set, like for instance german umlaut's, you can consider using the following solution:

如果您还想允许不属于 ascii 字符集的字母数字字符,例如德国元音变音,您可以考虑使用以下解决方案:

 String value = "your value";

 // this could be placed as a static final constant, so the compiling is only done once
 Pattern pattern = Pattern.compile("[^\w]", Pattern.UNICODE_CHARACTER_CLASS);

 value = pattern.matcher(value).replaceAll("");

Please note that the usage of the UNICODE_CHARACTER_CLASS flag could have an impose on performance penalty (see javadoc of this flag)

请注意,使用 UNICODE_CHARACTER_CLASS 标志可能会对性能造成影响(请参阅此标志的 javadoc)