Java 如何替换字符串中的特殊字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4283351/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 15:22:18  来源:igfitidea点击:

How to replace special characters in a string?

javastring

提问by Tanu

I have a string with lots of special characters. I want to remove all those, but keep alphabetical characters.

我有一个包含很多特殊字符的字符串。我想删除所有这些,但保留字母字符。

How can I do this?

我怎样才能做到这一点?

回答by Sean Patrick Floyd

That depends on what you mean. If you just want to get rid of them, do this:
(Update: Apparently you want to keep digits as well, use the second lines in that case)

这取决于你的意思。如果您只想摆脱它们,请执行以下操作:(
更新:显然您也想保留数字,在这种情况下使用第二行)

String alphaOnly = input.replaceAll("[^a-zA-Z]+","");
String alphaAndDigits = input.replaceAll("[^a-zA-Z0-9]+","");

or the equivalent:

或等价物:

String alphaOnly = input.replaceAll("[^\p{Alpha}]+","");
String alphaAndDigits = input.replaceAll("[^\p{Alpha}\p{Digit}]+","");

(All of these can be significantly improved by precompiling the regex pattern and storing it in a constant)

(所有这些都可以通过预编译正则表达式模式并将其存储在常量中得到显着改善)

Or, with Guava:

或者,用番石榴

private static final CharMatcher ALNUM =
  CharMatcher.inRange('a', 'z').or(CharMatcher.inRange('A', 'Z'))
  .or(CharMatcher.inRange('0', '9')).precomputed();
// ...
String alphaAndDigits = ALNUM.retainFrom(input);

But if you want to turn accented characters into something sensible that's still ascii, look at these questions:

但是,如果您想将重音字符转换为仍然是 ascii 的合理字符,请查看以下问题:

回答by Madhu Nandan

You can use basic regular expressions on strings to find all special characters or use pattern and matcher classes to search/modify/delete user defined strings. This link has some simple and easy to understand examples for regular expressions: http://www.vogella.de/articles/JavaRegularExpressions/article.html

您可以在字符串上使用基本的正则表达式来查找所有特殊字符或使用模式和匹配器类来搜索/修改/删除用户定义的字符串。这个链接有一些简单易懂的正则表达式示例:http: //www.vogella.de/articles/JavaRegularExpressions/article.html

回答by Dhiral Pandya

I am using this.

我正在使用这个。

s = s.replaceAll("\W", ""); 

It replace all special characters from string.

它替换字符串中的所有特殊字符。

Here

这里

\w : A word character, short for [a-zA-Z_0-9]

\w : 一个单词字符,[a-zA-Z_0-9] 的缩写

\W : A non-word character

\W : 一个非单词字符

回答by Mundroid

You can get unicode for that junk character from charactermap tool in window pc and add \u e.g. \u00a9 for copyright symbol. Now you can use that string with that particular junk caharacter, don't remove any junk character but replace with proper unicode.

您可以从 windows pc 中的字符映射工具获取该垃圾字符的 unicode,并添加 \u 例如 \u00a9 作为版权符号。现在您可以将该字符串与该特定的垃圾字符一起使用,不要删除任何垃圾字符,而是用适当的 unicode 替换。

回答by dhuma1981

You can use the following method to keep alphanumeric characters.

您可以使用以下方法保留字母数字字符。

replaceAll("[^a-zA-Z0-9]", "");

And if you want to keep only alphabetical characters use this

如果您只想保留字母字符,请使用它

replaceAll("[^a-zA-Z]", "");

回答by Mike Clark

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9&, _]|^\s)", "");

Here all the special characters except space, comma, and ampersand are replaced. You can also omit space, comma and ampersand by the following regular expression.

此处替换了除空格、逗号和与号以外的所有特殊字符。您还可以通过以下正则表达式省略空格、逗号和与号。

string Output = Regex.Replace(Input, @"([ a-zA-Z0-9_]|^\s)", "");

Where Input is the string which we need to replace the characters.

其中 Input 是我们需要替换字符的字符串。

回答by Muhammad Ahsan

For spaces use "[^a-z A-Z 0-9]" this pattern

对于空格使用“[^az AZ 0-9]”这种模式

回答by krishnamurthy

Replace any special characters by

将任何特殊字符替换为

replaceAll("\your special character","new character");

ex:to replace all the occurrence of * with white space

例如:用空格替换所有出现的 *

replaceAll("\*","");

*this statement can only replace one type of special character at a time

*此语句一次只能替换一种特殊字符

回答by Marco Sulla

Following the example of the Andrzej Doyle's answer, I think the better solution is to use org.apache.commons.lang3.StringUtils.stripAccents():

按照Andrzej Doyle 的回答示例,我认为更好的解决方案是使用org.apache.commons.lang3.StringUtils.stripAccents()

package bla.bla.utility;

import org.apache.commons.lang3.StringUtils;

public class UriUtility {
    public static String normalizeUri(String s) {
        String r = StringUtils.stripAccents(s);
        r = r.replace(" ", "_");
        r = r.replaceAll("[^\.A-Za-z0-9_]", "");
        return r;
    }
}