Java 如何替换字符串中的所有特殊字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18598996/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 09:22:07  来源:igfitidea点击:

How to replaceAll special characters in a string?

javastringreplaceall

提问by user2743857

So to remove all the spaces in my string. I did a method that is consists of

所以删除我的字符串中的所有空格。我做了一个方法,包括

message = message.replaceAll("\s", "");

I was wondering if there was a command to remove and special character, like a comma, or period and just have it be a string. Do i have to remove them one by one or is there a piece of code that I am missing?

我想知道是否有删除特殊字符的命令,例如逗号或句点,并将其设为字符串。我必须一个一个地删除它们还是有一段代码我遗漏了?

回答by Rohit Jain

You can go the other way round. Replace everything that is not word characters, using negated character class:

你可以反过来。使用否定字符类替换所有不是单词字符的内容:

message = message.replaceAll("[^\w]", "");

or

或者

message = message.replaceAll("\W", "");

Both of them will replace the characters apart from [a-zA-Z0-9_]. If you want to replace the underscore too, then use:

它们都将替换除[a-zA-Z0-9_]. 如果您也想替换下划线,请使用:

[\W_]

回答by Chris Lohfink

\w is the same [A-Za-z0-9_] which will strip all spaces and such (but not _). Much safer to whitelist whats allowed instead of removing individual charecters.

\w 是相同的 [A-Za-z0-9_],它将去除所有空格等(但不是 _)。将允许的内容列入白名单而不是删除单个字符要安全得多。

回答by Bohemian

Contrary to what some may claim, \wis notthe same as [a-zA-Z0-9_]. \walso includes all characters from all languages (Chinese, Arabic, etc) that are letters or numbers (and the underscore).

相反,有些可能声称,\w一样的[a-zA-Z0-9_]\w还包括来自所有语言(中文、阿拉伯语等)的所有字母或数字(以及下划线)字符。

Considering that you probably consider non-Latin letters/numbers to be "special", this will remove all "non-normal" characters:

考虑到您可能认为非拉丁字母/数字是“特殊的”,这将删除所有“非正常”字符:

message = message.replaceAll("[^a-zA-Z0-9]", "");