java 从字符串中删除不在白名单中的所有字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15249047/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 18:59:22  来源:igfitidea点击:

Remove all characters from string which are not on whitelist

javaregexreplacecharacterwhitelist

提问by PerwinCZ

I am trying to write java code which would remove all unwanted characters and let there be only whitelisted ones.

我正在尝试编写 Java 代码,该代码将删除所有不需要的字符,并只保留白名单中的字符。

Example:

例子:

String[] whitelist = {"a", "b", "c"..."z", "0"..."9", "[", "]",...}

I want there only letters (lower and uppercase) and numbers + some next characters I would add. Then I would start for()cycle for every character in the string, and replace it with empty string if it isn't on whitelist.

我只想要字母(小写和大写)和数字 + 我要添加的一些下一个字符。然后我将对for()字符串中的每个字符开始循环,如果它不在白名单中,则用空字符串替换它。

But that isn't good solution. Maybe it could be done somehow using pattern (regex)? Thanks.

但这不是很好的解决方案。也许可以使用模式(正则表达式)以某种方式完成?谢谢。

回答by Jon Skeet

Yes, you can use String.replaceAllwhich takes a regex:

是的,您可以使用String.replaceAllwhich 需要正则表达式:

String input = "BAD good {} []";
String output = input.replaceAll("[^a-z0-9\[\]]", "");
System.out.println(output); // good[]

Or in Guavayou could use a CharMatcher:

或者在番石榴中,您可以使用CharMatcher

CharMatcher matcher = CharMatcher.inRange('a', 'z')
                          .or(CharMatcher.inRange('0', '9'))
                          .or(CharMatcher.anyOf("[]"));
String input = "BAD good {} []";
String output = matcher.retainFrom(input);

That just shows the lower case version, making it easier to demonstrate. To include upper case letters, use "[^A-Za-z0-9\\[\\]]"in the regex (and any other symbols you want) - and for the CharMatcheryou can orit with CharMatcher.inRange('A', 'Z').

那只是显示小写版本,使其更易于演示。要包含大写字母,请"[^A-Za-z0-9\\[\\]]"在正则表达式(以及您想要的任何其他符号)中使用 - 对于CharMatcher您可以or使用CharMatcher.inRange('A', 'Z').

回答by Thomas

You could try and match everything that is not in your whitelist and replace it with an empty string:

您可以尝试匹配不在白名单中的所有内容并将其替换为空字符串:

String in = "asng $%& 123";
//this assumes your whitelist contains word characters and whitespaces, adapt as needed
System.out.println(in.replaceAll( "[^\w\s]+", "" ));