java 在Java中使用正则表达式拆分后如何删除空结果?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25451331/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 08:03:47  来源:igfitidea点击:

How to remove empty results after splitting with regex in Java?

javaregex

提问by Xelian

I want to find all numbers from a given string (all numbers are mixed with letters but are separated by space).I try to split the input String but when check the result array I find that there are a lot of empty Strings, so how to change my split regex to remove this empty spaces?

我想从给定的字符串中找到所有数字(所有数字都与字母混合但用空格分隔)。我尝试拆分输入字符串,但是在检查结果数组时我发现有很多空字符串,那么如何更改我的拆分正则表达式以删除此空白空间?

Pattern reg = Pattern.compile("\D0*");
String[] numbers = reg.split("asd0085 sa223 9349x");
for(String s:numbers){
    System.out.println(s);
}

And the result:

结果:

85


223
9349

I know that I can iterate over the array and to remove empty results. But how to do it only with regex?

我知道我可以遍历数组并删除空结果。但是如何只使用正则表达式呢?

采纳答案by Pshemo

Don't use split. Use findmethod which will return all matching substrings. You can do it like

不要使用split. 使用find将返回所有匹配子字符串的方法。你可以这样做

Pattern reg = Pattern.compile("\d+");
Matcher m = reg.matcher("asd0085 sa223 9349x");
while (m.find())
    System.out.println(m.group());

which will print

这将打印

0085
223
9349


Based on your regex it seems that your goal is also to remove leading zeroes like in case of 0085. If that is true, you can use regex like 0*(\\d+)and take part matched by group 1 (the one in parenthesis) and let leading zeroes be matched outside of that group.

根据您的正则表达式,您的目标似乎也是删除前导零,例如0085. 如果这是真的,您可以使用 regex like0*(\\d+)并参与由组 1(括号中的那个)匹配的部分,并让前导零在该组之外匹配。

Pattern reg = Pattern.compile("0*(\d+)");
Matcher m = reg.matcher("asd0085 sa223 9349x");
while (m.find())
    System.out.println(m.group(1));

Output:

输出:

85
223
9349


But if you really want to use splitthen change "\\D0*"to \\D+0*so you could split on one-or-more non-digits \\D+, not just one non-digit \\D, but with this solution you may need to ignore first empty element in result array (depending if string will start with element which should be split on, or not).

但是,如果您真的想使用split然后更改"\\D0*"\\D+0*这样您就可以拆分一个或多个非数字\\D+,而不仅仅是一个非数字\\D,但使用此解决方案您可能需要忽略结果数组中的第一个空元素(取决于字符串将从应该拆分或不拆分的元素开始)。

回答by Ak?n ?zer

If you are using java 8, you can do it in 1 statement like this:

如果您使用的是 java 8,则可以在 1 条语句中执行此操作,如下所示:

String[] array = Arrays.asList(s1.split("[,]")).stream().filter(str -> !str.isEmpty()).collect(Collectors.toList()).toArray(new String[0]);

回答by Rakesh KR

The method i think to solve this problem is,

我认为解决这个问题的方法是,

String urStr = "asd0085   sa223 9349x";
urStr = urStr.replaceAll("[a-zA-Z]", "");
String[] urStrAry = urStr.split("\s");
  1. Replace all alphabets from the string.
  2. Then split it by whitespace (\\s).
  1. 替换字符串中的所有字母。
  2. 然后用空格 ( \\s)分割它。

回答by Braj

You can try with Patternand Matcheras well.

您也可以尝试使用PatternMatcher

Pattern p = Pattern.compile("\d+");
Matcher m = p.matcher("asd0085 sa223 9349x");
while (m.find()) {
    System.out.println(m.group());
}

回答by deamon

Pattern reg = Pattern.compile("\D+");
// ...

results in:

结果是:

0085
223
9349

回答by Rahul Tripathi

You may try this:

你可以试试这个:

reg.split("asd0085 sa223 9349x").replace("^/", "")

回答by Rohit Jain

Using String.split(), you get an empty string as array element, when you have back to back delimiter in your string, on which you're splitting.

使用String.split(),您将获得一个空字符串作为数组元素,当您的字符串中有背靠背分隔符时,您将在该分隔符上进行拆分。

For e.g, if you split xyyzon y, the 2nd element will be an empty string. To avoid that, you can just add a quantifier to delimiter - y+, so that split happens on 1 or more iteration.

例如,如果您xyyz在 上拆分y,则第二个元素将是一个空字符串。为避免这种情况,您可以在分隔符 - 中添加一个量词y+,以便在 1 次或多次迭代时进行拆分。

In your case it happens because you've used \\D0*which will match each non-digit character, and split on that. Thus you've back to back delimiter. You can of course use surrounding quantifier here:

在您的情况下,发生这种情况是因为您使用了\\D0*which 将匹配每个非数字字符,并对其进行拆分。因此,您已经背对背分隔符。您当然可以在此处使用周围的量词:

Pattern reg = Pattern.compile("(\D0*)+");   

But what you really need is: \\D+0*there.

但你真正需要的是:\\D+0*那里。

However, if what you only want is the numeric sequence from your string, I would use Matcher#find()method instead, with \\d+as regex.

但是,如果您只想要字符串中的数字序列,我会改用Matcher#find()方法,使用\\d+正则表达式。