java 正则表达式不带空格分割数字和字母组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11232801/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 04:19:21  来源:igfitidea点击:

Regex split numbers and letter groups without spaces

javaregex

提问by Steve C

If I have a string like "11E12C108N" which is a concatenation of letter groups and digit groups, how do I split them without a delimiter space character inbetween?

如果我有一个像“11E12C108N”这样的字符串,它是字母组和数字组的串联,我如何在没有分隔符空格字符的情况下拆分它们?

For example, I want the resulting split to be:

例如,我希望结果拆分为:

tokens[0] = "11"
tokens[1] = "E"
tokens[2] = "12"
tokens[3] = "C"
tokens[4] = "108"
tokens[5] = "N"

I have this right now.

我现在有这个。

public static void main(String[] args) {

    String stringToSplit = "11E12C108N";

    Pattern pattern = Pattern.compile("\d+\D+");
    Matcher matcher = pattern.matcher(stringToSplit);

    while (matcher.find()) {
        System.out.println(matcher.group());
    }
}

Which gives me:

这给了我:

11E
12C
108N

Can I make the original regex do a complete split in one go? Instead of having to run the regex again on the intermediate tokens?

我可以让原始正则表达式一次性完成完整拆分吗?而不是必须在中间令牌上再次运行正则表达式?

回答by Kendall Frey

Use the following regex, and get a list of all matches. That will be what you are looking for.

使用以下正则表达式,并获取所有匹配项的列表。那将是你正在寻找的。

\d+|\D+

In Java, I think the code would look something like this:

在 Java 中,我认为代码如下所示:

Matcher matcher = Pattern.compile("\d+|\D+").matcher(theString);
while (matcher.find())
{
    // append matcher.group() to your list
}

回答by Pshemo

You can also use "look around" in split regex

您还可以在拆分正则表达式中使用“环顾四周”

String stringToSplit = "11E12C108N";
String[] tokens = stringToSplit .split("(?<=\d)(?=\D)|(?=\d)(?<=\D)");
System.out.println(Arrays.toString(tokens));

out [11, E, 12, C, 108, N]

出去 [11, E, 12, C, 108, N]

Idea is to split in places which are between digit (\d) and non-digit (\D). In other words it is place (empty string) which have:

想法是在数字(\d)和非数字(\D)之间的位置拆分。换句话说,它是具有以下特性的地方(空字符串):

  • digit before (?<=\d)and non-digit after it (?=\D)
  • non-digit before (?<=\D)and digit after it (?=\d)
  • 前面(?<=\d)的数字和后面的非数字(?=\D)
  • 非数字之前(?<=\D)和之后的数字(?=\d)

More info about (?<=..)and (?=..)(and few more) you can find at http://www.regular-expressions.info/lookaround.html

有关(?<=..)(?=..)(以及更多)的更多信息,您可以在http://www.regular-expressions.info/lookaround.html 上找到