Java Regex 帮助:在空格、“=>”和逗号上拆分字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3654446/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 03:13:14  来源:igfitidea点击:

Java Regex Help: Splitting String on spaces, "=>", and commas

javaregex

提问by meteoritepanama

I need to split a string on any of the following sequences:

我需要在以下任何序列上拆分字符串:

1 or more spaces
0 or more spaces, followed by a comma, followed by 0 or more spaces,
0 or more spaces, followed by "=>", followed by 0 or more spaces

1个或多个空格
0个或多个空格,后接逗号,后接0个或多个空格,
0个或多个空格,后接“=>”,后接0个或多个空格

Haven't had experience doing Java regexs before, so I'm a little confused. Thanks!

之前没有使用 Java 正则表达式的经验,所以我有点困惑。谢谢!

Example:
add r10,r12 => r10
store r10 => r1

示例:
添加 r10,r12 => r10
存储 r10 => r1

采纳答案by Nikita Rybak

Just create regex matching any of your three cases and pass it into splitmethod:

只需创建与您的三种情况中的任何一种匹配的正则表达式并将其传递给split方法:

string.split("\s*(=>|,|\s)\s*");

Regex here means literally

正则表达式在这里的意思是字面意思

  1. Zero or more whitespaces (\\s*)
  2. Arrow, or comma, or whitespace (=>|,|\\s)
  3. Zero or more whitespaces (\\s*)
  1. 零个或多个空格 ( \\s*)
  2. 箭头、逗号或空格 ( =>|,|\\s)
  3. 零个或多个空格 ( \\s*)

You can replace whitespace \\s(detects spaces, tabs, line breaks, etc) with plain space character if necessary.

如有必要\\s,您可以用纯空格字符替换空格(检测空格、制表符、换行符等)

回答by Tim Pietzcker

String[] splitArray = subjectString.split(" *(,|=>| ) *");

should do it.

应该这样做。

回答by Bert F

Strictly translated

严格翻译

For simplicity, I'm going to interpret you indication of "space" () as "any whitespace" (\s).

为简单起见,我将把“空格” ( ) 的指示解释为“任何空格” ( \s)。

Translating your spec more or less "word for word" is to delimit on any of:

或多或少地“逐字逐句”地翻译您的规范是对以下任何一项进行界定:

  • 1 or more spaces
    • \s+
  • 0 or more spaces (\s*), followed by a comma (,), followed by 0 or more spaces (\s*)
    • \s*,\s*
  • 0 or more spaces (\s*), followed by a "=>" (=>), followed by 0 or more spaces (\s*)
    • \s*=>\s*
  • 1个或多个空格
    • \s+
  • 0 个或多个空格 ( \s*),后跟逗号 ( ,),后跟 0 个或多个空格 ( \s*)
    • \s*,\s*
  • 0 个或多个空格 ( \s*),后跟“=>” ( =>),后跟 0 个或多个空格 ( \s*)
    • \s*=>\s*

To match any of the above: (\s+|\s*,\s*|\s*=>\s*)

要匹配以上任何一项: (\s+|\s*,\s*|\s*=>\s*)

Reduced form

简化形式

However, your spec can be "reduced" to:

但是,您的规格可以“减少”为:

  • 0 or more spaces
    • \s*,
  • followed by either a space, comma, or "=>"
    • (\s|,|=>)
  • followed by 0 or more spaces
    • \s*
  • 0个或多个空格
    • \s*,
  • 后跟空格、逗号或“=>”
    • (\s|,|=>)
  • 后跟 0 个或多个空格
    • \s*

Put it all together: \s*(\s|,|=>)\s*

把它们放在一起: \s*(\s|,|=>)\s*

The reduced form gets around some corner cases with the strictly translated form that makes some unexpected empty "matches".

简化的形式绕过了一些极端情况,严格翻译的形式会产生一些意想不到的空“匹配”。

Code

代码

Here's some code:

这是一些代码:

import java.util.regex.Pattern;

public class Temp {

    // Strictly translated form:
    //private static final String REGEX = "(\s+|\s*,\s*|\s*=>\s*)";

    // "Reduced" form:
    private static final String REGEX = "\s*(\s|=>|,)\s*";

    private static final String INPUT =
        "one two,three=>four , five   six   => seven,=>";

    public static void main(final String[] args) {
        final Pattern p = Pattern.compile(REGEX);
        final String[] items = p.split(INPUT);
        // Shorthand for above:
        // final String[] items = INPUT.split(REGEX);
        for(final String s : items) {
            System.out.println("Match: '"+s+"'");
        }
    }
}

Output:

输出:

Match: 'one'
Match: 'two'
Match: 'three'
Match: 'four'
Match: 'five'
Match: 'six'
Match: 'seven'