使用多字符分隔符分割 Java 字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12375003/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 08:38:00  来源:igfitidea点击:

Java String split with multicharacter delimiter

javastringsplitcharacter

提问by David Homes

I'm fairly new to Java and I thought this worked the same as with other languages.

我对 Java 还很陌生,我认为这与其他语言的工作方式相同。

For a string:

对于字符串:

String line = "3::Daniel::Louis||##2::Leon: the Professional::1994||6::Jean::Reno||7::Gary::Oldman||8::Natalie::Portman||##3::Scarface::1983||9::Al::Pacino||10::Michelle::Pfeiffer";

I want to split it at every ||##.

我想在每个||##.

But:

但:

for(String s : line.split("||##")) {
    System.out.println("|"+s+"|");
 }

returns:

返回:

||
|3|
|:|
|:|
|D|
|a|
|n|
|i|

... etc.

... 等等。

I was expecting:

我期待:

3::Daniel::Louis

Leon: the Professional

... etc.

... 等等。

What am I doing wrong?

我究竟做错了什么?

回答by gtgaxiola

You have to escape the | character since it's a regex metacharacter for logical OR

你必须逃离| 字符,因为它是逻辑 OR 的正则表达式元字符

So I would use

所以我会用

line.split("\|\|##"))

Note that You have to escape the slash as well that is why I use

请注意,您也必须转义斜线,这就是我使用的原因

\|

instead of

代替

\|

To escape that metacharacter

逃避那个元字符

回答by Clyde

public String[] split(String regex) 

回答by paulsm4

It sounds like you want something like this:

听起来你想要这样的东西:

Pattern p = Pattern.compile("\|\|##", Pattern.LITERAL)  
String[] result = p.split(myString)  

I know you can have multiple characters in your delimiter, and that you can excludeyour delimiter from the output string.

我知道您的分隔符中可以有多个字符,并且您可以从输出字符串中排除您的分隔符。

I don'tknow if the example above will work exactlyfor your scenario; you might have to experiment a bit (for example, "escaping" regex "metacharacters" with "\").

知道,如果上面的例子将工作正是为您的方案; 您可能需要进行一些试验(例如,使用“\”“转义”正则表达式“元字符”)。

Here's the Javadoc for Pattern.compile:

这是 Pattern.compile 的 Javadoc:

And here's more information on Java regex syntax:

这里有更多关于 Java 正则表达式语法的信息:

回答by skiller3

Gilberto's solution will work just fine in this case, but you might want to check out guava. It has a lot of very useful utility classes including a String splitter. With it you could write:

Gilberto 的解决方案在这种情况下可以正常工作,但您可能想查看guava。它有许多非常有用的实用程序类,包括字符串拆分器。有了它,你可以写:

Iterable<String> frags = Splitter.on("||##").split(line);
// Do whatever with the iterable...maybe you just want a list?
// List<String> fragList = Lists.newArrayList(frags);

回答by pb2q

You need to escape the bars: |is a special character in the regex.

您需要转义条形:|是正则表达式中的特殊字符。

Use:

利用:

for(String s : line.split("\|\|##")) {

Alternately, you can use \Q\Eto force that the entire pattern be used literally:

或者,您可以使用\Q\E来强制按字面意思使用整个模式:

for(String s : line.split("\Q||##\E")) {

This is probably the same pattern that you'll get from Pattern.quote.

这可能与您从Pattern.quote.

|allows you to specify optional patterns in a regex. Your regex is equivalent to |##, or: nothing OR##. This splits around the empty string, or between every character in the input.

|允许您在正则表达式中指定可选模式。您的正则表达式相当于|##, or: nothing OR##。这将围绕空字符串拆分,或在输入中的每个字符之间拆分。

See the javadocfor Pattern.

请参阅的javadocPattern

回答by Reimeus

You should escape your |characters:

你应该转义你的|角色:

for (String s : line.split("\|\|##"))

回答by srini.venigalla

You have to escape the '|' like this \|

你必须逃避'|' 像这样\|