Java String 的 split 方法忽略空子串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21575452/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 09:27:27  来源:igfitidea点击:

Java String's split method ignores empty substrings

javaregexarraysstringsplit

提问by Sachin Verma

It occured to me today the behavior of java String.split()is very strange.

今天我突然想到java的行为String.split()很奇怪。

Actually I want to split a string "aa,bb,cc,dd,,,ee"to array by .split(",")that gives me a String array ["aa","bb","cc","dd","","","ee"]of length 7.

实际上,我想将一个字符串拆分"aa,bb,cc,dd,,,ee"为数组,.split(",")从而得到一个["aa","bb","cc","dd","","","ee"]长度为 7的字符串数组。

But when I try to split a String "aa,bb,cc,dd,,,,"to array this gives me a array of length 4 means only ["aa","bb","cc","dd"]rejecting all next blank Strings.

但是当我尝试将一个字符串拆分"aa,bb,cc,dd,,,,"为数组时,这给了我一个长度为 4 的数组,这意味着只["aa","bb","cc","dd"]拒绝所有下一个空白字符串。

I want a procedure that splits a String like "aa,bb,cc,dd,,,,"to array ["aa","bb","cc","dd","","",""].

我想要一个将 String 拆分"aa,bb,cc,dd,,,,"为 array 的过程["aa","bb","cc","dd","","",""]

Is this possible with java.lang.String api? Thanks in advance.

这可以用 java.lang.String api 实现吗?提前致谢。

采纳答案by nhahtdh

Use String.split(String regex, int limit)with negative limit (e.g. -1).

String.split(String regex, int limit)与负限制一起使用(例如 -1)。

"aa,bb,cc,dd,,,,".split(",", -1)

When String.split(String regex)is called, it is called with limit= 0, which will remove all trailing empty strings in the array (in most cases, see below).

String.split(String regex)被调用时,它用limit= 0调用,这将删除数组中所有尾随的空字符串(在大多数情况下,见下文)。

The actual behavior of String.split(String regex)is quite confusing:

的实际行为String.split(String regex)非常令人困惑:

  • Splitting an empty string will result in an array of length 1. Empty string split will always result in length 1 array containing the empty string.
  • Splitting ";"or ";;;"with regexbeing ";"will result in an empty array. Non-empty string split will result in all trailing empty strings in the array removed.
  • 拆分空字符串将导致长度为 1 的数组。空字符串拆分将始终导致包含空字符串的长度为 1 的数组
  • 分裂";"";;;"regex存在";"将导致一个空数组。非空字符串拆分将导致数组中的所有尾随空字符串删除

The behavior above can be observed from at least Java 5 to Java 8.

至少从 Java 5 到 Java 8 都可以观察到上述行为。

There was an attempt to change the behavior to return an empty array when splitting an empty string in JDK-6559590. However, it was soon reverted in JDK-8028321when it causes regression in various places. The change never makes it into the initial Java 8 release.

尝试更改在JDK-6559590 中拆分空字符串时返回空数组的行为。但是,在JDK-8028321 中很快就恢复了,因为它会导致各个地方的回归。该更改从未包含在最初的 Java 8 版本中。

回答by Maroun

You can use public String[] split(String regex, int limit):

您可以使用public String[] split(String regex, int limit)

The limitparameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

limit参数控制应用模式的次数,因此会影响结果数组的长度。如果限制 n 大于零,则该模式将最多应用 n - 1 次,数组的长度将不大于 n,并且数组的最后一个条目将包含最后一个匹配的分隔符之外的所有输入。如果 n 为非正数,则该模式将被应用尽可能多的次数,并且数组可以具有任意长度。如果 n 为零,则该模式将被应用尽可能多的次数,数组可以具有任意长度,并且将丢弃尾随的空字符串。



String st = "aa,bb,cc,dd,,,,";
System.out.println(Arrays.deepToString(st.split(",",-1)));
                                                    ↑

Prints:

印刷:

[aa, bb, cc, dd, , , , ]