在Java中将字符串拆分为等长的子字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3760152/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 04:32:40  来源:igfitidea点击:

Split string to equal length substrings in Java

javaregexstringsplit

提问by Emil

How to split the string "Thequickbrownfoxjumps"to substrings of equal size in Java. Eg. "Thequickbrownfoxjumps"of 4 equal size should give the output.

如何"Thequickbrownfoxjumps"在 Java中将字符串拆分为相同大小的子字符串。例如。"Thequickbrownfoxjumps"4 个相等的大小应该给出输出。

["Theq","uick","brow","nfox","jump","s"]

Similar Question:

类似问题:

Split string into equal-length substrings in Scala

在 Scala 中将字符串拆分为等长的子字符串

采纳答案by Alan Moore

Here's the regex one-liner version:

这是正则表达式的单行版本:

System.out.println(Arrays.toString(
    "Thequickbrownfoxjumps".split("(?<=\G.{4})")
));

\Gis a zero-width assertion that matches the position where the previous match ended. If there wasno previous match, it matches the beginning of the input, the same as \A. The enclosing lookbehind matches the position that's four characters along from the end of the last match.

\G是一个零宽度断言,匹配前一个匹配结束的位置。如果以前没有的比赛,它的输入的开始,同相匹配\A。封闭的lookbehind 匹配从最后一个匹配结束开始的四个字符的位置。

Both lookbehind and \Gare advanced regex features, not supported by all flavors. Furthermore, \Gis not implemented consistently across the flavors that do support it. This trick will work (for example) in Java, Perl, .NET and JGSoft, but not in PHP(PCRE), Ruby 1.9+ or TextMate (both Oniguruma). JavaScript's /y(sticky flag) isn't as flexible as \G, and couldn't be used this way even if JS did support lookbehind.

后视和\G高级正则表达式功能,并非所有风格都支持。此外,\G在支持它的风格中并没有一致地实现。这个技巧(例如)在Java、Perl、.NET 和 JGSoft 中有效,但不适用于PHP(PCRE)、Ruby 1.9+ 或 TextMate(均为 Oniguruma)。JavaScript 的/y(粘性标志)不如 灵活\G,即使 JS 确实支持后视,也不能以这种方式使用。

I should mention that I don't necessarily recommendthis solution if you have other options. The non-regex solutions in the other answers may be longer, but they're also self-documenting; this one's just about the oppositeof that. ;)

我应该提到,如果您有其他选择,我不一定推荐此解决方案。其他答案中的非正则表达式解决方案可能更长,但它们也是自我记录的;这个正好相反。;)

Also, this doesn't work in Android, which doesn't support the use of \Gin lookbehinds.

此外,这在不支持使用\Ginlookbehinds 的Android 中不起作用。

回答by pakore

You can use substringfrom String.class(handling exceptions) or from Apache lang commons(it handles exceptions for you)

您可以使用substringfrom String.class(处理异常)或Apache lang commons(它为您处理异常)

static String   substring(String str, int start, int end) 

Put it inside a loop and you are good to go.

把它放在一个循环中,你就可以开始了。

回答by Jon Skeet

Well, it's fairly easy to do this with simple arithmetic and string operations:

好吧,使用简单的算术和字符串运算就可以很容易地做到这一点:

public static List<String> splitEqually(String text, int size) {
    // Give the list the right capacity to start with. You could use an array
    // instead if you wanted.
    List<String> ret = new ArrayList<String>((text.length() + size - 1) / size);

    for (int start = 0; start < text.length(); start += size) {
        ret.add(text.substring(start, Math.min(text.length(), start + size)));
    }
    return ret;
}

I don't think it's really worth using a regex for this.

我认为为此使用正则表达式真的不值得。

EDIT: My reasoning for not using a regex:

编辑:我不使用正则表达式的理由:

  • This doesn't use any of the real pattern matching of regexes. It's just counting.
  • I suspectthe above will be more efficient, although in most cases it won't matter
  • If you need to use variable sizes in different places, you've either got repetition or a helper function to build the regex itself based on a parameter - ick.
  • The regex provided in another answer firstly didn't compile (invalid escaping), and then didn't work. My code worked first time. That's more a testament to the usability of regexes vs plain code, IMO.
  • 这不使用任何正则表达式的真正模式匹配。这只是计数。
  • 怀疑上述方法会更有效,尽管在大多数情况下这无关紧要
  • 如果您需要在不同的地方使用可变大小,您可以使用重复或辅助函数来根据参数构建正则表达式本身 - ick。
  • 另一个答案中提供的正则表达式首先没有编译(无效转义),然后没有工作。我的代码第一次工作。这更多地证明了正则表达式与普通代码(IMO)的可用性。

回答by Grodriguez

public String[] splitInParts(String s, int partLength)
{
    int len = s.length();

    // Number of parts
    int nparts = (len + partLength - 1) / partLength;
    String parts[] = new String[nparts];

    // Break into parts
    int offset= 0;
    int i = 0;
    while (i < nparts)
    {
        parts[i] = s.substring(offset, Math.min(offset + partLength, len));
        offset += partLength;
        i++;
    }

    return parts;
}

回答by Saul

public static String[] split(String src, int len) {
    String[] result = new String[(int)Math.ceil((double)src.length()/(double)len)];
    for (int i=0; i<result.length; i++)
        result[i] = src.substring(i*len, Math.min(src.length(), (i+1)*len));
    return result;
}

回答by Sean Patrick Floyd

This is very easy with Google Guava:

使用Google Guava这很容易:

for(final String token :
    Splitter
        .fixedLength(4)
        .split("Thequickbrownfoxjumps")){
    System.out.println(token);
}

Output:

输出:

Theq
uick
brow
nfox
jump
s

Or if you need the result as an array, you can use this code:

或者,如果您需要将结果作为数组,则可以使用以下代码:

String[] tokens =
    Iterables.toArray(
        Splitter
            .fixedLength(4)
            .split("Thequickbrownfoxjumps"),
        String.class
    );

Reference:

参考:

Note: Splitter construction is shown inline above, but since Splitters are immutable and reusable, it's a good practice to store them in constants:

注意:Splitter 的构造如上图所示,但由于 Splitter 是不可变和可重用的,因此将它们存储在常量中是一个很好的做法:

private static final Splitter FOUR_LETTERS = Splitter.fixedLength(4);

// more code

for(final String token : FOUR_LETTERS.split("Thequickbrownfoxjumps")){
    System.out.println(token);
}

回答by Cowan

If you're using Google's guavageneral-purpose libraries (and quite honestly, any new Java project probably shouldbe), this is insanely trivial with the Splitterclass:

如果您正在使用 Google 的guava通用库(老实说,任何新的 Java 项目都可能应该使用),这对于Splitter类来说非常简单:

for (String substring : Splitter.fixedLength(4).split(inputString)) {
    doSomethingWith(substring);
}

and that's it. Easy as!

就是这样。轻松!

回答by Ravichandra

    import static java.lang.System.exit;
   import java.util.Scanner;
   import Java.util.Arrays.*;


 public class string123 {

public static void main(String[] args) {


  Scanner sc=new Scanner(System.in);
    System.out.println("Enter String");
    String r=sc.nextLine();
    String[] s=new String[10];
    int len=r.length();
       System.out.println("Enter length Of Sub-string");
    int l=sc.nextInt();
    int last;
    int f=0;
    for(int i=0;;i++){
        last=(f+l);
            if((last)>=len) last=len;
        s[i]=r.substring(f,last);
     // System.out.println(s[i]);

      if (last==len)break;
       f=(f+l);
    } 
    System.out.print(Arrays.tostring(s));
    }}

Result

结果

 Enter String
 Thequickbrownfoxjumps
 Enter length Of Sub-string
 4

 ["Theq","uick","brow","nfox","jump","s"]

回答by joensson

I asked @Alan Moore in a comment to the accepted solutionhow strings with newlines could be handled. He suggested using DOTALL.

我在对已接受解决方案的评论中询问@Alan Moore如何处理带有换行符的字符串。他建议使用 DOTALL。

Using his suggestion I created a small sample of how that works:

根据他的建议,我创建了一个关于其工​​作原理的小样本:

public void regexDotAllExample() throws UnsupportedEncodingException {
    final String input = "The\nquick\nbrown\r\nfox\rjumps";
    final String regex = "(?<=\G.{4})";

    Pattern splitByLengthPattern;
    String[] split;

    splitByLengthPattern = Pattern.compile(regex);
    split = splitByLengthPattern.split(input);
    System.out.println("---- Without DOTALL ----");
    for (int i = 0; i < split.length; i++) {
        byte[] s = split[i].getBytes("utf-8");
        System.out.println("[Idx: "+i+", length: "+s.length+"] - " + s);
    }
    /* Output is a single entry longer than the desired split size:
    ---- Without DOTALL ----
    [Idx: 0, length: 26] - [B@17cdc4a5
     */


    //DOTALL suggested in Alan Moores comment on SO: https://stackoverflow.com/a/3761521/1237974
    splitByLengthPattern = Pattern.compile(regex, Pattern.DOTALL);
    split = splitByLengthPattern.split(input);
    System.out.println("---- With DOTALL ----");
    for (int i = 0; i < split.length; i++) {
        byte[] s = split[i].getBytes("utf-8");
        System.out.println("[Idx: "+i+", length: "+s.length+"] - " + s);
    }
    /* Output is as desired 7 entries with each entry having a max length of 4:
    ---- With DOTALL ----
    [Idx: 0, length: 4] - [B@77b22abc
    [Idx: 1, length: 4] - [B@5213da08
    [Idx: 2, length: 4] - [B@154f6d51
    [Idx: 3, length: 4] - [B@1191ebc5
    [Idx: 4, length: 4] - [B@30ddb86
    [Idx: 5, length: 4] - [B@2c73bfb
    [Idx: 6, length: 2] - [B@6632dd29
     */

}

But I like @Jon Skeets solution in https://stackoverflow.com/a/3760193/1237974also. For maintainability in larger projects where not everyone are equally experienced in Regular expressions I would probably use Jons solution.

但我也喜欢https://stackoverflow.com/a/3760193/1237974 中的@Jon Skeets 解决方案。对于并非每个人都对正则表达式有同等经验的大型项目的可维护性,我可能会使用 Jons 解决方案。

回答by Hubbly

Another brute force solution could be,

另一个蛮力解决方案可能是,

    String input = "thequickbrownfoxjumps";
    int n = input.length()/4;
    String[] num = new String[n];

    for(int i = 0, x=0, y=4; i<n; i++){
    num[i]  = input.substring(x,y);
    x += 4;
    y += 4;
    System.out.println(num[i]);
    }

Where the code just steps through the string with substrings

代码只是单步执行带有子字符串的字符串