在 JavaScript 中将大字符串拆分为 n 大小的块

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7033639/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 00:21:19  来源:igfitidea点击:

Split large string in n-size chunks in JavaScript

javascriptregexstringsplit

提问by tribe84

I would like to split a very large string (let's say, 10,000 characters) into N-size chunks.

我想将一个非常大的字符串(比如 10,000 个字符)拆分为 N 大小的块。

What would be the best way in terms of performance to do this?

就性能而言,这样做的最佳方法是什么?

For instance: "1234567890"split by 2 would become ["12", "34", "56", "78", "90"].

例如: "1234567890"被 2 分割将成为["12", "34", "56", "78", "90"]

Would something like this be possible using String.prototype.matchand if so, would that be the best way to do it in terms of performance?

是否可以使用String.prototype.match这样的东西,如果可以,这是否是性能方面的最佳方法?

回答by Vivin Paliath

You can do something like this:

你可以这样做:

"1234567890".match(/.{1,2}/g);
// Results in:
["12", "34", "56", "78", "90"]

The method will still work with strings whose size is not an exact multiple of the chunk-size:

该方法仍然适用于大小不是块大小的精确倍数的字符串:

"123456789".match(/.{1,2}/g);
// Results in:
["12", "34", "56", "78", "9"]

In general, for any string out of which you want to extract at-most n-sized substrings, you would do:

通常,对于您想要从中提取最多n 个大小的子字符串的任何字符串,您可以执行以下操作:

str.match(/.{1,n}/g); // Replace n with the size of the substring

If your string can contain newlines or carriage returns, you would do:

如果您的字符串可以包含换行符或回车符,您可以这样做:

str.match(/(.|[\r\n]){1,n}/g); // Replace n with the size of the substring

As far as performance, I tried this out with approximately 10k characters and it took a little over a second on Chrome. YMMV.

至于性能,我尝试了大约 10k 个字符,在 Chrome 上花了一秒钟多一点。天啊。

This can also be used in a reusable function:

这也可以用于可重用的函数:

function chunkString(str, length) {
  return str.match(new RegExp('.{1,' + length + '}', 'g'));
}

回答by Justin Warkentin

I created several faster variants which you can see on jsPerf. My favorite one is this:

我创建了几个更快的变体,您可以在 jsPerf 上看到它们。我最喜欢的是这个:

function chunkSubstr(str, size) {
  const numChunks = Math.ceil(str.length / size)
  const chunks = new Array(numChunks)

  for (let i = 0, o = 0; i < numChunks; ++i, o += size) {
    chunks[i] = str.substr(o, size)
  }

  return chunks
}

回答by Tgr

Bottom line:

底线:

  • matchis very inefficient, sliceis better, on Firefox substr/substringis better still
  • matchis even more inefficient for short strings (even with cached regex - probably due to regex parsing setup time)
  • matchis even more inefficient for large chunk size (probably due to inability to "jump")
  • for longer strings with very small chunk size, matchoutperforms sliceon older IE but still loses on all other systems
  • jsperfrocks
  • match效率很低,slice更好,在 Firefox 上substr/substring还是更好
  • match对于短字符串效率更低(即使使用缓存的正则表达式 - 可能是由于正则表达式解析设置时间)
  • match对于大块大小甚至效率更低(可能是由于无法“跳转”)
  • 对于块大小非常小的较长字符串,在较旧的 IE 上match表现优于slice但在所有其他系统上仍然失败
  • jsperf岩石

回答by Thank you

This is a fast and straightforward solution -

这是一个快速而直接的解决方案 -

function chunkString (str, len) {
  const size = Math.ceil(str.length/len)
  const r = Array(size)
  let offset = 0
  
  for (let i = 0; i < size; i++) {
    r[i] = str.substr(offset, len)
    offset += len
  }
  
  return r
}

console.log(chunkString("helloworld", 3))
// => [ "hel", "low", "orl", "d" ]

// 10,000 char string
const bigString = "helloworld".repeat(1000)
console.time("perf")
const result = chunkString(bigString, 3)
console.timeEnd("perf")
console.log(result)
// => perf: 0.385 ms
// => [ "hel", "low", "orl", "dhe", "llo", "wor", ... ]

回答by Fozi

Surprise! You can use splitto split.

惊喜!您可以使用split进行拆分。

var parts = "1234567890 ".split(/(.{2})/).filter(O=>O)

Results in [ '12', '34', '56', '78', '90', ' ' ]

结果是 [ '12', '34', '56', '78', '90', ' ' ]

回答by FishBasketGordo

var str = "123456789";
var chunks = [];
var chunkSize = 2;

while (str) {
    if (str.length < chunkSize) {
        chunks.push(str);
        break;
    }
    else {
        chunks.push(str.substr(0, chunkSize));
        str = str.substr(chunkSize);
    }
}

alert(chunks); // chunks == 12,34,56,78,9

回答by Egon Schmid

I have written an extended function, so the chunk length can also be an array of numbers, like [1,3]

我写了一个扩展函数,所以块长度也可以是一个数字数组,比如[1,3]

String.prototype.chunkString = function(len) {
    var _ret;
    if (this.length < 1) {
        return [];
    }
    if (typeof len === 'number' && len > 0) {
        var _size = Math.ceil(this.length / len), _offset = 0;
        _ret = new Array(_size);
        for (var _i = 0; _i < _size; _i++) {
            _ret[_i] = this.substring(_offset, _offset = _offset + len);
        }
    }
    else if (typeof len === 'object' && len.length) {
        var n = 0, l = this.length, chunk, that = this;
        _ret = [];
        do {
            len.forEach(function(o) {
                chunk = that.substring(n, n + o);
                if (chunk !== '') {
                    _ret.push(chunk);
                    n += chunk.length;
                }
            });
            if (n === 0) {
                return undefined; // prevent an endless loop when len = [0]
            }
        } while (n < l);
    }
    return _ret;
};

The code

编码

"1234567890123".chunkString([1,3])

will return:

将返回:

[ '1', '234', '5', '678', '9', '012', '3' ]

回答by Haseeb

it Split's large string in to Small strings of given words.

它将大字符串拆分为给定单词的小字符串。

function chunkSubstr(str, words) {
  var parts = str.split(" ") , values = [] , i = 0 , tmpVar = "";
  $.each(parts, function(index, value) {
      if(tmpVar.length < words){
          tmpVar += " " + value;
      }else{
          values[i] = tmpVar.replace(/\s+/g, " ");
          i++;
          tmpVar = value;
      }
  });
  if(values.length < 1 &&  parts.length > 0){
      values[0] = tmpVar;
  }
  return values;
}

回答by Poetro

var l = str.length, lc = 0, chunks = [], c = 0, chunkSize = 2;
for (; lc < l; c++) {
  chunks[c] = str.slice(lc, lc += chunkSize);
}

回答by alex

I would use a regex...

我会使用正则表达式...

var chunkStr = function(str, chunkLength) {
    return str.match(new RegExp('[\s\S]{1,' + +chunkLength + '}', 'g'));
}