JavaScript 使用 .match(regex) 拆分字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37838532/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 20:47:06  来源:igfitidea点击:

JavaScript split string with .match(regex)

javascriptregexsplit

提问by Audite Marlow

From the Mozilla Developer Network for function split():

来自 Mozilla 开发者网络的功能split()

The split() method returns the new array.

When found, separator is removed from the string and the substrings are returned in an array. If separator is not found or is omitted, the array contains one element consisting of the entire string. If separator is an empty string, str is converted to an array of characters.

If separator is a regular expression that contains capturing parentheses, then each time separator is matched, the results (including any undefined results) of the capturing parentheses are spliced into the output array. However, not all browsers support this capability.

split() 方法返回新数组。

找到后,从字符串中删除分隔符,并在数组中返回子字符串。如果未找到或省略了分隔符,则数组包含一个由整个字符串组成的元素。如果 separator 为空字符串,则 str 转换为字符数组。

如果separator是一个包含捕获括号的正则表达式,那么每次匹配separator时,将捕获括号的结果(包括任何未定义的结果)拼接到输出数组中。但是,并非所有浏览器都支持此功能。

Take the following example:

以下面的例子为例:

var string1 = 'one, two, three, four';
var splitString1 = string1.split(', ');
console.log(splitString1); // Outputs ["one", "two", "three", "four"]

This is a really clean approach. I tried the same with a regular expression and a somewhat different string:

这是一个非常干净的方法。我用正则表达式和稍微不同的字符串尝试了同样的方法:

var string2 = 'one split two split three split four';
var splitString2 = string2.split(/\ split\ /);
console.log(splitString2); // Outputs ["one", "two", "three", "four"]

This works just as well as the first example. In the following example, I have altered the string once more, with 3 different delimiters:

这与第一个示例一样有效。在以下示例中,我再次更改了字符串,并使用 3 个不同的分隔符:

var string3 = 'one split two splat three splot four';
var splitString3 = string3.split(/\ split\ |\ splat\ |\ splot\ /);
console.log(splitString3); // Outputs ["one", "two", "three", "four"]

However, the regular expression gets relatively messy right now. I can group the different delimiters, however the result will then include these delimiters:

但是,正则表达式现在变得相对混乱。我可以对不同的分隔符进行分组,但是结果将包括这些分隔符:

var string4 = 'one split two splat three splot four';
var splitString4 = string4.split(/\ (split|splat|splot)\ /);
console.log(splitString4); // Outputs ["one", "split", "two", "splat", "three", "splot", "four"]

So I tried removing the spaces from the regular expression while leaving the group, without much avail:

所以我尝试在离开组的同时从正则表达式中删除空格,但没有多大用处:

var string5 = 'one split two splat three splot four';
var splitString5 = string5.split(/(split|splat|splot)/);
console.log(splitString5);

Although, when I remove the parentheses in the regular expression, the delimiter is gone in the split string:

虽然,当我删除正则表达式中的括号时,分隔符在拆分字符串中消失了:

var string6 = 'one split two splat three splot four';
var splitString6 = string6.split(/split|splat|splot/);
console.log(splitString6); // Outputs ["one ", " two ", " three ", " four"]

An alternative would be to use match()to filter out the delimiters, except I don't really understand how reverse lookaheads work:

另一种方法是使用match()过滤掉分隔符,除非我真的不明白反向前瞻是如何工作的:

var string7 = 'one split two split three split four';
var splitString7 = string7.match(/((?!split).)*/g);
console.log(splitString7); // Outputs ["one ", "", "plit two ", "", "plit three ", "", "plit four", ""]

It doesn't match the whole word to begin with. And to be honest, I don't even know what's going on here exactly.

它不匹配整个单词开始。老实说,我什至不知道这里到底发生了什么。



How do I properly split a string using regular expressions without having the delimiter in my result?

如何在结果中没有分隔符的情况下使用正则表达式正确拆分字符串?

回答by anubhava

Use a non-capturing groupas split regex. By using non-capturing group, split matches will not be included in resulting array.

使用非捕获组作为拆分正则表达式。通过使用非捕获组,结果数组中将不包含拆分匹配项。

var string4 = 'one split two splat three splot four';
var splitString4 = string4.split(/\s+(?:split|splat|splot)\s+/);
console.log(splitString4);

// Output => ["one", "two", "three", "four"]

回答by nu11p01n73R

If you want to use matchyou can write it like

如果你想使用match你可以这样写

'one split two split three split four'.match(/(\b(?!split\b)[^ $]+\b)/g)
["one", "two", "three", "four"]

What it does?

它能做什么?

  • \bMatches a word boundary

  • (?!split\b)Negative look ahead, check if the word is notsplit

  • [^ $]+Matches anything other than space or $, end of string. This pattern will match a word, the look ahead ensures that what it matches is not split.

  • \bMatches the word end.

  • \b匹配单词边界

  • (?!split\b)消极地向前看,检查这个词是否不是split

  • [^ $]+匹配除空格或$, 字符串结尾以外的任何内容。此模式将匹配一个单词,前瞻确保它匹配的不是split

  • \b匹配单词 end。