Javascript 正则表达式匹配字符串中多个单词的开头
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/3507453/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex match for beginning of multiple words in string
提问by Abadaba
In Javascript i want to be able to match strings that begin with a certain phrase. However, I want it to be able to match the start of any word in the phrase, not just the beginning of the phrase.
在 Javascript 中,我希望能够匹配以某个短语开头的字符串。但是,我希望它能够匹配短语中任何单词的开头,而不仅仅是短语的开头。
For example:
例如:
Phrase: "This is the best"
短语:“这是最好的”
Need to Match: "th"
需要匹配:“th”
Result: Matches Th and th
结果:匹配 Th 和 th
EDIT: \b works great however it proposes another issue:
编辑: \b 效果很好,但它提出了另一个问题:
It will also match characters after foreign ones. For example if my string is "M?nn", and i search for "n", it will match the n after M?...Any ideas?
它还将匹配外国字符之后的字符。例如,如果我的字符串是“M?nn”,而我搜索“n”,它将匹配 M 之后的 n?...有什么想法吗?
回答by Peter Ajtai
"This is the best moth".match(/\bth/gi);
or with a variable for your string
或为您的字符串设置一个变量
var string = "This is the best moth";
alert(string.match(/\bth/gi));
\bin a regex is a word boundary so \bthwill only match a ththat at the beginning of a word.
\b在正则表达式中是一个词边界,所以\bth只会匹配th一个词开头的那个。
giis for a global match (look for all occurrences) and case insensitive
gi用于全局匹配(查找所有出现的)并且不区分大小写
(I threw mothin there to as a reminder to check that it is not matched)
(我扔moth在那里提醒检查它是否不匹配)
Edit:
编辑:
So, the above only returns the part that you match (th). If you want to return the entire words, you have to match the entire word. 
因此,以上仅返回您匹配的部分 ( th)。如果要返回整个单词,则必须匹配整个单词。
This is where things get tricky fast. First with no HTML entity letter:
这就是事情变得棘手的地方。首先没有 HTML 实体字母:
string.match(/\bth[^\b]*?\b/gi);
To match the entire word go from the word boundary \bgrab the thfollowed by non word boundaries [^\b]until you get to another word boundary \b. The *means you want to look for 0 or more of the previous (non word boundaries) the ?mark means that this is a lazy match. In other words it doesn't expand to as big as would be possible, but stops at the first opportunity.
要匹配整个单词,请从单词边界\b抓取th后面的非单词边界,[^\b]直到到达另一个单词边界\b。在*你想寻找0个或多个先前(非字边界)的方式?标记的手段,这是一个懒惰的比赛。换句话说,它不会扩展到尽可能大,而是一有机会就停止。
If you have HTML entity characters like ä (ä) things get complicated really fast, and you have to use whitespace or whitespace and a set of defined characters that may be at word boundaries.
如果您有像 ä ( ä)这样的 HTML 实体字符,事情会变得非常复杂,您必须使用空格或空格以及一组可能位于单词边界的已定义字符。
string.match(/\sth[^\s]*|^th[^\s]*/gi);
Since we're not using word boundaries, we have to take care of the beginning of the string separately (|^).
由于我们没有使用单词边界,我们必须单独处理字符串的开头 ( |^)。
The above will capture the white space at the beginning of words. Using \bwill not capture white space, since \bhas no width.
以上将捕获单词开头的空格。使用\b不会捕获空白,因为\b没有宽度。
回答by Michael Robinson
Use this:
用这个:
string.match(/^th|\sth/gi);
Examples:
例子:
'is this is a string'.match(/^th|\sth/gi);
'the string: This is a string'.match(/^th|\sth/gi);
Results:
结果:
["th", " Th"]
["th"]
["th", "th"]
[“第”]
回答by Vivin Paliath
var matches = "This is the best".match(/\bth/ig);
returns:
返回:
["Th", "th"]
The regular expression means: Match "th" ignoring case and globally (meaning, don't stop at just one match) if "th" is the first word in the string or if "th" is preceded by a space character.
正则表达式的意思是:如果“th”是字符串中的第一个单词或者如果“th”前面有一个空格字符,则匹配“th”忽略大小写和全局(意思是,不要只在一个匹配处停止)。
回答by Frxstrem
Use the gflag in the regex. It stands for "global", I think, and it searches for allmatches instead of only the first one.
g在正则表达式中使用标志。我认为它代表“全局”,它搜索所有匹配项,而不仅仅是第一个匹配项。
You should also use the iflag for case-insensitive matching.
您还应该使用该i标志进行不区分大小写的匹配。
You add flags to the end of the regex (/<regex>/<flags>)  or as a second parameter to new RegExp(pattern, flags)
您将标志添加到正则表达式 ( /<regex>/<flags>)的末尾或作为第二个参数new RegExp(pattern, flags)
For instance:
例如:
var matches = "This is the best".match(/\bth/gi);
or, using RegExpobjects:
或者,使用RegExp对象:
var re = new RegExp("\bth", "gi");
var matches = re.exec("This is the best");
EDIT:Use \bin the regex to match the boundary of a word. Note that it does not really match any specific character, but the beginning or end of a word or the string.
编辑:使用\b的正则表达式匹配b单词的oundary。请注意,它并不真正匹配任何特定字符,而是匹配单词或字符串的开头或结尾。

