Javascript 正则表达式匹配字符串中多个单词的开头

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3507453/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 04:57:39  来源:igfitidea点击:

Regex match for beginning of multiple words in string

javascriptregex

提问by Abadaba

In Javascript i want to be able to match strings that begin with a certain phrase. However, I want it to be able to match the start of any word in the phrase, not just the beginning of the phrase.

在 Javascript 中,我希望能够匹配以某个短语开头的字符串。但是,我希望它能够匹配短语中任何单词的开头,而不仅仅是短语的开头。

For example:

例如:

Phrase: "This is the best"

短语:“这是最好的”

Need to Match: "th"

需要匹配:“th”

Result: Matches Th and th

结果:匹配 Th 和 th

EDIT: \b works great however it proposes another issue:

编辑: \b 效果很好,但它提出了另一个问题:

It will also match characters after foreign ones. For example if my string is "M?nn", and i search for "n", it will match the n after M?...Any ideas?

它还将匹配外国字符之后的字符。例如,如果我的字符串是“M?nn”,而我搜索“n”,它将匹配 M 之后的 n?...有什么想法吗?

回答by Peter Ajtai

"This is the best moth".match(/\bth/gi);

or with a variable for your string

或为您的字符串设置一个变量

var string = "This is the best moth";
alert(string.match(/\bth/gi));

\bin a regex is a word boundary so \bthwill only match a ththat at the beginning of a word.

\b在正则表达式中是一个词边界,所以\bth只会匹配th一个词开头的那个。

giis for a global match (look for all occurrences) and case insensitive

gi用于全局匹配(查找所有出现的)并且不区分大小写

(I threw mothin there to as a reminder to check that it is not matched)

(我扔moth在那里提醒检查它是否不匹配)

jsFiddle example

jsFiddle 示例



Edit:

编辑:

So, the above only returns the part that you match (th). If you want to return the entire words, you have to match the entire word.

因此,以上仅返回您匹配的部分 ( th)。如果要返回整个单词,则必须匹配整个单词。

This is where things get tricky fast. First with no HTML entity letter:

这就是事情变得棘手的地方。首先没有 HTML 实体字母:

string.match(/\bth[^\b]*?\b/gi);

Example

例子

To match the entire word go from the word boundary \bgrab the thfollowed by non word boundaries [^\b]until you get to another word boundary \b. The *means you want to look for 0 or more of the previous (non word boundaries) the ?mark means that this is a lazy match. In other words it doesn't expand to as big as would be possible, but stops at the first opportunity.

要匹配整个单词,请从单词边界\b抓取th后面的非单词边界,[^\b]直到到达另一个单词边界\b。在*你想寻找0个或多个先前(非字边界)的方式?标记的手段,这是一个懒惰的比赛。换句话说,它不会扩展到尽可能大,而是一有机会就停止。

If you have HTML entity characters like ä (ä) things get complicated really fast, and you have to use whitespace or whitespace and a set of defined characters that may be at word boundaries.

如果您有像 ä ( ä)这样的 HTML 实体字符,事情会变得非常复杂,您必须使用空格或空格以及一组可能位于单词边界的已定义字符。

string.match(/\sth[^\s]*|^th[^\s]*/gi);

Example with HTML entities.

HTML 实体示例。

Since we're not using word boundaries, we have to take care of the beginning of the string separately (|^).

由于我们没有使用单词边界,我们必须单独处理字符串的开头 ( |^)。

The above will capture the white space at the beginning of words. Using \bwill not capture white space, since \bhas no width.

以上将捕获单词开头的空格。使用\b不会捕获空白,因为\b没有宽度。

回答by Michael Robinson

Use this:

用这个:

string.match(/^th|\sth/gi);

Examples:

例子:

'is this is a string'.match(/^th|\sth/gi);


'the string: This is a string'.match(/^th|\sth/gi);

Results:

结果:

["th", " Th"]

["th"]

["th", "th"]

[“第”]

回答by Vivin Paliath

var matches = "This is the best".match(/\bth/ig);

returns:

返回:

["Th", "th"]

The regular expression means: Match "th" ignoring case and globally (meaning, don't stop at just one match) if "th" is the first word in the string or if "th" is preceded by a space character.

正则表达式的意思是:如果“th”是字符串中的第一个单词或者如果“th”前面有一个空格字符,则匹配“th”忽略大小写和全局(意思是,不要只在一个匹配处停止)。

回答by Frxstrem

Use the gflag in the regex. It stands for "global", I think, and it searches for allmatches instead of only the first one.

g在正则表达式中使用标志。我认为它代表“全局”,它搜索所有匹配项,而不仅仅是第一个匹配项。

You should also use the iflag for case-insensitive matching.

您还应该使用该i标志进行不区分大小写的匹配。

You add flags to the end of the regex (/<regex>/<flags>) or as a second parameter to new RegExp(pattern, flags)

您将标志添加到正则表达式 ( /<regex>/<flags>)的末尾或作为第二个参数new RegExp(pattern, flags)

For instance:

例如:

var matches = "This is the best".match(/\bth/gi);

or, using RegExpobjects:

或者,使用RegExp对象:

var re = new RegExp("\bth", "gi");
var matches = re.exec("This is the best");

EDIT:Use \bin the regex to match the boundary of a word. Note that it does not really match any specific character, but the beginning or end of a word or the string.

编辑:使用\b的正则表达式匹配b单词的oundary。请注意,它并不真正匹配任何特定字符,而是匹配单词或字符串的开头或结尾。