regex javascript - 匹配多个搜索词而忽略它们的顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8808783/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 04:38:56  来源:igfitidea点击:

regex javascript - match multiple search terms ignoring their order

javascriptregex

提问by Ranch

I would like to find all the matches of given strings (divided by spaces) in a string. (The way for example, iTunes search box works).

我想在一个字符串中找到给定字符串(除以空格)的所有匹配项。(例如,iTunes 搜索框的工作方式)。

That, for example, both "ab de" and "de ab" will return true on "abcde" (also "bc e a" or any order should return true)

例如,“ ab de”和“ de ab”都将在“ abcde”上返回true (还有“ bc ea”或任何订单都应返回true)

If I replace the white space with a wild card, "ab*de" would return true on "abcde", but not "de*ab". [I use * and not Regex syntax just for this explanation]

如果我用通配符替换空格,“ab*de”将在“ abcde”上返回true ,但不会在“de*ab”上返回。[我使用 * 而不是 Regex 语法只是为了这个解释]

I could not find any pure Regex solution for that. The only solution I could think of is spliting the search term and run multiple Regex.

我找不到任何纯正则表达式解决方案。我能想到的唯一解决方案是拆分搜索词并运行多个正则表达式。

Is it possible to find a pure Regex expression that will cover all these options ?

是否有可能找到涵盖所有这些选项的纯正则表达式?

回答by Karl Adler

Returns truewhen all parts (divided by ,or ' ') of a searchStringoccur in text. Otherwise falseis returned.

返回true时,所有部件(除以,' '的)searchString出现在文本中。否则false返回。

filter(text, searchString) {
    const regexStr = '(?=.*' + searchString.split(/\,|\s/).join(')(?=.*') + ')';
    const searchRegEx = new RegExp(regexStr, 'gi');
    return text.match(searchRegEx) !== null;
}

回答by RoccoC5

I'm pretty sure you could come up with a regex to do what you want, but it may not be the most efficient approach.

我很确定你可以想出一个正则表达式来做你想做的事,但这可能不是最有效的方法。

For example, the regex pattern (?=.*bc)(?=.*e)(?=.*a)will match any string that contains bc, e, anda.

例如,正则表达式模式(?=.*bc)(?=.*e)(?=.*a)将匹配任何包含字符串bcea

var isMatch = 'abcde'.match(/(?=.*bc)(?=.*e)(?=.*a)/) != null; // equals true

var isMatch = 'bcde'.match(/(?=.*bc)(?=.*e)(?=.*a)/) != null; // equals false

You could write a function to dynamically create an expression based on your search terms, but whether it's the best way to accomplish what you are doing is another question.

您可以编写一个函数来根据您的搜索词动态创建表达式,但它是否是完成您正在做的事情的最佳方式是另一个问题。

回答by maerics

Alternations are order insensitive:

交替是顺序不敏感的:

"abcde".match(/(ab|de)/g); // => ['ab', 'de']
"abcde".match(/(de|ab)/g); // => ['ab', 'de']

So if you have a list of words to match you can build a regex with an alternation on the fly like so:

因此,如果您有一个要匹配的单词列表,您可以像这样即时构建一个带有交替的正则表达式:

function regexForWordList(words) {
  return new RegExp('(' + words.join('|') + ')', 'g');
}
'abcde'.match(['a', 'e']); // => ['a', 'e']

回答by Michael Sazonov

Try this:

试试这个:

var str = "your string";
str = str.split( " " );
for( var i = 0 ; i < str.length ; i++ ){
    // your regexp match
}

回答by Jan ?afránek

This is script which I use - it works also with single word searchStrings

这是我使用的脚本 - 它也适用于单个单词 searchStrings

var what="test string with search cool word";
var searchString="search word";
var search = new RegExp(searchString, "gi"); // one-word searching

// multiple search words
if(searchString.indexOf(' ') != -1) {

    search="";
    var words=searchString.split(" ");

    for(var i = 0; i < words.length; i++) {

        search+="(?=.*" + words[i] + ")";

    }

    search = new RegExp(search + ".+", "gi");

}

if(search.test(what)) {

    // found

} else {

    // notfound

}

回答by MetaEd

I assume you are matching words, or parts of words. You want space-separated search terms to limit search results, and it seems you intend to return only those entries which have all the words that the user supplies. And you intend a wildcard character *to stand for 0 or more characters in a matching word.

我假设您正在匹配单词或单词的一部分。您希望用空格分隔的搜索词来限制搜索结果,而且您​​似乎打算只返回那些包含用户提供的所有单词的条目。并且您打算使用通配符*来代表匹配单词中的 0 个或多个字符。

For example, if the user searches for the words term1 term2, you intend to return only those items which have both wordsterm1andterm2. If the user searches for the word term*, it would match any word beginning with term.

例如,如果用户搜索词term1 term2,您打算仅返回同时包含词term1和 的项目term2。如果用户搜索该词term*,它将匹配任何以 开头的词term

There are suitable regular expressions which are equivalent to this search language and can be generated from it.

有合适的正则表达式等价于这种搜索语言,并且可以从中生成。

A simple example, the word term, can be asserted in regex by converting to \bterm\b. But two or more words which must match in any order require lookahead assertions. Using extended syntax, the equivalent regex is:

一个简单的例子,单词term,可以在正则表达式中通过转换为 来断言\bterm\b。但是必须以任何顺序匹配的两个或多个单词需要先行断言。使用扩展语法,等效的正则表达式是:

(?= .* \b term1 \b )
(?= .* \b term2 \b )

The asterisk wildcard can be asserted in regex with a character class followed by asterisk. The character class identifies which letters you consider to be part of word. For example, you might find that [A-Za-z0-9]*fits the bill.

星号通配符可以在正则表达式中使用字符类后跟星号来断言。字符类标识您认为哪些字母是单词的一部分。例如,您可能会发现这[A-Za-z0-9]*符合要求。

In short, you might be satisfied if you convert an expression such as:

简而言之,如果您转换以下表达式,您可能会感到满意:

foo ba* quux

to:

到:

(?= .* \b foo            \b )
(?= .* \b ba[A-Za-z0-9]* \b )
(?= .* \b quux           \b )

That is a simple matter of search and replace. But do be careful to sanitize the input string to avoid injection attacks by removing punctuation, etc.

这是一个简单的搜索和替换问题。但是一定要小心清理输入字符串,以避免通过删除标点符号等方式进行注入攻击。

回答by Demian Brecht

I think you may be barking up the wrong tree with RegEx. What you mightwant to look at is the Levenshtein distanceof two input strings.

我认为您可能会用 RegEx 吠叫错误的树。您可能想要查看的是两个输入字符串的Levenshtein 距离

There's a Javascript implementation hereand a usage example here.

有一个Javascript实现在这里和使用例子在这里