regex javascript - 匹配多个搜索词而忽略它们的顺序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8808783/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
regex javascript - match multiple search terms ignoring their order
提问by Ranch
I would like to find all the matches of given strings (divided by spaces) in a string. (The way for example, iTunes search box works).
我想在一个字符串中找到给定字符串(除以空格)的所有匹配项。(例如,iTunes 搜索框的工作方式)。
That, for example, both "ab de" and "de ab" will return true on "abcde" (also "bc e a" or any order should return true)
例如,“ ab de”和“ de ab”都将在“ abcde”上返回true (还有“ bc ea”或任何订单都应返回true)
If I replace the white space with a wild card, "ab*de" would return true on "abcde", but not "de*ab". [I use * and not Regex syntax just for this explanation]
如果我用通配符替换空格,“ab*de”将在“ abcde”上返回true ,但不会在“de*ab”上返回。[我使用 * 而不是 Regex 语法只是为了这个解释]
I could not find any pure Regex solution for that. The only solution I could think of is spliting the search term and run multiple Regex.
我找不到任何纯正则表达式解决方案。我能想到的唯一解决方案是拆分搜索词并运行多个正则表达式。
Is it possible to find a pure Regex expression that will cover all these options ?
是否有可能找到涵盖所有这些选项的纯正则表达式?
回答by Karl Adler
Returns true
when all parts (divided by ,
or ' '
) of a searchString
occur in text. Otherwise false
is returned.
返回true
时,所有部件(除以,
或' '
的)searchString
出现在文本中。否则false
返回。
filter(text, searchString) {
const regexStr = '(?=.*' + searchString.split(/\,|\s/).join(')(?=.*') + ')';
const searchRegEx = new RegExp(regexStr, 'gi');
return text.match(searchRegEx) !== null;
}
回答by RoccoC5
I'm pretty sure you could come up with a regex to do what you want, but it may not be the most efficient approach.
我很确定你可以想出一个正则表达式来做你想做的事,但这可能不是最有效的方法。
For example, the regex pattern (?=.*bc)(?=.*e)(?=.*a)
will match any string that contains bc
, e
, anda
.
例如,正则表达式模式(?=.*bc)(?=.*e)(?=.*a)
将匹配任何包含字符串bc
,e
,和a
。
var isMatch = 'abcde'.match(/(?=.*bc)(?=.*e)(?=.*a)/) != null; // equals true
var isMatch = 'bcde'.match(/(?=.*bc)(?=.*e)(?=.*a)/) != null; // equals false
You could write a function to dynamically create an expression based on your search terms, but whether it's the best way to accomplish what you are doing is another question.
您可以编写一个函数来根据您的搜索词动态创建表达式,但它是否是完成您正在做的事情的最佳方式是另一个问题。
回答by maerics
Alternations are order insensitive:
交替是顺序不敏感的:
"abcde".match(/(ab|de)/g); // => ['ab', 'de']
"abcde".match(/(de|ab)/g); // => ['ab', 'de']
So if you have a list of words to match you can build a regex with an alternation on the fly like so:
因此,如果您有一个要匹配的单词列表,您可以像这样即时构建一个带有交替的正则表达式:
function regexForWordList(words) {
return new RegExp('(' + words.join('|') + ')', 'g');
}
'abcde'.match(['a', 'e']); // => ['a', 'e']
回答by Michael Sazonov
Try this:
试试这个:
var str = "your string";
str = str.split( " " );
for( var i = 0 ; i < str.length ; i++ ){
// your regexp match
}
回答by Jan ?afránek
This is script which I use - it works also with single word searchStrings
这是我使用的脚本 - 它也适用于单个单词 searchStrings
var what="test string with search cool word";
var searchString="search word";
var search = new RegExp(searchString, "gi"); // one-word searching
// multiple search words
if(searchString.indexOf(' ') != -1) {
search="";
var words=searchString.split(" ");
for(var i = 0; i < words.length; i++) {
search+="(?=.*" + words[i] + ")";
}
search = new RegExp(search + ".+", "gi");
}
if(search.test(what)) {
// found
} else {
// notfound
}
回答by MetaEd
I assume you are matching words, or parts of words. You want space-separated search terms to limit search results, and it seems you intend to return only those entries which have all the words that the user supplies. And you intend a wildcard character *
to stand for 0 or more characters in a matching word.
我假设您正在匹配单词或单词的一部分。您希望用空格分隔的搜索词来限制搜索结果,而且您似乎打算只返回那些包含用户提供的所有单词的条目。并且您打算使用通配符*
来代表匹配单词中的 0 个或多个字符。
For example, if the user searches for the words term1 term2, you intend to return only those items which have both wordsterm1
andterm2
. If the user searches for the word term*, it would match any word beginning with term
.
例如,如果用户搜索词term1 term2,您打算仅返回同时包含词term1
和 的项目term2
。如果用户搜索该词term*,它将匹配任何以 开头的词term
。
There are suitable regular expressions which are equivalent to this search language and can be generated from it.
有合适的正则表达式等价于这种搜索语言,并且可以从中生成。
A simple example, the word term
, can be asserted in regex by converting to \bterm\b
. But two or more words which must match in any order require lookahead assertions. Using extended syntax, the equivalent regex is:
一个简单的例子,单词term
,可以在正则表达式中通过转换为 来断言\bterm\b
。但是必须以任何顺序匹配的两个或多个单词需要先行断言。使用扩展语法,等效的正则表达式是:
(?= .* \b term1 \b )
(?= .* \b term2 \b )
The asterisk wildcard can be asserted in regex with a character class followed by asterisk. The character class identifies which letters you consider to be part of word. For example, you might find that [A-Za-z0-9]*
fits the bill.
星号通配符可以在正则表达式中使用字符类后跟星号来断言。字符类标识您认为哪些字母是单词的一部分。例如,您可能会发现这[A-Za-z0-9]*
符合要求。
In short, you might be satisfied if you convert an expression such as:
简而言之,如果您转换以下表达式,您可能会感到满意:
foo ba* quux
to:
到:
(?= .* \b foo \b )
(?= .* \b ba[A-Za-z0-9]* \b )
(?= .* \b quux \b )
That is a simple matter of search and replace. But do be careful to sanitize the input string to avoid injection attacks by removing punctuation, etc.
这是一个简单的搜索和替换问题。但是一定要小心清理输入字符串,以避免通过删除标点符号等方式进行注入攻击。
回答by Demian Brecht
I think you may be barking up the wrong tree with RegEx. What you mightwant to look at is the Levenshtein distanceof two input strings.
我认为您可能会用 RegEx 吠叫错误的树。您可能想要查看的是两个输入字符串的Levenshtein 距离。
There's a Javascript implementation hereand a usage example here.