Javascript 负前瞻正则表达式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6851921/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Negative lookahead Regular Expression
提问by gilly3
I want to match all strings ending in ".htm" unless it ends in "foo.htm". I'm generally decent with regular expressions, but negative lookaheads have me stumped. Why doesn't this work?
我想匹配所有以“.htm”结尾的字符串,除非它以“foo.htm”结尾。我对正则表达式通常很体面,但负面的前瞻让我难倒。为什么这不起作用?
/(?!foo)\.htm$/i.test("/foo.htm"); // returns true. I want false.
What should I be using instead? I think I need a "negative lookbehind" expression (if JavaScript supported such a thing, which I know it doesn't).
我应该用什么代替?我想我需要一个“否定背后”的表达(如果 JavaScript 支持这样的东西,我知道它不支持)。
回答by ridgerunner
The problem is pretty simple really. This will do it:
问题其实很简单。这将做到:
/^(?!.*foo\.htm$).*\.htm$/i
/^(?!.*foo\.htm$).*\.htm$/i
回答by Nicole
What you are describing (your intention) is a negative look-behind, and Javascript has no support for look-behinds.
您所描述的(您的意图)是负面的后视,而 Javascript 不支持后视。
Look-aheads look forward from the character at which they are placed — and you've placed it before the .
. So, what you've got is actually saying "anything ending in .htm
as long as the first three characters starting at that position (.ht
) are not foo
" which is always true.
前瞻从它们被放置的字符向前看——你已经把它放在.
. 因此,您实际上是在说“.htm
只要从该位置 ( .ht
)开始的前三个字符不是以结尾的任何内容foo
”,这总是正确的。
Usually, the substitute for negative look-behinds is to match more than you need, and extract only the part you actually do need. This is hacky, and depending on your precise situation you can probably come up with something else, but something like this:
通常,否定后视的替代方法是匹配比您需要的更多,并且只提取您实际需要的部分。这很hacky,根据您的具体情况,您可能会想出其他方法,但如下所示:
// Checks that the last 3 characters before the dot are not foo:
/(?!foo).{3}\.htm$/i.test("/foo.htm"); // returns false
回答by Floern
As mentioned JavaScript does not support negative look-behind assertions.
如前所述,JavaScript 不支持否定的后视断言。
But you could use a workaroud:
但是你可以使用一种解决方法:
/(foo)?\.htm$/i.test("/foo.htm") && RegExp. != "foo";
This will match everything that ends with .htm
but it will store "foo"
into RegExp.$1
if it matches foo.htm
, so you can handle it separately.
这将匹配以 结尾的所有内容,.htm
但如果匹配,它将存储"foo"
到RegExp.$1
中foo.htm
,因此您可以单独处理它。
回答by petho
Like Renesis mentioned, "lookbehind" is not supported in JavaScript, so maybe just use two regexps in combination:
就像 Renesis 提到的那样,JavaScript 不支持“lookbehind”,所以也许只需结合使用两个正则表达式:
!/foo\.htm$/i.test(teststring) && /\.htm$/i.test(teststring)
回答by Roko C. Buljan
String.prototype.endsWith(ES6)
String.prototype.endsWith(ES6)
console.log( /* !(not)endsWith */
!"foo.html".endsWith("foo.htm"), // true
!"barfoo.htm".endsWith("foo.htm"), // false (here you go)
!"foo.htm".endsWith("foo.htm"), // false (here you go)
!"test.html".endsWith("foo.htm"), // true
!"test.htm".endsWith("foo.htm") // true
);
回答by Igor Bykov
Probably this answer has arrived just a little bit later than necessary but I'll leave it here just in case someone will run into the same issue now (7 years, 6 months after this question was asked).
可能这个答案比必要的晚了一点,但我会把它留在这里,以防有人现在遇到同样的问题(在这个问题被问到之后的 7 年零 6 个月)。
Now lookbehinds are included in ECMA2018 standard & supported at least in last version of Chrome. However, you might solve the puzzle with or without them.
现在lookbehinds包含在ECMA2018标准中,至少在最新版本的Chrome中受支持。但是,无论有没有它们,您都可以解决难题。
A solution with negative lookahead:
负前瞻的解决方案:
let testString = `html.htm app.htm foo.tm foo.htm bar.js 1to3.htm _.js _.htm`;
testString.match(/\b(?!foo)[\w-.]+\.htm\b/gi);
> (4)?["html.htm", "app.htm", "1to3.htm", "_.htm"]
A solution with negative lookbehind:
负向后视的解决方案:
testString.match(/\b[\w-.]+(?<!foo)\.htm\b/gi);
> (4)?["html.htm", "app.htm", "1to3.htm", "_.htm"]
A solution with (technically) positive lookahead:
具有(技术上)积极前瞻的解决方案:
testString.match(/\b(?=[^f])[\w-.]+\.htm\b/gi);
> (4)?["html.htm", "app.htm", "1to3.htm", "_.htm"]
etc.
等等。
All these RegExps tell JS engine the same thing in different ways, the message that they pass to JS engine is something like the following.
所有这些 RegExp 都以不同的方式告诉 JS 引擎相同的事情,它们传递给 JS 引擎的消息类似于以下内容。
Please, find in this string all sequences of characters that are:
请在这个字符串中找到所有符合以下条件的字符序列:
- Separated from other text (like words);
- Consist of one or more letter(s) of english alphabet, underscore(s), hyphen(s), dot(s) or digit(s);
- End with ".htm";
- Apart from that, the part of sequence before ".htm" could be anything but "foo".
- 与其他文本(如单词)分开;
- 由一个或多个英文字母、下划线、连字符、点或数字组成;
- 以“.htm”结尾;
- 除此之外,“.htm”之前的序列部分可能不是“foo”。
回答by ngn
You could emulate the negative lookbehind with something like
/(.|..|.*[^f]..|.*f[^o].|.*fo[^o])\.htm$/
, but a programmatic approach would be better.
你可以用类似的东西来模拟负面的回顾
/(.|..|.*[^f]..|.*f[^o].|.*fo[^o])\.htm$/
,但程序化的方法会更好。