Javascript 负前瞻正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6851921/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 23:28:31  来源:igfitidea点击:

Negative lookahead Regular Expression

javascriptregexregex-lookarounds

提问by gilly3

I want to match all strings ending in ".htm" unless it ends in "foo.htm". I'm generally decent with regular expressions, but negative lookaheads have me stumped. Why doesn't this work?

我想匹配所有以“.htm”结尾的字符串,除非它以“foo.htm”结尾。我对正则表达式通常很体面,但负面的前瞻让我难倒。为什么这不起作用?

/(?!foo)\.htm$/i.test("/foo.htm");  // returns true. I want false.

What should I be using instead? I think I need a "negative lookbehind" expression (if JavaScript supported such a thing, which I know it doesn't).

我应该用什么代替?我想我需要一个“否定背后”的表达(如果 JavaScript 支持这样的东西,我知道它不支持)。

回答by ridgerunner

The problem is pretty simple really. This will do it:

问题其实很简单。这将做到:

/^(?!.*foo\.htm$).*\.htm$/i

/^(?!.*foo\.htm$).*\.htm$/i

回答by Nicole

What you are describing (your intention) is a negative look-behind, and Javascript has no support for look-behinds.

您所描述的(您的意图)是负面的后视,而 Javascript 不支持后视。

Look-aheads look forward from the character at which they are placed — and you've placed it before the .. So, what you've got is actually saying "anything ending in .htmas long as the first three characters starting at that position (.ht) are not foo" which is always true.

前瞻从它们被放置的字符向前看——你已经把它放在.. 因此,您实际上是在说“.htm只要从该位置 ( .ht)开始的前三个字符不是以结尾的任何内容foo”,这总是正确的。

Usually, the substitute for negative look-behinds is to match more than you need, and extract only the part you actually do need. This is hacky, and depending on your precise situation you can probably come up with something else, but something like this:

通常,否定后视的替代方法是匹配比您需要的更多,并且只提取您实际需要的部分。这很hacky,根据您的具体情况,您可能会想出其他方法,但如下所示:

// Checks that the last 3 characters before the dot are not foo:
/(?!foo).{3}\.htm$/i.test("/foo.htm"); // returns false 

回答by Floern

As mentioned JavaScript does not support negative look-behind assertions.

如前所述,JavaScript 不支持否定的后视断言。

But you could use a workaroud:

但是你可以使用一种解决方法:

/(foo)?\.htm$/i.test("/foo.htm") && RegExp. != "foo";

This will match everything that ends with .htmbut it will store "foo"into RegExp.$1if it matches foo.htm, so you can handle it separately.

这将匹配以 结尾的所有内容,.htm但如果匹配,它将存储"foo"RegExp.$1foo.htm,因此您可以单独处理它。

回答by petho

Like Renesis mentioned, "lookbehind" is not supported in JavaScript, so maybe just use two regexps in combination:

就像 Renesis 提到的那样,JavaScript 不支持“lookbehind”,所以也许只需结合使用两个正则表达式:

!/foo\.htm$/i.test(teststring) && /\.htm$/i.test(teststring)

回答by Roko C. Buljan

String.prototype.endsWith(ES6)

String.prototype.endsWith(ES6)

console.log( /* !(not)endsWith */

    !"foo.html".endsWith("foo.htm"), // true
  !"barfoo.htm".endsWith("foo.htm"), // false (here you go)
     !"foo.htm".endsWith("foo.htm"), // false (here you go)
   !"test.html".endsWith("foo.htm"), // true
    !"test.htm".endsWith("foo.htm")  // true

);

回答by Igor Bykov

Probably this answer has arrived just a little bit later than necessary but I'll leave it here just in case someone will run into the same issue now (7 years, 6 months after this question was asked).

可能这个答案比必要的晚了一点,但我会把它留在这里,以防有人现在遇到同样的问题(在这个问题被问到之后的 7 年零 6 个月)。

Now lookbehinds are included in ECMA2018 standard & supported at least in last version of Chrome. However, you might solve the puzzle with or without them.

现在lookbehinds包含在ECMA2018标准中,至少在最新版本的Chrome中受支持。但是,无论有没有它们,您都可以解决难题。

A solution with negative lookahead:

负前瞻的解决方案:

let testString = `html.htm app.htm foo.tm foo.htm bar.js 1to3.htm _.js _.htm`;

testString.match(/\b(?!foo)[\w-.]+\.htm\b/gi);
> (4)?["html.htm", "app.htm", "1to3.htm", "_.htm"]

A solution with negative lookbehind:

负向后视的解决方案:

testString.match(/\b[\w-.]+(?<!foo)\.htm\b/gi);
> (4)?["html.htm", "app.htm", "1to3.htm", "_.htm"]

A solution with (technically) positive lookahead:

具有(技术上)积极前瞻的解决方案:

testString.match(/\b(?=[^f])[\w-.]+\.htm\b/gi);
> (4)?["html.htm", "app.htm", "1to3.htm", "_.htm"]

etc.

等等。

All these RegExps tell JS engine the same thing in different ways, the message that they pass to JS engine is something like the following.

所有这些 RegExp 都以不同的方式告诉 JS 引擎相同的事情,它们传递给 JS 引擎的消息类似于以下内容。

Please, find in this string all sequences of characters that are:

请在这个字符串中找到所有符合以下条件的字符序列:

  • Separated from other text (like words);
  • Consist of one or more letter(s) of english alphabet, underscore(s), hyphen(s), dot(s) or digit(s);
  • End with ".htm";
  • Apart from that, the part of sequence before ".htm" could be anything but "foo".
  • 与其他文本(如单词)分开;
  • 由一个或多个英文字母、下划线、连字符、点或数字组成;
  • 以“.htm”结尾;
  • 除此之外,“.htm”之前的序列部分可能不是“foo”。

回答by ngn

You could emulate the negative lookbehind with something like /(.|..|.*[^f]..|.*f[^o].|.*fo[^o])\.htm$/, but a programmatic approach would be better.

你可以用类似的东西来模拟负面的回顾 /(.|..|.*[^f]..|.*f[^o].|.*fo[^o])\.htm$/,但程序化的方法会更好。