Javascript 如何正确转义正则表达式中的字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5663987/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 18:20:59  来源:igfitidea点击:

How to properly escape characters in regexp

javascriptregexescaping

提问by user1651105

I want to do a string search inside a string. Simply saying MySTR.search(Needle).

我想在字符串中进行字符串搜索。简单地说MySTR.search(Needle)

The problem occurs when this needlestring contains special regex characters like *,+ and so on. It fails with error invalid quantifier.

当此needle字符串包含特殊的正则表达式字符(如 *、+ 等)时,就会出现问题。它因错误而失败invalid quantifier

I have browsed the web and found out that string can be escaped with \Q some string \E.

我浏览了网页,发现字符串可以用\Q some string \E.

However, this does not always produce the desired behavior. For example:

但是,这并不总是会产生所需的行为。例如:

var sNeedle = '*Stars!*';
var sMySTR = 'The contents of this string have no importance';
sMySTR.search('\Q' + sNeedle + '\E');

Result is -1. OK.

结果是-1。好的。

var sNeedle = '**Stars!**';
var sMySTR = 'The contents of this string have no importance';
sMySTR.search('\Q' + sNeedle + '\E');

Result is "invalid quantifier". This happens because 2 or more special characters are 'touching' each other, because:

结果是“无效的量词”。发生这种情况是因为 2 个或更多特殊字符相互“接触”,因为:

var sNeedle = '*Dont touch me*Stars!*Dont touch me*';
var sMySTR = 'The contents of this string have no importance';
sMySTR.search('\Q' + sNeedle + '\E');

Will work OK.

会正常工作。

I know I could make a function escapeAllBadChars(sInStr)and just add double slashes before every possible special regex character, but I'm wondering if there is a simpler way to do it?

我知道我可以创建一个函数escapeAllBadChars(sInStr)并在每个可能的特殊正则表达式字符之前添加双斜杠,但我想知道是否有更简单的方法来做到这一点?

回答by Bart Kiers

\Q...\Edoesn't work in JavaScript (at least, they don't escape anything...) as you can see:

\Q...\E如您所见,在 JavaScript 中不起作用(至少,它们不会转义任何东西......):

var s = "*";
print(s.search(/\Q*\E/));
print(s.search(/\*/));

produces:

产生:

-1
0

as you can see on Ideone.

正如您在Ideone 上看到的那样

The following chars need to be escaped:

以下字符需要转义:

  • (
  • )
  • [
  • {
  • *
  • +
  • .
  • $
  • ^
  • \
  • |
  • ?
  • (
  • )
  • [
  • {
  • *
  • +
  • .
  • $
  • ^
  • \
  • |
  • ?

So, something like this would do:

所以,这样的事情会做:

function quote(regex) {
  return regex.replace(/([()[{*+.$^\|?])/g, '\');
}

No, ]and }don't need to be escaped: they have no special meaning, only their opening counter parts.

没有,]并且}也不需要做的是转义:他们没有特殊的意义,只有自己开柜台的部分。

Note that when using a literal regex, /.../, you also need to escape the /char. However, /is not a regex meta character: when using it in a RegExpobject, it doesn't need an escape.

请注意,在使用文字正则表达式时/.../,您还需要对/字符进行转义。但是,/不是正则表达式元字符:在RegExp对象中使用它时,它不需要转义。

回答by Mark Peters

I'm just dipping my feet in Javascript, but is there a reason you need to use the regex engine at all? How about

我只是在 Javascript 中涉足,但是您有什么理由需要使用正则表达式引擎吗?怎么样

var sNeedle = '*Stars!*';
var sMySTR = 'The contents of this string have no importance';
if ( sMySTR.indexOf(sNeedle) > -1 ) {
   //found it
}

回答by Taylor Gerring

I performed a quick Google search to see what's out there and it appears that you've got a few options for escaping regular expression characters. According to one page, you can define & run a function like below to escape problematic characters:

我执行了一个快速的谷歌搜索,看看那里有什么,看起来你有几个选项可以转义正则表达式字符。根据一页,您可以定义并运行如下函数来转义有问题的字符:

RegExp.escape = function(text) {
    return text.replace(/[-[\]{}()*+?.,\^$|#\s]/g, "\$&");
}

Alternatively, you can try and use a separate library such as XRegExp, which already handles nuances you're trying to re-solve.

或者,您可以尝试使用单独的库,例如XRegExp,它已经处理了您尝试重新解决的细微差别。

回答by CoolAJ86

Duplicate of https://stackoverflow.com/a/6969486/151312

https://stackoverflow.com/a/6969486/151312 的副本

This is proper as per MDN (see explanation in post above):

根据 MDN,这是正确的(请参阅上面帖子中的解释):

function escapeRegExp(str) {
  return str.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\^$\|]/g, "\$&");
}