Javascript 在放入 RegEx 之前应该转义的所有字符的列表?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5105143/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
List of all characters that should be escaped before put in to RegEx?
提问by Somebody
Could someone please give a complete list of special characters that should be escaped?
有人可以提供应该转义的特殊字符的完整列表吗?
I fear I don't know some of them.
我怕我不认识他们中的一些人。
回答by Tatu Ulmanen
Take a look at PHP.JS's implementation of PHP's preg_quote
function, that should do what you need:
看看 PHP.JS 对 PHPpreg_quote
函数的实现,它应该可以满足您的需求:
The special regular expression characters are: . \ + * ? [ ^ ] $ ( ) { } = ! < > | : -
特殊的正则表达式字符是: . \ + * ? [ ^ ] $ ( ) { } = ! < > | : -
回答by Andrea
According to this site, the list of characters to escape is
根据这个站点,要转义的字符列表是
[, the backslash \, the caret ^, the dollar sign $, the period or dot ., the vertical bar or pipe symbol |, the question mark ?, the asterisk or star *, the plus sign +, the opening round bracket ( and the closing round bracket ).
[、反斜杠 \、插入符号 ^、美元符号 $、句点或点 .、竖线或竖线符号 |、问号 ?、星号或星号 *、加号 +、开始的圆括号 (和结束圆括号)。
In addition to that, you need to escape characters that are interpreted by the Javascript interpreter as end of the string, that is either '
or "
.
除此之外,您需要转义由 Javascript 解释器解释为字符串结尾的字符,即'
或"
。
回答by jj2005
Inside a character set, to match a literal hyphen -
, it needs to be escaped when not positioned at the start or the end. For example, given the position of the last hyphen in the following pattern, it needs to be escaped:
在字符集中,要匹配文字 hyphen -
,它需要在未定位在开头或结尾时进行转义。例如,给定以下模式中最后一个连字符的位置,需要对其进行转义:
[a-z0-9\-_]+
But it doesn't need to be escaped here:
但是这里不需要转义:
[a-z0-9_-]+
If you fail to escape a hyphen, the engine will attempt to interpret it as a range between the preceding character and the next character (just like a-z
matches any character between a and z).
如果连字符转义失败,引擎将尝试将其解释为前一个字符和下一个字符之间的范围(就像a-z
匹配 a 和 z 之间的任何字符)。
Additionally, /
s do not be escaped inside a character set (though they doneed to be escaped when outside a character set). So, the following syntax is valid;
此外,/
s 不会在字符集内转义(尽管在字符集外时它们确实需要转义)。因此,以下语法是有效的;
const pattern = /[/]/;
回答by hngr18
Based off of Tatu Ulmanen's answer, my solution in C# took this form:
基于 Tatu Ulmanen 的回答,我在 C# 中的解决方案采用了这种形式:
private static List<string> RegexSpecialCharacters = new List<string>
{
"\",
".",
"+",
"*",
"?",
"[",
"^",
"]",
"$",
"(",
")",
"{",
"}",
"=",
"!",
"<",
">",
"|",
":",
"-"
};
foreach (var rgxSpecialChar in RegexSpecialCharacters)
rgxPattern = input.Replace(rgxSpecialChar, "\" + rgxSpecialChar);
Note that I have switched the positions of '\' and '.', failure to process the slashes first will lead to doubling up of the '\'s
请注意,我已经切换了 '\' 和 '.' 的位置,如果不先处理斜杠,则会导致 '\'s 加倍
Edit
编辑
Here is a javascript translation
这是一个javascript翻译
var regexSpecialCharacters = [
"\", ".", "+", "*", "?",
"[", "^", "]", "$", "(",
")", "{", "}", "=", "!",
"<", ">", "|", ":", "-"
];
regexSpecialCharacters.forEach(rgxSpecChar =>
input = input.replace(new RegExp("\" + rgxSpecChar,"gm"), "\" +
rgxSpecChar))
回答by Michael S
I was looking for this list in regards to ESLint's "no-useless-escape" setting for reg-ex. And found some of these characters mentioned do not need to be escaped for a regular-expression in JS. The longer list in the other answer here is for PHP, which does require the additional characters to be escaped.
我正在寻找有关 ESLint 为 reg-ex 设置的“no-useless-escape”设置的列表。并发现其中一些提到的字符不需要为 JS 中的正则表达式进行转义。此处另一个答案中较长的列表适用于 PHP,它确实需要对附加字符进行转义。
In this github issue for ESLint, about halfway down, user not-an-aardvark
explains why the character referenced in the issue is a character that should maybe be escaped.
在这个 ESLint 的 github 问题中,大约一半,用户not-an-aardvark
解释了为什么问题中引用的字符是一个可能应该转义的字符。
In javascript, a character that NEEDS to be escaped is a syntax character, or one of these:
在 javascript 中,需要转义的字符是语法字符,或以下之一:
^ $ \ . * + ? ( ) [ ] { } |
^ $ \ . * + ? ( ) [ ] { } |
The response to the github issue I linked to above includes explanation about "Annex B" semantics (which I don't know much about) which allows 4 of the above mentioned characters to be UNescaped: ) ] { }
.
对我上面链接的 github 问题的响应包括关于“Annex B”语义(我不太了解)的解释,它允许对上述字符中的 4 个进行非转义: ) ] { }
.
Another thing to note is that escaping a character that doesn't require escaping won't do any harm (except maybe if you're trying to escape the escape character). So, my personal rule of thumb is: "When in doubt, escape"
另一件要注意的事情是转义不需要转义的字符不会造成任何伤害(除非您试图转义转义字符)。所以,我个人的经验法则是:“有疑问时,逃避”
回答by haravares
The problem:
问题:
const character = '+'
new RegExp(character, 'gi') // error
Smart solutions:
智能解决方案:
// with babel-polyfill
// Warning: will be removed from babel-polyfill v7
const character = '+'
const escapeCharacter = RegExp.escape(character)
new RegExp(escapeCharacter, 'gi') // /\+/gi
// ES5
const character = '+'
const escapeCharacter = escapeRegExp(character)
new RegExp(escapeCharacter, 'gi') // /\+/gi
function escapeRegExp(string){
return string.replace(/[.*+?^${}()|[\]\]/g, '\$&')
}