转义 Python 字符串中的正则表达式特殊字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4202538/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Escape regex special characters in a Python string
提问by Wolfy
Does Python have a function that I can use to escape special characters in a regular expression?
Python 是否有一个函数可以用来转义正则表达式中的特殊字符?
For example, I'm "stuck" :\should become I\'m \"stuck\" :\\.
例如,I'm "stuck" :\应该变成I\'m \"stuck\" :\\.
采纳答案by pyfunc
Use re.escape
>>> import re
>>> re.escape(r'\ a.*$')
'\\\ a\.\*\$'
>>> print(re.escape(r'\ a.*$'))
\\ a\.\*$
>>> re.escape('www.stackoverflow.com')
'www\.stackoverflow\.com'
>>> print(re.escape('www.stackoverflow.com'))
www\.stackoverflow\.com
Repeating it here:
在这里重复一遍:
re.escape(string)
Return string with all non-alphanumerics backslashed; this is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it.
转义(字符串)
返回所有非字母数字反斜杠的字符串;如果您想匹配可能包含正则表达式元字符的任意文字字符串,这将非常有用。
As of Python 3.7 re.escape()was changed to escape only characters which are meaningful to regex operations.
从 Python 3.7 开始,re.escape()已更改为仅转义对正则表达式操作有意义的字符。
回答by poke
It's not that hard:
这并不难:
def escapeSpecialCharacters ( text, characters ):
for character in characters:
text = text.replace( character, '\' + character )
return text
>>> escapeSpecialCharacters( 'I\'m "stuck" :\', '\'"' )
'I\\'m \"stuck\" :\'
>>> print( _ )
I\'m \"stuck\" :\
回答by dp_
Use repr()[1:-1]. In this case, the double quotes don't need to be escaped. The [-1:1] slice is to remove the single quote from the beginning and the end.
使用 repr()[1:-1]。在这种情况下,双引号不需要转义。[-1:1] 切片是去除开头和结尾的单引号。
>>> x = raw_input()
I'm "stuck" :\
>>> print x
I'm "stuck" :\
>>> print repr(x)[1:-1]
I\'m "stuck" :\
Or maybe you just want to escape a phrase to paste into your program? If so, do this:
或者您可能只是想转义一个短语以粘贴到您的程序中?如果是这样,请执行以下操作:
>>> raw_input()
I'm "stuck" :\
'I\'m "stuck" :\'
回答by Tim Ruddick
I'm surprised no one has mentioned using regular expressions via re.sub():
我很惊讶没有人提到通过re.sub()以下方式使用正则表达式:
import re
print re.sub(r'([\"])', r'\', 'it\'s "this"') # it's \"this\"
print re.sub(r"([\'])", r'\', 'it\'s "this"') # it\'s "this"
print re.sub(r'([\" \'])', r'\', 'it\'s "this"') # it\'s\ \"this\"
Important things to note:
需要注意的重要事项:
- In the searchpattern, include
\as well as the character(s) you're looking for. You're going to be using\to escape your characters, so you need to escape thatas well. - Put parentheses around the searchpattern, e.g.
([\"]), so that the substitutionpattern can use the found character when it adds\in front of it. (That's what\1does: uses the value of the first parenthesized group.) - The
rin front ofr'([\"])'means it's a raw string. Raw strings use different rules for escaping backslashes. To write([\"])as a plain string, you'd need to double all the backslashes and write'([\\"])'. Raw strings are friendlier when you're writing regular expressions. - In the substitutionpattern, you need to escape
\to distinguish it from a backslash that precedes a substitution group, e.g.\1, hencer'\\\1'. To write thatas a plain string, you'd need'\\\\\\1'— and nobody wants that.
- 在搜索模式中,包括
\您要查找的字符。你会使用\逃脱你的角色,所以你需要逃避 那为好。 - 将括号放在搜索模式周围,例如
([\"]),这样替换模式可以在它\前面添加时使用找到的字符。(这就是\1:使用第一个带括号的组的值。) - 在
r前面r'([\"])'意味着它是一个原始字符串。原始字符串使用不同的规则来转义反斜杠。要写入([\"])纯字符串,您需要将所有反斜杠加倍并写入'([\\"])'. 编写正则表达式时,原始字符串更友好。 - 在替换模式中,您需要转义
\以将其与替换组之前的反斜杠区分开来,例如\1,因此r'\\\1'。要把 它写成一个普通的字符串,你需要'\\\\\\1'——而且没有人想要那样。
回答by spatar
As it was mentioned above, the answer depends on your case. If you want to escape a string for a regular expression then you should use re.escape(). But if you want to escape a specific set of characters then use this lambda function:
如上所述,答案取决于您的情况。如果要为正则表达式转义字符串,则应使用 re.escape()。但是如果你想转义一组特定的字符,那么使用这个 lambda 函数:
>>> escape = lambda s, escapechar, specialchars: "".join(escapechar + c if c in specialchars or c == escapechar else c for c in s)
>>> s = raw_input()
I'm "stuck" :\
>>> print s
I'm "stuck" :\
>>> print escape(s, "\", ['"'])
I'm \"stuck\" :\
回答by Christoph Roeder
If you only want to replace some characters you could use this:
如果你只想替换一些字符,你可以使用这个:
import re
print re.sub(r'([\.\\+\*\?\[\^\]$\(\)\{\}\!\<\>\|\:\-])', r'\', "example string.")

