匹配任何字符,包括 Python 正则表达式子表达式中的换行符,而不是全局
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33312175/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
matching any character including newlines in a Python regex subexpression, not globally
提问by Jason S
I want to use re.MULTILINE
but NOTre.DOTALL
, so that I can have a regex that includes both an "any character" wildcard and the normal .
wildcard that doesn't match newlines.
我想使用re.MULTILINE
但NOTre.DOTALL
,以便我可以拥有一个包含“任何字符”通配符和.
不匹配换行符的普通通配符的正则表达式。
Is there a way to do this? What should I use to match any character in those instances that I want to include newlines?
有没有办法做到这一点?在我想要包含换行符的那些实例中,我应该使用什么来匹配任何字符?
采纳答案by Wiktor Stribi?ew
To match a newline, or "any symbol" without re.S
/re.DOTALL
, you may use any of the following:
要匹配换行符或不带re.S
/ 的“任何符号” re.DOTALL
,您可以使用以下任何一种:
[\s\S]
[\w\W]
[\d\D]
The main idea is that the opposite shorthand classes inside a character class match any symbol there is in the input string.
主要思想是字符类中的相反速记类匹配输入字符串中的任何符号。
Comparing it to (.|\s)
and other variations with alternation, the character class solution is much more efficient as it involves much less backtracking (when used with a *
or +
quantifier). Compare the small example: it takes (?:.|\n)+
45 steps to complete, and it takes [\s\S]+
just 2 steps.
将它(.|\s)
与具有交替的其他变体进行比较,字符类解决方案效率更高,因为它涉及的回溯要少得多(与 a*
或+
量词一起使用时)。对比一下小例子:需要(?:.|\n)+
45步才能完成,只需要[\s\S]+
2步。