Python 两个字符串之间的正则表达式匹配?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12736074/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex matching between two strings?
提问by Hrvoje ?poljar
I can't seem to find a way to extract all comments like in following example.
我似乎无法找到一种方法来提取所有评论,如下例所示。
>>> import re
>>> string = '''
... <!-- one
... -->
... <!-- two -- -- -->
... <!-- three -->
... '''
>>> m = re.findall ( '<!--([^\(-->)]+)-->', string, re.MULTILINE)
>>> m
[' one \n', ' three ']
block with two -- --is not matched most likely because of bad regex. Can someone please point me in right direction how to extract matches between two strings.
two -- --由于正则表达式错误,块与最有可能不匹配。有人可以指出我如何在两个字符串之间提取匹配项的正确方向。
Hi I've tested what you guys suggested in comments.... here is working solution with little upgrade.
嗨,我已经测试了你们在评论中建议的内容......这里是几乎没有升级的工作解决方案。
>>> m = re.findall ( '<!--(.*?)-->', string, re.MULTILINE)
>>> m
[' two -- -- ', ' three ']
>>> m = re.findall ( '<!--(.*\n?)-->', string, re.MULTILINE)
>>> m
[' one \n', ' two -- -- ', ' three ']
thanks!
谢谢!
采纳答案by iruvar
this should do the trick
这应该可以解决问题
m = re.findall ( '<!--(.*?)-->', string, re.DOTALL)
回答by Wilduck
In general, it is impossible to do arbitrary matching between two delimiters with a regular grammar.
一般情况下,不可能用正则文法在两个定界符之间进行任意匹配。
Specifcally, if you allow nesting,
具体来说,如果您允许嵌套,
<!-- how do you deal <!-- with nested --> comments? -->
you'll run in to issues. So, while you may be able to solve this specific problem with a regular expression, any regular expression that you write will be able to be broken by some other strange nesting of comments.
你会遇到问题。因此,虽然您可以使用正则表达式解决这个特定问题,但您编写的任何正则表达式都可能被其他一些奇怪的注释嵌套破坏。
To parse arbitrary comments, you'll need to move on to a method of parsing context free grammars. A simple method to do so is to use a pushdown automaton.
要解析任意注释,您需要继续使用解析上下文无关文法的方法。一个简单的方法是使用下推自动机。

