Python 两个字符串之间的正则表达式匹配?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12736074/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 11:45:17  来源:igfitidea点击:

Regex matching between two strings?

pythonregexpython-3.xregex-negation

提问by Hrvoje ?poljar

I can't seem to find a way to extract all comments like in following example.

我似乎无法找到一种方法来提取所有评论,如下例所示。

>>> import re
>>> string = '''
... <!-- one 
... -->
... <!-- two -- -- -->
... <!-- three -->
... '''
>>> m = re.findall ( '<!--([^\(-->)]+)-->', string, re.MULTILINE)
>>> m
[' one \n', ' three ']

block with two -- --is not matched most likely because of bad regex. Can someone please point me in right direction how to extract matches between two strings.

two -- --由于正则表达式错误,块与最有可能不匹配。有人可以指出我如何在两个字符串之间提取匹配项的正确方向。



Hi I've tested what you guys suggested in comments.... here is working solution with little upgrade.

嗨,我已经测试了你们在评论中建议的内容......这里是几乎没有升级的工作解决方案。

>>> m = re.findall ( '<!--(.*?)-->', string, re.MULTILINE)
>>> m
[' two -- -- ', ' three ']
>>> m = re.findall ( '<!--(.*\n?)-->', string, re.MULTILINE)
>>> m
[' one \n', ' two -- -- ', ' three ']

thanks!

谢谢!

采纳答案by iruvar

this should do the trick

这应该可以解决问题

 m = re.findall ( '<!--(.*?)-->', string, re.DOTALL)

回答by Wilduck

In general, it is impossible to do arbitrary matching between two delimiters with a regular grammar.

一般情况下,不可能用正则文法在两个定界符之间进行任意匹配。

Specifcally, if you allow nesting,

具体来说,如果您允许嵌套,

<!-- how do you deal <!-- with nested --> comments? -->

you'll run in to issues. So, while you may be able to solve this specific problem with a regular expression, any regular expression that you write will be able to be broken by some other strange nesting of comments.

你会遇到问题。因此,虽然您可以使用正则表达式解决这个特定问题,但您编写的任何正则表达式都可能被其他一些奇怪的注释嵌套破坏。

To parse arbitrary comments, you'll need to move on to a method of parsing context free grammars. A simple method to do so is to use a pushdown automaton.

要解析任意注释,您需要继续使用解析上下文无关文法的方法。一个简单的方法是使用下推自动机