Python 两个字符串之间的正则表达式匹配？

Question

提问by Hrvoje ?poljar

I can't seem to find a way to extract all comments like in following example.

我似乎无法找到一种方法来提取所有评论，如下例所示。

>>> import re
>>> string = '''
... <!-- one 
... -->
... <!-- two -- -- -->
... <!-- three -->
... '''
>>> m = re.findall ( '<!--([^\(-->)]+)-->', string, re.MULTILINE)
>>> m
[' one \n', ' three ']

block with two -- --is not matched most likely because of bad regex. Can someone please point me in right direction how to extract matches between two strings.

two -- --由于正则表达式错误，块与最有可能不匹配。有人可以指出我如何在两个字符串之间提取匹配项的正确方向。

Hi I've tested what you guys suggested in comments.... here is working solution with little upgrade.

嗨，我已经测试了你们在评论中建议的内容......这里是几乎没有升级的工作解决方案。

>>> m = re.findall ( '<!--(.*?)-->', string, re.MULTILINE)
>>> m
[' two -- -- ', ' three ']
>>> m = re.findall ( '<!--(.*\n?)-->', string, re.MULTILINE)
>>> m
[' one \n', ' two -- -- ', ' three ']

thanks!

谢谢！

Answer 1

采纳答案by iruvar

this should do the trick

这应该可以解决问题

 m = re.findall ( '<!--(.*?)-->', string, re.DOTALL)

Answer 2

回答by Wilduck

In general, it is impossible to do arbitrary matching between two delimiters with a regular grammar.

一般情况下，不可能用正则文法在两个定界符之间进行任意匹配。

Specifcally, if you allow nesting,

具体来说，如果您允许嵌套，

<!-- how do you deal <!-- with nested --> comments? -->

you'll run in to issues. So, while you may be able to solve this specific problem with a regular expression, any regular expression that you write will be able to be broken by some other strange nesting of comments.

你会遇到问题。因此，虽然您可以使用正则表达式解决这个特定问题，但您编写的任何正则表达式都可能被其他一些奇怪的注释嵌套破坏。

To parse arbitrary comments, you'll need to move on to a method of parsing context free grammars. A simple method to do so is to use a pushdown automaton.

要解析任意注释，您需要继续使用解析上下文无关文法的方法。一个简单的方法是使用下推自动机。

Python 两个字符串之间的正则表达式匹配？

提问by Hrvoje ?poljar

采纳答案by iruvar

回答by Wilduck

相关推荐

最近更新

标签

Python 两个字符串之间的正则表达式匹配？

提问by Hrvoje ?poljar

采纳答案by iruvar

回答by Wilduck

相关推荐

Python 在 Flask 中执行耗时函数时显示“正在加载”消息

javascript中是否有像python这样的字典？

Python 错误：没有名为 psycopg2.extensions 的模块

Python：异或十六进制字符串

相关推荐

最近更新

标签