Python 找到所有正则表达式匹配的索引？

Question

提问by xitrium

I'm parsing strings that could have any number of quoted strings inside them (I'm parsing code, and trying to avoid PLY). I want to find out if a substring is quoted, and I have the substrings index. My initial thought was to use re to find all the matches and then figure out the range of indexes they represent.

我正在解析其中可能包含任意数量的带引号的字符串的字符串（我正在解析代码，并试图避免 PLY）。我想知道子字符串是否被引用，并且我有子字符串索引。我最初的想法是使用 re 查找所有匹配项，然后找出它们所代表的索引范围。

It seems like I should use re with a regex like \"[^\"]+\"|'[^']+'(I'm avoiding dealing with triple quoted and such strings at the moment). When I use findall() I get a list of the matching strings, which is somewhat nice, but I need indexes.

似乎我应该将 re 与正则表达式一起使用\"[^\"]+\"|'[^']+'（我目前正在避免处理三重引号和此类字符串）。当我使用 findall() 时，我得到了一个匹配字符串的列表，这有点不错，但我需要索引。

My substring might be as simple as c, and I need to figure out if this particular cis actually quoted or not.

我的子字符串可能像一样简单c，我需要弄清楚这个特定的字符串c是否真的被引用了。

Answer 1

采纳答案by Dave Kirby

This is what you want: (source)

这就是你想要的：（来源）

re.finditer(pattern, string[, flags]) 
Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found. Empty matches are included in the result unless they touch the beginning of another match.

re.finditer(pattern, string[, flags]) 
返回一个迭代器，在字符串中 RE 模式的所有非重叠匹配上产生 MatchObject 实例。从左到右扫描字符串，并按找到的顺序返回匹配项。空匹配项包含在结果中，除非它们触及另一个匹配项的开头。

You can then get the start and end positions from the MatchObjects.

然后您可以从 MatchObjects 获取开始和结束位置。

e.g.

例如

[(m.start(0), m.end(0)) for m in re.finditer(pattern, string)]

Answer 2

回答by Omkar Rahane

This should solve your issue pattern=r"(?=(\"[^\"]+\"|'[^']+'))"

这应该可以解决您的问题 pattern=r"(?=(\"[^\"]+\"|'[^']+'))"

Then use the following to get all overlapping indices,

然后使用以下内容获取所有重叠索引，

indicesTuple=[(mObj.start(1),mObj.end(1)-1) for mObj in re.finditer(pattern,input)]

indexTuple=[(mObj.start(1),mObj.end(1)-1) for mObj in re.finditer(pattern,input)]

Python 找到所有正则表达式匹配的索引？

提问by xitrium

采纳答案by Dave Kirby

回答by Omkar Rahane

相关推荐

最近更新

标签

Python 找到所有正则表达式匹配的索引？

提问by xitrium

采纳答案by Dave Kirby

回答by Omkar Rahane

相关推荐

在 Python 中模拟 Bash“源”

Python 3：导入错误“没有名为 Setuptools 的模块”

Python 将 os.system 的输出分配给变量并防止其显示在屏幕上

Python 什么时候在进程上调用 .join() ？

相关推荐

最近更新

标签