Python 具有多个组的正则表达式?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4963691/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 18:22:14  来源:igfitidea点击:

RegEx with multiple groups?

pythonregex

提问by joslinm

I'm getting confused returning multiple groups in Python. My RegEx is this:

我对在 Python 中返回多个组感到困惑。我的正则表达式是这样的:

lun_q = 'Lun:\s*(\d+\s?)*'

And my string is

我的字符串是

s = '''Lun:                     0 1 2 3 295 296 297 298'''`

I return a matched object, and then want to look at the groups, but all it shows it the last number (258):

我返回一个匹配的对象,然后想查看组,但它显示的是最后一个数字(258):

r.groups()  
(u'298',)

Why isn't it returning groups of 0,1,2,3,4etc.?

为什么不返回0,1,2,3,4等组?

采纳答案by Ben Blank

Your regex only contains a single pair of parentheses (one capturing group), so you only get one group in your match. If you use a repetition operator on a capturing group (+or *), the group gets "overwritten" each time the group is repeated, meaning that only the last match is captured.

您的正则表达式仅包含一对括号(一个捕获组),因此您在匹配中只能获得一个组。如果您在捕获组(+*)上使用重复运算符,则每次重复该组时该组都会被“覆盖”,这意味着仅捕获最后一个匹配项。

In your example here, you're probably better off using .split(), in combination with a regex:

在您的示例中,您最好将.split(), 与正则表达式结合使用:

lun_q = 'Lun:\s*(\d+(?:\s+\d+)*)'
s = '''Lun: 0 1 2 3 295 296 297 298'''

r = re.search(lun_q, s)

if r:
    luns = r.group(1).split()

    # optionally, also convert luns from strings to integers
    luns = [int(lun) for lun in luns]

回答by pokstad

Another approach would be to use the regex you have to validate your data and then use a more specific regex that targets each item you wish to extract using a match iterator.

另一种方法是使用您必须验证数据的正则表达式,然后使用更具体的正则表达式来定位您希望使用匹配迭代器提取的每个项目。

import re
s = '''Lun: 0 1 2 3 295 296 297 298'''
lun_validate_regex = re.compile(r'Lun:\s*((\d+)(\s\d+)*)')
match = lun_validate_regex.match(s)
if match:
    token_regex = re.compile(r"\d{1,3}")
    match_iterator = token_regex.finditer(match.group(1))
    for token_match in match_iterator:
        #do something brilliant

回答by kurumi

Sometimes, its easier without regex.

有时,没有正则表达式会更容易。

>>> s = '''Lun: 0 1 2 3 295 296 297 298'''
>>> if "Lun: " in s:
...     items = s.replace("Lun: ","").split()
...     for n in items:
...        if n.isdigit():
...           print n
...
0
1
2
3
295
296
297
298

回答by Rakesh kumar

If you are looking for an output such as 0,1,2,3,4 etc.The answeris very simple, see code below.

如果你正在寻找一个输出,如0,1,2,3,4等答案很简单,看看下面的代码。

print re.findall('\d',s)

打印 re.findall('\d',s)