Python 返回第一个匹配正则表达式的字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38579725/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:08:56  来源:igfitidea点击:

return string with first match Regex

pythonregex

提问by Luis Ramon Ramirez Rodriguez

I want to get the first match of a regex.

我想获得正则表达式的第一场比赛。

In this case, I got a list:

在这种情况下,我得到了一个列表:

text = 'aa33bbb44'
re.findall('\d+',text)

['33', '44']

['33', '44']

I could extract the first element of the list:

我可以提取列表的第一个元素:

text = 'aa33bbb44'
re.findall('\d+',text)[0]

'33'

'33'

But that only works if there is at least one match, otherwise I'll get an error:

但这仅在至少有一个匹配项时才有效,否则我会收到错误消息:

text = 'aazzzbbb'
re.findall('\d+',text)[0]

IndexError: list index out of range

IndexError:列表索引超出范围

In which case I could define a function:

在这种情况下,我可以定义一个函数:

def return_first_match(text):
    try:
        result = re.findall('\d+',text)[0]
    except Exception, IndexError:
        result = ''
    return result

Is there a way of obtaining that result without defining a new function?

有没有办法在不定义新函数的情况下获得该结果?

回答by Stefan Pochmann

You could embed the ''default in your regex by adding |$:

您可以''通过添加|$以下内容在正则表达式中嵌入默认值:

>>> re.findall('\d+|$', 'aa33bbb44')[0]
'33'
>>> re.findall('\d+|$', 'aazzzbbb')[0]
''
>>> re.findall('\d+|$', '')[0]
''

Also works with re.searchpointed out by others:

也适用于re.search其他人指出的:

>>> re.search('\d+|$', 'aa33bbb44').group()
'33'
>>> re.search('\d+|$', 'aazzzbbb').group()
''
>>> re.search('\d+|$', '').group()
''

回答by Iron Fist

If you only need the first match, then use re.searchinstead of re.findall:

如果您只需要第一个匹配项,请使用re.search代替re.findall

>>> m = re.search('\d+', 'aa33bbb44')
>>> m.group()
'33'
>>> m = re.search('\d+', 'aazzzbbb')
>>> m.group()
Traceback (most recent call last):
  File "<pyshell#281>", line 1, in <module>
    m.group()
AttributeError: 'NoneType' object has no attribute 'group'

Then you can use mas a checking condition as:

然后您可以将其m用作检查条件:

>>> m = re.search('\d+', 'aa33bbb44')
>>> if m:
        print('First number found = {}'.format(m.group()))
    else:
        print('Not Found')


First number found = 33

回答by Bill

I'd go with:

我会去:

r = re.search("\d+", ch)
result = return r.group(0) if r else ""

re.searchonly looks for the firstmatch in the string anyway, so I think it makes your intent slightly more clear than using findall.

re.search无论如何只查找字符串中的第一个匹配项,所以我认为它比使用findall.

回答by Tim Peters

You shouldn't be using .findall()at all - .search()is what you want. It finds the leftmost match, which is what you want (or returns Noneif no match exists).

你根本不应该使用.findall()-.search()这就是你想要的。它找到最左边的匹配项,这就是您想要的(None如果不存在匹配项,则返回)。

m = re.search(pattern, text)
result = m.group(0) if m else ""

Whether you want to put that in a function is up to you. It's unusualto want to return an empty string if no match is found, which is why nothing like that is built in. It's impossible to get confused about whether .search()on its own finds a match (it returns Noneif it didn't, or an SRE_Matchobject if it did).

你是否想把它放在一个函数中取决于你。如果没有找到匹配项,想要返回一个空字符串是不寻常的,这就是为什么没有内置类似的东西。 不可能对.search()自己是否找到匹配项感到困惑(None如果没有,它返回,或者一个SRE_Match对象如果确实如此)。

回答by ketan vijayvargiya

You can do:

你可以做:

x = re.findall('\d+', text)
result = x[0] if len(x) > 0 else ''

Note that your question isn't exactly related to regex. Rather, how do you safely find an element from an array, if it has none.

请注意,您的问题与正则表达式并不完全相关。相反,如果没有元素,您如何安全地从数组中找到元素。

回答by Marko Mackic

Maybe this would perform a bit better in case greater amount of input data does not contain your wanted piece because except has greater cost.

如果大量输入数据不包含您想要的部分,这可能会表现得更好一些,因为除了成本更高。

def return_first_match(text):
    result = re.findall('\d+',text)
    result = result[0] if result else ""
    return result