至少 4 位整数的 Python 正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16348538/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 22:21:45  来源:igfitidea点击:

Python regex for int with at least 4 digits

pythonregexintmatch

提问by kramer65

I am just learning regex and I'm a bit confused here. I've got a string from which I want to extract an int with at least 4 digits and at most 7 digits. I tried it as follows:

我只是在学习正则表达式,在这里我有点困惑。我有一个字符串,我想从中提取一个至少有 4 位数字,最多 7 位数字的 int。我试过如下:

>>> import re
>>> teststring = 'abcd123efg123456'
>>> re.match(r"[0-9]{4,7}$", teststring)

Where I was expecting 123456, unfortunately this results in nothing at all. Could anybody help me out a little bit here?

我期待 123456 的地方,不幸的是这根本没有结果。有人可以帮我一下吗?

采纳答案by Andrew Cheong

@ExplosionPills is correct, but there would still be two problems with your regex.

@ExplosionPills 是正确的,但您的正则表达式仍然存在两个问题。

First, $matches the endof the string. I'm guessing you'd like to be able to extract an int in the middle of the string as well, e.g.abcd123456efg789to return 123456. To fix that, you want this:

首先,$匹配结束的字符串。我猜你也希望能够在字符串的中间提取一个 int ,例如abcd123456efg789return 123456。要解决这个问题,你需要这个:

r"[0-9]{4,7}(?![0-9])"
            ^^^^^^^^^

The added portion is a negative lookahead assertion, meaning, "...not followed by any more numbers." Let me simplify that by the use of \dthough:

添加的部分是一个否定的前瞻断言,意思是“......后面没有更多的数字。” 让我通过使用\d虽然来简化它:

r"\d{4,7}(?!\d)"

That's better. Now, the second problem. You have no constraint on the left side of your regex, so given a string like abcd123efg123456789, you'd actually match 3456789. So, you need a negative lookbehind assertionas well:

这样更好。现在,第二个问题。您在正则表达式的左侧没有限制,因此给定一个字符串,例如abcd123efg123456789,您实际上会匹配3456789. 因此,您还需要一个否定的回顾断言

r"(?<!\d)\d{4,7}(?!\d)"

回答by Explosion Pills

.matchwill only match if the string startswith the pattern. Use .search.

.match仅当字符串以模式开头时才匹配。使用.search.

回答by galarant

You can also use:

您还可以使用:

re.findall(r"[0-9]{4,7}", teststring)

Which will return a list of all substrings that match your regex, in your case ['123456']

这将返回与您的正则表达式匹配的所有子字符串的列表,在您的情况下 ['123456']

If you're interested in just the first matched substring, then you can write this as:

如果您只对第一个匹配的子字符串感兴趣,那么您可以将其写为:

next(iter(re.findall(r"[0-9]{4,7}", teststring)), None)