Python 正则表达式匹配次数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3895646/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Number of regex matches
提问by dutt
I'm using the finditerfunction in the remodule to match some things and everything is working.
我正在使用模块中的finditer函数re来匹配一些东西,一切正常。
Now I need to find out how many matches I've got. Is it possible without looping through the iterator twice? (one to find out the count and then the real iteration)
现在我需要找出我有多少场比赛。是否可以不循环遍历迭代器两次?(先找出计数,然后才是真正的迭代)
Some code:
一些代码:
imageMatches = re.finditer("<img src\=\"(?P<path>[-/\w\.]+)\"", response[2])
# <Here I need to get the number of matches>
for imageMatch in imageMatches:
doStuff
Everything works, I just need to get the number of matches before the loop.
一切正常,我只需要在循环之前获取匹配数。
采纳答案by JoshD
If you know you will want all the matches, you could use the re.findallfunction. It will return a list of all the matches. Then you can just do len(result)for the number of matches.
如果您知道需要所有匹配项,则可以使用该re.findall功能。它将返回所有匹配项的列表。然后你可以只做len(result)匹配的数量。
回答by intuited
If you always need to know the length, and you just need the content of the match rather than the other info, you might as well use re.findall. Otherwise, if you only need the length sometimes, you can use e.g.
如果您总是需要知道长度,并且您只需要匹配的内容而不是其他信息,那么您不妨使用re.findall. 否则,如果您有时只需要长度,则可以使用例如
matches = re.finditer(...)
...
matches = tuple(matches)
to store the iteration of the matches in a reusable tuple. Then just do len(matches).
将匹配的迭代存储在可重用的元组中。然后就做len(matches)。
Another option, if you just need to know the total count after doing whatever with the match objects, is to use
另一种选择,如果您只需要在对匹配对象执行任何操作后知道总计数,则使用
matches = enumerate(re.finditer(...))
which will return an (index, match)pair for each of the original matches. So then you can just store the first element of each tuple in some variable.
这将为(index, match)每个原始匹配返回一对。因此,您可以将每个元组的第一个元素存储在某个变量中。
But if you need the length first of all, and you need match objects as opposed to just the strings, you should just do
但是,如果您首先需要长度,并且需要匹配对象而不仅仅是字符串,那么您应该这样做
matches = tuple(re.finditer(...))
回答by Rafe Kettler
If you find you need to stick with finditer(), you can simply use a counter while you iterate through the iterator.
如果您发现需要坚持使用finditer(),则可以在迭代迭代器时简单地使用计数器。
Example:
例子:
>>> from re import *
>>> pattern = compile(r'.ython')
>>> string = 'i like python jython and dython (whatever that is)'
>>> iterator = finditer(pattern, string)
>>> count = 0
>>> for match in iterator:
count +=1
>>> count
3
If you need the features of finditer()(not matching to overlapping instances), use this method.
如果您需要finditer()(不匹配重叠实例)的功能,请使用此方法。
回答by Mods Vs Rockers
#An example for counting matched groups
import re
pattern = re.compile(r'(\w+).(\d+).(\w+).(\w+)', re.IGNORECASE)
search_str = "My 11 Char String"
res = re.match(pattern, search_str)
print(len(res.groups())) # len = 4
print (res.group(1) ) #My
print (res.group(2) ) #11
print (res.group(3) ) #Char
print (res.group(4) ) #String
回答by Adam Gradzki
For those moments when you really want to avoid building lists:
对于那些你真的想避免构建列表的时刻:
import re
import operator
from functools import reduce
count = reduce(operator.add, (1 for _ in re.finditer(my_pattern, my_string)))
Sometimes you might need to operate on huge strings. This might help.
有时您可能需要对巨大的字符串进行操作。这可能会有所帮助。
回答by Travis Jones
I know this is a little old, but this but here is a concise function for counting regex patterns.
我知道这有点旧,但这是一个用于计算正则表达式模式的简洁函数。
def regex_cnt(string, pattern):
return len(re.findall(pattern, string))
string = 'abc123'
regex_cnt(string, '[0-9]')

