Python:如何获取 itertools _grouper 的长度

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13870962/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 09:48:09  来源:igfitidea点击:

Python: How to get the length of itertools _grouper

pythongroup-byitertools

提问by user1466679

I'm working with Python itertools and using groupby to sort a bunch of pairs by the last element. I've gotten it to sort and I can iterate through the groups just fine, but I would really love to be able to get the length of each group without having to iterate through each one, incrementing a counter.

我正在使用 Python itertools 并使用 groupby 按最后一个元素对一堆对进行排序。我已经把它排序了,我可以很好地遍历组,但我真的很想能够获得每个组的长度,而不必遍历每个组,增加一个计数器。

The project is cluster some data points. I'm working with pairs of (numpy.array, int) where the numpy array is a data point and the integer is a cluster label

该项目是对一些数据点进行聚类。我正在使用 (numpy.array, int) 对,其中 numpy 数组是一个数据点,整数是一个簇标签

Here's my relevant code:

这是我的相关代码:

data = sorted(data, key=lambda (point, cluster):cluster)
for cluster,clusterList in itertools.groupby(data, key=lambda (point, cluster):cluster):
    if len(clusterList) < minLen:

On the last line: if len(clusterList) < minLen:, I get an error that

在最后一行:if len(clusterList) < minLen:,我收到一个错误

object of type 'itertools._grouper' has no len()

'itertools._grouper' 类型的对象没有 len()

I've looked up the operations available for _groupers, but can't find anything that seems to provide the length of a group.

我查找了可用于 的操作_groupers,但找不到任何似乎提供组长度的内容。

采纳答案by kindall

Just because you call it clusterListdoesn't make it a list! It's basically a lazy iterator, returning each item as it's needed. You can convert it to a list like this, though:

仅仅因为你调用它clusterList并不能使它成为一个列表!它基本上是一个惰性迭代器,根据需要返回每个项目。不过,您可以将其转换为这样的列表:

clusterList = list(clusterList)

Or do that and get its length in one step:

或者这样做并一步得到它的长度:

length = len(list(clusterList))

If you don't want to take up the memory of making it a list, you can do this instead:

如果您不想占用将其设为列表的内存,则可以这样做:

length = sum(1 for x in clusterList)

Be aware that the original iterator will be consumed entirely by either converting it to a list or using the sum()formulation.

请注意,原始迭代器将通过将其转换为列表或使用sum()公式来完全消耗。

回答by Brian Cain

clusterListis iterablebut it is not a list. This can be a little confusing sometimes. You can do a forloop over clusterListbut you can't do other list things over it (slice, len, etc).

clusterListiterable但它不是list. 有时这可能有点令人困惑。您可以进行for循环,clusterList但不能对其进行其他列表操作(切片、len 等)。

Fix: assign the result of list(clusterList)to clusterList.

修复:将结果赋值list(clusterList)clusterList