Python 如何迭代从 groupby().size() 生成的 Pandas 系列

Question

提问by Reily Bourne

How do you iterate over a Pandas Series generated from a .groupby('...').size()command and get both the group name and count.

您如何迭代从.groupby('...').size()命令生成的 Pandas 系列并获取组名和计数。

As an example if I have:

例如，如果我有：

how can I loop over them so the that each iteration I would have -1 & 7, 0 & 85, 1 & 14 and 2 & 5 in variables?

我怎样才能循环它们，以便每次迭代我都会有 -1 & 7、0 & 85、1 & 14 和 2 & 5 变量？

I tried the enumerate option but it doesn't quite work. Example:

我尝试了 enumerate 选项，但它不太好用。例子：

for i, row in enumerate(df.groupby(['foo']).size()):
    print(i, row)

it doesn't return -1, 0, 1, and 2 for ibut rather 0, 1, 2, 3.

它不返回 -1、0、1 和 2，i而是返回 0、1、2、3。

Answer 1

回答by Psidom

Update:

更新：

Given a pandas Series:

给定一个熊猫系列：

s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])

s
#a    1
#b    2
#c    3
#d    4
#dtype: int64

You can directly loop through it, which yield one value from the series in each iteration:

您可以直接遍历它，在每次迭代中从系列中产生一个值：

for i in s:
    print(i)
1
2
3
4

If you want to access the index at the same time, you can use either itemsor iteritemsmethod, which produces a generator that contains both the index and value:

如果你想同时访问索引，你可以使用itemsoriteritems方法，它会生成一个包含索引和值的生成器：

for i, v in s.items():
    print('index: ', i, 'value: ', v)
#index:  a value:  1
#index:  b value:  2
#index:  c value:  3
#index:  d value:  4

for i, v in s.iteritems():
    print('index: ', i, 'value: ', v)
#index:  a value:  1
#index:  b value:  2
#index:  c value:  3
#index:  d value:  4

Old Answer:

旧答案：

You can call iteritems()method on the Series:

您可以iteritems()在系列上调用方法：

for i, row in df.groupby('a').size().iteritems():
    print(i, row)

# 12 4
# 14 2

According to doc:

根据文档：

Series.iteritems()
Lazily iterate over (index, value) tuples

系列.iteitems()
懒惰地迭代（索引，值）元组

Note: This is not the same data as in the question, just a demo.

注意：这与问题中的数据不同，只是一个演示。

Answer 2

回答by dbouz

To expand upon the answer of Psidom, there are three useful ways to unpack data from pd.Series. Having the same Series as Psidom:

为了扩展 Psidom 的答案，有三种有用的方法可以从 pd.Series 解包数据。与 Psidom 具有相同的系列：

s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])

A direct loop over syields the valueof each row.
A loop over s.iteritems()or s.items()yields a tuple with the (index,value)pairs of each row.
Using enumerate()on s.iteritems()yields a nested tuple in the form of: (rownum,(index,value)).

直接循环s产生value每一行的。
循环 s.iteritems()或s.items()产生一个包含(index,value)每行对的元组。
使用enumerate()on 会s.iteritems()产生以下形式的嵌套元组：(rownum,(index,value))。

The last way is useful in case your index contains other information than the row number itself (e.g. in a case of a timeseries where the index is time).

如果您的索引包含除行号本身之外的其他信息（例如，在索引为时间的时间序列的情况下），则最后一种方法很有用。

s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])

for rownum,(indx,val) in enumerate(s.iteritems()):
    print('row number: ', rownum, 'index: ', indx, 'value: ', val)

will output:

将输出：

row number:  0 index:  a value:  1
row number:  1 index:  b value:  2
row number:  2 index:  c value:  3
row number:  3 index:  d value:  4

You can read more on unpacking nested tuples here.

您可以在此处阅读有关解包嵌套元组的更多信息。

Python 如何迭代从 groupby().size() 生成的 Pandas 系列

提问by Reily Bourne

回答by Psidom

回答by dbouz

相关推荐

最近更新

标签

Python 如何迭代从 groupby().size() 生成的 Pandas 系列

提问by Reily Bourne

回答by Psidom

回答by dbouz

相关推荐

Python 如何缓存 Django Rest Framework API 调用？

Python 错误：安装要求时命令出错，退出状态为 1

Python 如何在 jupyter-notebook 中逐行执行代码？

Python “不推荐使用 type 的同义词；在 numpy 的未来版本中，它将被理解为 (type, (1,)) / '(1,)type'。” TensorFlow 中的问题

相关推荐

最近更新

标签