Python 如何迭代从 groupby().size() 生成的 Pandas 系列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38387529/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to iterate over Pandas Series generated from groupby().size()
提问by Reily Bourne
How do you iterate over a Pandas Series generated from a .groupby('...').size()
command and get both the group name and count.
您如何迭代从.groupby('...').size()
命令生成的 Pandas 系列并获取组名和计数。
As an example if I have:
例如,如果我有:
foo
-1 7
0 85
1 14
2 5
how can I loop over them so the that each iteration I would have -1 & 7, 0 & 85, 1 & 14 and 2 & 5 in variables?
我怎样才能循环它们,以便每次迭代我都会有 -1 & 7、0 & 85、1 & 14 和 2 & 5 变量?
I tried the enumerate option but it doesn't quite work. Example:
我尝试了 enumerate 选项,但它不太好用。例子:
for i, row in enumerate(df.groupby(['foo']).size()):
print(i, row)
it doesn't return -1, 0, 1, and 2 for i
but rather 0, 1, 2, 3.
它不返回 -1、0、1 和 2,i
而是返回 0、1、2、3。
回答by Psidom
Update:
更新:
Given a pandas Series:
给定一个熊猫系列:
s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])
s
#a 1
#b 2
#c 3
#d 4
#dtype: int64
You can directly loop through it, which yield one value from the series in each iteration:
您可以直接遍历它,在每次迭代中从系列中产生一个值:
for i in s:
print(i)
1
2
3
4
If you want to access the index at the same time, you can use either items
or iteritems
method, which produces a generator that contains both the index and value:
如果你想同时访问索引,你可以使用items
oriteritems
方法,它会生成一个包含索引和值的生成器:
for i, v in s.items():
print('index: ', i, 'value: ', v)
#index: a value: 1
#index: b value: 2
#index: c value: 3
#index: d value: 4
for i, v in s.iteritems():
print('index: ', i, 'value: ', v)
#index: a value: 1
#index: b value: 2
#index: c value: 3
#index: d value: 4
Old Answer:
旧答案:
You can call iteritems()
method on the Series:
您可以iteritems()
在系列上调用方法:
for i, row in df.groupby('a').size().iteritems():
print(i, row)
# 12 4
# 14 2
According to doc:
根据文档:
Series.iteritems()
Lazily iterate over (index, value) tuples
系列.iteitems()
懒惰地迭代(索引,值)元组
Note: This is not the same data as in the question, just a demo.
注意:这与问题中的数据不同,只是一个演示。
回答by dbouz
To expand upon the answer of Psidom, there are three useful ways to unpack data from pd.Series. Having the same Series as Psidom:
为了扩展 Psidom 的答案,有三种有用的方法可以从 pd.Series 解包数据。与 Psidom 具有相同的系列:
s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])
s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])
- A direct loop over
s
yields thevalue
of each row. - A loop over
s.iteritems()
ors.items()
yields a tuple with the(index,value)
pairs of each row. - Using
enumerate()
ons.iteritems()
yields a nested tuple in the form of:(rownum,(index,value))
.
- 直接循环
s
产生value
每一行的 。 - 循环
s.iteritems()
或s.items()
产生一个包含(index,value)
每行对的元组。 - 使用
enumerate()
on 会s.iteritems()
产生以下形式的嵌套元组:(rownum,(index,value))
。
The last way is useful in case your index contains other information than the row number itself (e.g. in a case of a timeseries where the index is time).
如果您的索引包含除行号本身之外的其他信息(例如,在索引为时间的时间序列的情况下),则最后一种方法很有用。
s = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])
for rownum,(indx,val) in enumerate(s.iteritems()):
print('row number: ', rownum, 'index: ', indx, 'value: ', val)
will output:
将输出:
row number: 0 index: a value: 1
row number: 1 index: b value: 2
row number: 2 index: c value: 3
row number: 3 index: d value: 4
You can read more on unpacking nested tuples here.