pandas 迭代数据帧中的组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/46230895/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Iterating over groups in a dataframe
提问by Tolki
The issue I am having is that I want to group the dataframe and then use functions to manipulate the data after its been grouped. For example I want to group the data by Date and then iterate through each row in the date groups to parse to a function?
我遇到的问题是我想对数据框进行分组,然后在分组后使用函数来操作数据。例如,我想按日期对数据进行分组,然后遍历日期组中的每一行以解析为函数?
The issue is groupby seems to create a tuple of the key and then a massive string consisting of all of the rows in the data making iterating through each row impossible
问题是 groupby 似乎创建了一个键的元组,然后是一个由数据中的所有行组成的大量字符串,使得遍历每一行变得不可能
回答by cs95
When you apply groupbyon a dataframe, you don't get rows, you get groups of dataframe. For example, consider:
当您应用groupby数据框时,您不会获得行,而是获得数据框组。例如,考虑:
df
ID Date Days Volume/Day
0 111 2016-01-01 20 50
1 111 2016-02-01 25 40
2 111 2016-03-01 31 35
3 111 2016-04-01 30 30
4 111 2016-05-01 31 25
5 112 2016-01-01 31 55
6 112 2016-01-02 26 45
7 112 2016-01-03 31 40
8 112 2016-01-04 30 35
9 112 2016-01-05 31 30
for i, g in df.groupby('ID'):
print(g, '\n')
ID Date Days Volume/Day
0 111 2016-01-01 20 50
1 111 2016-02-01 25 40
2 111 2016-03-01 31 35
3 111 2016-04-01 30 30
4 111 2016-05-01 31 25
ID Date Days Volume/Day
5 112 2016-01-01 31 55
6 112 2016-01-02 26 45
7 112 2016-01-03 31 40
8 112 2016-01-04 30 35
9 112 2016-01-05 31 30
For your case, you should probably look into dfGroupby.apply, if you want to apply some function on your groups, dfGroupby.transformto produce like indexed dataframe (see docs for explanation) or dfGroupby.agg, if you want to produce aggregated results.
对于您的情况,您可能应该查看dfGroupby.apply,如果您想对您的组应用某些功能,dfGroupby.transform以生成类似索引的数据框(请参阅文档以获取解释),或者dfGroupby.agg,如果您想生成聚合结果。
You'd do something like:
你会做这样的事情:
r = df.groupby('Date').apply(your_function)
You'd define your function as:
您可以将函数定义为:
def your_function(df):
... # operation on df
return result
If you have problems with the implementation, please open a new question, post your data and your code, and any associated errors/tracebacks. Happy coding.
如果您在实施中遇到问题,请打开一个新问题,发布您的数据和代码,以及任何相关的错误/回溯。快乐编码。

