pandas 迭代数据帧中的组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46230895/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:27:43  来源:igfitidea点击:

Iterating over groups in a dataframe

pythonpandasdataframegroup-bypandas-groupby

提问by Tolki

The issue I am having is that I want to group the dataframe and then use functions to manipulate the data after its been grouped. For example I want to group the data by Date and then iterate through each row in the date groups to parse to a function?

我遇到的问题是我想对数据框进行分组,然后在分组后使用函数来操作数据。例如,我想按日期对数据进行分组,然后遍历日期组中的每一行以解析为函数?

The issue is groupby seems to create a tuple of the key and then a massive string consisting of all of the rows in the data making iterating through each row impossible

问题是 groupby 似乎创建了一个键的元组,然后是一个由数据中的所有行组成的大量字符串,使得遍历每一行变得不可能

回答by cs95

When you apply groupbyon a dataframe, you don't get rows, you get groups of dataframe. For example, consider:

当您应用groupby数据框时,您不会获得行,而是获得数据框组。例如,考虑:

df
    ID        Date  Days  Volume/Day
0  111  2016-01-01    20          50
1  111  2016-02-01    25          40
2  111  2016-03-01    31          35
3  111  2016-04-01    30          30
4  111  2016-05-01    31          25
5  112  2016-01-01    31          55
6  112  2016-01-02    26          45
7  112  2016-01-03    31          40
8  112  2016-01-04    30          35
9  112  2016-01-05    31          30

for i, g in df.groupby('ID'):
     print(g, '\n')


    ID        Date  Days  Volume/Day
0  111  2016-01-01    20          50
1  111  2016-02-01    25          40
2  111  2016-03-01    31          35
3  111  2016-04-01    30          30
4  111  2016-05-01    31          25 

    ID        Date  Days  Volume/Day
5  112  2016-01-01    31          55
6  112  2016-01-02    26          45
7  112  2016-01-03    31          40
8  112  2016-01-04    30          35
9  112  2016-01-05    31          30 


For your case, you should probably look into dfGroupby.apply, if you want to apply some function on your groups, dfGroupby.transformto produce like indexed dataframe (see docs for explanation) or dfGroupby.agg, if you want to produce aggregated results.

对于您的情况,您可能应该查看dfGroupby.apply,如果您想对您的组应用某些功能,dfGroupby.transform以生成类似索引的数据框(请参阅文档以获取解释),或者dfGroupby.agg,如果您想生成聚合结果。

You'd do something like:

你会做这样的事情:

r = df.groupby('Date').apply(your_function) 

You'd define your function as:

您可以将函数定义为:

def your_function(df):
    ... # operation on df
    return result

If you have problems with the implementation, please open a new question, post your data and your code, and any associated errors/tracebacks. Happy coding.

如果您在实施中遇到问题,请打开一个新问题,发布您的数据和代码,以及任何相关的错误/回溯。快乐编码。