如何通过 DataFrame 扁平化 Pandas group?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46155173/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:25:57  来源:igfitidea点击:

How to flatten Pandas groupby DataFrame?

pythonpandaspandas-groupby

提问by Brylie Christopher Oxley

I have a Pandas DataFrame that is grouped by date and 'outcome':

我有一个按日期和“结果”分组的 Pandas DataFrame:

api_logs.groupby([api_logs.index.date, 'Outcome']).size()
            Outcome
2017-04-22  Success      7
2017-04-24  Failure     32
            Success     59
2017-04-25  Failure     23
            Success     91
2017-04-26  Failure      1
            Success     59
2017-04-27  Failure      3
            Success      1
2017-04-28  Failure      1
            Success      2
2017-04-29  Success      3
2017-05-03  Failure     38
2017-05-04  Failure      6
            Success    727

How can I flatten the nested data, so that it is structured as below?

如何展平嵌套数据,使其结构如下?

            Failure    Success
2017-04-22                   7
2017-04-24       32         59
2017-04-25       23         91
2017-04-26        1         59
2017-04-27        3          1
2017-04-28        1          2
2017-04-29        3
2017-05-03       38
2017-05-04        6        727

My end-goal is to plot the failures and successes together in a chart, so perhaps there is a different approach altogether?

我的最终目标是将失败和成功一起绘制在图表中,所以也许有完全不同的方法?

回答by jezrael

Use unstackfor reshape:

使用unstack的重塑:

df = api_logs.groupby([api_logs.index.date, 'Outcome']).size().unstack()
print (df)
Outcome     Failure  Success
2017-04-22      NaN      7.0
2017-04-24     32.0     59.0
2017-04-25     23.0     91.0
2017-04-26      1.0     59.0
2017-04-27      3.0      1.0
2017-04-28      1.0      2.0
2017-04-29      NaN      3.0
2017-05-03     38.0      NaN
2017-05-04      6.0    727.0

Also is possible replace NaNs to 0by parameter fill_value:

也可以通过参数替换NaNs :0fill_value

df = api_logs.groupby([api_logs.index.date, 'Outcome']).size().unstack(fill_value=0)

print (df)
Outcome     Failure  Success
2017-04-22        0        7
2017-04-24       32       59
2017-04-25       23       91
2017-04-26        1       59
2017-04-27        3        1
2017-04-28        1        2
2017-04-29        0        3
2017-05-03       38        0
2017-05-04        6      727