Python groupby 多个值,并绘制结果

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34225839/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:38:33  来源:igfitidea点击:

groupby multiple values, and plotting results

pythonpandasmatplotlibgroup-bydata-analysis

提问by A. Chatfield

I'm using some data on fungicide usage which has the Year, Fungicide, Amount used, along with some irrelevant columns in a panda DataFrame. It looks somewhat like:

我正在使用一些关于杀菌剂使用的数据,其中包含年份、杀菌剂、使用量以及熊猫数据框中的一些不相关列。它看起来有点像:

Year, State,      Fungicide, Value
2011, California, A,         12879
2011, California, B,         29572
2011, Florida,    A,         8645
2011, Florida,    B,         19573
2009, California, A,         8764
2009, California, B,         98643,
...

What I want from it is a single plot of total fungicide used over time, with a line plotted for each individual fungicide (in a different colour). I've used .groupby to get the total amount of each fungicide used each year:

我想要的是随着时间的推移使用的总杀菌剂的单一图,为每种杀菌剂绘制一条线(不同颜色)。我使用 .groupby 来获得每年使用的每种杀菌剂的总量:

apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()

This gives me the values I want to plot, something like:

这给了我想要绘制的值,例如:

Year, Fungicide, Value
...
2009, A,        128635
      B,        104765
2011, A,        154829
      B,        129865

Now I need to plot it so that each fungicide (A, B, ...) is a separate line on a single plotof Value over Time

现在我需要绘制它让每个杀真菌剂(A,B,...)是一个图一个单独的行价值随时间

Is there a way of doing this without separating it all out? Forgive my ignorance, I'm new to python and am still getting familiar with it.

有没有办法在不将其全部分离的情况下做到这一点?原谅我的无知,我是 python 的新手,并且还在熟悉它。

采纳答案by Stefan

For a clean solution that properly prints legendand xticks, you could

对于正确打印legend和的干净解决方案xticks,您可以

apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()
plot_df = apple_fplot.unstack('Fungicide').loc[:, 'Value']
plot_df.index = pd.PeriodIndex(plot_df.index.tolist(), freq='A')
plot_df.plot()

enter image description hereFor subplots, just set the respective keywordto True:

在此处输入图片说明对于subplots,只需将各自设置keywordTrue

plot_df.plot(subplots=True)

to get:

要得到:

enter image description here

在此处输入图片说明

回答by Chris

something along the lines of:

类似的东西:

df_grouped = df.groupby('Fungicide')
for key, group in df_grouped:
   group.groupby('Year')['Value'].sum().plot(ax=ax,label=key)

By using for loop on a groupby object will iterate through each group, assigning the key (e.g. 'A' or 'B', the values of the column it was grouped by), and the group dataframe each time.

通过在 groupby 对象上使用 for 循环,将遍历每个组,每次分配键(例如,'A' 或 'B',分组所依据的列的值)和组数据帧。

See here for an example

请参见此处的示例

http://pandas.pydata.org/pandas-docs/stable/groupby.html#iterating-through-groups

http://pandas.pydata.org/pandas-docs/stable/groupby.html#iterating-through-groups

回答by Colonel Beauvel

You can do:

你可以做:

import matplotlib
matplotlib.style.use('ggplot')
import matplotlib.pyplot as plt

plt.figure()
df.groupby(['Year','Fungicide']).sum().unstack().plot()

enter image description here

在此处输入图片说明

Data

数据

   Year        State Fungicide  Value
0  2011   California         A  12879
1  2011   California         B  29572
2  2011      Florida         A   8645
3  2011      Florida         B  19573
4  2009   California         A   8764
5  2009   California         B  98643