Python groupby 多个值,并绘制结果
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34225839/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
groupby multiple values, and plotting results
提问by A. Chatfield
I'm using some data on fungicide usage which has the Year, Fungicide, Amount used, along with some irrelevant columns in a panda DataFrame. It looks somewhat like:
我正在使用一些关于杀菌剂使用的数据,其中包含年份、杀菌剂、使用量以及熊猫数据框中的一些不相关列。它看起来有点像:
Year, State, Fungicide, Value
2011, California, A, 12879
2011, California, B, 29572
2011, Florida, A, 8645
2011, Florida, B, 19573
2009, California, A, 8764
2009, California, B, 98643,
...
What I want from it is a single plot of total fungicide used over time, with a line plotted for each individual fungicide (in a different colour). I've used .groupby to get the total amount of each fungicide used each year:
我想要的是随着时间的推移使用的总杀菌剂的单一图,为每种杀菌剂绘制一条线(不同颜色)。我使用 .groupby 来获得每年使用的每种杀菌剂的总量:
apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()
This gives me the values I want to plot, something like:
这给了我想要绘制的值,例如:
Year, Fungicide, Value
...
2009, A, 128635
B, 104765
2011, A, 154829
B, 129865
Now I need to plot it so that each fungicide (A, B, ...) is a separate line on a single plotof Value over Time
现在我需要绘制它让每个杀真菌剂(A,B,...)是一个图一个单独的行的价值随时间
Is there a way of doing this without separating it all out? Forgive my ignorance, I'm new to python and am still getting familiar with it.
有没有办法在不将其全部分离的情况下做到这一点?原谅我的无知,我是 python 的新手,并且还在熟悉它。
采纳答案by Stefan
For a clean solution that properly prints legend
and xticks
, you could
对于正确打印legend
和的干净解决方案xticks
,您可以
apple_fplot = df.groupby(['Year','Fungicide'])['Value'].sum()
plot_df = apple_fplot.unstack('Fungicide').loc[:, 'Value']
plot_df.index = pd.PeriodIndex(plot_df.index.tolist(), freq='A')
plot_df.plot()
For
subplots
, just set the respective keyword
to True
:
对于
subplots
,只需将各自设置keyword
为True
:
plot_df.plot(subplots=True)
to get:
要得到:
回答by Chris
something along the lines of:
类似的东西:
df_grouped = df.groupby('Fungicide')
for key, group in df_grouped:
group.groupby('Year')['Value'].sum().plot(ax=ax,label=key)
By using for loop on a groupby object will iterate through each group, assigning the key (e.g. 'A' or 'B', the values of the column it was grouped by), and the group dataframe each time.
通过在 groupby 对象上使用 for 循环,将遍历每个组,每次分配键(例如,'A' 或 'B',分组所依据的列的值)和组数据帧。
See here for an example
请参见此处的示例
http://pandas.pydata.org/pandas-docs/stable/groupby.html#iterating-through-groups
http://pandas.pydata.org/pandas-docs/stable/groupby.html#iterating-through-groups
回答by Colonel Beauvel
You can do:
你可以做:
import matplotlib
matplotlib.style.use('ggplot')
import matplotlib.pyplot as plt
plt.figure()
df.groupby(['Year','Fungicide']).sum().unstack().plot()
Data
数据
Year State Fungicide Value
0 2011 California A 12879
1 2011 California B 29572
2 2011 Florida A 8645
3 2011 Florida B 19573
4 2009 California A 8764
5 2009 California B 98643