Python Pandas:将多个时间序列 DataFrame 绘制成一个图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38197964/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: plot multiple time series DataFrame into a single plot
提问by ShanZhengYang
I have the following pandas DataFrame:
我有以下熊猫数据帧:
time Group blocks
0 1 A 4
1 2 A 7
2 3 A 12
3 4 A 17
4 5 A 21
5 6 A 26
6 7 A 33
7 8 A 39
8 9 A 48
9 10 A 59
.... .... ....
36 35 A 231
37 1 B 1
38 2 B 1.5
39 3 B 3
40 4 B 5
41 5 B 6
.... .... ....
911 35 Z 349
This is a dataframe with multiple time series-ques data, from min=1
to max=35
. Each Group
has a time series like this.
这是一个包含多个时间序列问题数据的数据帧,从min=1
到max=35
. 每个Group
都有这样的时间序列。
I would like to plot each individual time series A through Z against an x-axis of 1 to 35. The y-axis would be the blocks
at each time.
我想针对 1 到 35 的 x 轴绘制每个单独的时间序列 A 到 Z。y 轴将是blocks
每次。
I was thinking of using something like an Andrews Curves plot, which would plot each series against one another. Each "hue" would be set to a different group. (Other ideas are welcome.)
我正在考虑使用类似Andrews Curves plot 的东西,它将每个系列相互绘制。每个“色调”将被设置为不同的组。(欢迎其他想法。)
My problem: how do you format this dataframe to plot multiple series? Should the columns be GroupA
, GroupB
, etc.?
我的问题:你如何格式化这个数据框来绘制多个系列?如果列是GroupA
,GroupB
等?
How do you get the dataframe to be in the format:
如何让数据框采用以下格式:
time GroupA blocksA GroupsB blocksB GroupsC blocksC....
Is this the correct format for an Andrews plot as shown?
如图所示,这是安德鲁斯图的正确格式吗?
EDIT
编辑
If I try:
如果我尝试:
df.groupby('Group').plot(legend=False)
the x-axis is completely incorrect. All time series should be plotted from 0 to 35, all in one series.
x 轴完全不正确。所有时间序列都应该从 0 到 35 绘制,都在一个系列中。
How do I solve this?
我该如何解决这个问题?
采纳答案by Serenity
Look at this variants. The first is Andrews' curves and the second is a multiline plot which are grouped by one column Month
. The dataframe data
includes three columns Temperature
, Day
, and Month
:
看看这个变种。第一个是安德鲁斯曲线,第二个是按一列分组的多线图Month
。数据帧data
包括三列Temperature
,Day
以及Month
:
import pandas as pd
import statsmodels.api as sm
import matplotlib.pylab as plt
from pandas.tools.plotting import andrews_curves
data = sm.datasets.get_rdataset('airquality').data
fig, (ax1, ax2) = plt.subplots(nrows = 2, ncols = 1)
data = data[data.columns.tolist()[3:]] # use only Temp, Month, Day
# Andrews' curves
andrews_curves(data, 'Month', ax=ax1)
# multiline plot with group by
for key, grp in data.groupby(['Month']):
ax2.plot(grp['Day'], grp['Temp'], label = "Temp in {0:02d}".format(key))
plt.legend(loc='best')
plt.show()
When you plot Andrews' curve your data salvaged to one function. It means that Andrews' curves that are represented by functions close together suggest that the corresponding data points will also be close together.
当您绘制安德鲁斯曲线时,您的数据将挽救为一个函数。这意味着由靠在一起的函数表示的安德鲁斯曲线表明相应的数据点也将靠在一起。
回答by Michael Thomas
You can re-structure the data as a pivot table:
您可以将数据重新构建为数据透视表:
df.pivot_table(index='time',columns='Group',values='blocks',aggfunc='sum').plot()