Python 具有多个系列的 Seaborn 时间序列图
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37168303/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Seaborn timeseries plot with multiple series
提问by Zhao Li
I'm trying to make a time series plot with seaborn from a dataframe that has multiple series.
我正在尝试从具有多个系列的数据框中使用 seaborn 制作时间序列图。
From this post: seaborn time series from pandas dataframe
来自这篇文章: 来自熊猫数据框的seaborn时间序列
I gather that tsplot isn't going to work as it is meant to plot uncertainty.
我认为 tsplot 不会起作用,因为它旨在绘制不确定性。
So is there another Seaborn method that is meant for line charts with multiple series?
那么是否有另一种 Seaborn 方法适用于具有多个系列的折线图?
My dataframe looks like this:
我的数据框如下所示:
print(df.info())
print(df.describe())
print(df.values)
print(df.index)
output:
输出:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 253 entries, 2013-01-03 to 2014-01-03
Data columns (total 5 columns):
Equity(24 [AAPL]) 253 non-null float64
Equity(3766 [IBM]) 253 non-null float64
Equity(5061 [MSFT]) 253 non-null float64
Equity(6683 [SBUX]) 253 non-null float64
Equity(8554 [SPY]) 253 non-null float64
dtypes: float64(5)
memory usage: 11.9 KB
None
Equity(24 [AAPL]) Equity(3766 [IBM]) Equity(5061 [MSFT]) \
count 253.000000 253.000000 253.000000
mean 67.560593 194.075383 32.547436
std 6.435356 11.175226 3.457613
min 55.811000 172.820000 26.480000
25% 62.538000 184.690000 28.680000
50% 65.877000 193.880000 33.030000
75% 72.299000 203.490000 34.990000
max 81.463000 215.780000 38.970000
Equity(6683 [SBUX]) Equity(8554 [SPY])
count 253.000000 253.000000
mean 33.773277 164.690180
std 4.597291 10.038221
min 26.610000 145.540000
25% 29.085000 156.130000
50% 33.650000 165.310000
75% 38.280000 170.310000
max 40.995000 184.560000
[[ 77.484 195.24 27.28 27.685 145.77 ]
[ 75.289 193.989 26.76 27.85 146.38 ]
[ 74.854 193.2 26.71 27.875 145.965]
...,
[ 80.167 187.51 37.43 39.195 184.56 ]
[ 79.034 185.52 37.145 38.595 182.95 ]
[ 77.284 186.66 36.92 38.475 182.8 ]]
DatetimeIndex(['2013-01-03', '2013-01-04', '2013-01-07', '2013-01-08',
'2013-01-09', '2013-01-10', '2013-01-11', '2013-01-14',
'2013-01-15', '2013-01-16',
...
'2013-12-19', '2013-12-20', '2013-12-23', '2013-12-24',
'2013-12-26', '2013-12-27', '2013-12-30', '2013-12-31',
'2014-01-02', '2014-01-03'],
dtype='datetime64[ns]', length=253, freq=None, tz='UTC')
This works (but I want to get my hands dirty with Seaborn):
这有效(但我想用 Seaborn 弄脏我的手):
df.plot()
Output:
输出:
Thank you for your time!
感谢您的时间!
Update1:
更新1:
df.to_dict()
returned:
https://gist.github.com/anonymous/2bdc1ce0f9d0b6ccd6675ab4f7313a5f
df.to_dict()
返回:https:
//gist.github.com/anonymous/2bdc1ce0f9d0b6ccd6675ab4f7313a5f
Update2:
更新2:
Using @knagaev sample code, I've narrowed it down to this difference:
使用@knagaev 示例代码,我将范围缩小到这种差异:
current dataframe (output of print(current_df)
):
当前数据帧(输出print(current_df)
):
Equity(24 [AAPL]) Equity(3766 [IBM]) \
2013-01-03 00:00:00+00:00 77.484 195.2400
2013-01-04 00:00:00+00:00 75.289 193.9890
2013-01-07 00:00:00+00:00 74.854 193.2000
2013-01-08 00:00:00+00:00 75.029 192.8200
2013-01-09 00:00:00+00:00 73.873 192.3800
desired dataframe (output of print(desired_df)
):
所需的数据帧(输出print(desired_df)
):
Date Company Kind Price
0 2014-01-02 IBM Open 187.210007
1 2014-01-02 IBM High 187.399994
2 2014-01-02 IBM Low 185.199997
3 2014-01-02 IBM Close 185.529999
4 2014-01-02 IBM Volume 4546500.000000
5 2014-01-02 IBM Adj Close 171.971090
6 2014-01-02 MSFT Open 37.349998
7 2014-01-02 MSFT High 37.400002
8 2014-01-02 MSFT Low 37.099998
9 2014-01-02 MSFT Close 37.160000
10 2014-01-02 MSFT Volume 30632200.000000
11 2014-01-02 MSFT Adj Close 34.960000
12 2014-01-02 ORCL Open 37.779999
13 2014-01-02 ORCL High 38.029999
14 2014-01-02 ORCL Low 37.549999
15 2014-01-02 ORCL Close 37.840000
16 2014-01-02 ORCL Volume 18162100.000000
What's the best way to reorganize the current_df
to desired_df
?
重组current_df
to的最佳方法是desired_df
什么?
Update 3: I finally got it working from the help of @knagaev:
更新 3:我终于在 @knagaev 的帮助下让它工作了:
I had to add a dummy column as well as finesse the index:
我不得不添加一个虚拟列并优化索引:
df['Datetime'] = df.index
melted_df = pd.melt(df, id_vars='Datetime', var_name='Security', value_name='Price')
melted_df['Dummy'] = 0
sns.tsplot(melted_df, time='Datetime', unit='Dummy', condition='Security', value='Price', ax=ax)
采纳答案by knagaev
You can try to get hands dirty with tsplot.
您可以尝试使用tsplot弄脏手。
You will draw your line charts with standard errors ("statistical additions")
您将绘制带有标准误差的折线图(“统计添加”)
I tried to simulate your dataset. So here is the results
我试图模拟你的数据集。所以这是结果
import pandas.io.data as web
from datetime import datetime
import seaborn as sns
stocks = ['ORCL', 'TSLA', 'IBM','YELP', 'MSFT']
start = datetime(2014,1,1)
end = datetime(2014,3,28)
f = web.DataReader(stocks, 'yahoo',start,end)
df = pd.DataFrame(f.to_frame().stack()).reset_index()
df.columns = ['Date', 'Company', 'Kind', 'Price']
sns.tsplot(df, time='Date', unit='Kind', condition='Company', value='Price')
By the way this sample is very imitative. The parameter "unit" is "Field in the data DataFrame identifying the sampling unit (e.g. subject, neuron, etc.). The error representation will collapse over units at each time/condition observation. " (from documentation). So I used the 'Kind' field for illustrative purposes.
顺便说一下,这个样本非常具有模仿性。参数“unit”是“数据DataFrame中标识采样单元(例如主体、神经元等)的字段。错误表示将在每次/条件观察时在单元上折叠。”(来自文档)。因此,我使用“种类”字段进行说明。
Ok, I made an example for your dataframe. It has dummy field for "noise cleaning" :)
好的,我为你的数据框做了一个例子。它具有用于“噪音清理”的虚拟字段:)
import pandas.io.data as web
from datetime import datetime
import seaborn as sns
stocks = ['ORCL', 'TSLA', 'IBM','YELP', 'MSFT']
start = datetime(2010,1,1)
end = datetime(2015,12,31)
f = web.DataReader(stocks, 'yahoo',start,end)
df = pd.DataFrame(f.to_frame().stack()).reset_index()
df.columns = ['Date', 'Company', 'Kind', 'Price']
df_open = df[df['Kind'] == 'Open'].copy()
df_open['Dummy'] = 0
sns.tsplot(df_open, time='Date', unit='Dummy', condition='Company', value='Price')
P.S. Thanks to @VanPeer - now you can use seaborn.lineplotfor this problem
PS 感谢@VanPeer - 现在你可以使用seaborn.lineplot来解决这个问题