pandas 沿时间序列索引连接熊猫数据帧
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/11714768/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
concat pandas DataFrame along timeseries indexes
提问by Matthew Brown
I have two largish (snippets provided) pandas DateFrames with unequal dates as indexes that I wish to concat into one:
我有两个较大的(提供了片段)pandas DateFrame,它们的日期不相等作为索引,我希望将它们合并为一个:
NAB.AX CBA.AX
Close Volume Close Volume
Date Date
2009-06-05 36.51 4962900 2009-06-08 21.95 0
2009-06-04 36.79 5528800 2009-06-05 21.95 8917000
2009-06-03 36.80 5116500 2009-06-04 22.21 18723600
2009-06-02 36.33 5303700 2009-06-03 23.11 11643800
2009-06-01 36.16 5625500 2009-06-02 22.80 14249900
2009-05-29 35.14 13038600 --AND-- 2009-06-01 22.52 11687200
2009-05-28 33.95 7917600 2009-05-29 22.02 22350700
2009-05-27 35.13 4701100 2009-05-28 21.63 9679800
2009-05-26 35.45 4572700 2009-05-27 21.74 9338200
2009-05-25 34.80 3652500 2009-05-26 21.64 8502900
Problem is, if I run this:
问题是,如果我运行这个:
keys = ['CBA.AX','NAB.AX']
mv = pandas.concat([data['CBA.AX'][650:660],data['NAB.AX'][650:660]], axis=1, keys=stocks,)
the following DateFrame is produced:
生成以下 DateFrame:
CBA.AX NAB.AX
Close Volume Close Volume
Date
2200-08-16 04:24:21.460041 NaN NaN NaN NaN
2203-05-13 04:24:21.460041 NaN NaN NaN NaN
2206-02-06 04:24:21.460041 NaN NaN NaN NaN
2208-11-02 04:24:21.460041 NaN NaN NaN NaN
2211-07-30 04:24:21.460041 NaN NaN NaN NaN
2219-10-16 04:24:21.460041 NaN NaN NaN NaN
2222-07-12 04:24:21.460041 NaN NaN NaN NaN
2225-04-07 04:24:21.460041 NaN NaN NaN NaN
2228-01-02 04:24:21.460041 NaN NaN NaN NaN
2230-09-28 04:24:21.460041 NaN NaN NaN NaN
2238-12-15 04:24:21.460041 NaN NaN NaN NaN
Does anybody have any idea why this might be the case?
有谁知道为什么会这样?
On another point: is there any python libraries around that pull data from yahoo and normalise it?
另一方面:是否有任何 python 库可以从雅虎中提取数据并对其进行规范化?
Cheers.
干杯。
EDIT: For reference:
编辑:供参考:
data = {
'CBA.AX': <class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2313 entries, 2011-12-29 00:00:00 to 2003-01-01 00:00:00
Data columns:
Close 2313 non-null values
Volume 2313 non-null values
dtypes: float64(1), int64(1),
'NAB.AX': <class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 2329 entries, 2011-12-29 00:00:00 to 2003-01-01 00:00:00
Data columns:
Close 2329 non-null values
Volume 2329 non-null values
dtypes: float64(1), int64(1)
}
采纳答案by bmu
It is possible to read the data with pandas and to concatenate it.
可以使用 Pandas 读取数据并将其连接起来。
First import the data
首先导入数据
In [449]: import pandas.io.data as web
In [450]: nab = web.get_data_yahoo('NAB.AX', start='2009-05-25',
end='2009-06-05')[['Close', 'Volume']]
In [451]: cba = web.get_data_yahoo('CBA.AX', start='2009-05-26',
end='2009-06-08')[['Close', 'Volume']]
In [453]: nab
Out[453]:
Close Volume
Date
2009-05-25 21.15 9685100
2009-05-26 21.64 8541900
2009-05-27 21.74 9042900
2009-05-28 21.63 9701000
2009-05-29 22.02 14665700
2009-06-01 22.52 6782000
2009-06-02 22.80 10473400
2009-06-03 23.11 9931400
2009-06-04 22.21 17869000
2009-06-05 21.95 8214300
In [454]: cba
Out[454]:
Close Volume
Date
2009-05-26 35.45 4529600
2009-05-27 35.13 4521500
2009-05-28 33.95 7945400
2009-05-29 35.14 12548500
2009-06-01 36.16 4509400
2009-06-02 36.33 4304900
2009-06-03 36.80 4845400
2009-06-04 36.79 4592300
2009-06-05 36.51 4417500
2009-06-08 36.51 0
Than concatenate it:
比连接它:
In [455]: keys = ['CBA.AX','NAB.AX']
In [456]: pd.concat([cba, nab], axis=1, keys=keys)
Out[456]:
CBA.AX NAB.AX
Close Volume Close Volume
Date
2009-05-25 NaN NaN 21.15 9685100
2009-05-26 35.45 4529600 21.64 8541900
2009-05-27 35.13 4521500 21.74 9042900
2009-05-28 33.95 7945400 21.63 9701000
2009-05-29 35.14 12548500 22.02 14665700
2009-06-01 36.16 4509400 22.52 6782000
2009-06-02 36.33 4304900 22.80 10473400
2009-06-03 36.80 4845400 23.11 9931400
2009-06-04 36.79 4592300 22.21 17869000
2009-06-05 36.51 4417500 21.95 8214300
2009-06-08 36.51 0 NaN NaN
回答by Michael WS
Try to join on outer.
尝试在外部加入。
When I am working with a number of stocks, I would usually have a frame titled "open high,low,close,etc" with column as a ticker. If you want one data structure, I would use Panels for this.
当我处理许多股票时,我通常会有一个标题为“高开、低开、收盘等”的框架,其中一栏作为代码。如果您想要一种数据结构,我会为此使用面板。
for Yahoo data, you can use pandas:
对于雅虎数据,您可以使用熊猫:
import pandas.io.data as data
spy = data.DataReader("SPY","yahoo","1991/1/1")

