pandas 您如何从雅虎财经中提取每周历史数据?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20584627/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:26:52  来源:igfitidea点击:

How do you pull WEEKLY historical data from yahoo finance?

pythonpandastime-seriesyahoo-finance

提问by mlo

import datetime   
import pandas.io.data

sp  =  pd.io.data.get_data_yahoo('^IXIC',start = datetime.datetime(1972, 1, 3),
                       end = datetime.datetime(2010, 1, 3))

I have used the above example, but that just pulls DAILY data into a dataframe when I would like to pull weekly. It doesn't seem like get_data_yahoohas a parameter where you can select perhaps from daily, weekly or monthly like the options made available on yahoo itself. Any other packages or ideas that you know of that might be able to facilitate this?

我已经使用了上面的示例,但是当我想每周拉取时,这只是将 DAILY 数据拉入数据帧。它似乎没有get_data_yahoo一个参数,您可以像雅虎本身提供的选项一样从每日、每周或每月中进行选择。您知道的任何其他软件包或想法可能能够促进这一点?

采纳答案by unutbu

You can downsample using the asfreqmethod:

您可以使用以下asfreq方法进行下采样:

sp = sp.asfreq('W-FRI', method='pad')

The padmethod will propagate the last valid observation forward.

pad方法将向前传播最后一个有效的观察结果。

Using resample(as @tshauck has shown) is another possibility. Use asfreqif you want to guarantee that the values in your downsample are values found in the original data set. Use resampleif you wish to aggregate groups of rows from the original data set (for example, by taking a mean). reindexmight introduce NaN values if the original data set does not have a value on the date specified by the reindex -- though (as @behzad.nouri points out) you could use method=padto propagate last observations here as well.

使用resample(如@tshauck 所示)是另一种可能性。使用asfreq如果你要保证你的下采样值是原始数据集找到的值。使用resample,如果你想从原始数据集的行集合组(例如,通过采取平均值)。reindex如果原始数据集在重新索引指定的日期没有值,则可能会引入 NaN 值——尽管(正如@behzad.nouri 指出的那样)您也可以method=pad在这里传播最后的观察结果。

回答by Antony

If you check the latest pandas source codeon github, you will see that interval param is included in the latest master branch. You can manually modify your local copy by overwriting the same data.py under your Site-Packages/pandas/io folder

如果您在 github 上查看最新的Pandas 源代码,您会看到间隔参数包含在最新的 master 分支中。您可以通过覆盖 Site-Packages/pandas/io 文件夹下的相同 data.py 来手动修改本地副本

回答by behzad.nouri

you can always reindex to your desired frequency:

你总是可以重新索引到你想要的频率:

sp.reindex( pd.date_range( start=sp.index.min( ),
                           end=sp.index.max( ),
                           freq='W-WED' ) )  # weekly, Wednesdays

edit: you may add , method='ffill'to forward fill NaNvalues.

编辑:您可以添加, method='ffill'转发填充NaN值。

As a suggestion, take Wednesdays because that tend to have least missing values. ( i.e. fewer NYSE holidays falls on Wednesday ). I think Yahoo weekly data gives the stock price each Monday, which is worst weekly frequency based on S&P data from 2000 onwards:

作为建议,请选择星期三,因为它的缺失值往往最少。(即星期三的纽约证券交易所假期较少)。我认为雅虎每周数据给出了每周一的股价,这是基于 2000 年以来标准普尔数据的最差每周频率:

import pandas.io.data as web
sp = web.DataReader("^GSPC", "yahoo", start=dt.date( 2000, 1, 1 ) )

weekday = { 0:'MON', 1:'TUE', 2:'WED', 3:'THU', 4:'FRI' }
sp[ 'weekday' ] = list( map( weekday.get, sp.index.dayofweek ) )
sp.weekday.value_counts( )

output:

输出:

WED    722
TUE    717
THU    707
FRI    705
MON    659

回答by tshauck

One option would be to mask on the day of week you want.

一种选择是在您想要的星期几进行屏蔽。

sp[sp.index.dayofweek == 0]

Another option would be to resample.

另一种选择是重新采样。

sp.resample('W', how='mean')

回答by Into Numbers

That's how I convert daily to weekly price data:

这就是我将每日价格数据转换为每周价格数据的方式:

import datetime
import pandas as pd
import pandas_datareader.data as web

start = datetime.datetime(1972, 1, 3)
end = datetime.datetime(2010, 1, 3)

stock_d = web.DataReader('^IXIC', 'yahoo', start, end)

def week_open(array_like):
    return array_like[0]

def week_close(array_like):
    return array_like[-1]

stock_w = stock_d.resample('W',
                    how={'Open': week_open, 
                         'High': 'max',
                         'Low': 'min',
                         'Close': week_close,
                         'Volume': 'sum'}, 
                    loffset=pd.offsets.timedelta(days=-6))

stock_w = stock_w[['Open', 'High', 'Low', 'Close', 'Volume']]

more info:

更多信息:

https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#yahoo-financehttps://gist.github.com/prithwi/339f87bf9c3c37bb3188

https://pandas-datareader.readthedocs.io/en/latest/remote_data.html#yahoo-finance https://gist.github.com/prithwi/339f87bf9c3c37bb3188