Pandas 数据框 - RemoteDataError - Python
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37794095/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas Dataframe - RemoteDataError - Python
提问by RageAgainstheMachine
I'm trying to pull data from yahoo finance.
我正在尝试从雅虎财经中提取数据。
Here is the error I'm getting:
这是我得到的错误:
File "banana.py", line 35, in <module>
data = web.DataReader(ticker, "yahoo", datetime(2011,1,1), datetime(2015,12,31))
File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\data.py", line 94, in DataReader
session=session).read()
File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\yahoo\daily.py", line 77, in read
df = super(YahooDailyReader, self).read()
File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 173, in read
df = self._read_one_data(self.url, params=self._get_params(self.symbols))
File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 80, in _read_one_data
out = self._read_url_as_StringIO(url, params=params)
File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 91, in _read_url_as_StringIO
response = self._get_response(url, params=params)
File "C:\Users\ll\Anaconda2\lib\site-packages\pandas_datareader\base.py", line 117, in _get_response
raise RemoteDataError('Unable to read URL: {0}'.format(url))
pandas_datareader._utils.RemoteDataError: Unable to read URL: http://ichart.finance.yahoo.com/table.csv
The error shows up when I read from a .csv file instead of a list of tickers:
当我从 .csv 文件而不是股票行情列表中读取时,会出现错误:
This works:
这有效:
for ticker in ['MSFT']:
This doesn't:
这不会:
input_file = open("testlist.csv", 'r')
for ticker in input_file:
I've even put in exceptions (see below) but still not working:
我什至加入了例外(见下文)但仍然无法正常工作:
except RemoteDataError:
print("No information for ticker '%s'" % t)
continue
except IndexError:
print("Something went wacko for ticker '%s', trying again..." % t)
continue
except Exception, e:
continue
except:
print "Can't find ", ticker
My code:
我的代码:
from datetime import datetime
from pandas_datareader import data, wb
import pandas_datareader.data as web
import pandas as pd
from pandas_datareader._utils import RemoteDataError
import csv
import sys
import os
class MonthlyChange(object):
months = { 0:'JAN', 1:'FEB', 2:'MAR', 3:'APR', 4:'MAY',5:'JUN', 6:'JUL', 7:'AUG', 8:'SEP', 9:'OCT',10:'NOV', 11:'DEC' }
def __init__(self,month):
self.month = MonthlyChange.months[month-1]
self.sum_of_pos_changes=0
self.sum_of_neg_changes=0
self.total_neg=0
self.total_pos=0
def add_change(self,change):
if change < 0:
self.sum_of_neg_changes+=change
self.total_neg+=1
elif change > 0:
self.sum_of_pos_changes+=change
self.total_pos+=1
def get_data(self):
if self.total_pos == 0:
return (self.month,0.0,0,self.sum_of_neg_changes/self.total_neg,self.total_neg)
elif self.total_neg == 0:
return (self.month,self.sum_of_pos_changes/self.total_pos,self.total_pos,0.0,0)
else:
return (self.month,self.sum_of_pos_changes/self.total_pos,self.total_pos,self.sum_of_neg_changes/self.total_neg,self.total_neg)
input_file = open("Companylistnysenasdaq.csv", 'r')
for ticker in input_file: #for ticker in input_file:
print(ticker)
data = web.DataReader(ticker, "yahoo", datetime(2011,1,1), datetime(2015,12,31))
data['ymd'] = data.index
year_month = data.index.to_period('M')
data['year_month'] = year_month
first_day_of_months = data.groupby(["year_month"])["ymd"].min()
first_day_of_months = first_day_of_months.to_frame().reset_index(level=0)
last_day_of_months = data.groupby(["year_month"])["ymd"].max()
last_day_of_months = last_day_of_months.to_frame().reset_index(level=0)
fday_open = data.merge(first_day_of_months,on=['ymd'])
fday_open = fday_open[['year_month_x','Open']]
lday_open = data.merge(last_day_of_months,on=['ymd'])
lday_open = lday_open[['year_month_x','Open']]
fday_lday = fday_open.merge(lday_open,on=['year_month_x'])
monthly_changes = {i:MonthlyChange(i) for i in range(1,13)}
for index,ym, openf,openl in fday_lday.itertuples():
month = ym.strftime('%m')
month = int(month)
diff = (openf-openl)/openf
monthly_changes[month].add_change(diff)
changes_df = pd.DataFrame([monthly_changes[i].get_data() for i in monthly_changes],columns=["Month","Avg Inc.","Inc","Avg.Dec","Dec"])
t = ticker.strip()
j = 0
while j < 13:
try:
if len(changes_df.loc[changes_df.Inc > 2,'Month']) != 0:
print ticker
print ("Increase Months: ")
print (changes_df.loc[changes_df.Inc > 2,'Month'])
if len(changes_df.loc[changes_df.Dec > 2,'Month']) != 0:
print ticker
print ("Decrease Months: ")
print (changes_df.loc[changes_df.Dec > 2,'Month'])
j += 13
except RemoteDataError:
print("No information for ticker '%s'" % t)
j += 13
continue
except IndexError:
print("Something went googoo for ticker '%s', trying again..." % t)
j += 1
time.sleep(30)
continue
except Exception, e:
j+=13
time.sleep(30)
continue
except:
print "Can't find ", ticker
input_file.close()
回答by Stefan
pandas_datareader
throws this error when yahoo does not make data for the ticker
in question available through its API.
pandas_datareader
当雅虎未ticker
通过其 API 提供相关数据时,会引发此错误。
When reading your .csv
file, you are including newline characters, so pandas_datareader
doesn't recognize the tickers.
阅读.csv
文件时,您包含换行符,因此pandas_datareader
无法识别代码。
data = web.DataReader(ticker.strip('\n'), "yahoo", datetime(2011, 1, 1), datetime(2015, 12, 31))
works when I create a file that lists tickers in the first column.
当我创建一个在第一列中列出股票行情的文件时有效。
Might be easier to do:
可能更容易做到:
tickers = pd.read_csv('Companylistnysenasdaq.csv')
for ticker in tickers.iloc[:, 0].tolist():
assuming your file is a simple list with tickers in the first column. Might need header=None
in read_csv
depending on your file formatting.
假设您的文件是一个简单的列表,第一列中有代码。可能需要header=None
在read_csv
这取决于你的文件格式。
To handle errors, you can:
要处理错误,您可以:
from pandas_datareader._utils import RemoteDataError
try:
stockData = DataReader(ticker, 'yahoo', datetime(2015, 1, 1), datetime.today())
except RemoteDataError:
# handle error
回答by Merlin
Try this:
You open the file but didnt .read()
on it.
试试这个:你打开了文件,但没有.read()
打开。
input_file = open("testlist.csv", 'r').read()
Please run these:
请运行这些:
input_file = open("Companylistnysenasdaq.csv", 'r').read()
for x in input_file: print x
input_file = open("Companylistnysenasdaq.csv", 'r')
for x in input_file: print x