从 URL 到“pandas.DataFrame”的 Excel 工作簿表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15588713/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:44:12  来源:igfitidea点击:

sheets of Excel Workbook from a URL into a `pandas.DataFrame`

pythonurlpandasxlrd

提问by benjaminmgross

After looking at different ways to read an url link, pointing to a .xls file, I decided to go with using xlrd.

在查看了读取指向 .xls 文件的 url 链接的不同方法后,我决定使用 xlrd。

I am having a difficult time converting a 'xlrd.book.Book' type to a 'pandas.DataFrame'

我很难将“xlrd.book.Book”类型转换为“pandas.DataFrame”

I have the following:

我有以下几点:

import pandas
import xlrd 
import urllib2

link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls'
socket = urllib2.urlopen(link)

#this line gets me the excel workbook 
xlfile = xlrd.open_workbook(file_contents = socket.read())

#storing the sheets
sheets = xlfile.sheets()

I want to tak the last sheet of sheetsand import as a pandas.DataFrame, any ideas as to how I can accomplish this? I've tried, pandas.ExcelFile.parse()but it wants a path to an excel file. I can of certainly save the file to memory and then parse (using tempfileor something), but I'm trying to follow pythonic guidelines and use functionality likelyalready written into pandas.

我想把最后一张纸sheets作为一个pandas.DataFrame,关于我如何做到这一点的任何想法?我试过,pandas.ExcelFile.parse()但它想要一个excel文件的路径。我当然可以将文件保存到内存中,然后解析(使用tempfile或其他东西),但我正在尝试遵循 pythonic 准则并使用可能已经写入Pandas 的功能。

Any guidance is greatly appreciated as always.

一如既往地非常感谢任何指导。

回答by DSM

You can pass your socketto ExcelFile:

你可以将你的传递socketExcelFile

>>> import pandas as pd
>>> import urllib2
>>> link = 'http://www.econ.yale.edu/~shiller/data/chapt26.xls'
>>> socket = urllib2.urlopen(link)
>>> xd = pd.ExcelFile(socket)
NOTE *** Ignoring non-worksheet data named u'PDVPlot' (type 0x02 = Chart)
NOTE *** Ignoring non-worksheet data named u'ConsumptionPlot' (type 0x02 = Chart)
>>> xd.sheet_names
[u'Data', u'Consumption', u'Calculations']
>>> df = xd.parse(xd.sheet_names[-1], header=None)
>>> df
                                   0   1   2   3         4
0        Average Real Interest Rate: NaN NaN NaN  1.028826
1    Geometric Average Stock Return: NaN NaN NaN  0.065533
2              exp(geo. Avg. return) NaN NaN NaN  0.067728
3  Geometric Average Dividend Growth NaN NaN NaN  0.012025

回答by aghazaly

You can pass a URL to pandas.read_excel():

您可以将 URL 传递给pandas.read_excel()

import pandas as pd

link ='http://www.econ.yale.edu/~shiller/data/chapt26.xls'
data = pd.read_excel(link,'sheetname')