pandas 在协作中从驱动器加载 xlsx 文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47430544/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Load xlsx file from drive in colaboratory
提问by dd_rookie
How can I import MS-excel(.xlsx) file from google drive into colaboratory?
如何将 MS-excel(.xlsx) 文件从谷歌驱动器导入 colaboratory?
excel_file = drive.CreateFile({'id':'some id'})
does work(drive
is a pydrive.drive.GoogleDrive
object). But,
确实有效(drive
是一个pydrive.drive.GoogleDrive
对象)。但,
print excel_file.FetchContent()
returns None. And
返回无。和
excel_file.content()
throws:
抛出:
TypeErrorTraceback (most recent call last) in () ----> 1 excel_file.content()
TypeError: '_io.BytesIO' object is not callable
TypeErrorTraceback(最近一次调用最后一次) in () ----> 1 excel_file.content()
类型错误:“_io.BytesIO”对象不可调用
My intent is (given some valid file 'id') to import it as an io object, which could be read by pandas read_excel()
, and finally get a pandas dataframe out of it.
我的意图是(给定一些有效的文件 'id')将它作为 io 对象导入,它可以被 pandas 读取read_excel()
,最后从中获取一个 pandas 数据帧。
回答by Bob Smith
You'll want to use excel_file.GetContentFile
to save the file locally. Then, you can use the Pandas read_excel
method after you !pip install -q xlrd
.
您需要使用excel_file.GetContentFile
来在本地保存文件。然后,你可以read_excel
在你之后使用 Pandas方法!pip install -q xlrd
。
Here's a full example: https://colab.research.google.com/notebook#fileId=1SU176zTQvhflodEzuiacNrzxFQ6fWeWC
这是一个完整的示例:https: //colab.research.google.com/notebook#fileId=1SU176zTQvhflodEzuiacNrzxFQ6fWeWC
What I did in more detail:
我更详细地做了什么:
I created a new spreadsheet in sheetsto be exported as an .xlsx file.
我在要导出为 .xlsx 文件的工作表中创建了一个新电子表格。
Next, I exported it as an .xlsx file and uploaded again to Drive. The URL is: https://drive.google.com/open?id=1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM
接下来,我将其导出为 .xlsx 文件并再次上传到云端硬盘。网址是:https: //drive.google.com/open?id=1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM
Note the file ID. In my case it's 1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM
.
请注意文件 ID。就我而言,它是1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM
.
Then, in Colab, I tweaked the Drive download snippetto download the file. The key bits are:
然后,在 Colab 中,我调整了Drive 下载片段以下载文件。关键位是:
file_id = '1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM'
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile('exported.xlsx')
Finally, to create a Pandas DataFrame:
最后,要创建一个 Pandas DataFrame:
!pip install -q xlrd
import pandas as pd
df = pd.read_excel('exported.xlsx')
df
The !pip install...
line installs the xlrd library, which is needed to read Excel files.
该!pip install...
行安装了读取 Excel 文件所需的 xlrd 库。
回答by Nagesh Rupnar
I'm here to solve this problem.so you can import any file(.csv,.xlsx,...etc)from google drive to google colab.
我是来解决这个问题的。所以你可以将任何文件(.csv、.xlsx 等)从 google drive 导入到 google colab。
Solution:
解决方案:
from google.colab import drive
drive.mount('/content/gdrive')
import pandas as pd
df=pd.read_csv('gdrive/My Drive/HDPrice.csv')
df.shape
df
!pip install --upgrade --quiet gspread
from google.colab import auth
auth.authenticate_user()
import gspread
from oauth2client.client import GoogleCredentials
gc=gspread.authorize(GoogleCredentials.get_application_default())
worksheet=gc.open('SampleData').sheet1
cell_list=worksheet
rows=worksheet.get_all_values()
print(rows)
import pandas as pd
pd.DataFrame.from_records(rows)