pandas 在协作中从驱动器加载 xlsx 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/47430544/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:48:47  来源:igfitidea点击:

Load xlsx file from drive in colaboratory

pythonexcelpandaspydrivegoogle-colaboratory

提问by dd_rookie

How can I import MS-excel(.xlsx) file from google drive into colaboratory?

如何将 MS-excel(.xlsx) 文件从谷歌驱动器导入 colaboratory?

excel_file = drive.CreateFile({'id':'some id'})

does work(driveis a pydrive.drive.GoogleDriveobject). But,

确实有效(drive是一个pydrive.drive.GoogleDrive对象)。但,

print excel_file.FetchContent()

returns None. And

返回无。和

excel_file.content()

throws:

抛出:

TypeErrorTraceback (most recent call last) in () ----> 1 excel_file.content()

TypeError: '_io.BytesIO' object is not callable

TypeErrorTraceback(最近一次调用最后一次) in () ----> 1 excel_file.content()

类型错误:“_io.BytesIO”对象不可调用

My intent is (given some valid file 'id') to import it as an io object, which could be read by pandas read_excel(), and finally get a pandas dataframe out of it.

我的意图是(给定一些有效的文件 'id')将它作为 io 对象导入,它可以被 pandas 读取read_excel(),最后从中获取一个 pandas 数据帧。

回答by Bob Smith

You'll want to use excel_file.GetContentFileto save the file locally. Then, you can use the Pandas read_excelmethod after you !pip install -q xlrd.

您需要使用excel_file.GetContentFile来在本地保存文件。然后,你可以read_excel在你之后使用 Pandas方法!pip install -q xlrd

Here's a full example: https://colab.research.google.com/notebook#fileId=1SU176zTQvhflodEzuiacNrzxFQ6fWeWC

这是一个完整的示例:https: //colab.research.google.com/notebook#fileId=1SU176zTQvhflodEzuiacNrzxFQ6fWeWC

What I did in more detail:

我更详细地做了什么:

I created a new spreadsheet in sheetsto be exported as an .xlsx file.

我在要导出为 .xlsx 文件的工作表中创建了一个新电子表格

Next, I exported it as an .xlsx file and uploaded again to Drive. The URL is: https://drive.google.com/open?id=1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM

接下来,我将其导出为 .xlsx 文件并再次上传到云端硬盘。网址是:https: //drive.google.com/open?id=1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM

Note the file ID. In my case it's 1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM.

请注意文件 ID。就我而言,它是1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM.

Then, in Colab, I tweaked the Drive download snippetto download the file. The key bits are:

然后,在 Colab 中,我调整了Drive 下载片段以下载文件。关键位是:

file_id = '1Sv4ib5i7CKWhAHZkKg-uitIkS3xwxtXM'
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile('exported.xlsx')

Finally, to create a Pandas DataFrame:

最后,要创建一个 Pandas DataFrame:

!pip install -q xlrd
import pandas as pd
df = pd.read_excel('exported.xlsx')
df

The !pip install...line installs the xlrd library, which is needed to read Excel files.

!pip install...行安装了读取 Excel 文件所需的 xlrd 库。

回答by Nagesh Rupnar

I'm here to solve this problem.so you can import any file(.csv,.xlsx,...etc)from google drive to google colab.

我是来解决这个问题的。所以你可以将任何文件(.csv、.xlsx 等)从 google drive 导入到 google colab。

Solution:

解决方案:

from google.colab import drive
drive.mount('/content/gdrive')

import pandas as pd
df=pd.read_csv('gdrive/My Drive/HDPrice.csv')

df.shape

df

!pip install --upgrade --quiet gspread

from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials
gc=gspread.authorize(GoogleCredentials.get_application_default())

worksheet=gc.open('SampleData').sheet1
cell_list=worksheet

rows=worksheet.get_all_values()
print(rows)

import pandas as pd
pd.DataFrame.from_records(rows)