Python 如何在 Google Colab 中读取 csv 到数据框
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48340341/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to read csv to dataframe in Google Colab
提问by PagMax
I am trying to read a csv file which I stored locally on my machine. (Just for additional reference it is titanic data from Kaggle which is here.)
我正在尝试读取我本地存储在我的机器上的 csv 文件。(只为额外的参考是从Kaggle泰坦尼克号的数据是在这里。)
From thisquestion and answers I learnt that you can import data using this code which works well from me.
从这个问题和答案中,我了解到您可以使用此代码导入数据,这对我来说效果很好。
from google.colab import files
uploaded = files.upload()
Where I am lost is how to convert it to dataframe from here. The sample google notebook pagelisted in the answer above does not talk about it.
我迷路的是如何从这里将其转换为数据帧。上面答案中列出的示例 google notebook 页面没有谈论它。
I am trying to convert the dictionary uploaded
to dataframe using from_dict
command but not able to make it work. There is some discussion on converting dict to dataframe herebut the solutions are not applicable to me (I think).
我正在尝试uploaded
使用from_dict
命令将字典转换为数据框,但无法使其工作。有上转换字典内数据帧的一些讨论,在这里,但解决方案并不适用于我(我认为)。
So summarizing, my question is:
总结一下,我的问题是:
How do I convert a csv file stored locally on my files to pandas dataframe on Google Colaboratory?
如何将本地存储在我的文件中的 csv 文件转换为 Google Colaboratory 上的 Pandas 数据框?
回答by Bob Smith
Pandas read_csv
should do the trick. You'll want to wrap your uploaded bytes in an io.StringIO
since read_csv
expects a file-like object.
熊猫read_csv
应该可以解决问题。io.StringIO
由于read_csv
需要一个类似文件的对象,因此您需要将上传的字节包装起来。
Here's a full example: https://colab.research.google.com/notebook#fileId=1JmwtF5OmSghC-y3-BkvxLan0zYXqCJJf
这是一个完整的示例:https: //colab.research.google.com/notebook#fileId=1JmwtF5OmSghC-y3-BkvxLan0zYXqCJJf
The key snippet is:
关键片段是:
import pandas as pd
import io
df = pd.read_csv(io.StringIO(uploaded['train.csv'].decode('utf-8')))
df
回答by Garima Jain
step 1- Mount your Google Drive to Collaboratory
第 1 步 - 将您的 Google Drive 安装到 Collaboratory
from google.colab import drive
drive.mount('/content/gdrive')
step 2- Now you will see your Google Drive files in the left pane (file explorer). Right click on the file that you need to import and select ?opy path. Then import as usual in pandas, using this copied path.
第 2 步 - 现在您将在左窗格(文件资源管理器)中看到您的 Google Drive 文件。右键单击需要导入的文件并选择 ?opy 路径。然后像往常一样导入熊猫,使用这个复制的路径。
import pandas as pd
df=pd.read_csv('gdrive/My Drive/data.csv')
Done!
完毕!
回答by Yasser Mustafa
Colab google: uploading csv from your PCI had the same problem with an excel file (*.xlsx), I solved the problem as the following and I think you could do the same with csv files: - If you have a file in your PC drive called (file.xlsx) then: 1- Upload it from your hard drive by using this simple code:
Colab google:从您的 PC 上传 csv我遇到了与 excel 文件 (*.xlsx) 相同的问题,我解决了以下问题,我认为您可以对 csv 文件执行相同的操作: - 如果您的文件中有一个文件PC 驱动器称为 (file.xlsx) 然后: 1- 使用以下简单代码从您的硬盘驱动器上传它:
from google.colab import files
uploaded = files.upload()
Press on (Choose Files) and upload it to your google drive.
按(选择文件)并将其上传到您的谷歌驱动器。
2- Then:
2-然后:
import io
data = io.BytesIO(uploaded['file.XLSX'])
3- Finally, read your file:
3- 最后,阅读您的文件:
import pandas as pd
f = pd.read_excel(data , sheet_name = '1min', header = 0, skiprows = 2)
#df.sheet_names
df.head()
4- Please, change parameters values to read your own file. I think this could be generalized to read other types of files!
Enjoy it!
4- 请更改参数值以读取您自己的文件。我认为这可以推广到读取其他类型的文件!
好好享受!
回答by JARS
This worked for me:
这对我有用:
from google.colab import auth
auth.authenticate_user()
from pydrive.drive import GoogleDrive
from pydrive.auth import GoogleAuth
from oauth2client.client import GoogleCredentials
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
myfile = drive.CreateFile({'id': '!!!YOUR FILE ID!!!'})
myfile.GetContentFile('file.csv')
Replace !!!YOUR FILE ID!!!
with the id of the file in google drive (this is the long alphanumeric string that appears when you click on "obtain link to share"). Then you can access file.csv with pandas' read_csv:
替换!!!YOUR FILE ID!!!
为 google drive 中文件的 id(这是当您单击“获取共享链接”时出现的长字母数字字符串)。然后你可以使用pandas的read_csv访问file.csv:
import pandas as pd
frm = pd.read_csv('file.csv', header=None)
回答by Diwakar
Alternatively, you can use github to import files also. You can take this as an example: https://drive.google.com/file/d/1D6ViUx8_ledfBqcxHCrFPcqBvNZitwCs/view?usp=sharing
或者,您也可以使用 github 导入文件。你可以以此为例:https: //drive.google.com/file/d/1D6ViUx8_ledfBqcxHCrFPcqBvNZitwCs/view?usp=sharing
Also google does not persist the file for longer so you may have to run the github snippets time and again.
此外,谷歌不会将文件保留更长时间,因此您可能必须一次又一次地运行 github 代码段。