Python 从 Google colab notebook 中提取 Google Drive zip

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49685924/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:11:49  来源:igfitidea点击:

Extract Google Drive zip from Google colab notebook

pythongoogle-drive-apigoogle-colaboratoryzipfile

提问by Laxmikant

I already have a zip of (2K images) dataset on a google drive. I have to use it in a ML training algorithm. Below Code extracts the content in a string format:

我已经在谷歌驱动器上有一个(2K 图像)数据集的 zip。我必须在 ML 训练算法中使用它。下面的代码以字符串格式提取内容:

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import io
import zipfile
# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Download a file based on its file ID.
#
# A file ID looks like: laggVyWshwcyP6kEI-y_W3P8D26sz
file_id = '1T80o3Jh3tHPO7hI5FBxcX-jFnxEuUE9K' #-- Updated File ID for my zip
downloaded = drive.CreateFile({'id': file_id})
#print('Downloaded content "{}"'.format(downloaded.GetContentString(encoding='cp862')))

But I have to extract and store it in a separate directory as it would be easier for processing (as well as for understanding) of the dataset.

但是我必须将它提取并存储在一个单独的目录中,因为它更容易处理(以及理解)数据集。

I tried to extract it further, but getting "Not a zipfile error"

我试图进一步提取它,但得到“不是 zipfile 错误”

dataset = io.BytesIO(downloaded.encode('cp862'))
zip_ref = zipfile.ZipFile(dataset, "r")
zip_ref.extractall()
zip_ref.close()

Google Drive Dataset

Google 云端硬盘数据集

Note: Dataset is just for reference, I have already downloaded this zip to my google drive, and I'm referring to file in my drive only.

注意:数据集仅供参考,我已经将这个 zip 下载到我的谷歌驱动器,我指的只是我驱动器中的文件。

回答by Harsh Gupta

You can simply use this

你可以简单地使用这个

!unzip file_location

回答by giapnh

TO unzip a file to a directory:

要将文件解压缩到目录:

!unzip path_to_file.zip -d path_to_directory

回答by Alon Lavian

To extract Google Drive zip from a Google colab notebook:

要从 Google colab notebook 中提取 Google Drive zip:

import zipfile
from google.colab import drive

drive.mount('/content/drive/')

zip_ref = zipfile.ZipFile("/content/drive/My Drive/ML/DataSet.zip", 'r')
zip_ref.extractall("/tmp")
zip_ref.close()

回答by Plo_Koon

Mount GDrive:

安装 GDrive:

from google.colab import drive
drive.mount('/content/gdrive')

Open the link -> copy authorization code -> paste that into the prompt and press "Enter"

打开链接 -> 复制授权码 -> 将其粘贴到提示中,然后按“Enter”

Check GDrive access:

检查 GDrive 访问:

!ls "/content/gdrive/My Drive"

Unzip(q stands for "quiet") file from GDrive:

从 GDrive解压缩(q 代表“安静”)文件:

!unzip -q "/content/gdrive/My Drive/dataset.zip"

回答by Omid Ghahroodi

First, install unzip on colab:

首先,在colab上安装unzip:

!apt install unzip

then use unzip to extract your files:

然后使用 unzip 解压缩您的文件:

!unzip  source.zip -d destination.zip

回答by Kripalu Sar

First create a new directory:

首先新建一个目录:

!mkdir file_destination

Now, it's the time to inflate the directory with the unzipped files with this:

现在,是时候使用解压缩文件来扩充目录了:

!unzip file_location -d file_destination

回答by korakot

Instead of GetContentString(), use GetContentFile() instead. It will save the file instead of returning the string.

而不是GetContentString(),而是使用 GetContentFile()。它将保存文件而不是返回字符串。

downloaded.GetContentFile('images.zip') 

Then you can unzip it later with unzip.

然后您可以稍后使用unzip.

回答by Vaishnavi Bala

SIMPLE WAY TO CONNECT

简单的连接方式

1) You'll have to verify authentication

1)您必须验证身份验证

from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()

2)To fuse google drive

2)融合谷歌驱动器

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

3)To verify credentials

3)验证凭据

import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

4)Create a drive name to use it in colab ('gdrive') and check if it's working

4)创建一个驱动器名称以在colab('gdrive')中使用它并检查它是否正常工作

!mkdir gdrive
!google-drive-ocamlfuse gdrive
!ls gdrive
!cd gdrive

回答by abdul

For Python

对于 Python

Connect to drive,

连接到驱动器,

from google.colab import drive
drive.mount('/content/drive')

Check for directory

检查目录

!lsand !pwd

!ls!pwd

For unzip

用于解压

!unzip drive/"My Drive"/images.zip

回答by Md. Hishamur Rahman

After mounting on drive, use shutil.unpack_archive. It works with almost all archive formats (e.g., “zip”, “tar”, “gztar”, “bztar”, “xztar”) and it's simple:

在驱动器上安装后,使用shutil.unpack_archive。它适用于几乎所有存档格式(例如,“zip”、“tar”、“gztar”、“bztar”、“xztar”),而且很简单:

import shutil
shutil.unpack_archive("filename", "path_to_extract")