使用python ZipFile从zip中提取文件而不保留结构?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4917284/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 18:08:55  来源:igfitidea点击:

Extract files from zip without keeping the structure using python ZipFile?

pythonextractunzipzipfile

提问by Thammas

I try to extract all files from .zip containing subfolders in one folder. I want all the files from subfolders extract in only one folder without keeping the original structure. At the moment, I extract all, move the files to a folder, then remove previous subfolders. The files with same names are overwrited.

我尝试从一个文件夹中包含子文件夹的 .zip 中提取所有文件。我希望子文件夹中的所有文件只提取到一个文件夹中,而不保留原始结构。目前,我提取所有文件,将文件移动到一个文件夹,然后删除以前的子文件夹。具有相同名称的文件将被覆盖。

Is it possible to do it before writing files?

在写文件之前可以做吗?

Here is a structure for example:

例如,这是一个结构:

my_zip/file1.txt
my_zip/dir1/file2.txt
my_zip/dir1/dir2/file3.txt
my_zip/dir3/file4.txt

At the end I whish this:

最后我希望这个:

my_dir/file1.txt
my_dir/file2.txt
my_dir/file3.txt
my_dir/file4.txt

What can I add to this code ?

我可以在此代码中添加什么?

import zipfile
my_dir = "D:\Download\"
my_zip = "D:\Download\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    zip_file.extract(files, my_dir)
zip_file.close()

if I rename files path from zip_file.namelist(), I have this error:

如果我从 zip_file.namelist() 重命名文件路径,则会出现此错误:

KeyError: "There is no item named 'file2.txt' in the archive"

采纳答案by Reiner Gerecke

This opens file handles of members of the zip archive, extracts the filename and copies it to a target file (that's how ZipFile.extractworks, without taken care of subdirectories).

这将打开 zip 存档成员的文件句柄,提取文件名并将其复制到目标文件(这就是ZipFile.extract工作方式,无需处理子目录)。

import os
import shutil
import zipfile

my_dir = r"D:\Download"
my_zip = r"D:\Download\my_file.zip"

with zipfile.ZipFile(my_zip) as zip_file:
    for member in zip_file.namelist():
        filename = os.path.basename(member)
        # skip directories
        if not filename:
            continue

        # copy file (taken from zipfile's extract)
        source = zip_file.open(member)
        target = open(os.path.join(my_dir, filename), "wb")
        with source, target:
            shutil.copyfileobj(source, target)

回答by jsbueno

Just extract to bytes in memory,compute the filename, and write it there yourself, instead of letting the library do it - -mostly, just use the "read()" instead of "extract()" method:

只需提取到内存中的字节,计算文件名,然后自己写在那里,而不是让库来做——大多数情况下,只需使用“read()”而不是“extract()”方法:

import zipfile
import os

my_dir = "D:\Download\"
my_zip = "D:\Download\my_file.zip"

zip_file = zipfile.ZipFile(my_zip, 'r')
for files in zip_file.namelist():
    data = zip_file.read(files, my_dir)
    # I am almost shure zip represents directory separator
    # char as "/" regardless of OS, but I  don't have DOS or Windos here to test it
    myfile_path = os.path.join(my_dir, files.split("/")[-1])
    myfile = open(myfile_path, "wb")
    myfile.write(data)
    myfile.close()
zip_file.close()

回答by Gerhard G?tz

It is possible to iterate over the ZipFile.infolist(). On the returned ZipInfoobjects you can then manipulate the filenameto remove the directory part and finally extract it to a specified directory.

可以遍历ZipFile.infolist(). 在返回的ZipInfo对象上,您可以操作filename删除目录部分,最后将其解压缩到指定目录。

import glob
import zipfile
import shutil
import os

my_dir = "D:\Download\"
my_zip = "D:\Download\my_file.zip"

with zipfile.ZipFile(my_zip) as zip:
    for zip_info in zip.infolist():
        if zip_info.filename[-1] == '/':
            continue
        zip_info.filename = os.path.basename(zip_info.filename)
        zip.extract(zip_info, my_dir)

回答by vsnahar

In case you are getting badZipFile error. you can unzip the archive using 7zip sub process. assuming you have installed the 7zip then use the following code.

如果您遇到 badZipFile 错误。您可以使用 7zip 子进程解压缩存档。假设您已经安装了 7zip,然后使用以下代码。

import subprocess
my_dir = destFolder #destination folder
my_zip = destFolder + "/" + filename.zip #file you want to extract
ziploc = "C:/Program Files/7-Zip/7z.exe" #location where 7zip is installed
cmd = [ziploc, 'e',my_zip ,'-o'+ my_dir ,'*.txt' ,'-r' ] 
#extracting only txt files and from all subdirectories
sp = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)

回答by L0laapk3

A similar concept to the solution of Gerhard G?tz, but adapted for extracting single files instead of the entire zip:

Gerhard G?tz 的解决方案类似的概念,但适用于提取单个文件而不是整个 zip:

with ZipFile(zipPath, 'r') as zipObj:
    zipInfo = zipObj.getinfo(path_in_zip))
    zipInfo.filename = os.path.basename(destination)
    zipObj.extract(zipInfo, os.path.dirname(os.path.realpath(destination)))