python 你如何在python中解压非常大的文件？

Question

提问by Marc Novakowski

Using python 2.4 and the built-in ZipFilelibrary, I cannot read very large zip files (greater than 1 or 2 GB) because it wants to store the entire contents of the uncompressed file in memory. Is there another way to do this (either with a third-party library or some other hack), or must I "shell out" and unzip it that way (which isn't as cross-platform, obviously).

使用 python 2.4 和内置ZipFile库，我无法读取非常大的 zip 文件（大于 1 或 2 GB），因为它想将未压缩文件的全部内容存储在内存中。有没有另一种方法可以做到这一点（使用第三方库或其他一些黑客），或者我必须“掏空”并以这种方式解压缩它（这显然不是跨平台的）。

Answer 1

采纳答案by S.Lott

Here's an outline of decompression of large files.

下面是大文件解压的大纲。

import zipfile
import zlib
import os

src = open( doc, "rb" )
zf = zipfile.ZipFile( src )
for m in  zf.infolist():

    # Examine the header
    print m.filename, m.header_offset, m.compress_size, repr(m.extra), repr(m.comment)
    src.seek( m.header_offset )
    src.read( 30 ) # Good to use struct to unpack this.
    nm= src.read( len(m.filename) )
    if len(m.extra) > 0: ex= src.read( len(m.extra) )
    if len(m.comment) > 0: cm= src.read( len(m.comment) ) 

    # Build a decompression object
    decomp= zlib.decompressobj(-15)

    # This can be done with a loop reading blocks
    out= open( m.filename, "wb" )
    result= decomp.decompress( src.read( m.compress_size ) )
    out.write( result )
    result = decomp.flush()
    out.write( result )
    # end of the loop
    out.close()

zf.close()
src.close()

Answer 2

回答by Martijn Pieters

As of Python 2.6, you can use ZipFile.open()to open a file handle on a file, and copy contents efficiently to a target file of your choosing:

从 Python 2.6 开始，您可以使用ZipFile.open()打开文件的文件句柄，并将内容有效地复制到您选择的目标文件：

import errno
import os
import shutil
import zipfile

TARGETDIR = '/foo/bar/baz'

with open(doc, "rb") as zipsrc:
    zfile = zipfile.ZipFile(zipsrc)
    for member in zfile.infolist():
       target_path = os.path.join(TARGETDIR, member.filename)
       if target_path.endswith('/'):  # folder entry, create
           try:
               os.makedirs(target_path)
           except (OSError, IOError) as err:
               # Windows may complain if the folders already exist
               if err.errno != errno.EEXIST:
                   raise
           continue
       with open(target_path, 'wb') as outfile, zfile.open(member) as infile:
           shutil.copyfileobj(infile, outfile)

This uses shutil.copyfileobj()to efficiently read data from the open zipfile object, copying it over to the output file.

这用于shutil.copyfileobj()有效地从打开的 zipfile 对象读取数据，并将其复制到输出文件。

python 你如何在python中解压非常大的文件？

提问by Marc Novakowski

采纳答案by S.Lott

回答by Martijn Pieters

相关推荐

最近更新

标签

python 你如何在python中解压非常大的文件？

提问by Marc Novakowski

采纳答案by S.Lott

回答by Martijn Pieters

相关推荐

任何适用于 Python 的 AOP 支持库？

每个部分的 Python ConfigParser 唯一键

python “django 视图中的未知列 'user_id' 错误

python 如何修复错误嵌套/未关闭的 HTML 标签？

相关推荐

最近更新

标签