Python 解压 bz2 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16963352/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:06:19  来源:igfitidea点击:

Decompress bz2 files

pythoncompression

提问by MY_1129

I would like to decompress the files in different directories which are in different routes. And codes as below and the error is invalid data stream. Please help me out. Thank you so much.

我想解压缩不同路径中不同目录中的文件。和代码如下,错误是无效的数据流。请帮帮我。非常感谢。

import sys
import os
import bz2
from bz2 import decompress

path = "Dir"
for(dirpath,dirnames,files)in os.walk(path):
   for file in files:
       filepath = os.path.join(dirpath,filename)
       newfile = bz2.decompress(file)
       newfilepath = os.path.join(dirpath,newfile)

回答by Jochen Ritzel

bz2.decompresstakes compressed dataand inflates it. You pass a filename, not the data in the file!

bz2.decompress获取压缩数据并将其膨胀。您传递的是文件名,而不是文件中的数据!

Do this instead:

改为这样做:

zipfile = bz2.BZ2File(filepath) # open the file
data = zipfile.read() # get the decompressed data
newfilepath = filepath[:-4] # assuming the filepath ends with .bz2
open(newfilepath, 'wb').write(data) # write a uncompressed file

回答by michaelmeyer

This should work

这应该工作

for file in files:
    archive_path = os.path.join(dirpath,filename)
    outfile_path = os.path.join(dirpath, filename[:-4])
    with open(archive_path, 'rb') as source, open(outfile_path, 'wb') as dest:
        dest.write(bz2.decompress(source.read()))

回答by Juraj Ivan?i?

bz2.compress/decompress work with binary data:

bz2.compress/decompress 处理二进制数据:

>>> import bz2
>>> compressed = bz2.compress(b'test_string')
>>> compressed
b'BZh91AY&SYJ|i\x05\x00\x00\x04\x83\x80\x00\x00\x82\xa1\x1c\x00 \x00"\x03h\x840"
P\xdf\x04\x99\xe2\xeeH\xa7\n\x12\tO\x8d \xa0'
>>> bz2.decompress(compressed)
b'test_string'

In short - you need to process file contents manually. In case you have very large files you should prefer using bz2.BZ2Decompressorto bz2.decompress, because the latter requires that you store the entire file in a byte array.

简而言之 - 您需要手动处理文件内容。如果您有非常大的文件,您应该更喜欢使用bz2.BZ2Decompressorto bz2.decompress,因为后者要求您将整个文件存储在一个字节数组中。

for filename in files:
    filepath = os.path.join(dirpath, filename)
    newfilepath = os.path.join(dirpath,filename + '.decompressed')
    with open(newfilepath, 'wb') as new_file, open(filepath, 'rb') as file:
        decompressor = BZ2Decompressor()
        for data in iter(lambda : file.read(100 * 1024), b''):
            new_file.write(decompressor.decompress(data))

You can also use bz2.BZ2Fileto make this even simpler:

您还可以使用bz2.BZ2File使这更简单:

for filename in files:
    filepath = os.path.join(dirpath, filename)
    newfilepath = os.path.join(dirpath, filename + '.decompressed')
    with open(newfilepath, 'wb') as new_file, bz2.BZ2File(filepath, 'rb') as file:
        for data in iter(lambda : file.read(100 * 1024), b''):
            new_file.write(data)