Python 解压 bz2 文件

Question

提问by MY_1129

I would like to decompress the files in different directories which are in different routes. And codes as below and the error is invalid data stream. Please help me out. Thank you so much.

我想解压缩不同路径中不同目录中的文件。和代码如下，错误是无效的数据流。请帮帮我。非常感谢。

import sys
import os
import bz2
from bz2 import decompress

path = "Dir"
for(dirpath,dirnames,files)in os.walk(path):
   for file in files:
       filepath = os.path.join(dirpath,filename)
       newfile = bz2.decompress(file)
       newfilepath = os.path.join(dirpath,newfile)

Answer 1

回答by Jochen Ritzel

bz2.decompresstakes compressed dataand inflates it. You pass a filename, not the data in the file!

bz2.decompress获取压缩数据并将其膨胀。您传递的是文件名，而不是文件中的数据！

Do this instead:

改为这样做：

zipfile = bz2.BZ2File(filepath) # open the file
data = zipfile.read() # get the decompressed data
newfilepath = filepath[:-4] # assuming the filepath ends with .bz2
open(newfilepath, 'wb').write(data) # write a uncompressed file

Answer 2

回答by michaelmeyer

This should work

这应该工作

for file in files:
    archive_path = os.path.join(dirpath,filename)
    outfile_path = os.path.join(dirpath, filename[:-4])
    with open(archive_path, 'rb') as source, open(outfile_path, 'wb') as dest:
        dest.write(bz2.decompress(source.read()))

Answer 3

回答by Juraj Ivan?i?

bz2.compress/decompress work with binary data:

bz2.compress/decompress 处理二进制数据：

>>> import bz2
>>> compressed = bz2.compress(b'test_string')
>>> compressed
b'BZh91AY&SYJ|i\x05\x00\x00\x04\x83\x80\x00\x00\x82\xa1\x1c\x00 \x00"\x03h\x840"
P\xdf\x04\x99\xe2\xeeH\xa7\n\x12\tO\x8d \xa0'
>>> bz2.decompress(compressed)
b'test_string'

In short - you need to process file contents manually. In case you have very large files you should prefer using bz2.BZ2Decompressorto bz2.decompress, because the latter requires that you store the entire file in a byte array.

简而言之 - 您需要手动处理文件内容。如果您有非常大的文件，您应该更喜欢使用bz2.BZ2Decompressorto bz2.decompress，因为后者要求您将整个文件存储在一个字节数组中。

for filename in files:
    filepath = os.path.join(dirpath, filename)
    newfilepath = os.path.join(dirpath,filename + '.decompressed')
    with open(newfilepath, 'wb') as new_file, open(filepath, 'rb') as file:
        decompressor = BZ2Decompressor()
        for data in iter(lambda : file.read(100 * 1024), b''):
            new_file.write(decompressor.decompress(data))

You can also use bz2.BZ2Fileto make this even simpler:

您还可以使用bz2.BZ2File使这更简单：

for filename in files:
    filepath = os.path.join(dirpath, filename)
    newfilepath = os.path.join(dirpath, filename + '.decompressed')
    with open(newfilepath, 'wb') as new_file, bz2.BZ2File(filepath, 'rb') as file:
        for data in iter(lambda : file.read(100 * 1024), b''):
            new_file.write(data)

Python 解压 bz2 文件

提问by MY_1129

回答by Jochen Ritzel

回答by michaelmeyer

回答by Juraj Ivan?i?

相关推荐

最近更新

标签

Python 解压 bz2 文件

提问by MY_1129

回答by Jochen Ritzel

回答by michaelmeyer

回答by Juraj Ivan?i?

相关推荐

Python 从脚本中获取 virtualenv 的 bin 文件夹路径

Python ValueError：形状不匹配：对象不能广播到单个形状

PermissionError: [WinError 5] python 使用moviepy 写gif 被拒绝访问

python中的webbrowser.open()

相关推荐

最近更新

标签