Python如何从文件中读取原始二进制文件?(音频/视频/文本)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20004859/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 19:14:44  来源:igfitidea点击:

Python how to read raw binary from a file? (audio/video/text)

pythonpython-2.7binary

提问by user2803250

I want to read the raw binary of a file and put it into a string. Currently I am opening a file with the "rb" flag and printing the byte but it's coming up as ASCII characters (for text that is, for video and audio files it's giving symbols and gibberish). I'd like to get the raw 0's and 1's if possible. This needs to work for audio and video files as well so simply converting the ascii to binary isn't an option.

我想读取文件的原始二进制文件并将其放入字符串中。目前我正在打开一个带有“rb”标志的文件并打印字节,但它是作为 ASCII 字符出现的(对于文本,即对于视频和音频文件,它给出了符号和胡言乱语)。如果可能的话,我想获得原始的 0 和 1。这也需要适用于音频和视频文件,因此不能简单地将 ascii 转换为二进制文件。

with open(filePath, "rb") as file:
    byte = file.read(1)
    print byte

回答by bruno desthuilliers

What you are reading IS really the "raw binary" content of your "binary" file. Strange as it might seems, binary data are not "0's and 1's" but binary words(aka bytes, cf http://en.wikipedia.org/wiki/Byte) which have an integer (base 10) value and canbe interpreted as ascii chars. Or as integers (which is how one usually do binary operations). Or as hexadecimal. For what it's worth, "text" isactually "raw binary data" too.

您正在阅读的内容实际上是“二进制”文件的“原始二进制”内容。看起来很奇怪,二进制数据不是“0 和 1”,而是二进制(又名字节,参见http://en.wikipedia.org/wiki/Byte),它们具有整数(基数 10)值并且可以被解释作为 ascii 字符。或者作为整数(这是通常进行二元运算的方式)。或者作为十六进制。对于它的价值,“文”实际上是“原始二进制数据”太。

To get a "binary" representation you can have a look here : Convert binary to ASCII and vice versabut that's not going to give you more "raw binary data" than what you actually have...

要获得“二进制”表示,您可以查看此处:将二进制转换为 ASCII,反之亦然,但这不会为您提供比实际拥有的更多的“原始二进制数据”...

Now the question: whydo you want these data as "0's and 1's" exactly ?

现在的问题是:您为什么要将这些数据准确地设为“0 和 1”?

回答by Holy Mackerel

to get the binary representation I think you will need to import binascii, then:

要获得二进制表示,我认为您需要导入 binascii,然后:

byte = f.read(1)
binary_string = bin(int(binascii.hexlify(byte), 16))[2:].zfill(8)

or, broken down:

或者,分解:

import binascii


filePath = "mysong.mp3"
file = open(filePath, "rb")
with file:
    byte = file.read(1)
    hexadecimal = binascii.hexlify(byte)
    decimal = int(hexadecimal, 16)
    binary = bin(decimal)[2:].zfill(8)
    print("hex: %s, decimal: %s, binary: %s" % (hexadecimal, decimal, binary))

will output:

将输出:

hex: 64, decimal: 100, binary: 01100100