在 Python 中将二进制数据写入文件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25168616/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Writing binary data to a file in Python
提问by Schafer
I am trying to write data (text, floating point data) to a file in binary, which is to be read by another program later. The problem is that this program (in Fort95) is incredibly particular; each byte has to be in exactly the right place in order for the file to be read correctly. I've tried using Bytes objects and .encode() to write, but haven't had much luck (I can tell from the file size that it is writing extra bytes of data). Some code I've tried:
我正在尝试将数据(文本、浮点数据)写入二进制文件,稍后将由另一个程序读取。问题是这个程序(在 Fort95 中)非常特别;每个字节都必须在正确的位置才能正确读取文件。我曾尝试使用 Bytes 对象和 .encode() 进行写入,但运气不佳(我可以从文件大小看出它正在写入额外字节的数据)。我试过的一些代码:
mgcnmbr='42'
bts=bytes(mgcnmbr)
test_file=open(PATH_HERE/test_file.dat','ab')
test_file.write(bts)
test_file.close()
I've also tried:
我也试过:
mgcnmbr='42'
bts=mgcnmbr.encode(utf_32_le)
test_file=open(PATH_HERE/test_file.dat','ab')
test_file.write(bts)
test_file.close()
To clarify, what I need is the integer value 42, written as a 4 byte binary. Next, I would write the numbers 1 and 0 in 4 byte binary. At that point, I should have exactly 12 bytes. Each is a 4 byte signed integer, written in binary. I'm pretty new to Python, and can't seem to get it to work out. Any suggestions? Soemthing like this? I need complete control over how many bytes each integer (and later, 4 byte floating point ) is.
为了澄清,我需要的是整数值 42,写为 4 字节二进制。接下来,我将用 4 字节二进制写入数字 1 和 0。那时,我应该正好有 12 个字节。每个都是一个 4 字节的有符号整数,用二进制写成。我对 Python 很陌生,似乎无法解决它。有什么建议?像这样的东西?我需要完全控制每个整数(以及后来的 4 字节浮点数)有多少字节。
Thanks
谢谢
回答by Algorithmic Canary
Assuming that you want it in little-endian, you could do something like this to write 42 in a four byte binary.
假设您希望它采用小端格式,您可以执行类似的操作,将 42 写入四字节二进制文件。
test_file=open(PATH_HERE/test_file.dat','ab')
test_file.write(b'\xA2file.write(bytes(chr(int(mgcnmbr)), 'iso8859-1'))
import struct
fout = open('test.dat', 'wb')
fout.write(struct.pack('>i', 42))
fout.write(struct.pack('>f', 2.71828182846))
fout.close()
import struct
fout = open('test.dat', 'wb')
fout.write(struct.pack('>if', 42, 2.71828182846))
fout.close()
')
test_file.close()
A2 is 42 in hexadecimal, and the bytes '\xA2\0\0\0'makes the first byte equal to 42 followed by three empty bytes. This code writes the byte: 42, 0, 0, 0.
A2 是十六进制的 42,字节'\xA2\0\0\0'使第一个字节等于 42,后跟三个空字节。此代码写入字节:42, 0, 0, 0。
Your code writes the bytes to represent the character '4' in UTF 32 and the bytes to represent 2 in UTF 32. This means it writes the bytes: 52, 0, 0, 0, 50, 0, 0, 0, because each character is four bytes when encoded in UTF 32.
您的代码写入字节以表示 UTF 32 中的字符“4”和字节以表示 UTF 32 中的 2。这意味着它写入字节:52, 0, 0, 0, 50, 0, 0, 0,因为每个当以 UTF 32 编码时,字符为四个字节。
Also having a hex editor for debugging could be useful for you, then you could see the bytes that your program is outputting and not just the size.
还有一个用于调试的十六进制编辑器可能对您有用,然后您可以看到程序输出的字节而不仅仅是大小。
回答by Green Carpet
In my problem Write binary string in binary file Python 3.4I do like this:
##代码##回答by M.J. Rayburn
You need the structmodule.
您需要struct模块。
##代码##The first argument in struct.pack is the format string.
struct.pack 中的第一个参数是格式字符串。
The first character in the format string dictates the byte order or endiannessof the data (Is the most significant or least significant byte stored first - big-endian or little-endian). Endianness varies from system to system. If ">" doesn't work try "<".
格式字符串中的第一个字符指示数据的字节顺序或字节序(是首先存储的最高有效字节还是最低有效字节 - 大端或小端)。字节序因系统而异。如果“>”不起作用,请尝试“<”。
The second character in the format string is the data type. Unsurprisingly the "i" stands for integer and the "f" stands for float. The number of bytes is determined by the type. Shorts or "h's" for example are two bytes long. There are also codes for unsigned types. "H" corresponds to an unsigned short for instance.
格式字符串中的第二个字符是数据类型。不出所料,“i”代表整数,“f”代表浮点数。字节数由类型决定。例如,短裤或“h's”是两个字节长。还有无符号类型的代码。例如,“H”对应于无符号短。
The second argument in struct.pack is of course the value to be packed into the bytes object.
struct.pack 中的第二个参数当然是要打包到 bytes 对象中的值。
Here's the part where I tell you that I lied about a couple of things. First I said that the number of bytes is determined by the type. This is only partially true. The size of a given type is technically platform dependent as the C/C++ standard (which the struct module is based on) merely specifies minimumsizes. This leads me to the second lie. The first character in the format string also encodes whether the standard (minimum) number of bytes or the native (platform dependent) number of bytes is to be used. (Both ">" and "<" guarantee that the standard, minimum number of bytes is used which is in fact four in the case of an integer "i" or float "f".) It additionally encodes the alignmentof the data.
这是我告诉你我在一些事情上撒谎的部分。首先我说字节数是由类型决定的。这只是部分正确。给定类型的大小在技术上取决于平台,因为 C/C++ 标准(struct 模块基于该标准)仅指定最小大小。这将我引向第二个谎言。格式字符串中的第一个字符还对是使用标准(最小)字节数还是使用本机(取决于平台)字节数进行编码。(“>”和“<”都保证使用标准的最小字节数,在整数“i”或浮点数“f”的情况下实际上是四个字节。)它还对数据的对齐进行了编码。
The documentation on the struct modulehas tables for the format string parameters.
You can also pack multiple primitives into a single bytes object and realize the same result.
您还可以将多个原语打包到一个字节对象中并实现相同的结果。
##代码##And you can of course parse binary data with struct.unpack.
您当然可以使用 struct.unpack 解析二进制数据。

