python Python二进制数据读取
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1591920/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python binary data reading
提问by Hyman
A urllib2 request receives binary response as below:
一个 urllib2 请求接收二进制响应如下:
00 00 00 01 00 04 41 4D 54 44 00 00 00 00 02 41
97 33 33 41 99 5C 29 41 90 3D 71 41 91 D7 0A 47
0F C6 14 00 00 01 16 6A E0 68 80 41 93 B4 05 41
97 1E B8 41 90 7A E1 41 96 8F 57 46 E6 2E 80 00
00 01 16 7A 53 7C 80 FF FF
Its structure is:
它的结构是:
DATA, TYPE, DESCRIPTION
00 00 00 01, 4 bytes, Symbol Count =1
00 04, 2 bytes, Symbol Length = 4
41 4D 54 44, 6 bytes, Symbol = AMTD
00, 1 byte, Error code = 0 (OK)
00 00 00 02, 4 bytes, Bar Count = 2
FIRST BAR
41 97 33 33, 4 bytes, Close = 18.90
41 99 5C 29, 4 bytes, High = 19.17
41 90 3D 71, 4 bytes, Low = 18.03
41 91 D7 0A, 4 bytes, Open = 18.23
47 0F C6 14, 4 bytes, Volume = 3,680,608
00 00 01 16 6A E0 68 80, 8 bytes, Timestamp = November 23,2007
SECOND BAR
41 93 B4 05, 4 bytes, Close = 18.4629
41 97 1E B8, 4 bytes, High = 18.89
41 90 7A E1, 4 bytes, Low = 18.06
41 96 8F 57, 4 bytes, Open = 18.82
46 E6 2E 80, 4 bytes, Volume = 2,946,325
00 00 01 16 7A 53 7C 80, 8 bytes, Timestamp = November 26,2007
TERMINATOR
FF FF, 2 bytes,
How to read binary data like this?
如何读取这样的二进制数据?
Thanks in advance.
提前致谢。
Update:
更新:
I tried struct module on first 6 bytes with following code:
我使用以下代码在前 6 个字节上尝试了 struct 模块:
struct.unpack('ih', response.read(6))
(16777216, 1024)
(16777216, 1024)
But it should output (1, 4). I take a look at the manual but have no clue what was wrong.
但它应该输出 (1, 4)。我看了一下手册,但不知道出了什么问题。
回答by Alex Martelli
So here's my best shot at interpreting the data you're giving...:
所以这是我解释你提供的数据的最好方法......:
import datetime
import struct
class Printable(object):
specials = ()
def __str__(self):
resultlines = []
for pair in self.__dict__.items():
if pair[0] in self.specials: continue
resultlines.append('%10s %s' % pair)
return '\n'.join(resultlines)
head_fmt = '>IH6sBH'
head_struct = struct.Struct(head_fmt)
class Header(Printable):
specials = ('bars',)
def __init__(self, symbol_count, symbol_length,
symbol, error_code, bar_count):
self.__dict__.update(locals())
self.bars = []
del self.self
bar_fmt = '>5fQ'
bar_struct = struct.Struct(bar_fmt)
class Bar(Printable):
specials = ('header',)
def __init__(self, header, close, high, low,
open, volume, timestamp):
self.__dict__.update(locals())
self.header.bars.append(self)
del self.self
self.timestamp /= 1000.0
self.timestamp = datetime.date.fromtimestamp(self.timestamp)
def showdata(data):
terminator = '\xff' * 2
assert data[-2:] == terminator
head_data = head_struct.unpack(data[:head_struct.size])
try:
assert head_data[4] * bar_struct.size + head_struct.size == \
len(data) - len(terminator)
except AssertionError:
print 'data length is %d' % len(data)
print 'head struct size is %d' % head_struct.size
print 'bar struct size is %d' % bar_struct.size
print 'number of bars is %d' % head_data[4]
print 'head data:', head_data
print 'terminator:', terminator
print 'so, something is wrong, since',
print head_data[4] * bar_struct.size + head_struct.size, '!=',
print len(data) - len(terminator)
raise
head = Header(*head_data)
for i in range(head.bar_count):
bar_substr = data[head_struct.size + i * bar_struct.size:
head_struct.size + (i+1) * bar_struct.size]
bar_data = bar_struct.unpack(bar_substr)
Bar(head, *bar_data)
assert len(head.bars) == head.bar_count
print head
for i, x in enumerate(head.bars):
print 'Bar #%s' % i
print x
datas = '''
00 00 00 01 00 04 41 4D 54 44 00 00 00 00 02 41
97 33 33 41 99 5C 29 41 90 3D 71 41 91 D7 0A 47
0F C6 14 00 00 01 16 6A E0 68 80 41 93 B4 05 41
97 1E B8 41 90 7A E1 41 96 8F 57 46 E6 2E 80 00
00 01 16 7A 53 7C 80 FF FF
'''
data = ''.join(chr(int(x, 16)) for x in datas.split())
showdata(data)
this emits:
这发出:
symbol_count 1
bar_count 2
symbol AMTD
error_code 0
symbol_length 4
Bar #0
volume 36806.078125
timestamp 2007-11-22
high 19.1700000763
low 18.0300006866
close 18.8999996185
open 18.2299995422
Bar #1
volume 29463.25
timestamp 2007-11-25
high 18.8899993896
low 18.0599994659
close 18.4629001617
open 18.8199901581
...which seems to be pretty close to what you want, net of some output formatting details. Hope this helps!-)
...这似乎与您想要的非常接近,除去一些输出格式的细节。希望这可以帮助!-)
回答by jfs
>>> data
'\x00\x00\x00\x01\x00\x04AMTD\x00\x00\x00\x00\x02A\x9733A\x99\)A\x90=qA\x91\xd7\nG\x0f\xc6\x14\x00\x00\x01\x16j\xe0h\x80A\x93\xb4\x05A\x97\x1e\xb8A\x90z\xe1A\x96\x8fWF\xe6.\x80\x00\x00\x01\x16zS|\x80\xff\xff'
>>> from struct import unpack, calcsize
>>> scount, slength = unpack("!IH", data[:6])
>>> assert scount == 1
>>> symbol, error_code = unpack("!%dsb" % slength, data[6:6+slength+1])
>>> assert error_code == 0
>>> symbol
'AMTD'
>>> bar_count = unpack("!I", data[6+slength+1:6+slength+1+4])
>>> bar_count
(2,)
>>> bar_format = "!5fQ"
>>> from collections import namedtuple
>>> Bar = namedtuple("Bar", "Close High Low Open Volume Timestamp")
>>> b = Bar(*unpack(bar_format, data[6+slength+1+4:6+slength+1+4+calcsize(bar_format)]))
>>> b
Bar(Close=18.899999618530273, High=19.170000076293945, Low=18.030000686645508, Open=18.229999542236328, Volume=36806.078125, Timestamp=1195794000000L)
>>> import time
>>> time.ctime(b.Timestamp//1000)
'Fri Nov 23 08:00:00 2007'
>>> int(b.Volume*100 + 0.5)
3680608
回答by mhawke
>>> struct.unpack('ih', response.read(6)) (16777216, 1024)
>>> struct.unpack('ih', response.read(6)) (16777216, 1024)
You are unpacking big-endian data on a little-endian machine. Try this instead:
您正在小端机器上解包大端数据。试试这个:
>>> struct.unpack('!IH', response.read(6))
(1L, 4)
This tells unpack to consider the data in network-order (big-endian). Also, the values of counts and lengths can not be negative, so you should should use the unsigned variants in your format string.
这告诉解包以网络顺序(大端)考虑数据。此外,计数和长度的值不能为负,因此您应该在格式字符串中使用无符号变体。
回答by monkut
Take a look at the struct.unpackin the struct module.
看一下struct 模块中的struct.unpack。
回答by monkut
Use pack/unpack functions from "struct" package. More info here http://docs.python.org/library/struct.html
使用“struct”包中的打包/解包函数。更多信息在这里http://docs.python.org/library/struct.html
Bye!
再见!
回答by Andrey Vlasovskikh
As it was already mentioned, struct
is the module you need to use.
正如已经提到的,struct
是您需要使用的模块。
Please read its documentation to learn about byte ordering, etc.
请阅读其文档以了解字节顺序等。
In your example you need to do the following (as your data is big-endian and unsigned):
在您的示例中,您需要执行以下操作(因为您的数据是 big-endian 且未签名):
>>> import struct
>>> x = '\x00\x00\x00\x01\x00\x04'
>>> struct.unpack('>IH', x)
(1, 4)