python Python二进制数据读取

Question

提问by Hyman

A urllib2 request receives binary response as below:

一个 urllib2 请求接收二进制响应如下：

00 00 00 01 00 04 41 4D 54 44 00 00 00 00 02 41
97 33 33 41 99 5C 29 41 90 3D 71 41 91 D7 0A 47
0F C6 14 00 00 01 16 6A E0 68 80 41 93 B4 05 41
97 1E B8 41 90 7A E1 41 96 8F 57 46 E6 2E 80 00
00 01 16 7A 53 7C 80 FF FF

Its structure is:

它的结构是：

DATA, TYPE, DESCRIPTION

00 00 00 01, 4 bytes, Symbol Count =1

00 04, 2 bytes, Symbol Length = 4

41 4D 54 44, 6 bytes, Symbol = AMTD

00, 1 byte, Error code = 0 (OK)

00 00 00 02, 4 bytes, Bar Count =  2

FIRST BAR

41 97 33 33, 4 bytes, Close = 18.90

41 99 5C 29, 4 bytes, High = 19.17

41 90 3D 71, 4 bytes, Low = 18.03

41 91 D7 0A, 4 bytes, Open = 18.23

47 0F C6 14, 4 bytes, Volume = 3,680,608

00 00 01 16 6A E0 68 80, 8 bytes, Timestamp = November 23,2007

SECOND BAR

41 93 B4 05, 4 bytes, Close = 18.4629

41 97 1E B8, 4 bytes, High = 18.89

41 90 7A E1, 4 bytes, Low = 18.06

41 96 8F 57, 4 bytes, Open = 18.82

46 E6 2E 80, 4 bytes, Volume = 2,946,325

00 00 01 16 7A 53 7C 80, 8 bytes, Timestamp = November 26,2007

TERMINATOR

FF FF, 2 bytes,

How to read binary data like this?

如何读取这样的二进制数据？

Thanks in advance.

提前致谢。

Update:

更新：

I tried struct module on first 6 bytes with following code:

我使用以下代码在前 6 个字节上尝试了 struct 模块：

struct.unpack('ih', response.read(6))

(16777216, 1024)

But it should output (1, 4). I take a look at the manual but have no clue what was wrong.

但它应该输出 (1, 4)。我看了一下手册，但不知道出了什么问题。

Answer 1

回答by Alex Martelli

So here's my best shot at interpreting the data you're giving...:

所以这是我解释你提供的数据的最好方法......：

import datetime
import struct

class Printable(object):
  specials = ()
  def __str__(self):
    resultlines = []
    for pair in self.__dict__.items():
      if pair[0] in self.specials: continue
      resultlines.append('%10s %s' % pair)
    return '\n'.join(resultlines)

head_fmt = '>IH6sBH'
head_struct = struct.Struct(head_fmt)
class Header(Printable):
  specials = ('bars',)
  def __init__(self, symbol_count, symbol_length,
               symbol, error_code, bar_count):
    self.__dict__.update(locals())
    self.bars = []
    del self.self

bar_fmt = '>5fQ'
bar_struct = struct.Struct(bar_fmt)
class Bar(Printable):
  specials = ('header',)
  def __init__(self, header, close, high, low,
               open, volume, timestamp):
    self.__dict__.update(locals())
    self.header.bars.append(self)
    del self.self
    self.timestamp /= 1000.0
    self.timestamp = datetime.date.fromtimestamp(self.timestamp)

def showdata(data):
  terminator = '\xff' * 2
  assert data[-2:] == terminator
  head_data = head_struct.unpack(data[:head_struct.size])
  try:
    assert head_data[4] * bar_struct.size + head_struct.size == \
           len(data) - len(terminator)
  except AssertionError:
    print 'data length is %d' % len(data)
    print 'head struct size is %d' % head_struct.size
    print 'bar struct size is %d' % bar_struct.size
    print 'number of bars is %d' % head_data[4]
    print 'head data:', head_data
    print 'terminator:', terminator
    print 'so, something is wrong, since',
    print head_data[4] * bar_struct.size + head_struct.size, '!=',
    print len(data) - len(terminator)
    raise

  head = Header(*head_data)
  for i in range(head.bar_count):
    bar_substr = data[head_struct.size + i * bar_struct.size:
                      head_struct.size + (i+1) * bar_struct.size]
    bar_data = bar_struct.unpack(bar_substr)
    Bar(head, *bar_data)
  assert len(head.bars) == head.bar_count
  print head
  for i, x in enumerate(head.bars):
    print 'Bar #%s' % i
    print x

datas = '''
00 00 00 01 00 04 41 4D 54 44 00 00 00 00 02 41
97 33 33 41 99 5C 29 41 90 3D 71 41 91 D7 0A 47
0F C6 14 00 00 01 16 6A E0 68 80 41 93 B4 05 41
97 1E B8 41 90 7A E1 41 96 8F 57 46 E6 2E 80 00
00 01 16 7A 53 7C 80 FF FF
'''

data = ''.join(chr(int(x, 16)) for x in datas.split())
showdata(data)

this emits:

这发出：

symbol_count 1
 bar_count 2
    symbol AMTD
error_code 0
symbol_length 4
Bar #0
    volume 36806.078125
 timestamp 2007-11-22
      high 19.1700000763
       low 18.0300006866
     close 18.8999996185
      open 18.2299995422
Bar #1
    volume 29463.25
 timestamp 2007-11-25
      high 18.8899993896
       low 18.0599994659
     close 18.4629001617
      open 18.8199901581

...which seems to be pretty close to what you want, net of some output formatting details. Hope this helps!-)

...这似乎与您想要的非常接近，除去一些输出格式的细节。希望这可以帮助！-）

Answer 2

回答by jfs

>>> data
'\x00\x00\x00\x01\x00\x04AMTD\x00\x00\x00\x00\x02A\x9733A\x99\)A\x90=qA\x91\xd7\nG\x0f\xc6\x14\x00\x00\x01\x16j\xe0h\x80A\x93\xb4\x05A\x97\x1e\xb8A\x90z\xe1A\x96\x8fWF\xe6.\x80\x00\x00\x01\x16zS|\x80\xff\xff'
>>> from struct import unpack, calcsize
>>> scount, slength = unpack("!IH", data[:6])
>>> assert scount == 1
>>> symbol, error_code = unpack("!%dsb" % slength, data[6:6+slength+1])
>>> assert error_code == 0
>>> symbol
'AMTD'
>>> bar_count = unpack("!I", data[6+slength+1:6+slength+1+4])
>>> bar_count
(2,)
>>> bar_format = "!5fQ"                                                         
>>> from collections import namedtuple
>>> Bar = namedtuple("Bar", "Close High Low Open Volume Timestamp")             
>>> b = Bar(*unpack(bar_format, data[6+slength+1+4:6+slength+1+4+calcsize(bar_format)]))
>>> b
Bar(Close=18.899999618530273, High=19.170000076293945, Low=18.030000686645508, Open=18.229999542236328, Volume=36806.078125, Timestamp=1195794000000L)
>>> import time
>>> time.ctime(b.Timestamp//1000)
'Fri Nov 23 08:00:00 2007'
>>> int(b.Volume*100 + 0.5)
3680608

Answer 3

回答by mhawke

>>> struct.unpack('ih', response.read(6))
(16777216, 1024)

>>> struct.unpack('ih', response.read(6))
(16777216, 1024)

You are unpacking big-endian data on a little-endian machine. Try this instead:

您正在小端机器上解包大端数据。试试这个：

>>> struct.unpack('!IH', response.read(6))
(1L, 4)

This tells unpack to consider the data in network-order (big-endian). Also, the values of counts and lengths can not be negative, so you should should use the unsigned variants in your format string.

这告诉解包以网络顺序（大端）考虑数据。此外，计数和长度的值不能为负，因此您应该在格式字符串中使用无符号变体。

Answer 4

回答by monkut

Take a look at the struct.unpackin the struct module.

看一下struct 模块中的struct.unpack。

Answer 5

回答by monkut

Use pack/unpack functions from "struct" package. More info here http://docs.python.org/library/struct.html

使用“struct”包中的打包/解包函数。更多信息在这里http://docs.python.org/library/struct.html

Bye!

再见！

Answer 6

回答by Andrey Vlasovskikh

As it was already mentioned, structis the module you need to use.

正如已经提到的，struct是您需要使用的模块。

Please read its documentation to learn about byte ordering, etc.

请阅读其文档以了解字节顺序等。

In your example you need to do the following (as your data is big-endian and unsigned):

在您的示例中，您需要执行以下操作（因为您的数据是 big-endian 且未签名）：

>>> import struct
>>> x = '\x00\x00\x00\x01\x00\x04'
>>> struct.unpack('>IH', x)
(1, 4)

python Python二进制数据读取

提问by Hyman

Update:

更新：

回答by Alex Martelli

回答by jfs

回答by mhawke

回答by monkut

回答by monkut

回答by Andrey Vlasovskikh

相关推荐

最近更新

标签

python Python二进制数据读取

提问by Hyman

Update:

更新：

回答by Alex Martelli

回答by jfs

回答by mhawke

回答by monkut

回答by monkut

回答by Andrey Vlasovskikh

相关推荐

python 在 ctypes.Structure 中使用枚举

python 寻找完美的正方形

检查参数是否是 Python 模块？

删除列表中的重复项，同时保持其顺序（Python）

相关推荐

最近更新

标签