在python中将二进制转换为utf-8
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19255832/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
converting binary to utf-8 in python
提问by Aidin.T
I have a binary like this:
1101100110000110110110011000001011011000101001111101100010101000
我有一个这样的二进制文件:
1101100110000110110110011000001011011000101001111101100010101000
and I want to convert it to utf-8. how can I do this in python?
我想将其转换为 utf-8。我怎么能在python中做到这一点?
采纳答案by Igonato
Cleaner version:
清洁版:
>>> test_string = '1101100110000110110110011000001011011000101001111101100010101000'
>>> print ('%x' % int(test_string, 2)).decode('hex').decode('utf-8')
????
Inverse (from @Rob?'s comment):
反向(来自@Rob?的评论):
>>> '{:b}'.format(int(u'????'.encode('utf-8').encode('hex'), 16))
1: '1101100110000110110110011000001011011000101001111101100010101000'
回答by Nacib Neme
Use:
用:
def bin2text(s): return "".join([chr(int(s[i:i+8],2)) for i in xrange(0,len(s),8)])
>>> print bin2text("01110100011001010111001101110100")
>>> test
回答by Paulo Bu
Well, the idea I have is:
1. Split the string into octets
2. Convert the octet to hexadecimal using int
and later chr
3. Join them and decode the utf-8 string into Unicode
好吧,我的想法是: 1. 将字符串拆分为八位字节 2. 使用int
和稍后将八位字节转换为十六进制chr
3. 加入它们并将 utf-8 字符串解码为 Unicode
This code works for me, but I'm not sure what does it print because I don't have utf-8 in my console (Windows :P ).
这段代码对我有用,但我不确定它打印什么,因为我的控制台中没有 utf-8(Windows :P)。
s = '1101100110000110110110011000001011011000101001111101100010101000'
u = "".join([chr(int(x,2)) for x in [s[i:i+8]
for i in range(0,len(s), 8)
]
])
d = u.decode('utf-8')
Hope this helps!
希望这可以帮助!
回答by Rob?
>>> s='1101100110000110110110011000001011011000101001111101100010101000'
>>> print (''.join([chr(int(x,2)) for x in re.split('(........)', s) if x ])).decode('utf-8')
????
>>>
Or, the inverse:
或者,反过来:
>>> s=u'????'
>>> ''.join(['{:b}'.format(ord(x)) for x in s.encode('utf-8')])
'1101100110000110110110011000001011011000101001111101100010101000'
>>>