在python中将字符串转换为二进制
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18815820/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert string to binary in python
提问by user1090614
I am in need of a way to get the binary representation of a string in python. e.g.
我需要一种方法来获取 python 中字符串的二进制表示。例如
st = "hello world"
toBinary(st)
Is there a module of some neat way of doing this?
是否有一些巧妙的方法可以做到这一点?
采纳答案by Ashwini Chaudhary
Something like this?
像这样的东西?
>>> st = "hello world"
>>> ' '.join(format(ord(x), 'b') for x in st)
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'
#using `bytearray`
>>> ' '.join(format(x, 'b') for x in bytearray(st, 'utf-8'))
'1101000 1100101 1101100 1101100 1101111 100000 1110111 1101111 1110010 1101100 1100100'
回答by Mark R. Wilkins
You can access the code values for the characters in your string using the ord()
built-in function. If you then need to format this in binary, the string.format()
method will do the job.
您可以使用ord()
内置函数访问字符串中字符的代码值。如果您随后需要将其格式化为二进制文件,该string.format()
方法将完成这项工作。
a = "test"
print(' '.join(format(ord(x), 'b') for x in a))
(Thanks to Ashwini Chaudhary for posting that code snippet.)
(感谢 Ashwini Chaudhary 发布了该代码片段。)
While the above code works in Python 3, this matter gets more complicated if you're assuming any encoding other than UTF-8. In Python 2, strings are byte sequences, and ASCII encoding is assumed by default. In Python 3, strings are assumed to be Unicode, and there's a separate bytes
type that acts more like a Python 2 string. If you wish to assume any encoding other than UTF-8, you'll need to specify the encoding.
虽然上述代码在 Python 3 中有效,但如果您假设使用 UTF-8 以外的任何编码,则问题会变得更加复杂。在 Python 2 中,字符串是字节序列,默认情况下采用 ASCII 编码。在 Python 3 中,字符串被假定为 Unicode,并且有一种bytes
更像 Python 2 字符串的单独类型。如果您希望采用 UTF-8 以外的任何编码,则需要指定编码。
In Python 3, then, you can do something like this:
然后,在 Python 3 中,您可以执行以下操作:
a = "test"
a_bytes = bytes(a, "ascii")
print(' '.join(["{0:b}".format(x) for x in a_bytes]))
The differences between UTF-8 and ascii encoding won't be obvious for simple alphanumeric strings, but will become important if you're processing text that includes characters not in the ascii character set.
对于简单的字母数字字符串,UTF-8 和 ascii 编码之间的差异并不明显,但如果您正在处理包含不在 ascii 字符集中的字符的文本,则将变得很重要。
回答by Kasramvd
As a more pythonic way you can first convert your string to byte array then use bin
function within map
:
作为一种更 Pythonic 的方式,您可以先将字符串转换为字节数组,然后bin
在map
以下内容中使用函数:
>>> st = "hello world"
>>> map(bin,bytearray(st))
['0b1101000', '0b1100101', '0b1101100', '0b1101100', '0b1101111', '0b100000', '0b1110111', '0b1101111', '0b1110010', '0b1101100', '0b1100100']
Or you can join it:
或者你可以加入它:
>>> ' '.join(map(bin,bytearray(st)))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'
Note that in python3you need to specify an encoding for bytearray
function :
请注意,在python3 中,您需要为bytearray
function指定编码:
>>> ' '.join(map(bin,bytearray(st,'utf8')))
'0b1101000 0b1100101 0b1101100 0b1101100 0b1101111 0b100000 0b1110111 0b1101111 0b1110010 0b1101100 0b1100100'
You can also use binascii
module in python 2:
您还可以binascii
在 python 2 中使用模块:
>>> import binascii
>>> bin(int(binascii.hexlify(st),16))
'0b110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'
hexlify
return the hexadecimal representation of the binary data then you can convert to int by specifying 16 as its base then convert it to binary with bin
.
hexlify
返回二进制数据的十六进制表示,然后您可以通过指定 16 作为其基数将其转换为 int,然后使用bin
.
回答by Billal Begueradj
This is an update for the existing answers which used bytearray()
and can not work that way anymore:
这是对现有答案的更新,这些答案已bytearray()
无法再以这种方式工作:
>>> st = "hello world"
>>> map(bin, bytearray(st))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding
Because, as explained in the link above, if the source is a string, you must also give the encoding:
因为,如上面链接中所述,如果源是字符串,则 还必须提供编码:
>>> map(bin, bytearray(st, encoding='utf-8'))
<map object at 0x7f14dfb1ff28>
回答by Ben
def method_a(sample_string):
binary = ' '.join(format(ord(x), 'b') for x in sample_string)
def method_b(sample_string):
binary = ' '.join(map(bin,bytearray(sample_string,encoding='utf-8')))
if __name__ == '__main__':
from timeit import timeit
sample_string = 'Convert this ascii strong to binary.'
print(
timeit(f'method_a("{sample_string}")',setup='from __main__ import method_a'),
timeit(f'method_b("{sample_string}")',setup='from __main__ import method_b')
)
# 9.564299999998184 2.943955828988692
method_b is substantially more efficient at converting to a byte array because it makes low level function calls instead of manually transforming every character to an integer, and then converting that integer into its binary value.
method_b 在转换为字节数组方面效率更高,因为它进行低级函数调用,而不是手动将每个字符转换为整数,然后将该整数转换为其二进制值。
回答by Tao
We just need to encode it.
我们只需要对其进行编码。
'string'.encode('ascii')
回答by Vlad Bezden
In Python version 3.6 and above you can use f-stringto format result.
在 Python 3.6 及更高版本中,您可以使用f-string来格式化结果。
str = "hello world"
print(" ".join(f"{ord(i):08b}" for i in str))
01101000 01100101 01101100 01101100 01101111 00100000 01110111 01101111 01110010 01101100 01100100
The left side of the colon, ord(i), is the actual object whose value will be formatted and inserted into the output. Using ord() gives you the base-10 code point for a single str character.
The right hand side of the colon is the format specifier. 08 means width 8, 0 padded, and the b functions as a sign to output the resulting number in base 2 (binary).
冒号的左侧 ord(i) 是实际对象,其值将被格式化并插入到输出中。使用 ord() 为您提供单个 str 字符的 base-10 代码点。
冒号的右侧是格式说明符。08 表示宽度为 8,填充为 0,b 用作符号以输出基数为 2(二进制)的结果数。
回答by Solo Ship
a = list(input("Enter a string\t: "))
def fun(a):
c =' '.join(['0'*(8-len(bin(ord(i))[2:]))+(bin(ord(i))[2:]) for i in a])
return c
print(fun(a))