在 Python 3 中将 int 转换为字节

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21017698/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 21:48:38  来源:igfitidea点击:

Converting int to bytes in Python 3

pythonpython-3.x

提问by astrojuanlu

I was trying to build this bytes object in Python 3:

我试图在 Python 3 中构建这个字节对象:

b'3\r\n'

b'3\r\n'

so I tried the obvious (for me), and found a weird behaviour:

所以我尝试了明显的(对我来说),并发现了一个奇怪的行为:

>>> bytes(3) + b'\r\n'
b'\x00\x00\x00\r\n'

Apparently:

显然:

>>> bytes(10)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

I've been unable to see any pointers on why the bytes conversion works this way reading the documentation. However, I did find some surprise messages in this Python issue about adding formatto bytes (see also Python 3 bytes formatting):

我一直无法看到任何关于为什么字节转换以这种方式工作的指针阅读文档。但是,我确实在这个 Python 问题中发现了一些关于添加format字节的令人惊讶的消息(另请参阅Python 3 字节格式):

http://bugs.python.org/issue3982

http://bugs.python.org/issue3982

This interacts even more poorly with oddities like bytes(int) returning zeroes now

这与现在返回零的字节(int)之类的奇怪现象的交互更差

and:

和:

It would be much more convenient for me if bytes(int) returned the ASCIIfication of that int; but honestly, even an error would be better than this behavior. (If I wanted this behavior - which I never have - I'd rather it be a classmethod, invoked like "bytes.zeroes(n)".)

如果 bytes(int) 返回那个 int 的 ASCII 化,那对我来说会方便得多;但老实说,即使是错误也会比这种行为更好。(如果我想要这种行为——我从来没有——我宁愿它是一个类方法,像“bytes.zeroes(n)”一样调用。)

Can someone explain me where this behaviour comes from?

有人可以解释我这种行为的来源吗?

采纳答案by Tim Pietzcker

That's the way it was designed - and it makes sense because usually, you would call byteson an iterable instead of a single integer:

这就是它的设计方式——这是有道理的,因为通常,你会调用bytes一个可迭代的而不是单个整数:

>>> bytes([3])
b'\x03'

The docs state this, as well as the docstring for bytes:

文档说明这一点,以及文档字符串为bytes

 >>> help(bytes)
 ...
 bytes(int) -> bytes object of size given by the parameter initialized with null bytes

回答by alko

From bytes docs:

来自字节文档

Accordingly, constructor arguments are interpreted as for bytearray().

因此,构造函数参数被解释为 bytearray()。

Then, from bytearray docs:

然后,从bytearray 文档

The optional source parameter can be used to initialize the array in a few different ways:

  • If it is an integer, the array will have that size and will be initialized with null bytes.

可选的 source 参数可用于以几种不同的方式初始化数组:

  • 如果它是一个整数,则该数组将具有该大小并使用空字节进行初始化。

Note, that differs from 2.x (where x >= 6) behavior, where bytesis simply str:

请注意,这与 2.x(其中 x >= 6)行为不同,其中bytes只是str

>>> bytes is str
True

PEP 3112:

PEP 3112

The 2.6 str differs from 3.0's bytes type in various ways; most notably, the constructor is completely different.

2.6 的 str 与 3.0 的字节类型有很多不同;最值得注意的是,构造函数完全不同。

回答by freakish

The behaviour comes from the fact that in Python prior to version 3 byteswas just an alias for str. In Python3.x bytesis an immutable version of bytearray- completely new type, not backwards compatible.

该行为来自这样一个事实,即在版本 3 之前的 Pythonbytes中只是str. 在 Python3.x 中bytes是一个不可变版本bytearray- 全新类型,不向后兼容。

回答by Schcriher

The documentation says:

文档说:

bytes(int) -> bytes object of size given by the parameter
              initialized with null bytes

The sequence:

序列:

b'3\r\n'

It is the character '3' (decimal 51) the character '\r' (13) and '\n' (10).

它是字符'3'(十进制51)、字符'\r'(13)和'\n'(10)。

Therefore, the way would treat it as such, for example:

因此,该方式会这样对待它,例如:

>>> bytes([51, 13, 10])
b'3\r\n'

>>> bytes('3', 'utf8') + b'\r\n'
b'3\r\n'

>>> n = 3
>>> bytes(str(n), 'ascii') + b'\r\n'
b'3\r\n'

Tested on IPython 1.1.0 & Python 3.2.3

在 IPython 1.1.0 和 Python 3.2.3 上测试

回答by Andy Hayden

You can use the struct's pack:

您可以使用结构包

In [11]: struct.pack(">I", 1)
Out[11]: '\x00\x00\x00\x01'

The ">" is the byte-order (big-endian)and the "I" is the format character. So you can be specific if you want to do something else:

">" 是字节顺序 (big-endian),而 "I" 是格式字符。因此,如果您想做其他事情,则可以具体说明:

In [12]: struct.pack("<H", 1)
Out[12]: '\x01\x00'

In [13]: struct.pack("B", 1)
Out[13]: '\x01'

This works the same on both python 2 and python 3.

这在 python 2 和python 3上的工作原理相同。

Note: the inverse operation (bytes to int) can be done with unpack.

注意:逆操作(字节到整数)可以用unpack完成。

回答by brunsgaard

From python 3.2 you can do

从python 3.2你可以做

>>> (1024).to_bytes(2, byteorder='big')
b'\x04\x00'

https://docs.python.org/3/library/stdtypes.html#int.to_bytes

https://docs.python.org/3/library/stdtypes.html#int.to_bytes

def int_to_bytes(x: int) -> bytes:
    return x.to_bytes((x.bit_length() + 7) // 8, 'big')

def int_from_bytes(xbytes: bytes) -> int:
    return int.from_bytes(xbytes, 'big')

Accordingly, x == int_from_bytes(int_to_bytes(x)). Note that this encoding works only for unsigned (non-negative) integers.

相应地,x == int_from_bytes(int_to_bytes(x))。请注意,此编码仅适用于无符号(非负)整数。

回答by Bachsau

The ASCIIfication of 3 is "\x33"not "\x03"!

3 的 ASCII 化"\x33"不是"\x03"!

That is what python does for str(3)but it would be totally wrong for bytes, as they should be considered arrays of binary data and not be abused as strings.

这就是 python 所做的,str(3)但对于字节来说是完全错误的,因为它们应该被视为二进制数据数组,而不是被滥用为字符串。

The most easy way to achieve what you want is bytes((3,)), which is better than bytes([3])because initializing a list is much more expensive, so never use lists when you can use tuples. You can convert bigger integers by using int.to_bytes(3, "little").

实现您想要的最简单的方法是bytes((3,)),这比bytes([3])初始化列表要昂贵得多,因此在可以使用元组时永远不要使用列表。您可以使用 来转换更大的整数int.to_bytes(3, "little")

Initializing bytes with a given length makes sense and is the most useful, as they are often used to create some type of buffer for which you need some memory of given size allocated. I often use this when initializing arrays or expanding some file by writing zeros to it.

使用给定长度初始化字节是有意义的,也是最有用的,因为它们通常用于创建某种类型的缓冲区,您需要为其分配一些给定大小的内存。我经常在初始化数组或通过向其写入零来扩展某些文件时使用它。

回答by jfs

Python 3.5+ introduces %-interpolation (printf-style formatting) for bytes:

Python 3.5+printf为字节引入了 %-interpolation(-style 格式)

>>> b'%d\r\n' % 3
b'3\r\n'

See PEP 0461 -- Adding % formatting to bytes and bytearray.

请参阅PEP 0461 - 向字节和字节数组添加 % 格式

On earlier versions, you could use strand .encode('ascii')the result:

在早期版本中,你可以使用str.encode('ascii')结果:

>>> s = '%d\r\n' % 3
>>> s.encode('ascii')
b'3\r\n'

Note: It is different from what int.to_bytesproduces:

注:这是从不同的东西int.to_bytes产生

>>> n = 3
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big') or b'
import codecs

def int2bytes(i):
    hex_value = '{0:x}'.format(i)
    # make length of hex_value a multiple of two
    hex_value = '0' * (len(hex_value) % 2) + hex_value
    return codecs.decode(hex_value, 'hex_codec')
' b'\x03' >>> b'3' == b'\x33' != '\x03' True

回答by renskiy

int(including Python2's long) can be converted to bytesusing following function:

int(包括 Python2's long)可以转换为bytes使用以下函数:

import codecs
import six  # should be installed via 'pip install six'

long = six.integer_types[-1]

def bytes2int(b):
    return long(codecs.encode(b, 'hex_codec'), 16)

The reverse conversion can be done by another one:

反向转换可以由另一个完成:

Testing with 63:
bytes_: 100000 loops, best of 5: 3.3 usec per loop
to_bytes: 100000 loops, best of 5: 2.72 usec per loop
struct_pack: 100000 loops, best of 5: 2.32 usec per loop
chr_encode: 50000 loops, best of 5: 3.66 usec per loop

Both functions work on both Python2 and Python3.

这两个函数都适用于 Python2 和 Python3。

回答by Graham

I was curious about performance of various methods for a single int in the range [0, 255], so I decided to do some timing tests.

我对 range 中单个 int 的各种方法的性能很好奇[0, 255],所以我决定做一些计时测试。

Based on the timings below, and from the general trend I observed from trying many different values and configurations, struct.packseems to be the fastest, followed by int.to_bytes, bytes, and with str.encode(unsurprisingly) being the slowest. Note that the results show some more variation than is represented, and int.to_bytesand bytessometimes switched speed ranking during testing, but struct.packis clearly the fastest.

根据下面的时间安排,以及我从尝试许多不同的值和配置中观察到的总体趋势,struct.pack似乎是最快的,其次是int.to_bytes, bytes,并且str.encode(不出所料)是最慢的。请注意,结果显示的变化比所表示的要多,int.to_bytes并且bytes有时会在测试期间切换速度排名,但struct.pack显然是最快的。

Results in CPython 3.7 on Windows:

Windows 上的 CPython 3.7 结果:

"""Functions for converting a single int to a bytes object with that int's value."""

import random
import shlex
import struct
import timeit

def bytes_(i):
    """From Tim Pietzcker's answer:
    https://stackoverflow.com/a/21017834/8117067
    """
    return bytes([i])

def to_bytes(i):
    """From brunsgaard's answer:
    https://stackoverflow.com/a/30375198/8117067
    """
    return i.to_bytes(1, byteorder='big')

def struct_pack(i):
    """From Andy Hayden's answer:
    https://stackoverflow.com/a/26920966/8117067
    """
    return struct.pack('B', i)

# Originally, jfs's answer was considered for testing,
# but the result is not identical to the other methods
# https://stackoverflow.com/a/31761722/8117067

def chr_encode(i):
    """Another method, from Quuxplusone's answer here:
    https://codereview.stackexchange.com/a/210789/140921

    Similar to g10guang's answer:
    https://stackoverflow.com/a/51558790/8117067
    """
    return chr(i).encode('latin1')

converters = [bytes_, to_bytes, struct_pack, chr_encode]

def one_byte_equality_test():
    """Test that results are identical for ints in the range [0, 255]."""
    for i in range(256):
        results = [c(i) for c in converters]
        # Test that all results are equal
        start = results[0]
        if any(start != b for b in results):
            raise ValueError(results)

def timing_tests(value=None):
    """Test each of the functions with a random int."""
    if value is None:
        # random.randint takes more time than int to byte conversion
        # so it can't be a part of the timeit call
        value = random.randint(0, 255)
    print(f'Testing with {value}:')
    for c in converters:
        print(f'{c.__name__}: ', end='')
        # Uses technique borrowed from https://stackoverflow.com/q/19062202/8117067
        timeit.main(args=shlex.split(
            f"-s 'from int_to_byte import {c.__name__}; value = {value}' " +
            f"'{c.__name__}(value)'"
        ))

Test module (named int_to_byte.py):

测试模块(命名int_to_byte.py):

##代码##