Python 将 int 值转换为 unicode
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17627834/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert an int value to unicode
提问by user2578666
I am using pyserial and need to send some values less than 255. If I send the int itself the the ascii value of the int gets sent. So now I am converting the int into a unicode value and sending it through the serial port.
我正在使用 pyserial 并且需要发送一些小于 255 的值。如果我发送 int 本身,则发送 int 的 ascii 值。所以现在我将 int 转换为 unicode 值并通过串行端口发送它。
unichr(numlessthan255);
However it throws this error:
'ascii' codec can't encode character u'\x9a' in position 24: ordinal not in range(128)
Whats the best way to convert an int to unicode?
将 int 转换为 unicode 的最佳方法是什么?
采纳答案by Steve Barnes
Just use chr(somenumber)
to get a 1 byte value of an int as long as it is less than 256. pySerial will then send it fine.
只要chr(somenumber)
它小于 256,就可以用来获取一个 int 的 1 字节值。然后 pySerial 会很好地发送它。
If you are looking at sending things over pySerial it is a verygood idea to look at the struct module in the standard library it handles endian issues an packing issues as well as encoding for just about every data type that you are likely to need that is 1 byte or over.
如果您正在考虑通过 pySerial 发送内容,那么查看标准库中的 struct 模块是一个非常好的主意,它处理字节序问题、打包问题以及您可能需要的几乎所有数据类型的编码1 字节或以上。
回答by Martijn Pieters
Use the chr()
functioninstead; you are sending a value of less than 256 but more than 128, but are creating a Unicode character.
改用chr()
函数;您发送的值小于 256 但大于 128,但正在创建 Unicode 字符。
The unicode character has to then be encoded first to get a bytecharacter, and that encoding fails because you are using a value outside the ASCII range (0-127):
然后必须首先对 unicode 字符进行编码以获取字节字符,并且该编码失败,因为您使用的是 ASCII 范围 (0-127) 之外的值:
>>> str(unichr(169))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in position 0: ordinal not in range(128)
This is normal Python 2 behaviour; when trying to convert a unicode string to a byte string, an implicit encoding has to take place and the default encoding is ASCII.
这是正常的 Python 2 行为;尝试将 unicode 字符串转换为字节字符串时,必须进行隐式编码,默认编码为 ASCII。
If you were to use chr()
instead, you create a byte string of one character and that implicit encoding does nothave to take place:
如果你使用chr()
相反,你建立一个字符的字节串和隐式编码并没有要发生:
>>> str(chr(169))
'\xa9'
Another method you may want to look into is the struct
module, especially if you need to send integer values greaterthan 255:
您可能想要研究的另一种方法是struct
module,尤其是当您需要发送大于255 的整数值时:
>>> struct.pack('!H', 1000)
'\x03\xe8'
The above example packs an integer into a unsigned short in network byte order, for example.
例如,上面的示例以网络字节顺序将整数打包为无符号短整型。
回答by Eric O Lebigot
I think that the best solution is to be explicit and say that you want to represent a number as a byte (and not as a character):
我认为最好的解决方案是明确表示您想将数字表示为字节(而不是字符):
>>> import struct
>>> struct.pack('B', 128)
>>> '\x80'
This makes your code work in both Python?2 and Python?3 (in Python?3, the result is, as it should, a bytesobject). An alternative, in Python?3, would be to use the new bytes([128])
to create a single byte of value 128.
这使您的代码在 Python?2 和 Python?3 中都能工作(在 Python?3 中,结果应该是一个字节对象)。在 Python?3 中,另一种方法是使用 newbytes([128])
创建一个值为 128 的单个字节。
I am not a big fan of the chr()
solutions: in Python?3, they produce a (character, not byte) stringthat needs to be encodedbefore sending it anywhere (file, socket, terminal,…)—chr()
in Python?3 is equivalent to the problematic Python?2 unichr()
of the question. The struct
solution has the advantage of correctly producing a byte whatever the version of Python. If you want to send data over the serial port with chr()
, you need to have control over the encoding that must take place subsequently. The code might work when the default encoding used by Python?3 is UTF-8 (which I think is the case), but this is due to the fact that Unicode characters of code point smaller than 256 can be coded as a single byte in UTF-8. This adds an unnecessary layer of subtlety and complexity that I do not recommend (it makes the code harder to understand and, if necessary, debug).
我不是这些chr()
解决方案的忠实粉丝:在 Python?3 中,它们生成一个(字符,而不是字节)字符串,需要在将其发送到任何地方(文件、套接字、终端等)之前对其进行编码——chr()
在 Python?3 中是等价的到有问题的 Python?2unichr()
问题。该struct
解决方案的优点是无论 Python 版本如何,都可以正确生成字节。如果你想通过串口发送数据chr()
,您需要控制随后必须进行的编码。当 Python?3 使用的默认编码是 UTF-8(我认为是这种情况)时,该代码可能会起作用,但这是因为代码点小于 256 的 Unicode 字符可以编码为单个字节UTF-8。这增加了我不推荐的不必要的微妙和复杂层(它使代码更难理解,并在必要时进行调试)。
So, I strongly suggest that you use the approach above (which was also hinted at by Steve Barnes and Martijn Pieters): it makes it clear that you want to produce a byte(and not characters). It will not give you any surprise even if you run your code with Python?3, and it makes your intent clearer and more obvious.
因此,我强烈建议您使用上述方法(Steve Barnes 和 Martijn Pieters 也暗示了这一点):它清楚地表明您想要生成一个字节(而不是字符)。即使你用 Python?3 运行你的代码也不会给你任何惊喜,它让你的意图更清晰、更明显。
回答by chasmani
In Python 2 - Turn it into a string first, then into unicode.
在 Python 2 中 - 首先将其转换为字符串,然后转换为 unicode。
str(integer).decode("utf-8")
Best way I think. Works with any integer, plus still works if you put a string in as the input.
我认为最好的方式。适用于任何整数,如果您将字符串作为输入,它仍然适用。
Updated edit due to a comment: For Python 2 and 3 - This works on both but a bit messy:
由于评论而更新编辑:对于 Python 2 和 3 - 这适用于两者,但有点混乱:
str(integer).encode("utf-8").decode("utf-8")