Python UnicodeEncodeError: 'ascii' 编解码器无法对位置 0-5 中的字符进行编码：序号不在范围内 (128)

Question

提问by Serhii Matrunchyk

I'm simply trying to decode \uXXXX\uXXXX\uXXXX-like string. But I get an error:

我只是想解码 \uXXXX\uXXXX\uXXXX 之类的字符串。但我收到一个错误：

$ python
Python 2.7.6 (default, Sep  9 2014, 15:04:36) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> print u'\u041e\u043b\u044c\u0433\u0430'.decode('utf-8')
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)

    UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)

I'm Python newbie. What's a problem? Thanks!

我是 Python 新手。有什么问题？谢谢！

Answer 1

采纳答案by Martijn Pieters

Python is trying to be helpful. You cannot decodeUnicode data, it is already decoded. So Python first will encodethe data (using the ASCII codec) to get bytes to decode. It is this implicit encoding that fails.

Python 正在努力提供帮助。您无法解码Unicode 数据，它已被解码。因此 Python 首先将编码数据（使用 ASCII 编解码器）以获取要解码的字节。正是这种隐式编码失败了。

If you have Unicode data, it only makes sense to encodeto UTF-8, not decode:

如果您有 Unicode 数据，则只能编码为 UTF-8，而不是解码：

>>> print u'\u041e\u043b\u044c\u0433\u0430'
Ольга
>>> u'\u041e\u043b\u044c\u0433\u0430'.encode('utf8')
'\xd0\x9e\xd0\xbb\xd1\x8c\xd0\xb3\xd0\xb0'

If you wanted a Unicode value, then using a Unicode literal (u'...') is all you needed to do. No further decoding is necessary.

如果您想要一个 Unicode 值，那么u'...'您只需要使用 Unicode 文字 ( ) 即可。不需要进一步的解码。

The same implicit conversion takes place in the other direction; if you tried to encode a bytestring you'd trigger an implicit decoding:

同样的隐式转换发生在另一个方向；如果您尝试对字节串进行编码，则会触发隐式解码：

>>> u'\u041e\u043b\u044c\u0433\u0430'.encode('utf8').encode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128)

Answer 2

回答by Ranvijay Sachan

you can set default encoding utf-8.

您可以设置默认编码 utf-8。

import sys  
reload(sys)  
sys.setdefaultencoding('utf-8')

Python UnicodeEncodeError: 'ascii' 编解码器无法对位置 0-5 中的字符进行编码：序号不在范围内 (128)

提问by Serhii Matrunchyk

采纳答案by Martijn Pieters

回答by Ranvijay Sachan

相关推荐

最近更新

标签

Python UnicodeEncodeError: 'ascii' 编解码器无法对位置 0-5 中的字符进行编码：序号不在范围内 (128)

提问by Serhii Matrunchyk

采纳答案by Martijn Pieters

回答by Ranvijay Sachan

相关推荐

Python 使用 NumPy 对灰度图像进行直方图均衡

python中的漂亮打印json（pythonic方式）

Python 请求：如何禁用/绕过代理

如何在python中围绕感兴趣的区域绘制矩形

相关推荐

最近更新

标签