Python “UCS-2”编解码器无法对位置 1050-1050 中的字符进行编码

Question

提问by Andi

When I run my Python code, I get the following errors:

当我运行 Python 代码时，出现以下错误：

  File "E:\python343\crawler.py", line 31, in <module>
    print (x1)
  File "E:\python343\lib\idlelib\PyShell.py", line 1347, in write
    return self.shell.write(s, self.tags)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 1050-1050: Non-BMP character not supported in Tk

Here is my code:

这是我的代码：

x = g.request('search', {'q' : 'TaylorSwift', 'type' : 'page', 'limit' : 100})['data'][0]['id']

# GET ALL STATUS POST ON PARTICULAR PAGE(X=PAGE ID)
for x1 in g.get_connections(x, 'feed')['data']:
    print (x1)
    for x2 in x1:
        print (x2)
        if(x2[1]=='status'):
            x2['message']

How can I fix this?

我怎样才能解决这个问题？

Answer 1

采纳答案by Martijn Pieters

Your data contains characters outside of the Basic Multilingual Plane. Emoji's for example, are outside the BMP, and the window system used by IDLE, Tk, cannot handle such characters.

您的数据包含基本多语言平面之外的字符。例如，表情符号在 BMP 之外，IDLE、Tk 使用的窗口系统无法处理此类字符。

You could use a translation tableto map everything outside of the BMP to the replacement character:

您可以使用转换表将 BMP 之外的所有内容映射到替换字符：

import sys
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
print(x.translate(non_bmp_map))

The non_bmp_mapmaps all codepoints outside the BMP (any codepoint higher than 0xFFFF, all the way up to the highest Unicode codepoint your Python version can handle) to U+FFFD REPLACEMENT CHARACTER:

将non_bmp_mapBMP 之外的所有代码点（任何高于 0xFFFF 的代码点，一直到您的 Python 版本可以处理的最高 Unicode 代码点）映射到U+FFFD REPLACEMENT CHARACTER：

>>> print('This works outside IDLE! \U0001F44D')
This works outside IDLE! 
>>> print('This works in IDLE too! \U0001F44D'.translate(non_bmp_map))
This works in IDLE too! ?

Answer 2

回答by Keith Student

None of these worked for me but the following does. This assumes that public_tweets was pulled from tweepy api.search

这些都不适合我，但以下内容有效。这假设 public_tweets 是从 tweepy api.search 中提取的

for tweet in public_tweets:
    print (tweet.text)
    u=tweet.text
    u=u.encode('unicode-escape').decode('utf-8')

Answer 3

回答by Parika Pandey

this unicode issue has been seen in python 3.6 and older versions, to resolve it just upgrade python as python 3.8 and use your code.This error will not come.

这个 unicode 问题已经在 python 3.6 和旧版本中出现，要解决它，只需将 python 升级为 python 3.8 并使用你的代码。这个错误不会出现。

Python “UCS-2”编解码器无法对位置 1050-1050 中的字符进行编码

提问by Andi

采纳答案by Martijn Pieters

回答by Keith Student

回答by Parika Pandey

相关推荐

最近更新

标签

Python “UCS-2”编解码器无法对位置 1050-1050 中的字符进行编码

提问by Andi

采纳答案by Martijn Pieters

回答by Keith Student

回答by Parika Pandey

相关推荐

Python 如何在 scikit-learn 中创建/自定义您自己的评分器功能？

python pandas数据框到字典

Python 我可以移动一个 virtualenv 吗？

Python 更改 matplotlib imshow() 图形轴上的值

相关推荐

最近更新

标签