Python UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 34: ordinal not in range(128)

Question

提问by Aaron Misquith

I have been working on a program to retrieve questions from stack overflow. Till yesterday the program was working fine, but since today I'm getting the error

我一直在开发一个程序来从堆栈溢出中检索问题。直到昨天程序运行良好，但从今天开始我收到错误

"Message    File Name   Line    Position    
Traceback               
<module>    C:\Users\DPT\Desktop\questions.py   13      
UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 34: ordinal not in range(128)"

Currently the Questions are being displayed but I seem to be unable to copy the output to a new text file.

目前正在显示问题，但我似乎无法将输出复制到新的文本文件。

import sys
sys.path.append('.')
import stackexchange
so = stackexchange.Site(stackexchange.StackOverflow)
term= raw_input("Enter the keyword for Stack Exchange")
print 'Searching for %s...' % term,
sys.stdout.flush()
qs = so.search(intitle=term)
print '\r--- questions with "%s" in title ---' % (term)
for q in qs:
  print '%8d %s' % (q.id, q.title)
  with open('E:\questi.txt', 'a+') as question:
     question.write(q.title)

 time.sleep(10)
 with open('E:\questi.txt') as intxt:
   data = intxt.read()

regular = re.findall('[aA-zZ]+', data)
print(regular)

tokens = set(regular)

with open('D:\Dictionary.txt', 'r') as keywords:
  keyset = set(keywords.read().split())


with open('D:\Questionmatches.txt', 'w') as matches:
  for word in keyset:
    if word in tokens:
        matches.write(word + '\n')

Answer 1

采纳答案by Tim Pietzcker

q.titleis a Unicode string. When writing that to a file, you need to encode it first, preferably a fully Unicode-capable encoding such as UTF-8(if you don't, Python will default to using the ASCIIcodec which doesn't support any character codepoint above 127).

q.title是一个 Unicode 字符串。将其写入文件时，您需要先对其进行编码，最好是完全支持 Unicode 的编码，例如UTF-8（如果不这样做，Python 将默认使用ASCII不支持上述任何字符代码点的编解码器127）。

question.write(q.title.encode("utf-8"))

should fix the problem.

应该解决问题。

By the way, the program tripped up on character “(U+201C).

顺便说一下，程序在字符“( U+201C)上出错了。

Answer 2

回答by Vinnie James

I ran into this as well using TransifexAPI

我也使用TransifexAPI遇到了这个问题

response['source_string']

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 3: ordinal not in range(128)

Fixed with response['source_string'].encode("utf-8")

固定与 response['source_string'].encode("utf-8")

import requests

username = "api"
password = "PASSWORD"

AUTH = (username, password)

url = 'https://www.transifex.com/api/2/project/project-site/resource/name-of-resource/translation/en/strings/?details'

response = requests.get(url, auth=AUTH).json()

print response['key'], response['context']
print response['source_string'].encode("utf-8")

Python UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 34: ordinal not in range(128)

提问by Aaron Misquith

采纳答案by Tim Pietzcker

回答by Vinnie James

相关推荐

最近更新

标签

Python UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c' in position 34: ordinal not in range(128)

提问by Aaron Misquith

采纳答案by Tim Pietzcker

回答by Vinnie James

相关推荐

如何在换行符上拆分python字符串

Python 将字典打印到表格中

Python 使用 scikit-image 将 numpy 数组保存为高精度（16 位）图像

Python 使用 xml.etree.ElementTree 获取子节点的所有实例

相关推荐

最近更新

标签