Python UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

Question

提问by speedyrazor

I have this code:

我有这个代码：

    printinfo = title + "\t" + old_vendor_id + "\t" + apple_id + '\n'
    # Write file
    f.write (printinfo + '\n')

But I get this error when running it:

但是我在运行时遇到这个错误：

    f.write(printinfo + '\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

It's having toruble writing out this:

写这个很麻烦：

Identité secrète (Abduction) [VF]

Any ideas please, not sure how to fix.

请有任何想法，不知道如何解决。

Cheers.

干杯。

UPDATE: This is the bulk of my code, so you can see what I am doing:

更新：这是我的大部分代码，所以你可以看到我在做什么：

def runLookupEdit(self, event):
    newpath1 = pathindir + "/"
    errorFileOut = newpath1 + "REPORT.csv"
    f = open(errorFileOut, 'w')

global old_vendor_id

for old_vendor_id in vendorIdsIn.splitlines():
    writeErrorFile = 0
    from lxml import etree
    parser = etree.XMLParser(remove_blank_text=True) # makes pretty print work

    path1 = os.path.join(pathindir, old_vendor_id)
    path2 = path1 + ".itmsp"
    path3 = os.path.join(path2, 'metadata.xml')

    # Open and parse the xml file
    cantFindError = 0
    try:
        with open(path3): pass
    except IOError:
        cantFindError = 1
        errorMessage = old_vendor_id
        self.Error(errorMessage)
        break
    tree = etree.parse(path3, parser)
    root = tree.getroot()

    for element in tree.xpath('//video/title'):
        title = element.text
        while '\n' in title:
            title= title.replace('\n', ' ')
        while '\t' in title:
            title = title.replace('\t', ' ')
        while '  ' in title:
            title = title.replace('  ', ' ')
        title = title.strip()
        element.text = title
    print title

#########################################
######## REMOVE UNWANTED TAGS ########
#########################################

    # Remove the comment tags
    comments = tree.xpath('//comment()')
    q = 1
    for c in comments:
        p = c.getparent()
        if q == 3:
            apple_id = c.text
        p.remove(c)
        q = q+1

    apple_id = apple_id.split(':',1)[1]
    apple_id = apple_id.strip()
    printinfo = title + "\t" + old_vendor_id + "\t" + apple_id

    # Write file
    # f.write (printinfo + '\n')
    f.write(printinfo.encode('utf8') + '\n')
f.close()

Answer 1

回答by Martijn Pieters

You need to encode Unicode explicitly before writing to a file, otherwise Python does it for you with the default ASCII codec.

您需要在写入文件之前显式编码 Unicode，否则 Python 会使用默认的 ASCII 编解码器为您完成。

Pick an encoding and stick with it:

选择一种编码并坚持使用它：

f.write(printinfo.encode('utf8') + '\n')

or use io.open()to create a file object that'll encode for you as you write to the file:

或用于io.open()创建一个文件对象，该对象将在您写入文件时为您编码：

import io

f = io.open(filename, 'w', encoding='utf8')

You may want to read:

你可能想阅读：

The Python Unicode HOWTO
Pragmatic Unicodeby Ned Batchelder
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)by Joel Spolsky

在Python的Unicode指南
内德巴切尔德的实用 Unicode
每个软件开发人员绝对、肯定地必须了解 Unicode 和字符集的绝对最低要求（没有任何借口！）作者：Joel Spolsky

before continuing.

在继续之前。

Python UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

提问by speedyrazor

回答by Martijn Pieters

相关推荐

最近更新

标签

Python UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 7: ordinal not in range(128)

提问by speedyrazor

回答by Martijn Pieters

相关推荐

Python socket.send() 和 socket.sendall() 有什么区别？

Python 如何在 Windows 中使用子进程

Python 中可能出现无限循环？

在python中将十六进制转换为int

相关推荐

最近更新

标签