python 3.2 UnicodeEncodeError: 'charmap' 编解码器无法对位置 9629 中的字符 '\u2013' 进行编码:字符映射到 <undefined>
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16346914/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python 3.2 UnicodeEncodeError: 'charmap' codec can't encode character '\u2013' in position 9629: character maps to <undefined>
提问by Mattias
I'm trying to make a script that gets data out from an sqlite3 database, but I have run in to a problem.
我正在尝试制作一个从 sqlite3 数据库中获取数据的脚本,但我遇到了一个问题。
The field in the database is of type text and the contains a html formated text. see the text below
数据库中的字段是文本类型,包含一个 html 格式的文本。看下面的文字
<html>
<head>
<title>Yahoo!</title>
</head>
<body>
<style type="text/css">
html {}
.yshortcuts {border-bottom:none !important;}
.ReadMsgBody {width:100%;}
.ExternalClass{width:100%;}
</style>
<table cellpadding="0" cellspacing="0" bgcolor="#ffffff">
<tr>
<td width="550" valign="top" align="left">
<table cellpadding="0" cellspacing="0" width="500">
<tr>
<td colspan="3"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/us/logo.gif" width="292" height="51" style="display:block;" border="0" alt="Yahoo! Mail"></td>
</tr>
<tr>
<td rowspan="3" width="1" bgcolor="#c7c4ca"></td>
<td width="498" height="1" bgcolor="#c7c4ca"></td>
<td rowspan="3" width="1" bgcolor="#c7c4ca"></td>
</tr>
<tr>
<td width="498" valign="top" align="left">
<table cellpadding="0" cellspacing="0">
<tr>
<td width="498" bgcolor="#61399d" align="left" valign="top">
<table cellspacing="0" cellpadding="0"><tr><td height="24"></td></tr></table>
<div style="font-family:Arial, Helvetica, sans-serif;font-size:23px;line-height:27px;margin-bottom:10px;color:#ffffff;margin-left:15px;"><span style="color:#ffffff;text-decoration:none;font-weight:bold;line-height:27px;">V?lkommen till Yahoo! Mail.</span></div>
<div style="font-family:Arial, Helvetica, sans-serif;font-size:22px;line-height:26px;margin-bottom:1px;color:#ffffff;margin-left:15px;margin-bottom:7px;margin-right:15px;">Ansluta och dela g?r snabbt och enkelt och ?r tillg?ngligt ?verallt.</div>
</td>
</tr>
<tr>
<td><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/b1.gif" width="498" height="18" style="display:block;" border="0"></td>
</tr>
</table>
<table cellpadding="0" cellspacing="0" width="498">
<tr>
<td width="292" valign="top">
<table cellpadding="0" cellspacing="0">
<tr>
<td><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/grad.gif" width="292" height="9" style="display:block;"></td>
</tr>
<tr>
<td width="292" bgcolor="#ffffff" align="left" valign="top">
<table cellspacing="0" cellpadding="0"><tr><td height="11"></td></tr></table>
<div style="margin-left:15px;">
<div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:18px;color:#333333;margin-bottom:11px;font-weight:bold;">Det ?r l?tt som en pl?tt att komma ig?ng.</div>
<table cellpadding="0" cellspacing="0" width="267">
<tr>
<td width="16" align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">1. </div></td>
<td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;margin-bottom:9px;"><a rel="nofollow" target="_blank" href="http://us-mg999.mail.yahoo.com/neo/launch?action=contacts" style="text-decoration:underline;color:#61399d;"><span>L?gg till alla dina kontakter p? en plats</span></a>.</div></td>
</tr>
<tr>
<td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">2. </div></td>
<td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;margin-bottom:9px;"><a rel="nofollow" target="_blank" href="http://mrd.mail.yahoo.com/themes" style="text-decoration:underline;color:#61399d;"><span>Anpassa din nya inkorg</span></a>.</div></td>
</tr>
<tr>
<td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:14px;line-height:16px;color:#61399d;margin-bottom:9px;font-weight:bold;">3. </div></td>
<td align="left" valign="top"><div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#61399d;"><a rel="nofollow" target="_blank" href="http://se.overview.mail.yahoo.com/mobile" style="text-decoration:underline;color:#61399d;"><span>Anslut ?verallt p? dina mobila enheter</span></a>.</div></td>
</tr>
</table>
</div>
</td>
</tr>
<tr><td height="13"></td></tr>
</table>
</td>
<td width="196" valign="top">
<table cellpadding="0" cellspacing="0">
<tr>
<td width="1" bgcolor="#fbfbfd" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g1.gif" width="1" height="21" style="display:block;"></td>
<td width="1" bgcolor="#f5f6fa" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g2.gif" width="1" height="21" style="display:block;"></td>
<td width="1" bgcolor="#e8eaf1" valign="top"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/g3.gif" width="1" height="21" style="display:block;"></td>
<td width="1" bgcolor="#d4d4d4"></td>
<td width="186" bgcolor="#f0f0f0" align="left" valign="top">
<table cellspacing="0" cellpadding="0"><tr><td height="3"> </td></tr></table>
<div style="margin-left:11px;">
<div style="font-family:Arial, Helvetica, sans-serif;font-size:13px;line-height:16px;color:#333333;margin-bottom:9px;"><b>Info f?r dig:</b></div>
<div style="font-family:Arial, Helvetica, sans-serif;font-size:12px;color:#43494e;line-height:18px;margin-bottom:10px;">
Yahoo!-ID och e-postadress:<br />
<div style="font-family:Arial, Helvetica, sans-serif;font-size:12px;color:#43494e;line-height:18px;">
H?ll ditt konto och inst?llningar aktuella. <br><a rel="nofollow" target="_blank" href="https://edit.yahoo.com/config/eval_profile" style="text-decoration:underline;color:#61399d;"><span>Mitt konto</span></a>
</div>
</div>
<table cellspacing="0" cellpadding="0"><tr><td height="20"></td></tr></table>
</td>
<td width="1" bgcolor="#dbdbdb"></td>
<td width="1" bgcolor="#ced2de"></td>
<td width="1" bgcolor="#dbdfed"></td>
<td width="1" bgcolor="#e8ebf3"></td>
<td width="1" bgcolor="#f3f4f9"></td>
<td width="1" bgcolor="#fafbfc"></td>
</tr>
<tr>
<td colspan="11"><img src="http://mail.yimg.com/nq/assets/sharedmessages/v1/all/b2.gif" width="196" height="8" style="display:block;" border="0"></td>
</tr>
<tr><td height="13"></td></tr>
</table>
</td>
<td width="10"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td width="498" height="1" bgcolor="#c7c4ca"></td>
</tr>
</table>
<table cellpadding="0" cellspacing="0" width="500">
<tr>
<td align="center" valign="top">
<table cellspacing="0" cellpadding="0"><tr><td height="10"></td></tr></table>
<div style="font-family:Arial, Helvetica, sans-serif;font-size:11px;line-height:18px;margin-bottom:10px;">
<a rel="nofollow" target="_blank" href="http://info.yahoo.com/legal/se/yahoo/utos.html" style="text-decoration:underline;color:#61399d;">Yahoo! Villkor f?r anv?ndning</a> | <a rel="nofollow" target="_blank" href="http://info.yahoo.com/legal/se/yahoo/mail/atos.html" style="text-decoration:underline;color:#61399d;">Yahoo! Mail –Villkor f?r anv?ndning</a> | <a rel="nofollow" target="_blank" href="http://info.yahoo.com/privacy/se/yahoo/details.html" style="text-decoration:underline;color:#61399d;">Yahoo! Sekretesspolicy</a>
</div>
</td>
</tr>
<tr>
<td align="left" valign="top">
<div style="font-family:Arial, Helvetica, sans-serif;font-size:11px;line-height:14px;color:#545454;margin-left:16px;margin-right:14px;">Var god svara inte p? detta meddelande. Detta ?r ett servicemeddelande som r?r din anv?ndning av Yahoo! Mail. Om du vill veta mer om Yahoo!s anv?ndning av personlig information, inklusive anv?ndning av webb-beacons i HTML-baserad e-post, kan du l?sa v?r Yahoo! Sekretesspolicy. Yahoo!s adress ?r 701 First Avenue, Sunnyvale, CA 94089, USA.<br /><br />RefID: lp-1037111</div>
</td>
</tr>
</table>
</td>
</tr>
</table>
<img width="1" height="1" src="http://pclick.internal.yahoo.com/p/s=2143684696">
</body>
</html>`
and the python code that try to extract the data is as follows.
尝试提取数据的python代码如下。
>>> import sqlite3
>>> conn = sqlite3.connect('C:/temp/Mobils/export/com.yahoo.mobile.client.android.mail/databases/mail.db')
>>> c = conn.cursor()
>>> conn.row_factory=sqlite3.Row
>>> c.execute('select body from messages_1 where _id=7')
<sqlite3.Cursor object at 0x0000000001FB78F0>
>>> r = c.fetchone()
>>> r.keys()
['body']
>>> print(r['body'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python32\lib\encodings\cp850.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2013' in position 9629: character maps to <undefined>
>>>
Does anybody have any idea of how to print/write this to a file. Yes I know that this is printed to stdout, but I get the same UnicodeEncodeError when I try to write to a file. I tried both write method of a file object and print(r['body'], file=f).
有没有人知道如何将其打印/写入文件。是的,我知道这是打印到标准输出的,但是当我尝试写入文件时,我得到了相同的 UnicodeEncodeError。我尝试了文件对象的 write 方法和print(r['body'], file=f).
采纳答案by Mark Ransom
When you open the file you want to write to, open it with a specific encoding that can handle all the characters.
当您打开要写入的文件时,请使用可以处理所有字符的特定编码打开它。
with open('filename', 'w', encoding='utf-8') as f:
print(r['body'], file=f)
回答by abarnert
While Python 3 deals in Unicode, the Windows console or POSIX tty that you're running inside does not. So, whenever you print, or otherwise send Unicode strings to stdout, and it's attached to a console/tty, Python has to encode it.
虽然 Python 3 处理 Unicode,但您在其中运行的 Windows 控制台或 POSIX tty 不处理。因此,无论何时您print或以其他方式将 Unicode 字符串发送到stdout,并且它附加到控制台/tty,Python 都必须对其进行编码。
The error message indirectly tells you what character set Python was trying to use:
错误消息间接告诉您 Python 尝试使用的字符集:
File "C:\Python32\lib\encodings\cp850.py", line 19, in encode
This means the charset is cp850.
这意味着字符集是cp850.
You can test or yourself that this charset doesn't have the appropriate character just by doing '\u2013'.encode('cp850'). Or you can look up cp850 online (e.g., at Wikipedia).
您可以通过执行'\u2013'.encode('cp850'). 或者您可以在线查找 cp850(例如,在Wikipedia 上)。
It's possible that Python is guessing wrong, and your console is really set for, say UTF-8. (In that case, just manually set sys.stdout.encoding='utf-8'.) It's also possible that you intended your console to be set for UTF-8 but did something wrong. (In that case, you probably want to follow up at superuser.com.)
Python 可能猜错了,而您的控制台确实设置为 UTF-8。(在这种情况下,只需手动设置sys.stdout.encoding='utf-8'。)也有可能您打算将控制台设置为 UTF-8,但做错了。(在这种情况下,您可能希望在 superuser.com 上跟进。)
But if nothing is wrong, you just can't print that character. You will have to manually encode it with one of the non-strict error-handlers. For example:
但是,如果没有问题,您就无法打印该字符。您必须使用非严格错误处理程序之一对其进行手动编码。例如:
>>> '\u2013'.encode('cp850')
UnicodeEncodeError: 'charmap' codec can't encode character '\u2013' in position 0: character maps to <undefined>
>>> '\u2013'.encode('cp850', errors='replace')
b'?'
So, how do you print a string that won't print on your console?
那么,如何打印不会在控制台上打印的字符串?
You canreplace every printfunction with something like this:
你可以print用这样的东西替换每个函数:
>>> print(r['body'].encode('cp850', errors='replace').decode('cp850'))
?
… but that's going to get pretty tedious pretty fast.
……但这很快就会变得非常乏味。
The simple thing to do is to just set the error handler on sys.stdout:
要做的简单事情就是将错误处理程序设置为sys.stdout:
>>> sys.stdout.errors = 'replace'
>>> print(r['body'])
?
For printing to a file, things are pretty much the same, except that you don't have to set f.errorsafter the fact, you can set it at construction time. Instead of this:
对于打印到文件,事情几乎相同,只是您不必f.errors事后设置,您可以在构建时进行设置。取而代之的是:
with open('path', 'w', encoding='cp850') as f:
Do this:
做这个:
with open('path', 'w', encoding='cp850', errors='replace') as f:
… Or, of course, if you can use UTF-8 files, just do that, as Mark Ransom's answer shows:
……或者,当然,如果您可以使用 UTF-8 文件,就这样做,正如 Mark Ransom 的回答所示:
with open('path', 'w', encoding='utf-8') as f:
回答by Kattern
Maybe a little late to reply. I happen to run into the same problem today. I find that on Windows you can change the console encoder to utf-8or other encoder that can represent your data. Then you can print it to sys.stdout.
回复可能有点晚了。我今天碰巧遇到了同样的问题。我发现在 Windows 上,您可以将控制台编码器更改为utf-8可以表示您的数据的其他编码器。然后您可以将其打印到sys.stdout.
First, run following code in the console:
首先,在控制台中运行以下代码:
chcp 65001
set PYTHONIOENCODING=utf-8
Then, start pythondo anything you want.
然后,开始python做任何你想做的事。
回答by Raj
for me , using export PYTHONIOENCODING=UTF-8 before executing python command worked .
对我来说,在执行 python 命令之前使用 export PYTHONIOENCODING=UTF-8 工作。

