windows 命令行的Python utf编码问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7138052/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python utf-coding problem with command line
提问by Mathias
For the past few days I've been learning programing with Python and I'm still but a beginner. Recently, I've used the book 'Code in the cloud' for that purpose. The thing is, while all those textbooks cover a wide area of topics thoroughly they merely touch upon the issue of UTF-8 encoding in languages other than English. Hance my question for you - how to make the following batch of code display utf-8 characters correctly in my mother tongue.
在过去的几天里,我一直在学习使用 Python 编程,但我仍然只是一个初学者。最近,我为此目的使用了《云中的代码》一书。问题是,尽管所有这些教科书都彻底涵盖了广泛的主题,但它们仅涉及英语以外语言的 UTF-8 编码问题。为您解决我的问题 - 如何使以下批次的代码以我的母语正确显示 utf-8 字符。
# -*- coding: utf-8 -*-
import datetime
import sys
class ChatError(Exception):
""" Wyj?tki obs?uguj?ce wszelkiego rodzaju b??dy w czacie."""
def __init__(self, msg):
self.message = msg
# START: ChatMessage
class ChatMessage(object):
"""Pojedyncza wiadomo?? wys?ana przez u?ytkownika czatu"""
def __init__(self, user, text):
self.sender = user
self.msg = text
self.time = datetime.datetime.now()
def __str__(self):
return "Od: %s o godzinie %s: %s" % (self.sender.username,
self.time,
self.msg)
# END: ChatMessage
# START: ChatUser
class ChatUser(object):
"""U?ytkownik bior?cy udzia? w czacie"""
def __init__(self, username):
self.username = username
self.rooms = {}
def subscribe(self, roomname):
if roomname in ChatRoom.rooms:
room = ChatRoom.rooms[roomname]
self.rooms[roomname] = room
room.addSubscriber(self)
else:
raise ChatError("Nie znaleziono pokoju %s" % roomname)
def sendMessage(self, roomname, text):
if roomname in self.rooms:
room = self.rooms[roomname]
cm = ChatMessage(self, text)
room.addMessage(cm)
else:
raise ChatError("U?ytkownik %s nie jest zarejestrowany w pokoju %s" %
(self.username, roomname))
def displayChat(self, roomname, out):
if roomname in self.rooms:
room = self.rooms[roomname]
room.printMessages(out)
else:
raise ChatError("U?ytkownik %s nie jest zarejestrowany w pokoju %s" %
(self.username, roomname))
# END: ChatUser
# START: ChatRoom
class ChatRoom(object):
"""A chatroom"""
rooms = {}
def __init__(self, name):
self.name = name
self.users = []
self.messages = []
ChatRoom.rooms[name] = self
def addSubscriber(self, subscriber):
self.users.append(subscriber)
subscriber.sendMessage(self.name, 'U?ytkownik %s do??czy? do dyskusji.' %
subscriber.username)
def removeSubscriber(self, subscriber):
if subscriber in self.users:
subscriber.sendMessage(self.name,
"U?ytkownik %s opó?ci? pokój." %
subscriber.username)
self.users.remove(subscriber)
def addMessage(self, msg):
self.messages.append(msg)
def printMessages(self, out):
print >>out, "Lista wiadomo?ci: %s" % self.name
for i in self.messages:
print >>out, i
# END: ChatRoom
# START: ChatMain
def main():
room = ChatRoom("Main")
markcc = ChatUser("MarkCC")
markcc.subscribe("Main")
prag = ChatUser("Prag")
prag.subscribe("Main")
markcc.sendMessage("Main", "Hej! Jest tu kto?")
prag.sendMessage("Main", "Tak, ja tu jestem.")
markcc.displayChat("Main", sys.stdout)
if __name__ == "__main__":
main()
# END: ChatMain
It was taken from the aforementioned book, but I cannot make it display non-English characters correctly in the Windows commandline (even though it supports them). As you can see I added encoding statement (# -- coding: utf-8 -) at the beginning thanks to which the code works at all. I also tried using u"string" syntax but to no avail- it returns the following message:
它取自上述书中,但我无法让它在 Windows 命令行中正确显示非英文字符(即使它支持它们)。如您所见,我在开头添加了编码语句 (# - - coding: utf-8 -),这要归功于代码完全可以工作。我也尝试使用 u"string" 语法但无济于事 - 它返回以下消息:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u017c' in position 5
1: ordinal not in range(128)
What to do to make those characters display correctly? Yes, I will often work with strings formated in UTF. I would be very grateful for your help.
如何使这些字符正确显示?是的,我经常使用 UTF 格式的字符串。我将非常感谢您的帮助。
回答by Keith
Try invoking the Python interpreter this way:
尝试以这种方式调用 Python 解释器:
#!/usr/bin/python -S
import sys
sys.setdefaultencoding("utf-8")
import site
This will set the global default encoding to utf-8. The usual default encoding is ASCII. This is used when writing string to some output, such as using built-ins like print.
这会将全局默认编码设置为 utf-8。通常的默认编码是 ASCII。这在将字符串写入某些输出时使用,例如使用像 print 这样的内置函数。
回答by gkuzmin
This works for me currently:
目前这对我有用:
#!/usr/bin/env python
# -*-coding=utf-8 -*-
回答by Robin Winslow
Okay, I know nothing about python, and little about the windows command-line, but a little Googling and:
好吧,我对 python 一无所知,对 windows 命令行也知之甚少,但会谷歌搜索和:
I think the problem is that the windows cmd shell doesn't support utf-8. If I'm not wrong, this should give you more understanding about the error:
http://wiki.python.org/moin/PrintFails
我认为问题在于windows cmd shell 不支持utf-8。如果我没有错,这应该让您对错误有更多的了解:http:
//wiki.python.org/moin/PrintFails
(Got that link from this question:' Unicode characters in Windows command line - how?).
(从这个问题得到这个链接:' Windows 命令行中的Unicode 字符 - 如何?)。
It looks like you can force python into thinking it can print UTF8 using PYTHONIOENCODING.
看起来您可以强制 python 认为它可以使用 PYTHONIOENCODING 打印 UTF8。
This question is about finding utf8 enabled windows shells:
Is there a Windows command shell that will display Unicode characters?
这个问题是关于寻找支持 utf8 的 windows shell:
Is there a Windows command shell that will display Unicode characters?
May be helpful. Hope you solve your problem.
可能会有所帮助。希望你能解决你的问题。
回答by Eric O Lebigot
The Windows terminal sometimes uses a non-UTF-8 encoding (python: unicode in Windows terminal, encoding used?). You therefore might want to try the following:
Windows 终端有时使用非 UTF-8 编码(python: unicode in Windows terminal, encoding used?)。因此,您可能想尝试以下操作:
stdout_encoding = sys.stdout.encoding
def printMessages(self, out):
print >>out, ("Lista wiadomo?ci: %s" % self.name).decode('utf-8').encode(stdout_encoding)
for i in self.messages:
print >>out, i.decode('utf-8').encode(stdout_encoding)
This takes your byte strings, turns them into character strings (your file indicates that they are encoded in UTF-8), and then encodes them for your terminal.
这需要您的字节字符串,将它们转换为字符串(您的文件表明它们以 UTF-8 编码),然后为您的终端对它们进行编码。
You can find useful information about the general issue of encoding and decoding on StackOverflow.
您可以在StackOverflow上找到有关编码和解码的一般问题的有用信息。