windows IPython 中输入编码的奇怪问题
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2260815/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Weird problem with input encoding in IPython
提问by Andrey Balaguta
I'm running python 2.6 with latest IPython on Windows XP SP3, and I have two questions. First one of my problems is, when under IPython, I cannot input Unicode strings directly, and, as a result, cannot open files with non-latin names. Let me demonstrate. Under usual python this works:
我在 Windows XP SP3 上使用最新的 IPython 运行 python 2.6,我有两个问题。我的第一个问题是,在 IPython 下,我无法直接输入 Unicode 字符串,因此无法打开非拉丁名称的文件。让我示范一下。在通常的python下,这有效:
>>> sys.getdefaultencoding()
'ascii'
>>> sys.getfilesystemencoding()
'mbcs'
>>> fd = open(u'm:/Блокнот/home.tdl')
>>> print u'm:/Блокнот/home.tdl'
m:/Блокнот/home.tdl
>>>
It's cyrillic in there, by the way. And under the IPython I get:
顺便说一下,里面是西里尔文。在 IPython 下,我得到:
In [49]: sys.getdefaultencoding()
Out[49]: 'ascii'
In [50]: sys.getfilesystemencoding()
Out[50]: 'mbcs'
In [52]: fd = open(u'm:/Блокнот/home.tdl')
---------------------------------------------------------------------------
IOError Traceback (most recent call last)
C:\Documents and Settings\andrey\<ipython console> in <module>()
IOError: [Errno 2] No such file or directory: u'm:/\x81\xab\xae\xaa\xad\xae\xe2/home.tdl'
In [53]: print u'm:/Блокнот/home.tdl'
-------------->print(u'm:/Блокнот/home.tdl')
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (15, 0))
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
C:\Documents and Settings\andrey\<ipython console> in <module>()
C:\Program Files\Python26\lib\encodings\cp866.pyc in encode(self, input, errors)
10
11 def encode(self,input,errors='strict'):
---> 12 return codecs.charmap_encode(input,errors,encoding_map)
13
14 def decode(self,input,errors='strict'):
UnicodeEncodeError: 'charmap' codec can't encode characters in position 3-9: character maps to <und
In [54]:
The second problem is less frustrating, but still. When I try to open a file, and specify file name argument as non-unicode string, it does not open. I have to forciblydecode string from OEM charset, before I could open files, which is pretty inconvenient:
第二个问题不那么令人沮丧,但仍然如此。当我尝试打开一个文件,并将文件名参数指定为非 unicode 字符串时,它不会打开。我必须从 OEM 字符集中强制解码字符串,然后才能打开文件,这非常不方便:
>>> fd2 = open('m:/Блокнот/home.tdl'.decode('cp866'))
>>>
Maybe it has something to with my regional settings, I don't know, because I can't even cut-and-paste cyrillic text from console. I've put "Russian" everywhere in regional settings, but it does not seem to work.
也许它与我的区域设置有关,我不知道,因为我什至无法从控制台剪切和粘贴西里尔文文本。我已经在区域设置中随处放置了“俄语”,但它似乎不起作用。
采纳答案by bobince
Yes. Typing Unicode at the console is always problematic and generally best avoided, but IPython is particularly broke. It converts characters you type on its console as if they were encoded in ISO-8859-1, regardless of the actual encoding you're giving it.
是的。在控制台输入 Unicode 总是有问题的,通常最好避免,但IPython 尤其糟糕。它会将您在其控制台上键入的字符转换为以 ISO-8859-1 编码的字符,而不管您提供的实际编码如何。
For now, you'll have to say u'm:/\u0411\u043b\u043e\u043a\u043d\u043e\u0442/home.tdl'
.
现在,你必须说u'm:/\u0411\u043b\u043e\u043a\u043d\u043e\u0442/home.tdl'
。
回答by David Eyk
Perversely enough, this will work:
反常的是,这将起作用:
fd = open('m:/Блокнот/home.tdl')
Or:
或者:
fd = open('m:/Блокнот/home.tdl'.encode('utf-8'))
This gets around ipython's bug by inputting the string as a raw UTF-8 encoded byte-string. ipython doesn't try any funny business with it. You're then free to encode it into a unicode string if you like, and get on with your life.
这通过将字符串作为原始 UTF-8 编码字节字符串输入来绕过 ipython 的错误。ipython 不会尝试任何有趣的事情。然后,如果您愿意,您可以自由地将其编码为 unicode 字符串,然后继续您的生活。
回答by stephane k.
I had the same problem with Greek input, this patch from launchpadworks for me too.
我对希腊语输入有同样的问题,launchpad 的这个补丁也适用于我。
Thanks.
谢谢。