pandas IPython Notebook:默认编码是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15420672/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 00:08:18  来源:igfitidea点击:

IPython Notebook: What is the default encoding?

pandasipythonipython-notebook

提问by Adriano Almeida

I have created a package using the encoding utf-8.

我使用编码 utf-8 创建了一个包。

When calling a function, it returns a DataFrame, with a column coded in utf-8.

调用函数时,它返回一个DataFrame,其中有一列以 utf-8 编码。

When using IPython at the command line, I don't have any problems showing the content of this table. When using the Notebook, it crashes with the error 'utf8' codec can't decode byte 0xe7. I've attached a full traceback below.

在命令行中使用 IPython 时,显示此表的内容没有任何问题。使用 Notebook 时,它会因错误而崩溃'utf8' codec can't decode byte 0xe7。我在下面附上了完整的追溯。

What is the proper encoding to work with Notebook?

使用 Notebook 的正确编码是什么?

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-13-92c0011919e7> in <module>()
      3 ver = verif.VerificacaoNA()
      4 comp, total = ver.executarCompRealFisica(DT_INI, DT_FIN)
----> 5 comp

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\core\displayhook.pyc in __call__(self, result)
    240             self.update_user_ns(result)
    241             self.log_output(format_dict)
--> 242             self.finish_displayhook()
    243 
    244     def flush(self):

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\displayhook.pyc in finish_displayhook(self)
     59         sys.stdout.flush()
     60         sys.stderr.flush()
---> 61         self.session.send(self.pub_socket, self.msg, ident=self.topic)
     62         self.msg = None
     63 

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in send(self, stream, msg_or_type, content, parent, ident, buffers, subheader, track, header)
    557 
    558         buffers = [] if buffers is None else buffers
--> 559         to_send = self.serialize(msg, ident)
    560         flag = 0
    561         if buffers:

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in serialize(self, msg, ident)
    461             content = self.none
    462         elif isinstance(content, dict):
--> 463             content = self.pack(content)
    464         elif isinstance(content, bytes):
    465             # content is already packed, as in a relayed message

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in <lambda>(obj)
     76 
     77 # ISO8601-ify datetime objects
---> 78 json_packer = lambda obj: jsonapi.dumps(obj, default=date_default)
     79 json_unpacker = lambda s: extract_dates(jsonapi.loads(s))
     80 

c:\Python27-32\lib\site-packages\pyzmq-13.0.0-py2.7-win32.egg\zmq\utils\jsonapi.pyc in dumps(o, **kwargs)
     70         kwargs['separators'] = (',', ':')
     71 
---> 72     return _squash_unicode(jsonmod.dumps(o, **kwargs))
     73 
     74 def loads(s, **kwargs):

c:\Python27-32\lib\json\__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, **kw)
    236         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237         separators=separators, encoding=encoding, default=default,
--> 238         **kw).encode(obj)
    239 
    240 

c:\Python27-32\lib\json\encoder.pyc in encode(self, o)
    199         # exceptions aren't as detailed.  The list call should be roughly
    200         # equivalent to the PySequence_Fast that ''.join() would do.
--> 201         chunks = self.iterencode(o, _one_shot=True)
    202         if not isinstance(chunks, (list, tuple)):
    203             chunks = list(chunks)

c:\Python27-32\lib\json\encoder.pyc in iterencode(self, o, _one_shot)
    262                 self.key_separator, self.item_separator, self.sort_keys,
    263                 self.skipkeys, _one_shot)
--> 264         return _iterencode(o, 0)
    265 
    266 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 199: invalid continuation byte

回答by olebebo

I had the same problem recently, and indeed setting the default encoding to UTF-8 did the trick:

我最近遇到了同样的问题,确实将默认编码设置为 UTF-8 可以解决问题:

import sys
reload(sys)
sys.setdefaultencoding("utf-8")

Running sys.getdefaultencoding()yielded 'ascii'on my environment (Python 2.7.3), so I guess that's the default.

运行sys.getdefaultencoding()产生了'ascii'对我的环境(的Python 2.7.3),所以我想这是默认的。

Also see this related questionand Ian Bicking's blog post on the subject.

另请参阅此相关问题Ian Bicking 关于该主题的博客文章