pandas IPython Notebook：默认编码是什么？

Question

提问by Adriano Almeida

I have created a package using the encoding utf-8.

我使用编码 utf-8 创建了一个包。

When calling a function, it returns a DataFrame, with a column coded in utf-8.

调用函数时，它返回一个DataFrame，其中有一列以 utf-8 编码。

When using IPython at the command line, I don't have any problems showing the content of this table. When using the Notebook, it crashes with the error 'utf8' codec can't decode byte 0xe7. I've attached a full traceback below.

在命令行中使用 IPython 时，显示此表的内容没有任何问题。使用 Notebook 时，它会因错误而崩溃'utf8' codec can't decode byte 0xe7。我在下面附上了完整的追溯。

What is the proper encoding to work with Notebook?

使用 Notebook 的正确编码是什么？

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-13-92c0011919e7> in <module>()
      3 ver = verif.VerificacaoNA()
      4 comp, total = ver.executarCompRealFisica(DT_INI, DT_FIN)
----> 5 comp

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\core\displayhook.pyc in __call__(self, result)
    240             self.update_user_ns(result)
    241             self.log_output(format_dict)
--> 242             self.finish_displayhook()
    243 
    244     def flush(self):

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\displayhook.pyc in finish_displayhook(self)
     59         sys.stdout.flush()
     60         sys.stderr.flush()
---> 61         self.session.send(self.pub_socket, self.msg, ident=self.topic)
     62         self.msg = None
     63 

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in send(self, stream, msg_or_type, content, parent, ident, buffers, subheader, track, header)
    557 
    558         buffers = [] if buffers is None else buffers
--> 559         to_send = self.serialize(msg, ident)
    560         flag = 0
    561         if buffers:

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in serialize(self, msg, ident)
    461             content = self.none
    462         elif isinstance(content, dict):
--> 463             content = self.pack(content)
    464         elif isinstance(content, bytes):
    465             # content is already packed, as in a relayed message

c:\Python27-32\lib\site-packages\ipython-0.13.1-py2.7.egg\IPython\zmq\session.pyc in <lambda>(obj)
     76 
     77 # ISO8601-ify datetime objects
---> 78 json_packer = lambda obj: jsonapi.dumps(obj, default=date_default)
     79 json_unpacker = lambda s: extract_dates(jsonapi.loads(s))
     80 

c:\Python27-32\lib\site-packages\pyzmq-13.0.0-py2.7-win32.egg\zmq\utils\jsonapi.pyc in dumps(o, **kwargs)
     70         kwargs['separators'] = (',', ':')
     71 
---> 72     return _squash_unicode(jsonmod.dumps(o, **kwargs))
     73 
     74 def loads(s, **kwargs):

c:\Python27-32\lib\json\__init__.pyc in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, encoding, default, **kw)
    236         check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237         separators=separators, encoding=encoding, default=default,
--> 238         **kw).encode(obj)
    239 
    240 

c:\Python27-32\lib\json\encoder.pyc in encode(self, o)
    199         # exceptions aren't as detailed.  The list call should be roughly
    200         # equivalent to the PySequence_Fast that ''.join() would do.
--> 201         chunks = self.iterencode(o, _one_shot=True)
    202         if not isinstance(chunks, (list, tuple)):
    203             chunks = list(chunks)

c:\Python27-32\lib\json\encoder.pyc in iterencode(self, o, _one_shot)
    262                 self.key_separator, self.item_separator, self.sort_keys,
    263                 self.skipkeys, _one_shot)
--> 264         return _iterencode(o, 0)
    265 
    266 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 199: invalid continuation byte

Answer 1

回答by olebebo

I had the same problem recently, and indeed setting the default encoding to UTF-8 did the trick:

我最近遇到了同样的问题，确实将默认编码设置为 UTF-8 可以解决问题：

import sys
reload(sys)
sys.setdefaultencoding("utf-8")

Running sys.getdefaultencoding()yielded 'ascii'on my environment (Python 2.7.3), so I guess that's the default.

运行sys.getdefaultencoding()产生了'ascii'对我的环境（的Python 2.7.3），所以我想这是默认的。

Also see this related questionand Ian Bicking's blog post on the subject.

另请参阅此相关问题和Ian Bicking 关于该主题的博客文章。

pandas IPython Notebook：默认编码是什么？

提问by Adriano Almeida

回答by olebebo

相关推荐

最近更新

标签

pandas IPython Notebook：默认编码是什么？

提问by Adriano Almeida

回答by olebebo

相关推荐

wpf 如何删除多余的列Datagrid

wpf 将网格列向右对齐

wpf MultiDataTrigger 与具有多重绑定的 DataTrigger

wpf 如何从代码隐藏设置图像资源 URI

相关推荐

最近更新

标签