Python 将 Pandas df 写入 csv 时出现 Unicode 编码错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31331358/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Unicode Encode Error when writing pandas df to csv
提问by I am not George
I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df.
我清理了 400 个 excel 文件并使用 Pandas 将它们读入 python 并将所有原始数据附加到一个大 df 中。
Then when I try to export it to a csv:
然后当我尝试将其导出到 csv 时:
df.to_csv("path",header=True,index=False)
I get this error:
我收到此错误:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 20: ordinal not in range(128)
Can someone suggest a way to fix this and what it means?
有人可以建议一种方法来解决这个问题,这意味着什么?
Thanks
谢谢
采纳答案by unutbu
You have unicode
values in your DataFrame. Files store bytes, which means all unicode
have to be encoded into bytes before they can be stored in a file. You have to specify an encoding, such as utf-8
. For example,
您unicode
的 DataFrame 中有值。文件存储字节,这意味着unicode
在将它们存储在文件中之前,所有这些都必须编码为字节。您必须指定编码,例如utf-8
. 例如,
df.to_csv('path', header=True, index=False, encoding='utf-8')
If you don't specify an encoding, then the encoding used by df.to_csv
defaults to ascii
in Python2, or utf-8
in Python3.
如果不指定编码,则df.to_csv
默认使用ascii
Python2 或utf-8
Python3 中的编码。
回答by tangfucius
Adding an answer to help myself google it later:
添加一个答案以帮助自己稍后进行谷歌搜索:
One trick that helped me is to encode a problematic series first, then decode it back to utf-8. Like:
帮助我的一个技巧是首先对有问题的系列进行编码,然后将其解码回 utf-8。喜欢:
df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))
This would get the dataframe to print correctly too.
这也将使数据框正确打印。