Python 将 Pandas df 写入 csv 时出现 Unicode 编码错误

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/31331358/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 09:49:48  来源:igfitidea点击:

Unicode Encode Error when writing pandas df to csv

pythonpandasexport-to-csvpython-unicode

提问by I am not George

I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df.

我清理了 400 个 excel 文件并使用 Pandas 将它们读入 python 并将所有原始数据附加到一个大 df 中。

Then when I try to export it to a csv:

然后当我尝试将其导出到 csv 时:

df.to_csv("path",header=True,index=False)

I get this error:

我收到此错误:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 20: ordinal not in range(128)

Can someone suggest a way to fix this and what it means?

有人可以建议一种方法来解决这个问题,这意味着什么?

Thanks

谢谢

采纳答案by unutbu

You have unicodevalues in your DataFrame. Files store bytes, which means all unicodehave to be encoded into bytes before they can be stored in a file. You have to specify an encoding, such as utf-8. For example,

unicode的 DataFrame 中有值。文件存储字节,这意味着unicode在将它们存储在文件中之前,所有这些都必须编码为字节。您必须指定编码,例如utf-8. 例如,

df.to_csv('path', header=True, index=False, encoding='utf-8')

If you don't specify an encoding, then the encoding used by df.to_csvdefaults to asciiin Python2, or utf-8in Python3.

如果不指定编码,则df.to_csv默认使用asciiPython2 或utf-8Python3 中的编码。

回答by tangfucius

Adding an answer to help myself google it later:

添加一个答案以帮助自己稍后进行谷歌搜索:

One trick that helped me is to encode a problematic series first, then decode it back to utf-8. Like:

帮助我的一个技巧是首先对有问题的系列进行编码,然后将其解码回 utf-8。喜欢:

df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))

This would get the dataframe to print correctly too.

这也将使数据框正确打印。