用 unicode 将 Pandas DataFrame 写入 JSON

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39612240/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:03:37  来源:igfitidea点击:

Writing pandas DataFrame to JSON in unicode

pythonjsonpandasunicode

提问by Swier

I'm trying to write a pandas DataFrame containing unicode to json, but the built in .to_jsonfunction escapes the characters. How do I fix this?

我正在尝试将包含 unicode 的 Pandas DataFrame 编写为 json,但内置.to_json函数会转义字符。我该如何解决?

Example:

例子:

import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json')

This gives:

这给出:

{"0":{"0":"\u03c4","1":"\u03c0"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}

Which differs from the desired result:

这与所需的结果不同:

{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}



我试过添加force_ascii=Falseforce_ascii=False论点:

import pandas as pd
df = pd.DataFrame([['τ', 'a', 1], ['π', 'b', 2]])
df.to_json('df.json', force_ascii=False)

But this gives the following error:

但这会产生以下错误:

UnicodeEncodeError: 'charmap' codec can't encode character '\u03c4' in position 11: character maps to <undefined>



我正在使用 WinPython 3.4.4.2 64 位和 Pandas 0.18.0

回答by Swier

Opening a file with the encoding set to utf-8, and then passing that file to the .to_jsonfunction fixes the problem:

打开一个编码设置为 utf-8 的文件,然后将该文件传递给.to_json函数可以解决问题:

with open('df.json', 'w', encoding='utf-8') as file:
    df.to_json(file, force_ascii=False)

gives the correct:

给出正确的:

{"0":{"0":"τ","1":"π"},"1":{"0":"a","1":"b"},"2":{"0":1,"1":2}}

Note: it does still require the force_ascii=Falseargument.

注意:它仍然需要force_ascii=False参数。