将 Pandas DataFrame 写入换行符分隔的 JSON

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/28976546/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:02:16  来源:igfitidea点击:

Write Pandas DataFrame to newline-delimited JSON

pythonjsonpandas

提问by uspowpow

I started by reading a CSV into a Pandas Data Frame via the pandas read_csv() function. Now that the data is in an actual data frame, I tried to write something like this:

我首先通过 pandas read_csv() 函数将 CSV 读入 Pandas 数据帧。现在数据在一个实际的数据框中,我试着写这样的东西:

for row in df.iterrows():
    row[1].to_json(path_to_file)

This works but only the last row is saved to disk because I've been rewriting the file each time I make a call to row[1].to_json(path_to_file). I've tried a few other file handling options but to no avail. Can anyone shed some insight on how to proceed?

这有效,但只有最后一行被保存到磁盘,因为我每次调用 row[1].to_json(path_to_file) 时都在重写文件。我尝试了其他一些文件处理选项,但无济于事。任何人都可以对如何进行一些了解吗?

Thank you!

谢谢!

回答by conradlee

To create newline-delimited json from a dataframe df, run the following

要从数据帧创建换行符分隔的 json df,请运行以下命令

df.to_json("path/to/filename.json",
           orient="records",
           lines=True)

Pay close attention to those optional keyword args! The linesoption was added in pandas 0.19.0.

密切注意那些可选的关键字参数!该lines选项已添加到 pandas 中0.19.0

回答by Noah

You can pass a buffer in to df.to_json():

您可以将缓冲区传递给df.to_json()

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"a":[1,3,5], "b":[1.1,1.2,1.2]})

In [3]: df
Out[3]: 
   a    b
0  1  1.1
1  3  1.2
2  5  1.2

In [4]: f = open("temp.txt", "w")

In [5]: for row in df.iterrows():
    row[1].to_json(f)
    f.write("\n")
   ...:     

In [6]: f.close()

In [7]: open("temp.txt").read()
Out[7]: '{"a":1.0,"b":1.1}\n{"a":3.0,"b":1.2}\n{"a":5.0,"b":1.2}\n'

回答by Jon Clements

If you're trying to write a DF using iterrows- I suspect you should be looking at:

如果您尝试使用以下方法编写 DF iterrows- 我怀疑您应该查看:

df.to_json(orient='records') # List of lists of values
# [[1, 2], [3,4]]

Or:

或者:

df.to_json(orient='records') # List of dicts with col->val
# [{'A': 1, 'B': 2}, {'A': 3, 'B': 4}]

Or writing a dict of {index:col value}:

或者写一个 {index:col value} 的字典:

df.A.to_json()
# {0: 1, 1: 3}