Python 中 Pandas DataFrame 的 JSON 字典
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25734079/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
JSON dictionaries to Pandas DataFrame in Python
提问by MauricioRoman
I get JSON data from an API service, and I would like to use a DataFrame to then output the data into CSV.
我从 API 服务获取 JSON 数据,然后我想使用 DataFrame 将数据输出到 CSV 中。
So, I am trying to convert a list of dictionaries, with about 100.000 dictionaries with about 100 key value pairs, nested up to 4 levels deep, into a Pandas DataFrame.
因此,我正在尝试将包含大约 100.000 个字典和大约 100 个键值对的字典列表转换为 Pandas DataFrame,最多嵌套 4 层。
I am using the following code, but it is painfully slow:
我正在使用以下代码,但速度非常慢:
try:
# Convert each JSON data event to a Pandas DataFrame
df_i = []
for d in data:
df_i.append( json_normalize(d) )
# Concatenate all DataFrames into a single one
df = concat(df_i, axis=0)
except AttributeError:
print "Error: Expected a list of dictionaries to parse JSON data"
Does anyone know of a better and faster way to do this?
有谁知道更好更快的方法来做到这一点?
采纳答案by Andy Hayden
There's a whole section in the io docson reading json (as strings or files) directly using pd.read_json.
io 文档中有一个关于直接使用 .json 读取 json(作为字符串或文件)的完整部分pd.read_json。
You ought to be able to do something like:
您应该能够执行以下操作:
pd.concat((pd.read_json(d) for d in data), axis=0)
This will often be much faster than creating a temporary dict.
这通常比创建临时字典要快得多。

