将 JSON 导入 Pandas
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44980845/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Importing JSON into Pandas
提问by MoreScratch
I have to following JSON that is coming from an API (e.g. my_json). The array of entities is stored in a key called entities:
我必须遵循来自 API(例如 my_json)的 JSON。实体数组存储在称为实体的键中:
{
"action" : "get",
"application" : "4d97323f-ac0f-11e6-b1d4-0eec2415f3df",
"params" : {
"limit" : [ "2" ]
},
"path" : "/businesses",
"entities" : [
{
"uuid" : "508d56f1-636b-11e7-9928-122e0737977d",
"type" : "business",
"size" : 730 },
{
"uuid" : "2f3bd4dc-636b-11e7-b937-0ad881f403bf",
"type" : "business",
"size" : 730
} ],
"timestamp" : 1499469891059,
"duration" : 244,
"count" : 2
}
I am trying to load them into a data frame as follows:
我正在尝试将它们加载到数据框中,如下所示:
import pandas as pd
pd.read_json(my_json['entities'], orient='split')
I get the following error:
我收到以下错误:
ValueError: Invalid file path or buffer object type: <type 'list'>
I have tried records orientation and still doesn't work.
我试过记录方向,但仍然不起作用。
回答by piRSquared
If my_json
is a dictionary as I suspect, then you can skip the pd.read_json
and just do
如果my_json
是我怀疑的字典,那么您可以跳过pd.read_json
并执行
pd.DataFrame(my_json['entities'])
size type uuid
0 730 business 508d56f1-636b-11e7-9928-122e0737977d
1 730 business 2f3bd4dc-636b-11e7-b937-0ad881f403bf
回答by Daniel Corin
The way you are using my_json['entities']
makes it look like it is a Python dict
.
您使用的方式my_json['entities']
使它看起来像是一个 Python dict
。
According to the pandas
documentation, read_json
takes in "a valid JSON string or file-like". You can convert a dict
into a json string with the following:
根据pandas
文档,read_json
接受“有效的 JSON 字符串或类似文件”。您可以dict
使用以下命令将 a转换为 json 字符串:
import json
json_str = json.dumps(my_json["entities"])
The data under the key "entities"
as you have described it does not fit the formatting strategy for orient="split"
. It looks like you will need to use orient="list"
:
"entities"
您所描述的键下的数据不符合orient="split"
. 看起来您将需要使用orient="list"
:
import pandas as pd
my_json = """{
"entities": [
{
"type": "business",
"uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf",
"size": 918
},
{
"type": "business",
"uuid": "054a7650-b36a-11e6-a734-122e0737977d",
"size": 984
}
]
}"""
print pd.read_json(my_json, orient='list')
yielding:
产生:
entity
0 {u'type': u'business', u'uuid': u'199bca3e-baf...
1 {u'type': u'business', u'uuid': u'054a7650-b36...
or
或者
import pandas as pd
my_json = """[
{
"type": "business",
"uuid": "199bca3e-baf6-11e6-861b-0ad881f403bf",
"size": 918
},
{
"type": "business",
"uuid": "054a7650-b36a-11e6-a734-122e0737977d",
"size": 984
}
]"""
print pd.read_json(my_json, orient='list')
yielding:
产生:
size type uuid
0 918 business 199bca3e-baf6-11e6-861b-0ad881f403bf
1 984 business 054a7650-b36a-11e6-a734-122e0737977d
回答by Avi Gaur
According to the official documenationorient is suppose to be 'records'
根据官方文件orient 应该是“记录”
df = pd.read_json(json.dumps(b_j['entities']) , orient='records')
回答by MoreScratch
danielcorin pointed me in the right direction. I ended up having to do:
danielcorin 为我指明了正确的方向。我最终不得不这样做:
pd.read_json(json.dumps(b_j['entities']) , orient='list')
The read_json method takes a string so I dump the entities collection and use that.
read_json 方法接受一个字符串,所以我转储实体集合并使用它。