使用 Pandas 库将 JSON 转换为 CSV
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45061902/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting JSON to CSV w/ Pandas Library
提问by tseivad359
I'm having trouble converting a JSON file to CSV in Python and I'm not sure what's going wrong. The conversion completes but it is not correct. I think there's an issue due to the formatting of the JSON file; however, it's a valid JSON.
我在 Python 中将 JSON 文件转换为 CSV 时遇到问题,我不确定出了什么问题。转换完成,但不正确。我认为由于 JSON 文件的格式存在问题;但是,它是一个有效的 JSON。
Here's the content of my JSON file:
这是我的 JSON 文件的内容:
{
"tags": [{
"name": "ACDTestData",
"results": [{
"groups": [{
"name": "type",
"type": "number"
}],
"values": [
[
1409154300000,
1.16003418,
3
],
[
1409154240000,
0.024047852,
3
],
[
1409153280000,
10.25598145,
3
],
[
1409152200000,
10.73193359,
3
],
[
1409151240000,
0.024047852,
3
],
[
1409080200000,
14.34393311,
3
],
[
1409039580000,
4.883850098,
3
],
[
1408977480000,
5.520019531,
3
],
[
1408977360000,
0.00793457,
3
],
[
1408974300000,
2.695922852,
3
],
[
1408968480000,
0.011962891,
3
],
[
1408965720000,
0.427978516,
3
],
[
1408965660000,
0.011962891,
3
]
]
}]
}]
}
Here's what I tried:
这是我尝试过的:
import pandas as pd
json_file = pd.read_json("QueryExportTest2.json")
json_file.to_csv()
Here's my output:
这是我的输出:
,tags\n0,"{u\'name\': u\'ACDTestData\', u\'results\': [{u\'values\': [[1409154300000L, 1.16003418, 3], [1409154240000L, 0.024047852, 3], [1409153280000L, 10.25598145, 3], [1409152200000L, 10.73193359, 3], [1409151240000L, 0.024047852, 3], [1409080200000L, 14.34393311, 3], [1409039580000L, 4.883850098, 3], [1408977480000L, 5.520019531, 3], [1408977360000L, 0.00793457, 3], [1408974300000L, 2.695922852, 3], [1408968480000L, 0.011962891000000002, 3], [1408965720000L, 0.42797851600000003, 3], [1408965660000L, 0.011962891000000002, 3]], u\'groups\': [{u\'type\': u\'number\', u\'name\': u\'type\'}]}]}"\n
,tags\n0,"{u\'name\': u\'ACDTestData\', u\'results\': [{u\'values\': [[1409154300000L, 1.16003418, 3], [1409154240000L, 0.024047852, 3], [1409153280000L, 10.25598145, 3], [1409152200000L, 10.73193359, 3], [1409151240000L, 0.024047852, 3], [1409080200000L, 14.34393311, 3], [1409039580000L, 4.883850098, 3], [1408977480000L, 5.520019531, 3], [1408977360000L, 0.00793457, 3], [1408974300000L, 2.695922852, 3], [1408968480000L, 0.011962891000000002, 3], [1408965720000L, 0.42797851600000003, 3], [1408965660000L, 0.011962891000000002, 3]], u\'groups\': [{u\'type\': u\'number\', u\'name\': u\'type\'}]}]}"\n
This isn't right, because when I put it into a new Excel CSV doc instead of just printing it, the CSV is all in one cell.
这是不对的,因为当我将它放入一个新的 Excel CSV 文档而不是仅仅打印它时,CSV 都在一个单元格中。
If it helps, when I try this:
如果有帮助,当我尝试这个时:
import json
with open('QueryExportTest2.json') as json_data:
d = json.load(json_data)
print(d)
I get this:
我明白了:
{u'tags': [{u'name': u'ACDTestData', u'results': [{u'values': [[1409154300000L, 1.16003418, 3], [1409154240000L, 0.024047852, 3], [1409153280000L, 10.25598145, 3], [1409152200000L, 10.73193359, 3], [1409151240000L, 0.024047852, 3], [1409080200000L, 14.34393311, 3], [1409039580000L, 4.883850098, 3], [1408977480000L, 5.520019531, 3], [1408977360000L, 0.00793457, 3], [1408974300000L, 2.695922852, 3], [1408968480000L, 0.011962891, 3], [1408965720000L, 0.427978516, 3], [1408965660000L, 0.011962891, 3]], u'groups': [{u'type': u'number', u'name': u'type'}]}]}]}
{u'tags': [{u'name': u'ACDTestData', u'results': [{u'values': [[1409154300000L, 1.16003418, 3], [1409154240000L, 0.024047852, 3], [1409153280000L, 10.25598145, 3], [1409152200000L, 10.73193359, 3], [1409151240000L, 0.024047852, 3], [1409080200000L, 14.34393311, 3], [1409039580000L, 4.883850098, 3], [1408977480000L, 5.520019531, 3], [1408977360000L, 0.00793457, 3], [1408974300000L, 2.695922852, 3], [1408968480000L, 0.011962891, 3], [1408965720000L, 0.427978516, 3], [1408965660000L, 0.011962891, 3]], u'groups': [{u'type': u'number', u'name': u'type'}]}]}]}
How can I convert this nested JSON to CSV properly?
如何将此嵌套的 JSON 正确转换为 CSV?
采纳答案by Diego Aguado
Your json is a nested dict (with lists and other dictionaries). I guess that you are interested in the values
section of the json
. If my assumption is correct, since this is a single entry json, try the following
您的 json 是一个嵌套的字典(带有列表和其他字典)。我猜你values
对json
. 如果我的假设是正确的,因为这是单条目 json,请尝试以下操作
df = pd.DataFrame.from_dict(json_str['tags'][0]['results'][0]['values'])
df.columns = ['var1','var2', 'var3']
df.to_csv(filename)
If you will have more records you will have to iterate over the lists of values, namely you could append them.
如果您将有更多记录,则必须遍历值列表,即可以附加它们。
all_results = json['tags'][0]['results']
for i in range(0, len(all_results))
if i == 0:
my_df = pd.DataFrame(all_results[i]['values'])
else:
my_df.append(pd.DataFrame(all_results[i]['values']))
回答by Diego Aguado
What are you trying to get in the end? The problem here is that your json is nested, if you for example try doing something like this:
你最终想得到什么?这里的问题是您的 json 是嵌套的,例如,如果您尝试执行以下操作:
pandas.DataFrame.from_dict(jour_json['tags'])
you will get dataframe with two columns - name and result.
您将获得包含两列的数据框 - 名称和结果。