pandas 使用 pd.read_json 读取 JSON 文件时出现 ValueError 错误
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33559660/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
ValueError errors while reading JSON file with pd.read_json
提问by shantanuo
I am trying to read JSON file using pandas:
我正在尝试使用 Pandas 读取 JSON 文件:
import pandas as pd
df = pd.read_json('https://data.gov.in/node/305681/datastore/export/json')
I get ValueError: arrays must all be same length
我得到 ValueError: arrays must all be same length
Some other JSON pages show this error:
其他一些 JSON 页面显示此错误:
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
How do I somehow read the values? I am not particular about data validity.
我如何以某种方式读取值?我并不特别关注数据有效性。
采纳答案by Andy Hayden
Looking at the json it is valid, but it's nested with data and fields:
查看 json 它是有效的,但它嵌套了数据和字段:
import json
import requests
In [11]: d = json.loads(requests.get('https://data.gov.in/node/305681/datastore/export/json').text)
In [12]: list(d.keys())
Out[12]: ['data', 'fields']
You want the data as the content, and fields as the column names:
您希望数据作为内容,字段作为列名:
In [13]: pd.DataFrame(d["data"], columns=[x["label"] for x in d["fields"]])
Out[13]:
S. No. States/UTs 2008-09 2009-10 2010-11 2011-12 2012-13
0 1 Andhra Pradesh 183446.36 193958.45 201277.09 212103.27 222973.83
1 2 Arunachal Pradesh 360.5 380.15 407.42 419 438.69
2 3 Assam 4658.93 4671.22 4707.31 4705 4709.58
3 4 Bihar 10740.43 11001.77 7446.08 7552 8371.86
4 5 Chhattisgarh 9737.92 10520.01 12454.34 12984.44 13704.06
5 6 Goa 148.61 148 149 149.45 457.87
6 7 Gujarat 12675.35 12761.98 13269.23 14269.19 14558.39
7 8 Haryana 38149.81 38453.06 39644.17 41141.91 42342.66
8 9 Himachal Pradesh 977.3 1000.26 1020.62 1049.66 1069.39
9 10 Jammu and Kashmir 7208.26 7242.01 7725.19 6519.8 6715.41
10 11 Jharkhand 3994.77 3924.73 4153.16 4313.22 4238.95
11 12 Karnataka 23687.61 29094.3 30674.18 34698.77 36773.33
12 13 Kerala 15094.54 16329.52 16856.02 17048.89 22375.28
13 14 Madhya Pradesh 6712.6 7075.48 7577.23 7971.53 8710.78
14 15 Maharashtra 35502.28 38640.12 42245.1 43860.99 45661.07
15 16 Manipur 1105.25 1119 1137.05 1149.17 1162.19
16 17 Meghalaya 994.52 999.47 1010.77 1021.14 1028.18
17 18 Mizoram 411.14 370.92 387.32 349.33 352.02
18 19 Nagaland 831.92 833.5 802.03 703.65 617.98
19 20 Odisha 19940.15 23193.01 23570.78 23006.87 23229.84
20 21 Punjab 36789.7 32828.13 35449.01 36030 37911.01
21 22 Rajasthan 6449.17 6713.38 6696.92 9605.43 10334.9
22 23 Sikkim 136.51 136.07 139.83 146.24 146
23 24 Tamil Nadu 88097.59 108475.73 115137.14 118518.45 119333.55
24 25 Tripura 1388.41 1442.39 1569.45 1650 1565.17
25 26 Uttar Pradesh 10139.8 10596.17 10990.72 16075.42 17073.67
26 27 Uttarakhand 1961.81 2535.77 2613.81 2711.96 3079.14
27 28 West Bengal 33055.7 36977.96 39939.32 43432.71 47114.91
28 29 Andaman and Nicobar Islands 617.58 657.44 671.78 780 741.32
29 30 Chandigarh 272.88 248.53 180.06 180.56 170.27
30 31 Dadra and Nagar Haveli 70.66 70.71 70.28 73 73
31 32 Daman and Diu 18.83 18.9 18.81 19.67 20
32 33 Delhi 1.17 1.17 1.17 1.23 NA
33 34 Lakshadweep 134.64 138.22 137.98 139.86 139.99
34 35 Puducherry 111.69 112.84 113.53 116 112.89
See also json_normalize
for more complex json DataFrame extraction.
另请参阅json_normalize
更复杂的 json DataFrame 提取。
回答by Akhil Gupta
The following listed both the key and value pair for me:
下面列出了我的键值对:
from urllib.request import urlopen
import json
from pandas.io.json import json_normalize
import pandas as pd
import requests
df = json.loads(requests.get('https://api.github.com/repos/akkhil2012/MachineLearning').text)
data = pd.DataFrame.from_dict(df, orient='index')
print(data)
回答by AlexG
eht For this case we can make the dataframe by doing
eht 对于这种情况,我们可以通过做
import pandas as pd
df = pd.DataFrame(data["data"])