Python json.loads 显示 ValueError: Extra data
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21058935/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python json.loads shows ValueError: Extra data
提问by Apoorv Ashutosh
I am getting some data from a JSON file "new.json", and I want to filter some data and store it into a new JSON file. Here is my code:
我从 JSON 文件“new.json”中获取一些数据,我想过滤一些数据并将其存储到一个新的 JSON 文件中。这是我的代码:
import json
with open('new.json') as infile:
data = json.load(infile)
for item in data:
iden = item.get["id"]
a = item.get["a"]
b = item.get["b"]
c = item.get["c"]
if c == 'XYZ' or "XYZ" in data["text"]:
filename = 'abc.json'
try:
outfile = open(filename,'ab')
except:
outfile = open(filename,'wb')
obj_json={}
obj_json["ID"] = iden
obj_json["VAL_A"] = a
obj_json["VAL_B"] = b
and I am getting an error, the traceback is:
我收到一个错误,回溯是:
File "rtfav.py", line 3, in <module>
data = json.load(infile)
File "/usr/lib64/python2.7/json/__init__.py", line 278, in load
**kw)
File "/usr/lib64/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 369, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 88 column 2 - line 50607 column 2 (char 3077 - 1868399)
Can someone help me?
有人能帮我吗?
Here is a sample of the data in new.json, there are about 1500 more such dictionaries in the file
这是 new.json 中的数据示例,文件中还有大约 1500 个这样的字典
{
"contributors": null,
"truncated": false,
"text": "@HomeShop18 #DreamJob to professional rafter",
"in_reply_to_status_id": null,
"id": 421584490452893696,
"favorite_count": 0,
"source": "<a href=\"https://mobile.twitter.com\" rel=\"nofollow\">Mobile Web (M2)</a>",
"retweeted": false,
"coordinates": null,
"entities": {
"symbols": [],
"user_mentions": [
{
"id": 183093247,
"indices": [
0,
11
],
"id_str": "183093247",
"screen_name": "HomeShop18",
"name": "HomeShop18"
}
],
"hashtags": [
{
"indices": [
12,
21
],
"text": "DreamJob"
}
],
"urls": []
},
"in_reply_to_screen_name": "HomeShop18",
"id_str": "421584490452893696",
"retweet_count": 0,
"in_reply_to_user_id": 183093247,
"favorited": false,
"user": {
"follow_request_sent": null,
"profile_use_background_image": true,
"default_profile_image": false,
"id": 2254546045,
"verified": false,
"profile_image_url_https": "https://pbs.twimg.com/profile_images/413952088880594944/rcdr59OY_normal.jpeg",
"profile_sidebar_fill_color": "171106",
"profile_text_color": "8A7302",
"followers_count": 87,
"profile_sidebar_border_color": "BCB302",
"id_str": "2254546045",
"profile_background_color": "0F0A02",
"listed_count": 1,
"profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
"utc_offset": null,
"statuses_count": 9793,
"description": "Rafter. Rafting is what I do. Me aur mera Tablet. Technocrat of Future",
"friends_count": 231,
"location": "",
"profile_link_color": "473623",
"profile_image_url": "http://pbs.twimg.com/profile_images/413952088880594944/rcdr59OY_normal.jpeg",
"following": null,
"geo_enabled": false,
"profile_banner_url": "https://pbs.twimg.com/profile_banners/2254546045/1388065343",
"profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
"name": "Jayy",
"lang": "en",
"profile_background_tile": false,
"favourites_count": 41,
"screen_name": "JzayyPsingh",
"notifications": null,
"url": null,
"created_at": "Fri Dec 20 05:46:00 +0000 2013",
"contributors_enabled": false,
"time_zone": null,
"protected": false,
"default_profile": false,
"is_translator": false
},
"geo": null,
"in_reply_to_user_id_str": "183093247",
"lang": "en",
"created_at": "Fri Jan 10 10:09:09 +0000 2014",
"filter_level": "medium",
"in_reply_to_status_id_str": null,
"place": null
}
采纳答案by falsetru
As you can see in the following example, json.loads(and json.load) does not decode multiple json object.
正如您在以下示例中看到的,json.loads(和json.load) 不会解码多个 json 对象。
>>> json.loads('{}')
{}
>>> json.loads('{}{}') # == json.loads(json.dumps({}) + json.dumps({}))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 368, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 3 - line 1 column 5 (char 2 - 4)
If you want to dump multiple dictionaries, wrap them in a list, dump the list (instead of dumping dictionaries multiple times)
如果要转储多个字典,请将它们包装在一个列表中,转储该列表(而不是多次转储字典)
>>> dict1 = {}
>>> dict2 = {}
>>> json.dumps([dict1, dict2])
'[{}, {}]'
>>> json.loads(json.dumps([dict1, dict2]))
[{}, {}]
回答by Adam Hughes
Can I just suggest that you don't have to package all of the tweets into a list and then do json.dumps. You can just write to a file as you go, and then load them in with:
我可以建议您不必将所有推文打包到一个列表中,然后执行json.dumps. 您可以随时写入文件,然后使用以下命令加载它们:
tweets = []
for line in open('tweets.json', 'r'):
tweets.append(json.loads(line))
That way you don't have to store intermediate python objects. As long as your write one full tweet per append()call, this should work.
这样你就不必存储中间的 python 对象。只要您每次append()通话都写一条完整的推文,这应该有效。
回答by VISQL
This may also happen if your JSON file is not just 1 JSON record. A JSON record looks like this:
如果您的 JSON 文件不仅仅是 1 个 JSON 记录,也可能会发生这种情况。JSON 记录如下所示:
[{"some data": value, "next key": "another value"}]
It opens and closes with a bracket [ ], within the brackets are the braces { }. There can be many pairs of braces, but it all ends with a close bracket ]. If your json file contains more than one of those:
它用括号 [ ] 打开和关闭,括号内是大括号 { }。可以有很多对大括号,但都以右括号 ] 结尾。如果您的 json 文件包含多个:
[{"some data": value, "next key": "another value"}]
[{"2nd record data": value, "2nd record key": "another value"}]
then loads() will fail.
然后 load() 将失败。
I verified this with my own file that was failing.
我用我自己失败的文件验证了这一点。
import json
guestFile = open("1_guests.json",'r')
guestData = guestFile.read()
guestFile.close()
gdfJson = json.loads(guestData)
This works because 1_guests.json has one record []. The original file I was using all_guests.json had 6 records separated by newline. I deleted 5 records, (which I already checked to be bookended by brackets) and saved the file under a new name. Then the loads statement worked.
这是有效的,因为 1_guests.json 有一个记录 []。我使用的原始文件 all_guests.json 有 6 条记录,由换行符分隔。我删除了 5 条记录,(我已经检查过这些记录被括号括起来了)并以新名称保存了文件。然后负载语句起作用了。
Error was
错误是
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 2 column 1 - line 10 column 1 (char 261900 - 6964758)
PS. I use the word record, but that's not the official name. Also, if your file has newline characters like mine, you can loop through it to loads() one record at a time into a json variable.
附注。我使用“记录”这个词,但这不是正式名称。此外,如果您的文件有像我这样的换行符,您可以循环遍历它,一次将一个记录加载()到一个 json 变量中。
回答by Nic Scozzaro
I came across this because I was trying to load a JSON file dumped from MongoDB. It was giving me an error
我遇到这个是因为我试图加载从 MongoDB 转储的 JSON 文件。它给了我一个错误
JSONDecodeError: Extra data: line 2 column 1
The MongoDB JSON dump has one object per line, so what worked for me is:
MongoDB JSON 转储每行有一个对象,所以对我有用的是:
import json
data = [json.loads(line) for line in open('data.json', 'r')]
回答by coreehi
If you want to solve it in a two-liner you can do it like this:
如果你想在两行中解决它,你可以这样做:
with open('data.json') as f:
data = [json.loads(line) for line in f]
回答by Nihal
One-liner for your problem:
单线解决您的问题:
data = [json.loads(line) for line in open('tweets.json', 'r')]
回答by murat yal??n
I think saving dicts in a list is not an ideal solution here proposed by @falsetru.
我认为将字典保存在列表中并不是@falsetru 提出的理想解决方案。
Better way is, iterating through dicts and saving them to .json by adding a new line.
更好的方法是,遍历 dicts 并通过添加新行将它们保存到 .json。
our 2 dictionaries are
我们的两本词典是
d1 = {'a':1}
d2 = {'b':2}
you can write them to .json
您可以将它们写入 .json
import json
with open('sample.json','a') as sample:
for dict in [d1,d2]:
sample.write('{}\n'.format(json.dumps(dict)))
and you can read json file without any issues
您可以毫无问题地读取 json 文件
with open('sample.json','r') as sample:
for line in sample:
line = json.loads(line.strip())
simple and efficient
简单高效
回答by Akbar Noto
Well , it might help someone. i just got the same error while my json file is like this
嗯,它可能会帮助某人。当我的 json 文件是这样的时,我遇到了同样的错误
{"id":"1101010","city_id":"1101","name":"TEUPAH SELATAN"}
{"id":"1101020","city_id":"1101","name":"SIMEULUE TIMUR"}
and i found it malformed, so i changed it into somekind of
我发现它格式不正确,所以我把它改成了某种
{
"datas":[
{"id":"1101010","city_id":"1101","name":"TEUPAH SELATAN"},
{"id":"1101020","city_id":"1101","name":"SIMEULUE TIMUR"}
]
}

