生成动态嵌套的 JSON 对象和数组 - python

Question

提问by Asif Ali

As the question explains the problem, I've been trying to generate nested JSON object. In this case I have forloops getting the data out of dictionary dic. Below is the code:

正如问题所解释的那样，我一直在尝试生成嵌套的 JSON 对象。在这种情况下，我有for循环从字典中获取数据dic。下面是代码：

f = open("test_json.txt", 'w')
flag = False
temp = ""
start = "{\n\t\"filename\"" + " : \"" +initial_filename+"\",\n\t\"data\"" +" : " +" [\n"
end = "\n\t]" +"\n}"
f.write(start)
for i, (key,value) in enumerate(dic.iteritems()):
    f.write("{\n\t\"keyword\":"+"\""+str(key)+"\""+",\n")
    f.write("\"term_freq\":"+str(len(value))+",\n")
    f.write("\"lists\":[\n\t")
    for item in value:
        f.write("{\n")
        f.write("\t\t\"occurance\" :"+str(item)+"\n")
        #Check last object
        if value.index(item)+1 == len(value):
            f.write("}\n" 
            f.write("]\n")
        else:
            f.write("},") # close occurrence object
    # Check last item in dic
    if i == len(dic)-1:
        flag = True
    if(flag):
        f.write("}")
    else:
        f.write("},") #close lists object
        flag = False 

#check for flag
f.write("]") #close lists array 
f.write("}")

Expected output is:

预期输出为：

{
"filename": "abc.pdf",
"data": [{
    "keyword": "irritation",
    "term_freq": 5,
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    }]
}, {
    "keyword": "bomber",
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    }],
    "term_freq": 5
}]
}

But currently I'm getting an output like below:

但目前我得到如下输出：

{
"filename": "abc.pdf",
"data": [{
    "keyword": "irritation",
    "term_freq": 5,
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    },]                // Here lies the problem "," before array(last element)
}, {
    "keyword": "bomber",
    "lists": [{
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 1
    }, {
        "occurance": 2
    },],                  // Here lies the problem "," before array(last element)
    "term_freq": 5
}]
}

Please help, I've trying to solve it, but failed. Please don't mark it duplicate since I have already checked other answers and didn't help at all.

请帮忙，我试图解决它，但失败了。请不要将其标记为重复，因为我已经检查了其他答案并且根本没有帮助。

Edit 1:Input is basically taken from a dictionary dicwhose mapping type is <String, List>for example: "irritation" => [1,3,5,7,8] where irritation is the key, and mapped to a list of page numbers. This is basically read in the outer for loop where key is the keyword and value is a list of pages of occurrence of that keyword.

编辑1：输入基本上取自字典，dic其映射类型<String, List>例如：“刺激”=> [1,3,5,7,8] 其中刺激是关键，并映射到页码列表。这基本上是在外部 for 循环中读取的，其中 key 是关键字，value 是该关键字出现的页面列表。

Edit 2:

编辑2：

dic = collections.defaultdict(list) # declaring the variable dictionary
dic[key].append(value) # inserting the values - useless to tell here
for key in dic:
    # Here dic[x] represents list - each value of x
    print key,":",dic[x],"\n" #prints the data in dictionary

Answer 1

采纳答案by Kruup?s

What @andrea-f looks good to me, here another solution:

@andrea-f 对我来说看起来不错，这是另一个解决方案：

Feel free to pick in both :)

随意选择两者:)

import json

dic = {
        "bomber": [1, 2, 3, 4, 5],
        "irritation": [1, 3, 5, 7, 8]
      }

filename = "abc.pdf"

json_dict = {}
data = []

for k, v in dic.iteritems():
  tmp_dict = {}
  tmp_dict["keyword"] = k
  tmp_dict["term_freq"] = len(v)
  tmp_dict["lists"] = [{"occurrance": i} for i in v]
  data.append(tmp_dict)

json_dict["filename"] = filename
json_dict["data"] = data

with open("abc.json", "w") as outfile:
    json.dump(json_dict, outfile, indent=4, sort_keys=True)

It's the same idea, I first create a big json_dictto be saved directly in json. I use the withstatement to save the json avoiding the catch of exception

也是一样的思路，我先创建一个bigjson_dict直接保存在json中。我使用该with语句来保存 json 避免捕获exception

Also, you should have a look to the doc of json.dumps()if you need future improve in your jsonoutput.

此外，json.dumps()如果您需要在未来改进json输出，您应该查看文档。

EDIT

编辑

And just for fun, if you don't like tmpvar, you can do all the data forloop in a one-liner :)

只是为了好玩，如果您不喜欢tmpvar，您可以for在一行中完成所有数据循环 :)

json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()]

It could gave for final solution something not totally readable like this:

它可以为最终解决方案提供一些不完全可读的东西：

import json

json_dict = {
              "filename": "abc.pdf",
              "data": [{
                        "keyword": k,
                        "term_freq": len(v),
                        "lists": [{"occurrance": i} for i in v]
                       } for k, v in dic.iteritems()]
            }

with open("abc.json", "w") as outfile:
    json.dump(json_dict, outfile, indent=4, sort_keys=True)

EDIT 2

编辑 2

It looks like you don't want to save your jsonas the desired output, but be abble to readit.

看起来您不想将您json的输出保存为所需的输出，但能够阅读它。

In fact, you can also use json.dumps()in order to printyour json.

事实上，你也可以使用json.dumps()为了打印你的json。

with open('abc.json', 'r') as handle:
    new_json_dict = json.load(handle)
    print json.dumps(json_dict, indent=4, sort_keys=True)

There is still one problem here though, "filename":is printed at the end of the list because the dof datacomes before the f.

但是这里仍然存在一个问题，"filename":它打印在列表的末尾，因为dofdata出现在f.

To force the order, you will have to use an OrderedDictin the generation of the dict. Be careful the syntax is ugly (imo) with python 2.X

要强制执行该命令，您必须OrderedDict在生成 dict 时使用 an 。小心语法是丑陋的（imo）python 2.X

Here is the new complete solution ;)

这是新的完整解决方案;)

import json
from collections import OrderedDict

dic = {
        'bomber': [1, 2, 3, 4, 5],
        'irritation': [1, 3, 5, 7, 8]
      }

json_dict = OrderedDict([
              ('filename', 'abc.pdf'),
              ('data', [ OrderedDict([
                                        ('keyword', k),
                                        ('term_freq', len(v)),
                                        ('lists', [{'occurrance': i} for i in v])
                                     ]) for k, v in dic.iteritems()])
            ])

with open('abc.json', 'w') as outfile:
    json.dump(json_dict, outfile)


# Now to read the orderer json file

with open('abc.json', 'r') as handle:
    new_json_dict = json.load(handle, object_pairs_hook=OrderedDict)
    print json.dumps(json_dict, indent=4)

Will output:

将输出：

{
    "filename": "abc.pdf", 
    "data": [
        {
            "keyword": "bomber", 
            "term_freq": 5, 
            "lists": [
                {
                    "occurrance": 1
                }, 
                {
                    "occurrance": 2
                }, 
                {
                    "occurrance": 3
                }, 
                {
                    "occurrance": 4
                }, 
                {
                    "occurrance": 5
                }
            ]
        }, 
        {
            "keyword": "irritation", 
            "term_freq": 5, 
            "lists": [
                {
                    "occurrance": 1
                }, 
                {
                    "occurrance": 3
                }, 
                {
                    "occurrance": 5
                }, 
                {
                    "occurrance": 7
                }, 
                {
                    "occurrance": 8
                }
            ]
        }
    ]
}

But be carefull, most of the time, it is better to save a regular.jsonfile in order to be cross languages.

但请注意，大多数情况下，为了跨语言，最好保存常规.json文件。

Answer 2

回答by andrea-f

Your current code is not working because the loop iterates through the before-last item adding the },then when the loop runs again it sets the flag to false, but the last time it ran it added a ,since it thought that there will be another element.

您当前的代码不起作用，因为循环遍历 before-last 项，添加},then 当循环再次运行时，它将标志设置为 false，但上次运行时它添加了 a，,因为它认为会有另一个元素。

If this is your dict: a = {"bomber":[1,2,3,4,5]}then you can do:

如果这是你的 dict:a = {"bomber":[1,2,3,4,5]}那么你可以这样做：

import json
file_name = "a_file.json"
file_name_input = "abc.pdf"
new_output = {}
new_output["filename"] = file_name_input

new_data = []
i = 0
for key, val in a.iteritems():
   new_data.append({"keyword":key, "lists":[], "term_freq":len(val)})
   for p in val:
       new_data[i]["lists"].append({"occurrance":p})
   i += 1

new_output['data'] = new_data

Then save the data by:

然后通过以下方式保存数据：

f = open(file_name, 'w+')
f.write(json.dumps(new_output, indent=4, sort_keys=True, default=unicode))
f.close()

生成动态嵌套的 JSON 对象和数组 - python

提问by Asif Ali

采纳答案by Kruup?s

回答by andrea-f

相关推荐

最近更新

标签

生成动态嵌套的 JSON 对象和数组 - python

提问by Asif Ali

采纳答案by Kruup?s

回答by andrea-f

相关推荐

Python 位置参数跟随关键字参数

忽略错误消息以继续 python 中的循环

Python 安装： Reportlab：“导入错误：没有名为 reportlab.lib 的模块”

Python pandas.read_csv FileNotFoundError：文件 b'\xe2\x80\xaa<etc>' 尽管路径正确

相关推荐

最近更新

标签