Python 如何从 JSON 文件中的每个值中删除空格和换行符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17098553/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:26:57  来源:igfitidea点击:

How to remove whitespaces and newlines from every value in a JSON file?

pythonjsonstrip

提问by John West

I have a JSONfile that has the following structure:

我有一个JSON具有以下结构的文件:

{
    "name":[
        {
            "someKey": "\n\n   some Value   "
        },
        {
            "someKey": "another value    "
        }
    ],
    "anotherName":[
        {
            "anArray": [
                {
                    "key": "    value\n\n",
                    "anotherKey": "  value"
                },
                {
                    "key": "    value\n",
                    "anotherKey": "value"
                }
            ]
        }
    ]
}

Now I want to stripoff all he whitespaces and newlines for every value in the JSONfile. Is there some way to iterate over each element of the dictionary and the nested dictionaries and lists?

现在我想strip关闭JSON文件中每个值的所有空格和换行符。有没有办法遍历字典的每个元素以及嵌套的字典和列表?

采纳答案by jfs

Now I want to strip off all he whitespaces and newlines for every value in the JSON file

现在我想去除 JSON 文件中每个值的所有空格和换行符

Using pkgutil.simplegeneric()to create a helper function get_items():

使用pkgutil.simplegeneric()创建一个辅助函数get_items()

import json
import sys
from pkgutil import simplegeneric

@simplegeneric
def get_items(obj):
    while False: # no items, a scalar object
        yield None

@get_items.register(dict)
def _(obj):
    return obj.items() # json object. Edit: iteritems() was removed in Python 3

@get_items.register(list)
def _(obj):
    return enumerate(obj) # json array

def strip_whitespace(json_data):
    for key, value in get_items(json_data):
        if hasattr(value, 'strip'): # json string
            json_data[key] = value.strip()
        else:
            strip_whitespace(value) # recursive call


data = json.load(sys.stdin) # read json data from standard input
strip_whitespace(data)
json.dump(data, sys.stdout, indent=2)

Note: functools.singledispatch()function (Python 3.4+) would allow to use collections' MutableMapping/MutableSequenceinstead of dict/listhere.

注意:functools.singledispatch()函数(Python 3.4+)将允许使用collections'MutableMapping/MutableSequence而不是dict/list此处。

Output

输出

{
  "anotherName": [
    {
      "anArray": [
        {
          "anotherKey": "value", 
          "key": "value"
        }, 
        {
          "anotherKey": "value", 
          "key": "value"
        }
      ]
    }
  ], 
  "name": [
    {
      "someKey": "some Value"
    }, 
    {
      "someKey": "another value"
    }
  ]
}

回答by Brent Washburne

Parse the file using JSON:

使用JSON解析文件:

import json
file = file.replace('\n', '')    # do your cleanup here
data = json.loads(file)

then walk through the resulting data structure.

然后遍历生成的数据结构。

回答by Justin S Barrett

This may not be the most efficient process, but it works. I copied that sample into a file named json.txt, then read it, deserialized it with json.loads(), and used a pair of functions to recursively clean it and everything inside it.

这可能不是最有效的过程,但它有效。我将该样本复制到一个名为 的文件中json.txt,然后读取它,使用 反序列化它json.loads(),并使用一对函数递归地清理它和其中的所有内容。

import json

def clean_dict(d):
    for key, value in d.iteritems():
        if isinstance(value, list):
            clean_list(value)
        elif isinstance(value, dict):
            clean_dict(value)
        else:
            newvalue = value.strip()
            d[key] = newvalue

def clean_list(l):
    for index, item in enumerate(l):
        if isinstance(item, dict):
            clean_dict(item)
        elif isinstance(item, list):
            clean_list(item)
        else:
            l[index] = item.strip()

# Read the file and send it to the dict cleaner
with open("json.txt") as f:
    data = json.load(f)

print "before..."
print data, "\n"

clean_dict(data)

print "after..."
print data

The result...

结果...

before...
{u'anotherName': [{u'anArray': [{u'anotherKey': u'  value', u'key': u'    value\n\n'}, {u'anotherKey': u'value', u'key': u'    value\n'}]}], u'name': [{u'someKey': u'\n\n   some Value   '}, {u'someKey': u'another value    '}]} 

after...
{u'anotherName': [{u'anArray': [{u'anotherKey': u'value', u'key': u'value'}, {u'anotherKey': u'value', u'key': u'value'}]}], u'name': [{u'someKey': u'some Value'}, {u'someKey': u'another value'}]}