Python 使用常规编码器使对象 JSON 可序列化

Question

提问by leonsas

The regular way of JSON-serializing custom non-serializable objects is to subclass json.JSONEncoderand then pass a custom encoder to dumps.

JSON 序列化自定义不可序列化对象的常规方法是子类化json.JSONEncoder，然后将自定义编码器传递给转储。

It usually looks like this:

它通常看起来像这样：

class CustomEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, foo):
            return obj.to_json()

        return json.JSONEncoder.default(self, obj)

print json.dumps(obj, cls = CustomEncoder)

What I'm trying to do, is to make something serializable with the default encoder. I looked around but couldn't find anything. My thought is that there would be some field in which the encoder looks at to determine the json encoding. Something similar to __str__. Perhaps a __json__field. Is there something like this in python?

我想要做的是使用默认编码器制作一些可序列化的东西。我环顾四周，但找不到任何东西。我的想法是编码器会查看某些字段以确定 json 编码。类似于__str__. 也许是一个__json__领域。python中有这样的东西吗？

I want to make one class of a module I'm making to be JSON serializable to everyone that uses the package without them worrying about implementing their own [trivial] custom encoders.

我想制作一类我正在制作的模块，以便对使用该包的每个人进行 JSON 序列化，而无需他们担心实现自己的 [琐碎] 自定义编码器。

Answer 1

采纳答案by martineau

As I said in a comment to your question, after looking at the jsonmodule's source code, it does not appear to lend itself to doing what you want. However the goal could be achieved by what is known as monkey-patching(see question What is a monkey patch?). This could be done in your package's __init__.pyinitialization script and would affect all subsequent jsonmodule serialization since modules are generally only loaded once and the result is cached in sys.modules.

正如我在对您的问题的评论中所说，在查看json模块的源代码后，它似乎不适合做您想做的事情。然而，该目标可以通过所谓的猴子补丁来实现（请参阅问题什么是猴子补丁？）。这可以在包的__init__.py初始化脚本中完成，并且会影响所有后续的json模块序列化，因为模块通常只加载一次并且结果缓存在sys.modules.

The patch changes the default json encoder's defaultmethod—the default default().

该补丁更改了默认的 json 编码器default方法——默认的default().

Here's an example implemented as a standalone module for simplicity's sake:

为简单起见，这是一个作为独立模块实现的示例：

Module: make_json_serializable.py

模块： make_json_serializable.py

""" Module that monkey-patches json module when it's imported so
JSONEncoder.default() automatically checks for a special "to_json()"
method and uses it to encode the object if found.
"""
from json import JSONEncoder

def _default(self, obj):
    return getattr(obj.__class__, "to_json", _default.default)(obj)

_default.default = JSONEncoder.default  # Save unmodified default.
JSONEncoder.default = _default # Replace it.

Using it is trivial since the patch is applied by simply importing the module.

使用它很简单，因为只需导入模块即可应用补丁。

Sample client script:

示例客户端脚本：

import json
import make_json_serializable  # apply monkey-patch

class Foo(object):
    def __init__(self, name):
        self.name = name
    def to_json(self):  # New special method.
        """ Convert to JSON format string representation. """
        return '{"name": "%s"}' % self.name

foo = Foo('sazpaz')
print(json.dumps(foo))  # -> "{\"name\": \"sazpaz\"}"

To retain the object type information, the special method can also include it in the string returned:

为了保留对象类型信息，特殊方法还可以将其包含在返回的字符串中：

        return ('{"type": "%s", "name": "%s"}' %
                 (self.__class__.__name__, self.name))

Which produces the following JSON that now includes the class name:

它生成以下 JSON，现在包含类名：

"{\"type\": \"Foo\", \"name\": \"sazpaz\"}"

Magick Lies Here

魔法就在这里

Even better than having the replacement default()look for a specially named method, would be for it to be able to serialize most Python objects automatically, including user-defined class instances, without needing to add a special method. After researching a number of alternatives, the following which uses the picklemodule, seemed closest to that ideal to me:

甚至比default()寻找一个特别命名的方法更好的是，它能够自动序列化大多数 Python 对象，包括用户定义的类实例，而无需添加特殊方法。在研究了许多替代方案后，以下使用该pickle模块的方案对我来说似乎最接近理想：

Module: make_json_serializable2.py

模块： make_json_serializable2.py

""" Module that imports the json module and monkey-patches it so
JSONEncoder.default() automatically pickles any Python objects
encountered that aren't standard JSON data types.
"""
from json import JSONEncoder
import pickle

def _default(self, obj):
    return {'_python_object': pickle.dumps(obj)}

JSONEncoder.default = _default  # Replace with the above.

Of course everything can't be pickled—extension types for example. However there are ways defined to handle them via the pickle protocol by writing special methods—similar to what you suggested and I described earlier—but doing that would likely be necessary for a far fewer number of cases.

当然，一切都不能被腌制——例如扩展类型。然而，有一些方法可以通过编写特殊方法来通过 pickle 协议来处理它们——类似于你所建议的和我之前描述的——但这样做可能对少得多的情况是必要的。

Deserializing

反序列化

Regardless, using the pickle protocol also means it would be fairly easy to reconstruct the original Python object by providing a custom object_hookfunction argument on any json.loads()calls that used any '_python_object'key in the dictionary passed in, whenever it has one. Something like:

无论如何，使用 pickle 协议也意味着通过object_hook在任何json.loads()使用'_python_object'传入字典中的任何键的任何调用上提供自定义函数参数，只要它有一个，就可以很容易地重建原始 Python 对象。就像是：

def as_python_object(dct):
    try:
        return pickle.loads(str(dct['_python_object']))
    except KeyError:
        return dct

pyobj = json.loads(json_str, object_hook=as_python_object)

If this has to be done in many places, it might be worthwhile to define a wrapper function that automatically supplied the extra keyword argument:

如果这必须在很多地方完成，那么定义一个自动提供额外关键字参数的包装函数可能是值得的：

json_pkloads = functools.partial(json.loads, object_hook=as_python_object)

pyobj = json_pkloads(json_str)

Naturally, this could be monkey-patched it into the jsonmodule as well, making the function the default object_hook(instead of None).

当然，这也可以通过猴子补丁将其添加到json模块中，从而使该函数成为默认值object_hook（而不是None）。

I got the idea for using picklefrom an answerby Raymond Hettingerto another JSON serialization question, whom I consider exceptionally credible as well as an official source (as in Python core developer).

我的想法用pickle从答案由雷蒙德赫廷杰另一个JSON序列化的问题，就是我认为非常可靠以及官方源（如在Python核心开发人员）。

Portablity to Python 3

可移植到 Python 3

The code above does not work as shown in Python 3 because json.dumps()returns a bytesobject which the JSONEncodercan't handle. However the approach is still valid. A simple way to workaround the issue is to latin1"decode" the value returned from pickle.dumps()and then "encode" it from latin1before passing it on to pickle.loads()in the as_python_object()function. This works because arbitrary binary strings are valid latin1which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answerby Sven Marnach).

上面的代码在 Python 3 中不起作用，因为json.dumps()返回了一个无法处理的bytes对象JSONEncoder。然而，该方法仍然有效。一个简单的方法来解决该问题是latin1“解码”，从返回的值pickle.dumps()，然后选择“编码”它latin1传递到之前pickle.loads()的as_python_object()功能。这工作，因为任意的二进制字符串是有效的latin1，可总是被解码为Unicode，然后再编码回原来的字符串（如指出，这个答案由斯文Marnach）。

^{(Although the following works fine in Python 2, the latin1decoding and encoding it does is superfluous.)}

^{（虽然以下在 Python 2 中工作正常，但latin1它所做的解码和编码是多余的。）}

from decimal import Decimal

class PythonObjectEncoder(json.JSONEncoder):
    def default(self, obj):
        return {'_python_object': pickle.dumps(obj).decode('latin1')}


def as_python_object(dct):
    try:
        return pickle.loads(dct['_python_object'].encode('latin1'))
    except KeyError:
        return dct


class Foo(object):  # Some user-defined class.
    def __init__(self, name):
        self.name = name

    def __eq__(self, other):
        if not isinstance(other, type(self)):
            # Don't attempt to compare against unrelated types.
            return NotImplemented
        return self.name == other.name


data = [1,2,3, set(['knights', 'who', 'say', 'ni']), {'key':'value'},
        Foo('Bar'), Decimal('3.141592653589793238462643383279502884197169')]
j = json.dumps(data, cls=PythonObjectEncoder, indent=4)
data2 = json.loads(j, object_hook=as_python_object)
assert data == data2  # both should be same

Answer 2

回答by blakev

I don't understand why you can't write a serializefunction for your own class? You implement the custom encoder inside the class itself and allow "people" to call the serialize function that will essentially return self.__dict__with functions stripped out.

我不明白为什么不能serialize为自己的类编写函数？您可以在类本身内部实现自定义编码器，并允许“人们”调用序列化函数，该函数本质上将self.__dict__在删除函数的情况下返回。

edit:

编辑：

This questionagrees with me, that the most simple way is write your own method and return the json serialized data that you want. They also recommend to try jsonpickle, but now you're adding an additional dependency for beauty when the correct solution comes built in.

这个问题同意我，最简单的方法是编写自己的方法并返回您想要的json序列化数据。他们还建议尝试 jsonpickle，但是现在当内置正确的解决方案时，您正在为美观添加额外的依赖项。

Answer 3

回答by Yoav Kleinberger

I suggest putting the hack into the class definition. This way, once the class is defined, it supports JSON. Example:

我建议将 hack 放入类定义中。这样，一旦定义了类，它就支持 JSON。例子：

import json

class MyClass( object ):

    def _jsonSupport( *args ):
        def default( self, xObject ):
            return { 'type': 'MyClass', 'name': xObject.name() }

        def objectHook( obj ):
            if 'type' not in obj:
                return obj
            if obj[ 'type' ] != 'MyClass':
                return obj
            return MyClass( obj[ 'name' ] )
        json.JSONEncoder.default = default
        json._default_decoder = json.JSONDecoder( object_hook = objectHook )

    _jsonSupport()

    def __init__( self, name ):
        self._name = name

    def name( self ):
        return self._name

    def __repr__( self ):
        return '<MyClass(name=%s)>' % self._name

myObject = MyClass( 'Magneto' )
jsonString = json.dumps( [ myObject, 'some', { 'other': 'objects' } ] )
print "json representation:", jsonString

decoded = json.loads( jsonString )
print "after decoding, our object is the first in the list", decoded[ 0 ]

Answer 4

回答by Aravindan Ve

You can extend the dict class like so:

您可以像这样扩展 dict 类：

#!/usr/local/bin/python3
import json

class Serializable(dict):

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        # hack to fix _json.so make_encoder serialize properly
        self.__setitem__('dummy', 1)

    def _myattrs(self):
        return [
            (x, self._repr(getattr(self, x))) 
            for x in self.__dir__() 
            if x not in Serializable().__dir__()
        ]

    def _repr(self, value):
        if isinstance(value, (str, int, float, list, tuple, dict)):
            return value
        else:
            return repr(value)

    def __repr__(self):
        return '<%s.%s object at %s>' % (
            self.__class__.__module__,
            self.__class__.__name__,
            hex(id(self))
        )

    def keys(self):
        return iter([x[0] for x in self._myattrs()])

    def values(self):
        return iter([x[1] for x in self._myattrs()])

    def items(self):
        return iter(self._myattrs())

Now to make your classes serializable with the regular encoder, extend 'Serializable':

现在要使您的类可以使用常规编码器进行序列化，请扩展“Serializable”：

class MySerializableClass(Serializable):

    attr_1 = 'first attribute'
    attr_2 = 23

    def my_function(self):
        print('do something here')


obj = MySerializableClass()

print(obj)will print something like:

print(obj)将打印如下内容：

<__main__.MySerializableClass object at 0x1073525e8>

print(json.dumps(obj, indent=4))will print something like:

print(json.dumps(obj, indent=4))将打印如下内容：

{
    "attr_1": "first attribute",
    "attr_2": 23,
    "my_function": "<bound method MySerializableClass.my_function of <__main__.MySerializableClass object at 0x1073525e8>>"
}

Answer 5

回答by ribamar

The problem with overriding JSONEncoder().defaultis that you can do it only once. If you stumble upon anything a special data type that does not work with that pattern (like if you use a strange encoding). With the pattern below, you can always make your class JSON serializable, provided that the class field you want to serialize is serializable itself (and can be added to a python list, barely anything). Otherwise, you have to apply recursively the same pattern to your json field (or extract the serializable data from it):

覆盖的问题JSONEncoder().default是你只能做一次。如果您偶然发现任何不适用于该模式的特殊数据类型（例如，如果您使用一种奇怪的编码）。使用下面的模式，您始终可以使您的类 JSON 可序列化，前提是您要序列化的类字段本身是可序列化的（并且可以添加到 python 列表中，几乎没有任何东西）。否则，您必须递归地将相同的模式应用于您的 json 字段（或从中提取可序列化的数据）：

# base class that will make all derivatives JSON serializable:
class JSONSerializable(list): # need to derive from a serializable class.

  def __init__(self, value = None):
    self = [ value ]

  def setJSONSerializableValue(self, value):
    self = [ value ]

  def getJSONSerializableValue(self):
    return self[1] if len(self) else None


# derive  your classes from JSONSerializable:
class MyJSONSerializableObject(JSONSerializable):

  def __init__(self): # or any other function
    # .... 
    # suppose your__json__field is the class member to be serialized. 
    # it has to be serializable itself. 
    # Every time you want to set it, call this function:
    self.setJSONSerializableValue(your__json__field)
    # ... 
    # ... and when you need access to it,  get this way:
    do_something_with_your__json__field(self.getJSONSerializableValue())


# now you have a JSON default-serializable class:
a = MyJSONSerializableObject()
print json.dumps(a)

Answer 6

回答by S?awomir Lenart

For production environment, prepare rather own module of jsonwith your own custom encoder, to make it clear that you overrides something. Monkey-patch is not recommended, but you can do monkey patch in your testenv.

对于生产环境，json使用您自己的自定义编码器准备自己的模块，以明确您覆盖了某些内容。不推荐使用 Monkey-patch，但您可以在 testenv 中执行 Monkey patch。

For example,

例如，

class JSONDatetimeAndPhonesEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, (datetime.date, datetime.datetime)):
            return obj.date().isoformat()
        elif isinstance(obj, basestring):
            try:
                number = phonenumbers.parse(obj)
            except phonenumbers.NumberParseException:
                return json.JSONEncoder.default(self, obj)
            else:
                return phonenumbers.format_number(number, phonenumbers.PhoneNumberFormat.NATIONAL)
        else:
            return json.JSONEncoder.default(self, obj)

you want:

你要：

payload = json.dumps(your_data, cls=JSONDatetimeAndPhonesEncoder)

有效载荷 = json.dumps(your_data, cls=JSONDatetimeAndPhonesEncoder)

or:

或者：

payload = your_dumps(your_data)

有效载荷 = your_dumps(your_data)

or:

或者：

payload = your_json.dumps(your_data)

有效载荷 = your_json.dumps(your_data)

however in testing environment, go a head:

但是在测试环境中，请继续：

@pytest.fixture(scope='session', autouse=True)
def testenv_monkey_patching():
    json._default_encoder = JSONDatetimeAndPhonesEncoder()

which will apply your encoder to all json.dumpsoccurrences.

这会将您的编码器应用于所有json.dumps事件。

Python 使用常规编码器使对象 JSON 可序列化

提问by leonsas

采纳答案by martineau

Magick Lies Here

魔法就在这里

Portablity to Python 3

可移植到 Python 3

回答by blakev

回答by Yoav Kleinberger

回答by Aravindan Ve

回答by ribamar

回答by S?awomir Lenart

相关推荐

最近更新

标签

Python 使用常规编码器使对象 JSON 可序列化

提问by leonsas

采纳答案by martineau

Magick Lies Here

魔法就在这里

Portablity to Python 3

可移植到 Python 3

回答by blakev

回答by Yoav Kleinberger

回答by Aravindan Ve

回答by ribamar

回答by S?awomir Lenart

相关推荐

Python NameError: 名称 '_name_' 未定义

Python Matplotlib，从三个不等长的数组创建堆叠直方图

Python字典：获取键列表的值列表

在python中旋转的高斯消除

相关推荐

最近更新

标签