Python NumPy 数组不是 JSON 可序列化的

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26646362/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 00:48:56  来源:igfitidea点击:

NumPy array is not JSON serializable

pythonjsondjangonumpy

提问by Karnivaurus

After creating a NumPy array, and saving it as a Django context variable, I receive the following error when loading the webpage:

创建 NumPy 数组并将其保存为 Django 上下文变量后,我在加载网页时收到以下错误:

array([   0,  239,  479,  717,  952, 1192, 1432, 1667], dtype=int64) is not JSON serializable

What does this mean?

这是什么意思?

采纳答案by travelingbones

I regularly "jsonify" np.arrays. Try using the ".tolist()" method on the arrays first, like this:

我经常“jsonify”np.arrays。首先尝试在数组上使用“.tolist()”方法,如下所示:

import numpy as np
import codecs, json 

a = np.arange(10).reshape(2,5) # a 2 by 5 array
b = a.tolist() # nested lists with same data, indices
file_path = "/path.json" ## your path variable
json.dump(b, codecs.open(file_path, 'w', encoding='utf-8'), separators=(',', ':'), sort_keys=True, indent=4) ### this saves the array in .json format

In order to "unjsonify" the array use:

为了“unjsonify”数组使用:

obj_text = codecs.open(file_path, 'r', encoding='utf-8').read()
b_new = json.loads(obj_text)
a_new = np.array(b_new)

回答by ntk4

Also, some very interesting information further on lists vs. arrays in Python ~> Python List vs. Array - when to use?

此外,还有一些非常有趣的信息,关于Python 中的列表 vs. 数组~> Python 列表 vs. 数组 - 何时使用?

It could be noted that once I convert my arrays into a list before saving it in a JSON file, in my deployment right now anyways, once I read that JSON file for use later, I can continue to use it in a list form (as opposed to converting it back to an array).

可以注意到,一旦我在将数组保存到 JSON 文件之前将其转换为列表,无论如何在我的部署中,一旦我阅读了该 JSON 文件以供以后使用,我就可以继续以列表形式使用它(如反对将其转换回数组)。

AND actually looks nicer (in my opinion) on the screen as a list (comma seperated) vs. an array (not-comma seperated) this way.

并且实际上在屏幕上看起来更好(在我看来)作为列表(逗号分隔)与数组(非逗号分隔)这种方式。

Using @travelingbones's .tolist() method above, I've been using as such (catching a few errors I've found too):

使用上面@travelingbones 的 .tolist() 方法,我一直在使用它(我也发现了一些错误):

SAVE DICTIONARY

保存字典

def writeDict(values, name):
    writeName = DIR+name+'.json'
    with open(writeName, "w") as outfile:
        json.dump(values, outfile)

READ DICTIONARY

阅读字典

def readDict(name):
    readName = DIR+name+'.json'
    try:
        with open(readName, "r") as infile:
            dictValues = json.load(infile)
            return(dictValues)
    except IOError as e:
        print(e)
        return('None')
    except ValueError as e:
        print(e)
        return('None')

Hope this helps!

希望这可以帮助!

回答by Roei Bahumi

Here is an implementation that work for me and removed all nans (assuming these are simple object (list or dict)):

这是一个对我有用并删除了所有 nans 的实现(假设这些是简单的对象(列表或字典)):

from numpy import isnan

def remove_nans(my_obj, val=None):
    if isinstance(my_obj, list):
        for i, item in enumerate(my_obj):
            if isinstance(item, list) or isinstance(item, dict):
                my_obj[i] = remove_nans(my_obj[i], val=val)

            else:
                try:
                    if isnan(item):
                        my_obj[i] = val
                except Exception:
                    pass

    elif isinstance(my_obj, dict):
        for key, item in my_obj.iteritems():
            if isinstance(item, list) or isinstance(item, dict):
                my_obj[key] = remove_nans(my_obj[key], val=val)

            else:
                try:
                    if isnan(item):
                        my_obj[key] = val
                except Exception:
                    pass

    return my_obj

回答by JLT

I had a similar problem with a nested dictionary with some numpy.ndarrays in it.

我在嵌套字典中遇到了类似的问题,其中包含一些 numpy.ndarrays。

def jsonify(data):
    json_data = dict()
    for key, value in data.iteritems():
        if isinstance(value, list): # for lists
            value = [ jsonify(item) if isinstance(item, dict) else item for item in value ]
        if isinstance(value, dict): # for nested lists
            value = jsonify(value)
        if isinstance(key, int): # if key is integer: > to string
            key = str(key)
        if type(value).__module__=='numpy': # if value is numpy.*: > to python list
            value = value.tolist()
        json_data[key] = value
    return json_data

回答by John Zwinck

You can use Pandas:

您可以使用熊猫

import pandas as pd
pd.Series(your_array).to_json(orient='values')

回答by Mark

This is not supported by default, but you can make it work quite easily! There are several things you'll want to encode if you want the exact same data back:

这在默认情况下不受支持,但您可以很容易地使其工作!如果您想要返回完全相同的数据,您需要对以下几件事进行编码:

  • The data itself, which you can get with obj.tolist()as @travelingbones mentioned. Sometimes this may be good enough.
  • The data type. I feel this is important in quite some cases.
  • The dimension (not necessarily 2D), which could be derived from the above if you assume the input is indeed always a 'rectangular' grid.
  • The memory order (row- or column-major). This doesn't often matter, but sometimes it does (e.g. performance), so why not save everything?
  • 数据本身,你可以obj.tolist()像@travelingbones 提到的那样获得。有时这可能就足够了。
  • 数据类型。我觉得这在某些情况下很重要。
  • 维度(不一定是 2D),如果您假设输入确实总是一个“矩形”网格,则可以从上面推导出来。
  • 内存顺序(行优先或列优先)。这通常无关紧要,但有时确实如此(例如性能),那么为什么不保存所有内容呢?

Furthermore, your numpy array could part of your data structure, e.g. you have a list with some matrices inside. For that you could use a custom encoder which basically does the above.

此外,您的 numpy 数组可能是您数据结构的一部分,例如您有一个包含一些矩阵的列表。为此,您可以使用基本上执行上述操作的自定义编码器。

This should be enough to implement a solution. Or you could use json-trickswhich does just this (and supports various other types) (disclaimer: I made it).

这应该足以实施解决方案。或者您可以使用json-tricks来完成此操作(并支持各种其他类型)(免责声明:我做到了)。

pip install json-tricks

Then

然后

data = [
    arange(0, 10, 1, dtype=int).reshape((2, 5)),
    datetime(year=2017, month=1, day=19, hour=23, minute=00, second=00),
    1 + 2j,
    Decimal(42),
    Fraction(1, 3),
    MyTestCls(s='ub', dct={'7': 7}),  # see later
    set(range(7)),
]
# Encode with metadata to preserve types when decoding
print(dumps(data))

回答by karlB

Store as JSON a numpy.ndarray or any nested-list composition.

将 numpy.ndarray 或任何嵌套列表组合存储为 JSON。

class NumpyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

a = np.array([[1, 2, 3], [4, 5, 6]])
print(a.shape)
json_dump = json.dumps({'a': a, 'aa': [2, (2, 3, 4), a], 'bb': [2]}, cls=NumpyEncoder)
print(json_dump)

Will output:

将输出:

(2, 3)
{"a": [[1, 2, 3], [4, 5, 6]], "aa": [2, [2, 3, 4], [[1, 2, 3], [4, 5, 6]]], "bb": [2]}

To restore from JSON:

从 JSON 恢复:

json_load = json.loads(json_dump)
a_restored = np.asarray(json_load["a"])
print(a_restored)
print(a_restored.shape)

Will output:

将输出:

[[1 2 3]
 [4 5 6]]
(2, 3)

回答by tsveti_iko

I found the best solution if you have nested numpy arrays in a dictionary:

如果您在字典中嵌套了 numpy 数组,我找到了最佳解决方案:

import json
import numpy as np

class NumpyEncoder(json.JSONEncoder):
    """ Special json encoder for numpy types """
    def default(self, obj):
        if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
            np.int16, np.int32, np.int64, np.uint8,
            np.uint16, np.uint32, np.uint64)):
            return int(obj)
        elif isinstance(obj, (np.float_, np.float16, np.float32, 
            np.float64)):
            return float(obj)
        elif isinstance(obj,(np.ndarray,)): #### This is the fix
            return obj.tolist()
        return json.JSONEncoder.default(self, obj)

dumped = json.dumps(data, cls=NumpyEncoder)

with open(path, 'w') as f:
    json.dump(dumped, f)

Thanks to this guy.

多亏了这家伙

回答by KS HARSHA

This is a different answer, but this might help to help people who are trying to save data and then read it again.
There is hickle which is faster than pickle and easier.
I tried to save and read it in pickle dump but while reading there were lot of problems and wasted an hour and still didn't find solution though I was working on my own data to create a chat bot.

这是一个不同的答案,但这可能有助于帮助那些试图保存数据然后再次阅读的人。
有比泡菜更快更容易的 hickle。
我试图在 pickle dump 中保存和阅读它,但在阅读时有很多问题,浪费了一个小时,尽管我正在使用自己的数据来创建聊天机器人,但仍然没有找到解决方案。

vec_xand vec_yare numpy arrays:

vec_x并且vec_y是 numpy 数组:

data=[vec_x,vec_y]
hkl.dump( data, 'new_data_file.hkl' )

Then you just read it and perform the operations:

然后你只需阅读它并执行操作:

data2 = hkl.load( 'new_data_file.hkl' )

回答by steco

You could also use defaultargument for example:

您还可以使用default参数,例如:

def myconverter(o):
    if isinstance(o, np.float32):
        return float(o)

json.dump(data, default=myconverter)