将 numpy 类型转换为 python

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27050108/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:22:17  来源:igfitidea点击:

Convert numpy type to python

pythonjsonnumpypandas

提问by ubh

I have a list of dicts in the following form that I generate from pandas. I want to convert it to a json format.

我有一个从熊猫生成的以下形式的字典列表。我想将其转换为 json 格式。

list_val = [{1.0: 685}, {2.0: 8}]
output = json.dumps(list_val)

However, json.dumps throws an error: TypeError: 685 is not JSON serializable

但是,json.dumps 抛出错误:TypeError: 685 is not JSON serializable

I am guessing it's a type conversion issue from numpy to python(?).

我猜这是从 numpy 到 python(?) 的类型转换问题。

However, when I convert the values v of each dict in the array using np.int32(v) it still throws the error.

但是,当我使用 np.int32(v) 转换数组中每个字典的值 v 时,它仍然会引发错误。

EDIT: Here's the full code

编辑:这是完整的代码

            new = df[df[label] == label_new] 
            ks_dict = json.loads(content)
            ks_list = ks_dict['variables']
            freq_counts = []

            for ks_var in ks_list:

                    freq_var = dict()
                    freq_var["name"] = ks_var["name"]
                    ks_series = new[ks_var["name"]]
                    temp_df = ks_series.value_counts().to_dict()
                    freq_var["new"] = [{u: np.int32(v)} for (u, v) in temp_df.iteritems()]            
                    freq_counts.append(freq_var)

           out = json.dumps(freq_counts)

采纳答案by mgilson

It looks like you're correct:

看起来你是对的:

>>> import numpy
>>> import json
>>> json.dumps(numpy.int32(685))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: 685 is not JSON serializable

The unfortunate thing here is that numpy numbers' __repr__doesn't give you any hint about what typethey are. They're running around masquerading as ints when they aren't (gasp). Ultimately, it looks like jsonis telling you that an intisn't serializable, but really, it's telling you that this particular np.int32 (or whatever type you actually have) isn't serializable. (No real surprise there -- No np.int32 isserializable). This is also why the dict that you inevitably printed beforepassing it to json.dumpslooks like it just has integers in it as well.

不幸的是,numpy numbers'__repr__并没有给你任何关于它们是什么类型的提示。int当他们不是(喘气)时,他们伪装成s到处跑。最终,它看起来像是json在告诉您 anint不可序列化,但实际上,它在告诉您这个特定的 np.int32 (或您实际拥有的任何类型)不可序列化。(没有真正的惊喜——没有 np.int32是可序列化的)。这也是为什么将它传递给json.dumps之前不可避免地打印的 dict看起来也只有整数的原因。

The easiest workaround here is probably to write your own serializer1:

这里最简单的解决方法可能是编写自己的序列化程序1

class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, numpy.integer):
            return int(obj)
        elif isinstance(obj, numpy.floating):
            return float(obj)
        elif isinstance(obj, numpy.ndarray):
            return obj.tolist()
        else:
            return super(MyEncoder, self).default(obj)

You use it like this:

你像这样使用它:

json.dumps(numpy.float32(1.2), cls=MyEncoder)
json.dumps(numpy.arange(12), cls=MyEncoder)
json.dumps({'a': numpy.int32(42)}, cls=MyEncoder)

etc.

等等。

1Or you could just write the default function and pass that as the defautkeyword argument to json.dumps. In this scenario, you'd replace the last line with raise TypeError, but ... meh. The class is more extensible :-)

1或者您可以只编写默认函数并将其作为defaut关键字参数传递给json.dumps. 在这种情况下,您可以将最后一行替换为raise TypeError,但是……嗯。该类更具可扩展性:-)

回答by mobiusklein

If you leave the data in any of the pandasobjects, the library supplies a to_jsonfunction on Series, DataFrame, and all of the other higher dimension cousins.

如果您将数据保留在任何pandas对象中,该库会to_json在 Series、DataFrame 和所有其他更高维度的同类中提供一个函数。

See Series.to_json()

Series.to_json()

回答by Emanuele Paolini

You could also convert the array to a python list (use the tolistmethod) and then convert the list to json.

您还可以将数组转换为 python 列表(使用tolist方法),然后将列表转换为 json。

回答by ringsaturn

You can use our fork of ujson to deal with NumPy int64. caiyunapp/ultrajson: Ultra fast JSON decoder and encoder written in C with Python bindings and NumPy bindings

您可以使用我们的 ujson 分支来处理 NumPy int64。caiyunapp/ultrajson:用 C 语言编写的超快速 JSON 解码器和编码器,带有 Python 绑定和 NumPy 绑定

pip install nujson

Then

然后

>>> import numpy as np
>>> import nujson as ujson
>>> a = {"a": np.int64(100)}
>>> ujson.dumps(a)
'{"a":100}'
>>> a["b"] = np.float64(10.9)
>>> ujson.dumps(a)
'{"a":100,"b":10.9}'
>>> a["c"] = np.str_("12")
>>> ujson.dumps(a)
'{"a":100,"b":10.9,"c":"12"}'
>>> a["d"] = np.array(list(range(10)))
>>> ujson.dumps(a)
'{"a":100,"b":10.9,"c":"12","d":[0,1,2,3,4,5,6,7,8,9]}'
>>> a["e"] = np.repeat(3.9, 4)
>>> ujson.dumps(a)
'{"a":100,"b":10.9,"c":"12","d":[0,1,2,3,4,5,6,7,8,9],"e":[3.9,3.9,3.9,3.9]}'