Python:通过 numpy.save 保存字典

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40219946/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:18:28  来源:igfitidea点击:

Python : save dictionaries through numpy.save

pythonnumpydictionary

提问by ramu

I have a large data set (millions of rows) in memory, in the form of numpy arraysand dictionaries.

我在内存中有一个大数据集(数百万行),以numpy 数组字典的形式。

Once this data is constructed I want to store them into files; so, later I can load these files into memory quickly, without reconstructing this data from the scratch once again.

一旦构建了这些数据,我想将它们存储到文件中;因此,稍后我可以将这些文件快速加载到内存中,而无需再次从头开始重建这些数据。

np.saveand np.loadfunctions does the job smoothly for numpy arrays.
But I am facing problems with dict objects.

np.savenp.load函数可以顺利完成 numpy 数组的工作。
但我面临着 dict 对象的问题。

See below sample. d2 is the dictionary which was loaded from the file. See #out[28] it has been loaded into d2 as a numpy array, not as a dict.So further dict operations such as get are not working.

请参阅下面的示例。d2 是从文件加载的字典。参见 #out[28] 它已作为 numpy 数组而不是 dict 加载到 d2 中。因此,诸如 get 之类的进一步 dict 操作不起作用。

Is there a way to load the data from the file as dict (instead of numpy array) ?

有没有办法从文件中加载数据作为 dict (而不是 numpy 数组)?

In [25]: d1={'key1':[5,10], 'key2':[50,100]}

In [26]: np.save("d1.npy", d1)

In [27]: d2=np.load("d1.npy")

In [28]: d2
Out[28]: array({'key2': [50, 100], 'key1': [5, 10]}, dtype=object)

In [30]: d1.get('key1')  #original dict before saving into file
Out[30]: [5, 10]

In [31]: d2.get('key2')  #dictionary loaded from the file
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-23e02e45bf22> in <module>()
----> 1 d2.get('key2')

AttributeError: 'numpy.ndarray' object has no attribute 'get'

回答by Kennet Celeste

It's a structured array. Use d2.item()to retrieve the actual dict object first:

这是一个结构化数组。用于d2.item()首先检索实际的 dict 对象:

import numpy as np

d1={'key1':[5,10], 'key2':[50,100]}
np.save("d1.npy", d1)
d2=np.load("d1.npy")
print d1.get('key1')
print d2.item().get('key2')

result:

结果:

[5, 10]
[50, 100]

回答by Kh40tiK

picklemodule can be used. Example code:

可以使用pickle模块。示例代码:

from six.moves import cPickle as pickle #for performance
from __future__ import print_function
import numpy as np

def save_dict(di_, filename_):
    with open(filename_, 'wb') as f:
        pickle.dump(di_, f)

def load_dict(filename_):
    with open(filename_, 'rb') as f:
        ret_di = pickle.load(f)
    return ret_di

if __name__ == '__main__':
    g_data = {
        'm':np.random.rand(4,4),
        'n':np.random.rand(2,2,2)
    }
    save_dict(g_data, './data.pkl')
    g_data2 = load_dict('./data.pkl')
    print(g_data['m'] == g_data2['m'])
    print(g_data['n'] == g_data2['n'])

You may also save multiple python objects in a single pickled file. Each pickle.loadcall will load a single object in that case.

您还可以在单​​个腌制文件中保存多个 python 对象。pickle.load在这种情况下,每次调用都会加载一个对象。