Python：通过 numpy.save 保存字典

Question

提问by ramu

I have a large data set (millions of rows) in memory, in the form of numpy arraysand dictionaries.

我在内存中有一个大数据集（数百万行），以numpy 数组和字典的形式。

Once this data is constructed I want to store them into files; so, later I can load these files into memory quickly, without reconstructing this data from the scratch once again.

一旦构建了这些数据，我想将它们存储到文件中；因此，稍后我可以将这些文件快速加载到内存中，而无需再次从头开始重建这些数据。

np.saveand np.loadfunctions does the job smoothly for numpy arrays.
But I am facing problems with dict objects.

np.save和np.load函数可以顺利完成 numpy 数组的工作。
但我面临着 dict 对象的问题。

See below sample. d2 is the dictionary which was loaded from the file. See #out[28] it has been loaded into d2 as a numpy array, not as a dict.So further dict operations such as get are not working.

请参阅下面的示例。d2 是从文件加载的字典。参见 #out[28] 它已作为 numpy 数组而不是 dict 加载到 d2 中。因此，诸如 get 之类的进一步 dict 操作不起作用。

Is there a way to load the data from the file as dict (instead of numpy array) ?

有没有办法从文件中加载数据作为 dict （而不是 numpy 数组）？

In [25]: d1={'key1':[5,10], 'key2':[50,100]}

In [26]: np.save("d1.npy", d1)

In [27]: d2=np.load("d1.npy")

In [28]: d2
Out[28]: array({'key2': [50, 100], 'key1': [5, 10]}, dtype=object)

In [30]: d1.get('key1')  #original dict before saving into file
Out[30]: [5, 10]

In [31]: d2.get('key2')  #dictionary loaded from the file
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-31-23e02e45bf22> in <module>()
----> 1 d2.get('key2')

AttributeError: 'numpy.ndarray' object has no attribute 'get'

Answer 1

回答by Kennet Celeste

It's a structured array. Use d2.item()to retrieve the actual dict object first:

这是一个结构化数组。用于d2.item()首先检索实际的 dict 对象：

import numpy as np

d1={'key1':[5,10], 'key2':[50,100]}
np.save("d1.npy", d1)
d2=np.load("d1.npy")
print d1.get('key1')
print d2.item().get('key2')

result:

结果：

[5, 10]
[50, 100]

Answer 2

回答by Kh40tiK

picklemodule can be used. Example code:

可以使用pickle模块。示例代码：

from six.moves import cPickle as pickle #for performance
from __future__ import print_function
import numpy as np

def save_dict(di_, filename_):
    with open(filename_, 'wb') as f:
        pickle.dump(di_, f)

def load_dict(filename_):
    with open(filename_, 'rb') as f:
        ret_di = pickle.load(f)
    return ret_di

if __name__ == '__main__':
    g_data = {
        'm':np.random.rand(4,4),
        'n':np.random.rand(2,2,2)
    }
    save_dict(g_data, './data.pkl')
    g_data2 = load_dict('./data.pkl')
    print(g_data['m'] == g_data2['m'])
    print(g_data['n'] == g_data2['n'])

You may also save multiple python objects in a single pickled file. Each pickle.loadcall will load a single object in that case.

您还可以在单个腌制文件中保存多个 python 对象。pickle.load在这种情况下，每次调用都会加载一个对象。

Python：通过 numpy.save 保存字典

提问by ramu

回答by Kennet Celeste

回答by Kh40tiK

相关推荐

最近更新

标签

Python：通过 numpy.save 保存字典

提问by ramu

回答by Kennet Celeste

回答by Kh40tiK

相关推荐

Python 理解 __getitem__ 方法

Python Selenium 打开浏览器但不加载页面

conda 安装降级 python 版本

Python WebDriverException：消息：“geckodriver”可执行文件需要在 PATH 中

相关推荐

最近更新

标签

Python 理解 getitem 方法