python dict到numpy结构化数组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15579649/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python dict to numpy structured array
提问by Christa
I have a dictionary that I need to convert to a NumPy structured array. I'm using the arcpy function NumPyArraytoTable, so a NumPy structured array is the only data format that will work.
我有一本需要转换为 NumPy 结构化数组的字典。我正在使用 arcpy 函数NumPyArraytoTable,因此 NumPy 结构化数组是唯一可用的数据格式。
Based on this thread: Writing to numpy array from dictionaryand this thread: How to convert Python dictionary object to numpy array
基于此线程:Writing to numpy array from dictionary和此线程:How to convert Python dictionary object to numpy array
I've tried this:
我试过这个:
result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}
names = ['id','data']
formats = ['f8','f8']
dtype = dict(names = names, formats=formats)
array=numpy.array([[key,val] for (key,val) in result.iteritems()],dtype)
But I keep getting expected a readable buffer object
但我不断得到 expected a readable buffer object
The method below works, but is stupid and obviously won't work for real data. I know there is a more graceful approach, I just can't figure it out.
下面的方法有效,但很愚蠢,显然不适用于真实数据。我知道有一种更优雅的方法,我只是想不通。
totable = numpy.array([[key,val] for (key,val) in result.iteritems()])
array=numpy.array([(totable[0,0],totable[0,1]),(totable[1,0],totable[1,1])],dtype)
采纳答案by unutbu
You could use np.array(list(result.items()), dtype=dtype):
你可以使用np.array(list(result.items()), dtype=dtype):
import numpy as np
result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}
names = ['id','data']
formats = ['f8','f8']
dtype = dict(names = names, formats=formats)
array = np.array(list(result.items()), dtype=dtype)
print(repr(array))
yields
产量
array([(0.0, 1.1181753789488595), (1.0, 0.5566080288678394),
(2.0, 0.4718269778030734), (3.0, 0.48716683119447185), (4.0, 1.0),
(5.0, 0.1395076201641266), (6.0, 0.20941558441558442)],
dtype=[('id', '<f8'), ('data', '<f8')])
If you don't want to create the intermediate list of tuples, list(result.items()), then you could instead use np.fromiter:
如果您不想创建元组的中间列表, list(result.items()),则可以改为使用np.fromiter:
In Python2:
在 Python2 中:
array = np.fromiter(result.iteritems(), dtype=dtype, count=len(result))
In Python3:
在 Python3 中:
array = np.fromiter(result.items(), dtype=dtype, count=len(result))
Why using the list [key,val]does not work:
为什么使用列表[key,val]不起作用:
By the way, your attempt,
顺便说一句,你的尝试,
numpy.array([[key,val] for (key,val) in result.iteritems()],dtype)
was very close to working. If you change the list [key, val]to the tuple (key, val), then it would have worked. Of course,
非常接近工作。如果您将列表更改为[key, val]tuple (key, val),那么它会起作用。当然,
numpy.array([(key,val) for (key,val) in result.iteritems()], dtype)
is the same thing as
是一样的
numpy.array(result.items(), dtype)
in Python2, or
在 Python2 中,或
numpy.array(list(result.items()), dtype)
in Python3.
在 Python3 中。
np.arraytreats lists differently than tuples: Robert Kern explains:
np.array以不同于元组的方式对待列表:Robert Kern 解释说:
As a rule, tuples are considered "scalar" records and lists are recursed upon. This rule helps numpy.array() figure out which sequences are records and which are other sequences to be recursed upon; i.e. which sequences create another dimension and which are the atomic elements.
通常,元组被认为是“标量”记录并且列表被递归。这个规则帮助 numpy.array() 找出哪些序列是记录,哪些是要递归的其他序列;即哪些序列创建另一个维度,哪些是原子元素。
Since (0.0, 1.1181753789488595)is considered one of those atomic elements, it should be a tuple, not a list.
由于(0.0, 1.1181753789488595)被认为是这些原子元素之一,它应该是一个元组,而不是一个列表。
回答by dgdm
Let me propose an improved method when the values of the dictionnary are lists with the same lenght :
当字典的值是具有相同长度的列表时,让我提出一种改进的方法:
import numpy
def dctToNdarray (dd, szFormat = 'f8'):
'''
Convert a 'rectangular' dictionnary to numpy NdArray
entry
dd : dictionnary (same len of list
retrun
data : numpy NdArray
'''
names = dd.keys()
firstKey = dd.keys()[0]
formats = [szFormat]*len(names)
dtype = dict(names = names, formats=formats)
values = [tuple(dd[k][0] for k in dd.keys())]
data = numpy.array(values, dtype=dtype)
for i in range(1,len(dd[firstKey])) :
values = [tuple(dd[k][i] for k in dd.keys())]
data_tmp = numpy.array(values, dtype=dtype)
data = numpy.concatenate((data,data_tmp))
return data
dd = {'a':[1,2.05,25.48],'b':[2,1.07,9],'c':[3,3.01,6.14]}
data = dctToNdarray(dd)
print data.dtype.names
print data
回答by Federico Ressi
I would prefer storing keys and values on separate arrays. This i often more practical. Structures of arrays are perfect replacement to array of structures. As most of the time you have to process only a subset of your data (in this cases keys or values, operation only with only one of the two arrays would be more efficient than operating with half of the two arrays together.
我更喜欢将键和值存储在单独的数组上。这我往往更实用。数组结构是结构数组的完美替代品。由于大多数情况下您只需要处理数据的一个子集(在这种情况下,键或值,仅使用两个数组中的一个进行操作比将两个数组中的一半放在一起操作更有效。
But in case this way is not possible, I would suggest to use arrays sorted by column instead of by row. In this way you would have the same benefit as having two arrays, but packed only in one.
但如果这种方式是不可能的,我建议使用按列而不是按行排序的数组。通过这种方式,您将获得与拥有两个数组相同的好处,但只打包在一个中。
import numpy as np
result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}
names = 0
values = 1
array = np.empty(shape=(2, len(result)), dtype=float)
array[names] = r.keys()
array[values] = r.values()
But my favorite is this (simpler):
但我最喜欢的是这个(更简单):
import numpy as np
result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}
arrays = {'names': np.array(k.keys(), dtype=float),
'values': np.array(k.values(), dtype=float)}
回答by dgdm
Even more simple if you accept using pandas :
如果您接受使用 pandas 则更简单:
import pandas
result = {0: 1.1181753789488595, 1: 0.5566080288678394, 2: 0.4718269778030734, 3: 0.48716683119447185, 4: 1.0, 5: 0.1395076201641266, 6: 0.20941558441558442}
df = pandas.DataFrame(result, index=[0])
print df
gives :
给出:
0 1 2 3 4 5 6
0 1.118175 0.556608 0.471827 0.487167 1 0.139508 0.209416
回答by Can Hicabi Tartanoglu
Similarly to the approved answer. If you want to create an array from dictionary keys:
与批准的答案类似。如果要从字典键创建数组:
np.array( tuple(dict.keys()) )
If you want to create an array from dictionary values:
如果要从字典值创建数组:
np.array( tuple(dict.values()) )

![在 Python 中使用 open() 时 OSError [Errno 22] 无效参数](/res/img/loading.gif)