pandas 如何将熊猫数据帧行快速转换为ordereddict
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18996714/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to turn pandas dataframe row into ordereddict fast
提问by user1246428
Looking for a fast way to get a row in a pandas dataframe into a ordered dict with out using list. List are fine but with large data sets will take to long. I am using fiona GIS reader and the rows are ordereddicts with the schema giving the data type. I use pandas to join data. I many cases the rows will have different types so I was thinking turning into a numpy array with type string might do the trick.
寻找一种快速的方法来在不使用列表的情况下将 Pandas 数据框中的一行放入有序的 dict 中。列表很好,但大型数据集需要很长时间。我正在使用 fiona GIS 阅读器,并且行是带有提供数据类型的模式的ordereddicts。我使用Pandas来加入数据。在很多情况下,行会有不同的类型,所以我想变成一个字符串类型的 numpy 数组可能会起作用。
回答by Andy Hayden
Unfortunately you can't just do an apply (since it fits it back to a DataFrame):
不幸的是,您不能只进行应用(因为它适合回数据帧):
In [1]: df = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'])
In [2]: df
Out[2]:
a b
0 1 2
1 3 4
In [3]: from collections import OrderedDict
In [4]: df.apply(OrderedDict)
Out[4]:
a b
0 1 2
1 3 4
But you can use a list comprehension with iterrows:
但是您可以使用带有iterrows的列表理解:
In [5]: [OrderedDict(row) for i, row in df.iterrows()]
Out[5]: [OrderedDict([('a', 1), ('b', 2)]), OrderedDict([('a', 3), ('b', 4)])]
If it was possible to use a generator, rather than a list, to whatever you were working with this will usually be more efficient:
如果可以使用生成器而不是列表来处理您正在使用的任何内容,通常会更有效率:
In [6]: (OrderedDict(row) for i, row in df.iterrows())
Out[6]: <generator object <genexpr> at 0x10466da50>
回答by jezrael
This is implemented in pandas 0.21.0+in function to_dictwith parameter into:
这是在带有参数的pandas 0.21.0+函数中实现的:to_dictinto
df = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'])
print (df)
a b
0 1 2
1 3 4
d = df.to_dict(into=OrderedDict, orient='index')
print (d)
OrderedDict([(0, OrderedDict([('a', 1), ('b', 2)])), (1, OrderedDict([('a', 3), ('b', 4)]))])

