pandas 如何将熊猫数据帧行快速转换为ordereddict

Question

提问by user1246428

Looking for a fast way to get a row in a pandas dataframe into a ordered dict with out using list. List are fine but with large data sets will take to long. I am using fiona GIS reader and the rows are ordereddicts with the schema giving the data type. I use pandas to join data. I many cases the rows will have different types so I was thinking turning into a numpy array with type string might do the trick.

寻找一种快速的方法来在不使用列表的情况下将 Pandas 数据框中的一行放入有序的 dict 中。列表很好，但大型数据集需要很长时间。我正在使用 fiona GIS 阅读器，并且行是带有提供数据类型的模式的ordereddicts。我使用Pandas来加入数据。在很多情况下，行会有不同的类型，所以我想变成一个字符串类型的 numpy 数组可能会起作用。

Answer 1

回答by Andy Hayden

Unfortunately you can't just do an apply (since it fits it back to a DataFrame):

不幸的是，您不能只进行应用（因为它适合回数据帧）：

In [1]: df = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'])

In [2]: df
Out[2]: 
   a  b
0  1  2
1  3  4

In [3]: from collections import OrderedDict

In [4]: df.apply(OrderedDict)
Out[4]: 
   a  b
0  1  2
1  3  4

But you can use a list comprehension with iterrows:

但是您可以使用带有iterrows的列表理解：

In [5]: [OrderedDict(row) for i, row in df.iterrows()]
Out[5]: [OrderedDict([('a', 1), ('b', 2)]), OrderedDict([('a', 3), ('b', 4)])]

If it was possible to use a generator, rather than a list, to whatever you were working with this will usually be more efficient:

如果可以使用生成器而不是列表来处理您正在使用的任何内容，通常会更有效率：

In [6]: (OrderedDict(row) for i, row in df.iterrows())
Out[6]: <generator object <genexpr> at 0x10466da50>

Answer 2

回答by jezrael

This is implemented in pandas 0.21.0+in function to_dictwith parameter into:

这是在带有参数的pandas 0.21.0+函数中实现的：to_dictinto

df = pd.DataFrame([[1, 2], [3, 4]], columns=['a', 'b'])
print (df)
   a  b
0  1  2
1  3  4

d = df.to_dict(into=OrderedDict, orient='index')
print (d)
OrderedDict([(0, OrderedDict([('a', 1), ('b', 2)])), (1, OrderedDict([('a', 3), ('b', 4)]))])

pandas 如何将熊猫数据帧行快速转换为ordereddict

提问by user1246428

回答by Andy Hayden

回答by jezrael

相关推荐

最近更新

标签

pandas 如何将熊猫数据帧行快速转换为ordereddict

提问by user1246428

回答by Andy Hayden

回答by jezrael

相关推荐

pandas 分组多索引熊猫数据框

pandas 熊猫：DataFrame.mean() 很慢。如何更快地计算列的均值？

pandas 子样本熊猫数据框

将列作为副本添加到 Pandas DataFrame

相关推荐

最近更新

标签