pandas 将带有值列表的字典转换为数据框

Question

提问by stoves

I spent a while looking through SO and seems I have a unique problem.

我花了一段时间查看 SO，似乎我有一个独特的问题。

I have a dictionary that looks like the following:

我有一本字典，如下所示：

dict={
    123: [2,4],
    234: [6,8],
    ...
}

I want to convert this dictionary that has lists for values into a 3 column data frame like the following:

我想将这个包含值列表的字典转换为 3 列数据框，如下所示：

time, value1, value2
123, 2, 4
234, 6, 8
...

I can run:

我可以跑：

pandas.DataFrame(dict)

but this generates the following:

但这会产生以下内容：

123, 234, ...
2, 6, ...
4, 8, ...

Probably a simple fix but I'm still picking up pandas

可能是一个简单的修复，但我仍然在捡Pandas

Answer 1

回答by Roger Fan

You can either preprocess the data as levi suggests, or you can transpose the data frame after creating it.

您可以按照 Levi 的建议对数据进行预处理，也可以在创建数据框后转置数据框。

testdict={
    123: [2,4],
    234: [6,8],
    456: [10, 12]
}
df = pd.DataFrame(testdict)
df = df.transpose()

print(df)
#      0  1
# 123  2  4
# 234  6  8

Answer 2

回答by Robert Yi

It may be of interest to some that Roger Fan's pandas.DataFrame(dict)method is actually pretty slow if you have a ton of indices. The faster way is to just preprocess the data into separate lists and then create a DataFrame out of these lists. (Perhaps this was explained in levi's answer, but it is gone now.)

有些人可能会感兴趣的pandas.DataFrame(dict)是，如果您有大量索引，Roger Fan 的方法实际上非常慢。更快的方法是将数据预处理到单独的列表中，然后从这些列表中创建一个 DataFrame。（也许这在 levi 的回答中有所解释，但现在已经消失了。）

For example, consider this dictionary, dict1, where each value is a list. Specifically, dict1[i] = [ i*10, i*100](for ease of checking the final dataframe).

例如，考虑这个字典，dict1，其中每个值都是一个列表。具体来说，dict1[i] = [ i*10, i*100]（为了便于检查最终数据帧）。

keys = range(1000)
values = zip(np.arange(1000)*10, np.arange(1000)*100)
dict1 = dict(zip(keys, values))

It takes roughly 30 times as long with the pandas method. E.g.

使用 pandas 方法大约需要 30 倍的时间。例如

t = time.time()
test1 = pd.DataFrame(dict1).transpose()
print time.time() - t

0.118762016296

versus:

相对：

t = time.time()
keys = []
list1 = []
list2 = []
for k in dict1:
    keys.append(k)
    list1.append(dict1[k][0])
    list2.append(dict1[k][1])
test2 = pd.DataFrame({'element1': list1, 'element2': list2}, index=keys)
print time.time() - t

0.00310587882996

pandas 将带有值列表的字典转换为数据框

提问by stoves

回答by Roger Fan

回答by Robert Yi

相关推荐

最近更新

标签

pandas 将带有值列表的字典转换为数据框

提问by stoves

回答by Roger Fan

回答by Robert Yi

相关推荐

将 Pandas 数据框的全部内容写入 HTML 表格

返回将 Pandas 数据帧作为参数的函数的输出

根据列中的最大值过滤 Pandas Dataframe

R 的 Pandas 等价物 which()

相关推荐

最近更新

标签