pandas 将字典添加到数据框的最佳方法

Question

提问by Rutger Hofste

I have a Pandas Dataframe and want to add the data from a dictionary uniformly to all rows in my dataframe. Currently I loop over the dictionary and set the value to my new columns. Is there a more efficient way to do this?

我有一个 Pandas 数据框，想将字典中的数据统一添加到数据框中的所有行。目前我遍历字典并将值设置为我的新列。有没有更有效的方法来做到这一点？

notebook

笔记本

# coding: utf-8    
import pandas as pd

df = pd.DataFrame({'age' : [1, 2, 3],'name' : ['Foo', 'Bar', 'Barbie']}) 
d = {"blah":42,"blah-blah":"bar"}
for k,v in d.items():
    df[k] = v
df

Answer 1

回答by jezrael

Use assignif all keys are not numeric:

assign如果所有键都不是数字，请使用：

df = df.assign(**d)
print (df)
   age    name  blah blah-blah
0    1     Foo    42       bar
1    2     Bar    42       bar
2    3  Barbie    42       bar

If possible numeric joinworking nice:

如果可能的话，数字join工作很好：

d = {8:42,"blah-blah":"bar"}
df = df.join(pd.DataFrame(d, index=df.index))
print (df)

   age    name   8 blah-blah
0    1     Foo  42       bar
1    2     Bar  42       bar
2    3  Barbie  42       bar

Answer 2

回答by Anton vBR

The answer in my opinion is no. Looping through key,values in a dict is already efficient and assigning columns with df[k] = vis more readable. Remember that in the future you just want to remember why you did something and you won't care much if you spare some microseconds. The only thing missing is a comment why you do it.

在我看来，答案是否定的。循环遍历字典中的键和值已经很有效，并且分配列具有df[k] = v更高的可读性。请记住，将来你只想记住你为什么做某事，如果你留出一些微秒，你就不会太在意。唯一缺少的是评论你为什么这样做。

d = {"blah":42,"blah-blah":"bar"}

# Add columns to compensate for missing values in document XXX
for k,v in d.items():
    df[k] = v

Timings (but the error is too big... I'd say they are equivalent in speed):

时间（但错误太大......我会说它们在速度上是相同的）：

Your solution:

您的解决方案：

809 μs ± 70 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

df.assign():

893 μs ± 24.2 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

pandas 将字典添加到数据框的最佳方法

提问by Rutger Hofste

回答by jezrael

回答by Anton vBR

相关推荐

最近更新

标签

pandas 将字典添加到数据框的最佳方法

提问by Rutger Hofste

回答by jezrael

回答by Anton vBR

相关推荐

尝试使用 pip install pandas 时给出的双重要求

Pandas 解析 csv 错误 - 预期找到 1 个字段 9

pandas 在 Python 中使用 geopy 进行地理编码时出现错误 (429) 请求过多

标准化 Python Pandas 数据框中的某些列？

相关推荐

最近更新

标签