Python 如何在数据帧的每一行上应用一个函数?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33518124/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:30:13  来源:igfitidea点击:

How to apply a function on every row on a dataframe?

pythonfunctionpandas

提问by Koen

I am new to Python and I am not sure how to solve the following problem.

我是 Python 新手,不知道如何解决以下问题。

I have a function:

我有一个功能:

def EOQ(D,p,ck,ch):
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q

Say I have the dataframe

说我有数据框

df = pd.DataFrame({"D": [10,20,30], "p": [20, 30, 10]})

    D   p
0   10  20
1   20  30
2   30  10

ch=0.2
ck=5

And chand ckare float types. Now I want to apply the formula to every row on the dataframe and return it as an extra row 'Q'. An example (that does not work) would be:

chck是浮点类型。现在我想将公式应用于数据帧上的每一行,并将其作为额外的行“Q”返回。一个例子(不起作用)是:

df['Q']= map(lambda p, D: EOQ(D,p,ck,ch),df['p'], df['D']) 

(returns only 'map' types)

(仅返回“地图”类型)

I will need this type of processing more in my project and I hope to find something that works.

我将在我的项目中更多地需要这种类型的处理,我希望找到一些有用的东西。

采纳答案by EdChum

The following should work:

以下应该工作:

def EOQ(D,p,ck,ch):
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q
ch=0.2
ck=5
df['Q'] = df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1)
df

If all you're doing is calculating the square root of some result then use the np.sqrtmethod this is vectorised and will be significantly faster:

如果您所做的只是计算某个结果的平方根,那么使用np.sqrt矢量化的方法会明显更快:

In [80]:
df['Q'] = np.sqrt((2*df['D']*ck)/(ch*df['p']))

df
Out[80]:
    D   p          Q
0  10  20   5.000000
1  20  30   5.773503
2  30  10  12.247449

Timings

时间安排

For a 30k row df:

对于 30k 行 df:

In [92]:

import math
ch=0.2
ck=5
def EOQ(D,p,ck,ch):
    Q = math.sqrt((2*D*ck)/(ch*p))
    return Q

%timeit np.sqrt((2*df['D']*ck)/(ch*df['p']))
%timeit df.apply(lambda row: EOQ(row['D'], row['p'], ck, ch), axis=1)
1000 loops, best of 3: 622 μs per loop
1 loops, best of 3: 1.19 s per loop

You can see that the np method is ~1900 X faster

你可以看到 np 方法快了 ~1900 X