Pandas Dataframe:如何通过应用函数更新多列?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32603051/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:54:00  来源:igfitidea点击:

Pandas Dataframe: How to update multiple columns by applying a function?

pythonpandas

提问by John Smith

I have a Dataframe df like this:

我有一个像这样的 Dataframe df:

   A   B   C    D
2  1   O   s    h
4  2   P    
7  3   Q
9  4   R   h    m

I have a function f to calculate C and D based on B for a row:

我有一个函数 f 来计算基于 B 的 C 和 D 为一行:

def f(p): #p is the value of column B for a row. 
     return p+'k', p+'n'

How can I populate the missing values for row 4&7 by applying the function f to the Dataframe?

如何通过将函数 f 应用于数据框来填充第 4 行和第 7 行的缺失值?

The expected outcome is like below:

预期结果如下:

   A   B   C    D
2  1   O   s    h
4  2   P   Pk   Pn
7  3   Q   Qk   Qn
9  4   R   h    m

The function f has to be used as the real function is very complicated. Also, the function only needs to be applied to the rows missing C and D

必须使用函数 f,因为实际函数非常复杂。此外,该函数只需要应用于缺少 C 和 D 的行

回答by Fabio Lamanna

Maybe there is a more elegant way, but I would do in this way:

也许有更优雅的方式,但我会这样做:

df['C'] = df['B'].apply(lambda x: f(x)[0])
df['D'] = df['B'].apply(lambda x: f(x)[1])

Applying the function to the columns and get the first and the second value of the outputs. It returns:

将函数应用于列并获得输出的第一个和第二个值。它返回:

   A  B   C   D
0  1  O  Ok  On
1  2  P  Pk  Pn
2  3  Q  Qk  Qn
3  4  R  Rk  Rn

EDIT:

编辑:

In a more concise way, thanks to this answer:

以更简洁的方式,感谢这个答案

df[['C','D']] = df['B'].apply(lambda x: pd.Series([f(x)[0],f(x)[1]]))

回答by Zenith

I have a more easy way to do it.

我有一个更简单的方法来做到这一点。

If the table is not so big.

如果桌子不是那么大。

def f(row): #row is the value of row. 
    if row['C']=='':
        row['C']=row['B']+'k'
    if row['D']=='':
        row['D']=row['B']+'n'
    return row
df=df.apply(f,axis=1)

回答by Colonel Beauvel

If you want to use your function as such, here is a one liner:

如果你想使用你的函数为这样的,这里是一个班轮:

df.update(df.B.apply(lambda x: pd.Series(dict(zip(['C','D'],f(x))))), overwrite=False)

In [350]: df
Out[350]:
   A  B   C   D
2  1  O   s   h
4  2  P  Pk  Pn
7  3  Q  Qk  Qn
9  4  R   h   m

You can also do:

你也可以这样做:

df1 = df.copy()

df[['C','D']] = df.apply(lambda x: pd.Series([x['B'] + 'k', x['B'] + 'n']), axis=1)

df1.update(df, overwrite=False)

回答by Nader Hisham

simply by doing the following

只需执行以下操作

df.C.loc[df.C.isnull()] = df.B.loc[df.C.isnull()] + 'k'

df.D.loc[df.D.isnull()] = df.B.loc[df.D.isnull()] + 'n'

check this link indexing-view-versus-copyif you want to know why I've use loc

如果您想知道我为什么使用,请检查此链接indexing-view-versus-copyloc