具有 Nan 支持的 Pandas Lambda 函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44061607/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:37:50  来源:igfitidea点击:

Pandas Lambda Function with Nan Support

pythonpython-3.xpandaslambdanan

提问by Tyler Russell

I am trying to write a lambda function in Pandas that checks to see if Col1 is a Nan and if so, uses another column's data. I have having trouble getting code (below) to compile/execute correctly.

我正在尝试在 Pandas 中编写一个 lambda 函数来检查 Col1 是否为 Nan,如果是,则使用另一列的数据。我在获取代码(如下)以正确编译/执行时遇到问题。

import pandas as pd
import numpy as np
df=pd.DataFrame({ 'Col1' : [1,2,3,np.NaN], 'Col2': [7, 8, 9, 10]})  
df2=df.apply(lambda x: x['Col2'] if x['Col1'].isnull() else x['Col1'], axis=1)

Does anyone have any good idea on how to write a solution like this with a lambda function or have I exceeded the abilities of lambda? If not, do you have another solution? Thanks.

有没有人对如何使用 lambda 函数编写这样的解决方案有什么好主意,或者我是否超出了 lambda 的能力?如果没有,您有其他解决方案吗?谢谢。

回答by jezrael

You need pandas.isnullfor check if scalar is NaN:

您需要pandas.isnull检查标量是否为NaN

df = pd.DataFrame({ 'Col1' : [1,2,3,np.NaN],
                 'Col2' : [8,9,7,10]})  

df2 = df.apply(lambda x: x['Col2'] if pd.isnull(x['Col1']) else x['Col1'], axis=1)

print (df)
   Col1  Col2
0   1.0     8
1   2.0     9
2   3.0     7
3   NaN    10

print (df2)
0     1.0
1     2.0
2     3.0
3    10.0
dtype: float64

But better is use Series.combine_first:

但更好的是使用Series.combine_first

df['Col1'] = df['Col1'].combine_first(df['Col2'])

print (df)
   Col1  Col2
0   1.0     8
1   2.0     9
2   3.0     7
3  10.0    10

Another solution with Series.update:

另一个解决方案Series.update

df['Col1'].update(df['Col2'])
print (df)
   Col1  Col2
0   8.0     8
1   9.0     9
2   7.0     7
3  10.0    10

回答by Gerges

Assuming that you do have a second column, that is:

假设您确实有第二列,即:

df = pd.DataFrame({ 'Col1' : [1,2,3,np.NaN], 'Col2': [1,2,3,4]})

df = pd.DataFrame({ 'Col1' : [1,2,3,np.NaN], 'Col2': [1,2,3,4]})

The correct solution to this problem would be:

这个问题的正确解决方案是:

df['Col1'].fillna(df['Col2'], inplace=True)

回答by Allen

You need to use np.nan()

你需要使用 np.nan()

#import numpy as np
df2=df.apply(lambda x: 2 if np.isnan(x['Col1']) else 1, axis=1)   

df2
Out[1307]: 
0    1
1    1
2    1
3    2
dtype: int64

回答by jiahe

Within pandas 0.24.2, I use

在Pandas 0.24.2 中,我使用

df.apply(lambda x: x['col_name'] if x[col1] is np.nan else expressions_another, axis=1)

because pd.isnull() doesn't work.

因为 pd.isnull() 不起作用。

in my work,I found the following phenomenon,

在我的工作中,我发现了以下现象,

No running results:

没有运行结果:

df['prop'] = df.apply(lambda x: (x['buynumpday'] / x['cnumpday']) if pd.isnull(x['cnumpday']) else np.nan, axis=1)

Results exist:

结果存在:

df['prop'] = df.apply(lambda x: (x['buynumpday'] / x['cnumpday']) if x['cnumpday'] is not np.nan else np.nan, axis=1)