pandas 错误:浮动对象没有属性 notnull
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44877663/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Error: float object has no attribute notnull
提问by
I have a dataframe:
我有一个数据框:
a b c
0 nan Y nan
1 23 N 3
2 nan N 2
3 44 Y nan
I wish to have this output:
我希望有这个输出:
a b c d
0 nan Y nan nan
1 23 N 3 96
2 nan N 2 nan
3 44 Y nan 44
I wish to have a condition which is when column a is null, then d will be null else if column b is N and column c is not null then column d is equal to column a * column c else column d equal column a
我希望有一个条件,即当列 a 为空时,则 d 将为空,否则如果列 b 为 N 且列 c 不为空,则列 d 等于列 a * 列 c else 列 d 等于列 a
I have done this code but i get the error:
我已完成此代码,但出现错误:
def f4(row):
if row['a']==np.nan:
return np.nan
elif row['b']=="N" & row(row['c'].notnull()):
return row['a']*row['c']
else:
return row['a']
DF['P1']=DF.apply(f4,axis=1)
can anyone help me point out where is my mistake? I have refer to this and try this but also get the error Creating a new column based on if-elif-else condition
谁能帮我指出我的错误在哪里?我已经参考了这个并尝试了这个但也得到了错误创建一个基于 if-elif-else 条件的新列
采纳答案by Scott Boston
You don't need apply
, use np.where
:
你不需要apply
,使用np.where
:
df['d'] = np.where(df.a.isnull(),
np.nan,
np.where((df.b == "N")&(~df.c.isnull()),
df.a*df.c,
df.a))
Output:
输出:
a b c d
0 NaN Y NaN NaN
1 23.0 N 3.0 69.0
2 NaN N 2.0 NaN
3 44.0 Y NaN 44.0
回答by juanpa.arrivillaga
Since you just want Nan
s to be propagated, multiplying the columns takes care of that for you:
由于您只想Nan
传播 s,因此乘以列会为您解决这个问题:
>>> df = pd.read_clipboard()
>>> df
a b c
0 NaN Y NaN
1 23.0 N 3.0
2 NaN N 2.0
3 44.0 Y NaN
>>> df.a * df.c
0 NaN
1 69.0
2 NaN
3 NaN
dtype: float64
>>>
If you want to do it on a condition, you can use np.where
here instead of .apply
. all you need is the following:
如果您想在某个条件下执行此操作,则可以使用np.where
here 代替.apply
。您只需要以下内容:
>>> df
a b c
0 NaN Y NaN
1 23.0 N 3.0
2 NaN N 2.0
3 44.0 Y NaN
>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan, 69., nan, 44.])
This is the default behavior for most operations involving Nan
. So, you can simply assign the result of the above:
这是大多数涉及 的操作的默认行为Nan
。因此,您可以简单地分配上述结果:
>>> df['d'] = np.where(df.b == 'N', df.a*df.c, df.a)
>>> df
a b c d
0 NaN Y NaN NaN
1 23.0 N 3.0 69.0
2 NaN N 2.0 NaN
3 44.0 Y NaN 44.0
>>>
Just to elaborate on what this:
只是详细说明这是什么:
np.where(df.b == 'N', df.a*df.c, df.a)
Is doing, you can think of it as "where df.b == 'N', give me the result of df.a * df.c
, else, give me just df.a
:
正在做,你可以把它想成“where df.b == 'N', 给我的结果df.a * df.c
,否则,给我df.a
:
>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan, 69., nan, 44.])
Also note, if your dataframe were a little different:
另请注意,如果您的数据框略有不同:
>>> df
a b c
0 NaN Y NaN
1 23.0 Y 3.0
2 NaN N 2.0
3 44.0 Y NaN
>>> df.loc[0,'a'] = 99
>>> df.loc[0, 'b']= 'N'
>>> df
a b c
0 99.0 N NaN
1 23.0 N 3.0
2 NaN N 2.0
3 44.0 Y NaN
Then the following would not be equivalent:
那么下面的就不等价了:
>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan, 69., nan, 44.])
>>> np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
array([ 99., 69., nan, 44.])
So you might want to use the slightly more verbose:
所以你可能想使用稍微详细一点的:
>>> df['d'] = np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
>>> df
a b c d
0 99.0 N NaN 99.0
1 23.0 N 3.0 69.0
2 NaN N 2.0 NaN
3 44.0 Y NaN 44.0
>>>
回答by Vaishali
You can try
你可以试试
df['d'] = np.where((df.b == 'N') & (pd.notnull(df.c)), df.a*df.c, np.where(pd.notnull(df.a), df.a, np.nan))
a b c d
0 NaN Y NaN NaN
1 23.0 N 3.0 69.0
2 NaN N 2.0 NaN
3 44.0 Y NaN 44.0
See the documentationfor pandas notnull, in your current code, you just need to change series.notnull to pd.notnull(series) for it to work. Though np.where should be more efficient
请参阅pandas notnull的文档,在您当前的代码中,您只需将 series.notnull 更改为 pd.notnull(series) 即可使其工作。虽然 np.where 应该更有效
def f4(row):
if row['a']==np.nan:
return np.nan
elif (row['b']=="N") & (pd.notnull(row.c)):
return row['a']*row['c']
else:
return row['a']
df['d']=df.apply(f4,axis=1)
回答by Max Kleiner
Use
用
pd.isnull(df['Description'][i])