pandas 错误:浮动对象没有属性 notnull

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44877663/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:55:26  来源:igfitidea点击:

Error: float object has no attribute notnull

pythonpandas

提问by

I have a dataframe:

我有一个数据框:

  a     b     c
0 nan   Y     nan
1  23   N      3
2 nan   N      2
3  44   Y     nan

I wish to have this output:

我希望有这个输出:

  a     b     c      d
0 nan   Y     nan   nan
1  23   N      3     96
2 nan   N      2    nan
3  44   Y     nan    44

I wish to have a condition which is when column a is null, then d will be null else if column b is N and column c is not null then column d is equal to column a * column c else column d equal column a

我希望有一个条件,即当列 a 为空时,则 d 将为空,否则如果列 b 为 N 且列 c 不为空,则列 d 等于列 a * 列 c else 列 d 等于列 a

I have done this code but i get the error:

我已完成此代码,但出现错误:

def f4(row):
    if row['a']==np.nan:
       return np.nan
    elif row['b']=="N" & row(row['c'].notnull()):
       return row['a']*row['c']
    else:
       return row['a']

 DF['P1']=DF.apply(f4,axis=1)

can anyone help me point out where is my mistake? I have refer to this and try this but also get the error Creating a new column based on if-elif-else condition

谁能帮我指出我的错误在哪里?我已经参考了这个并尝试了这个但也得到了错误创建一个基于 if-elif-else 条件的新列

采纳答案by Scott Boston

You don't need apply, use np.where:

你不需要apply,使用np.where

df['d'] = np.where(df.a.isnull(),
         np.nan,
         np.where((df.b == "N")&(~df.c.isnull()),
                  df.a*df.c,
                  df.a))

Output:

输出:

      a  b    c     d
0   NaN  Y  NaN   NaN
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0

回答by juanpa.arrivillaga

Since you just want Nans to be propagated, multiplying the columns takes care of that for you:

由于您只想Nan传播 s,因此乘以列会为您解决这个问题:

>>> df = pd.read_clipboard()
>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> df.a * df.c
0     NaN
1    69.0
2     NaN
3     NaN
dtype: float64
>>>

If you want to do it on a condition, you can use np.wherehere instead of .apply. all you need is the following:

如果您想在某个条件下执行此操作,则可以使用np.wherehere 代替.apply。您只需要以下内容:

>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])

This is the default behavior for most operations involving Nan. So, you can simply assign the result of the above:

这是大多数涉及 的操作的默认行为Nan。因此,您可以简单地分配上述结果:

>>> df['d'] = np.where(df.b == 'N', df.a*df.c, df.a)
>>> df
      a  b    c     d
0   NaN  Y  NaN   NaN
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0
>>>

Just to elaborate on what this:

只是详细说明这是什么:

np.where(df.b == 'N', df.a*df.c, df.a)

Is doing, you can think of it as "where df.b == 'N', give me the result of df.a * df.c, else, give me just df.a:

正在做,你可以把它想成“where df.b == 'N', 给我的结果df.a * df.c,否则,给我df.a

>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])

Also note, if your dataframe were a little different:

另请注意,如果您的数据框略有不同:

>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  Y  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> df.loc[0,'a'] = 99
>>> df.loc[0, 'b']= 'N'
>>> df
      a  b    c
0  99.0  N  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN

Then the following would not be equivalent:

那么下面的就不等价了:

>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])
>>> np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
array([ 99.,  69.,  nan,  44.])

So you might want to use the slightly more verbose:

所以你可能想使用稍微详细一点的:

>>> df['d'] = np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
>>> df
      a  b    c     d
0  99.0  N  NaN  99.0
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0
>>>

回答by Vaishali

You can try

你可以试试

df['d'] = np.where((df.b == 'N') & (pd.notnull(df.c)), df.a*df.c, np.where(pd.notnull(df.a), df.a, np.nan))


    a       b   c      d
0   NaN     Y   NaN    NaN
1   23.0    N   3.0    69.0
2   NaN     N   2.0    NaN
3   44.0    Y   NaN    44.0

See the documentationfor pandas notnull, in your current code, you just need to change series.notnull to pd.notnull(series) for it to work. Though np.where should be more efficient

请参阅pandas notnull的文档,在您当前的代码中,您只需将 series.notnull 更改为 pd.notnull(series) 即可使其工作。虽然 np.where 应该更有效

def f4(row):
    if row['a']==np.nan:
        return np.nan
    elif (row['b']=="N") & (pd.notnull(row.c)):
        return row['a']*row['c']
    else:
        return row['a']
df['d']=df.apply(f4,axis=1)

回答by Max Kleiner

Use

pd.isnull(df['Description'][i])