pandas 错误：浮动对象没有属性 notnull

Question

提问by

I have a dataframe:

我有一个数据框：

  a     b     c
0 nan   Y     nan
1  23   N      3
2 nan   N      2
3  44   Y     nan

I wish to have this output:

我希望有这个输出：

  a     b     c      d
0 nan   Y     nan   nan
1  23   N      3     96
2 nan   N      2    nan
3  44   Y     nan    44

I wish to have a condition which is when column a is null, then d will be null else if column b is N and column c is not null then column d is equal to column a * column c else column d equal column a

我希望有一个条件，即当列 a 为空时，则 d 将为空，否则如果列 b 为 N 且列 c 不为空，则列 d 等于列 a * 列 c else 列 d 等于列 a

I have done this code but i get the error:

我已完成此代码，但出现错误：

def f4(row):
    if row['a']==np.nan:
       return np.nan
    elif row['b']=="N" & row(row['c'].notnull()):
       return row['a']*row['c']
    else:
       return row['a']

 DF['P1']=DF.apply(f4,axis=1)

can anyone help me point out where is my mistake? I have refer to this and try this but also get the error Creating a new column based on if-elif-else condition

谁能帮我指出我的错误在哪里？我已经参考了这个并尝试了这个但也得到了错误创建一个基于 if-elif-else 条件的新列

Answer 1

采纳答案by Scott Boston

You don't need apply, use np.where:

你不需要apply，使用np.where：

df['d'] = np.where(df.a.isnull(),
         np.nan,
         np.where((df.b == "N")&(~df.c.isnull()),
                  df.a*df.c,
                  df.a))

Output:

输出：

      a  b    c     d
0   NaN  Y  NaN   NaN
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0

Answer 2

回答by juanpa.arrivillaga

Since you just want Nans to be propagated, multiplying the columns takes care of that for you:

由于您只想Nan传播 s，因此乘以列会为您解决这个问题：

>>> df = pd.read_clipboard()
>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> df.a * df.c
0     NaN
1    69.0
2     NaN
3     NaN
dtype: float64
>>>

If you want to do it on a condition, you can use np.wherehere instead of .apply. all you need is the following:

如果您想在某个条件下执行此操作，则可以使用np.wherehere 代替.apply。您只需要以下内容：

>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])

This is the default behavior for most operations involving Nan. So, you can simply assign the result of the above:

这是大多数涉及的操作的默认行为Nan。因此，您可以简单地分配上述结果：

>>> df['d'] = np.where(df.b == 'N', df.a*df.c, df.a)
>>> df
      a  b    c     d
0   NaN  Y  NaN   NaN
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0
>>>

Just to elaborate on what this:

只是详细说明这是什么：

np.where(df.b == 'N', df.a*df.c, df.a)

Is doing, you can think of it as "where df.b == 'N', give me the result of df.a * df.c, else, give me just df.a:

正在做，你可以把它想成“where df.b == 'N', 给我的结果df.a * df.c，否则，给我df.a：

>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])

Also note, if your dataframe were a little different:

另请注意，如果您的数据框略有不同：

>>> df
      a  b    c
0   NaN  Y  NaN
1  23.0  Y  3.0
2   NaN  N  2.0
3  44.0  Y  NaN
>>> df.loc[0,'a'] = 99
>>> df.loc[0, 'b']= 'N'
>>> df
      a  b    c
0  99.0  N  NaN
1  23.0  N  3.0
2   NaN  N  2.0
3  44.0  Y  NaN

Then the following would not be equivalent:

那么下面的就不等价了：

>>> np.where(df.b == 'N', df.a*df.c, df.a)
array([ nan,  69.,  nan,  44.])
>>> np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
array([ 99.,  69.,  nan,  44.])

So you might want to use the slightly more verbose:

所以你可能想使用稍微详细一点的：

>>> df['d'] = np.where((df.b == 'N') & (~df.c.isnull()), df.a*df.c, df.a)
>>> df
      a  b    c     d
0  99.0  N  NaN  99.0
1  23.0  N  3.0  69.0
2   NaN  N  2.0   NaN
3  44.0  Y  NaN  44.0
>>>

Answer 3

回答by Vaishali

You can try

你可以试试

df['d'] = np.where((df.b == 'N') & (pd.notnull(df.c)), df.a*df.c, np.where(pd.notnull(df.a), df.a, np.nan))


    a       b   c      d
0   NaN     Y   NaN    NaN
1   23.0    N   3.0    69.0
2   NaN     N   2.0    NaN
3   44.0    Y   NaN    44.0

See the documentationfor pandas notnull, in your current code, you just need to change series.notnull to pd.notnull(series) for it to work. Though np.where should be more efficient

请参阅pandas notnull的文档，在您当前的代码中，您只需将 series.notnull 更改为 pd.notnull(series) 即可使其工作。虽然 np.where 应该更有效

def f4(row):
    if row['a']==np.nan:
        return np.nan
    elif (row['b']=="N") & (pd.notnull(row.c)):
        return row['a']*row['c']
    else:
        return row['a']
df['d']=df.apply(f4,axis=1)

Answer 4

回答by Max Kleiner

Use

用

pd.isnull(df['Description'][i])

pandas 错误：浮动对象没有属性 notnull

提问by

采纳答案by Scott Boston

回答by juanpa.arrivillaga

回答by Vaishali

回答by Max Kleiner

相关推荐

最近更新

标签

pandas 错误：浮动对象没有属性 notnull

提问by

采纳答案by Scott Boston

回答by juanpa.arrivillaga

回答by Vaishali

回答by Max Kleiner

相关推荐

在一个图中绘制来自多个 Pandas 数据框的数据

pandas 我在 groupby 上应用了 sum()，我想对最后一列的值进行排序

Pandas - 将大数据帧切成块

pandas 如何在熊猫中用分隔符读取文件？

相关推荐

最近更新

标签