pandas 系列的真值在数据框中不明确

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45811610/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:17:26  来源:igfitidea点击:

The truth value of a Series is ambiguous in dataframe

pythonpython-2.7pandas

提问by DYZ

I have the same code,I'm trying to create new field in pandas dataframe with simple conditions:

我有相同的代码,我正在尝试使用简单的条件在 Pandas 数据框中创建新字段:

if df_reader['email1_b']=='NaN':
    df_reader['email1_fin']=df_reader['email1_a']
else:
    df_reader['email1_fin']=df_reader['email1_b']

But I see this strange mistake:

但我看到了这个奇怪的错误:

ValueError                                Traceback (most recent call last)
<ipython-input-92-46d604271768> in <module>()
----> 1 if df_reader['email1_b']=='NaN':
      2     df_reader['email1_fin']=df_reader['email1_a']
      3 else:
      4     df_reader['email1_fin']=df_reader['email1_b']

/home/user/GL-env_py-gcc4.8.5/lib/python2.7/site-packages/pandas/core/generic.pyc in __nonzero__(self)
    953         raise ValueError("The truth value of a {0} is ambiguous. "
    954                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 955                          .format(self.__class__.__name__))
    956 
    957     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can anybody explain me, what I need to with this?

任何人都可以解释我,我需要什么?

回答by DYZ

df_reader['email1_b']=='NaN'is a vector of Boolean values (one per row), but you need one Boolean value for ifto work. Use this instead:

df_reader['email1_b']=='NaN'是一个布尔值向量(每行一个),但您需要一个布尔值if才能工作。改用这个:

df_reader['email1_fin'] = np.where(df_reader['email1_b']=='NaN', 
                                   df_reader['email1_a'],
                                   df_reader['email1_b'])

As a side note, are you sure about 'NaN'? Is it not NaN? In the latter case, your expression should be:

作为旁注,您确定'NaN'吗?不是NaN吗?在后一种情况下,您的表达式应该是:

df_reader['email1_fin'] = np.where(df_reader['email1_b'].isnull(), 
                                   df_reader['email1_a'],
                                   df_reader['email1_b'])

回答by EdChum

ifexpects a scalar value to be returned, it doesn't understand an array of booleans which is what is returned by your conditions. If you think about it what should it do if a single value in this array is False/True?

if期望返回一个标量值,它不理解布尔数组,这是您的条件返回的内容。如果您考虑一下,如果此数组中的单个值是False/ 该True怎么办?

to do this properly you can do the following:

要正确执行此操作,您可以执行以下操作:

df_reader['email1_fin'] = np.where(df_reader['email1_b'] == 'NaN', df_reader['email1_a'], df_reader['email1_b'] )

also you seem to be comparing against the str'NaN'rather than the numerical NaNis this intended?

您似乎也是在比较str'NaN'而不是数字NaN,这是打算吗?