Python 如何遍历数据帧的行并检查列行中的值是否为 NaN

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/33124117/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 12:48:39  来源:igfitidea点击:

How to iterate through rows of a dataframe and check whether value in a column row is NaN

pythonpandasiterationrowdataframe

提问by sequence_hard

I have a beginner question. I have a dataframe I am iterating over and I want to check if a value in a column2 row is NaNor not, to perform an action on this value if it is not NaN. My DataFrame looks like this:

我有一个初学者的问题。我有一个正在迭代的数据框,我想检查 column2 行中的NaN值是否为,如果不是,则对该值执行操作NaN。我的 DataFrame 看起来像这样:

df:

  Column1  Column2
0    a        hey
1    b        NaN
2    c        up

What I am trying right now is:

我现在正在尝试的是:

for item, frame in df['Column2'].iteritems():
    if frame.notnull() == True:
        print 'frame'

The thought behind that is that I iterate over the rows in column 2 and printframe for every row that has a value (which is a string). What I get however is this:

背后的想法是我遍历第 2 列中的行,并print为具有值(这是一个字符串)的每一行取帧。然而,我得到的是:

AttributeError                            Traceback (most recent call last)
<ipython-input-80-8b871a452417> in <module>()
      1 for item, frame in df['Column2'].iteritems():
----> 2     if frame.notnull() == True:
      3         print 'frame'

AttributeError: 'float' object has no attribute 'notnull'

When I only run the first line of my code, I get

当我只运行我的代码的第一行时,我得到

0
hey
1
nan
2
up

which suggests that the floats in the output of the first line are the cause of the error. Can anybody tell me how I can accomplish what I want?

这表明第一行输出中的浮点数是错误的原因。谁能告诉我如何实现我想要的?

采纳答案by Anand S Kumar

As you already understand , framein

正如你已经了解的那样,frame

for item, frame in df['Column2'].iteritems():

is every rowin the Column, its type would be the type of elements in the column (which most probably would not be Seriesor DataFrame). Hence, frame.notnull()on that would not work.

row列中的 each ,其类型将是列中元素的类型(很可能不是SeriesDataFrame)。因此,frame.notnull()在这行不通。

You should instead try -

你应该试试——

for item, frame in df['Column2'].iteritems():
    if pd.notnull(frame):
        print frame

回答by Hackaholic

try this:

尝试这个:

df[df['Column2'].notnull()]

The above code will give you the data for which Column2has not null value

上面的代码会给你Column2没有空值的数据

回答by Evan Wright

Using iteritemson a Series (which is what you get when you take a column from a DataFrame) iterates over pairs (index, value). So your itemwill take the values 0, 1, and 2 in the three iterations of the loop, and your framewill take the values 'hey', NaN, and 'up'(so "frame" is probably a bad name for it). The error comes from trying to use the method notnullon NaN(which is represented as a floating-point number).

使用iteritems上一个系列(这是你得到什么,当你从一个数据帧将列)遍历对(指数值)。所以,你item将采取值0,1,和2循环的三次迭代,并且你frame将采取的价值观'hey'NaN以及'up'(所以“帧”可能是因为它的名声)。错误来自尝试使用方法notnullon NaN(表示为浮点数)。

You can use the function pd.notnullinstead:

您可以改用该函数pd.notnull

In [3]: pd.notnull(np.nan)
Out[3]: False

In [4]: pd.notnull('hey')
Out[4]: True

Another way would be to use notnullon the whole Series, and then iterate over those values (which are now boolean):

另一种方法是notnull在整个系列上使用,然后迭代这些值(现在是布尔值):

for _, value in df['Column2'].notnull().iteritems():
    if value:
        print 'frame'