Python 如何遍历数据帧的行并检查列行中的值是否为 NaN
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33124117/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to iterate through rows of a dataframe and check whether value in a column row is NaN
提问by sequence_hard
I have a beginner question. I have a dataframe I am iterating over and I want to check if a value in a column2 row is NaN
or not, to perform an action on this value if it is not NaN
. My DataFrame looks like this:
我有一个初学者的问题。我有一个正在迭代的数据框,我想检查 column2 行中的NaN
值是否为,如果不是,则对该值执行操作NaN
。我的 DataFrame 看起来像这样:
df:
Column1 Column2
0 a hey
1 b NaN
2 c up
What I am trying right now is:
我现在正在尝试的是:
for item, frame in df['Column2'].iteritems():
if frame.notnull() == True:
print 'frame'
The thought behind that is that I iterate over the rows in column 2 and print
frame for every row that has a value (which is a string). What I get however is this:
背后的想法是我遍历第 2 列中的行,并print
为具有值(这是一个字符串)的每一行取帧。然而,我得到的是:
AttributeError Traceback (most recent call last)
<ipython-input-80-8b871a452417> in <module>()
1 for item, frame in df['Column2'].iteritems():
----> 2 if frame.notnull() == True:
3 print 'frame'
AttributeError: 'float' object has no attribute 'notnull'
When I only run the first line of my code, I get
当我只运行我的代码的第一行时,我得到
0
hey
1
nan
2
up
which suggests that the floats in the output of the first line are the cause of the error. Can anybody tell me how I can accomplish what I want?
这表明第一行输出中的浮点数是错误的原因。谁能告诉我如何实现我想要的?
采纳答案by Anand S Kumar
As you already understand , frame
in
正如你已经了解的那样,frame
在
for item, frame in df['Column2'].iteritems():
is every row
in the Column, its type would be the type of elements in the column (which most probably would not be Series
or DataFrame
). Hence, frame.notnull()
on that would not work.
是row
列中的 each ,其类型将是列中元素的类型(很可能不是Series
或DataFrame
)。因此,frame.notnull()
在这行不通。
You should instead try -
你应该试试——
for item, frame in df['Column2'].iteritems():
if pd.notnull(frame):
print frame
回答by Hackaholic
try this:
尝试这个:
df[df['Column2'].notnull()]
The above code will give you the data for which Column2
has not null value
上面的代码会给你Column2
没有空值的数据
回答by Evan Wright
Using iteritems
on a Series (which is what you get when you take a column from a DataFrame) iterates over pairs (index, value). So your item
will take the values 0, 1, and 2 in the three iterations of the loop, and your frame
will take the values 'hey'
, NaN
, and 'up'
(so "frame" is probably a bad name for it). The error comes from trying to use the method notnull
on NaN
(which is represented as a floating-point number).
使用iteritems
上一个系列(这是你得到什么,当你从一个数据帧将列)遍历对(指数值)。所以,你item
将采取值0,1,和2循环的三次迭代,并且你frame
将采取的价值观'hey'
,NaN
以及'up'
(所以“帧”可能是因为它的名声)。错误来自尝试使用方法notnull
on NaN
(表示为浮点数)。
You can use the function pd.notnull
instead:
您可以改用该函数pd.notnull
:
In [3]: pd.notnull(np.nan)
Out[3]: False
In [4]: pd.notnull('hey')
Out[4]: True
Another way would be to use notnull
on the whole Series, and then iterate over those values (which are now boolean):
另一种方法是notnull
在整个系列上使用,然后迭代这些值(现在是布尔值):
for _, value in df['Column2'].notnull().iteritems():
if value:
print 'frame'