Python Pandas:检查行值中的所有列是否为 NaN

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/39298372/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:56:53  来源:igfitidea点击:

Python Pandas: Check if all columns in rows value is NaN

pythonpandasnan

提问by Baig

Kindly accept my apologies if my question has already been answered. I tried to find a solution but all I can find is to dropna solution for all NaN's in a dataframe. My question is that I have a dataframe with 6 columns and 500 rows. I need to check if in any particular row all the values are NaN so that I can drop them from my dataset. Example below row 2, 6 & 7 contains all Nan from col1 to col6:

如果我的问题已经得到回答,请接受我的道歉。我试图找到一个解决方案,但我能找到的只是为数据帧中的所有 NaN 删除解决方案。我的问题是我有一个包含 6 列和 500 行的数据框。我需要检查在任何特定行中是否所有值都是 NaN,以便我可以将它们从我的数据集中删除。下面的第 2、6 和 7 行示例包含从 col1 到 col6 的所有 Nan:

    Col1    Col2    Col3    Col4    Col5    Col6
    12      25      02      78      88      90
    Nan     Nan     Nan     Nan     Nan     Nan
    Nan     35      03      11      65      53
    Nan     Nan     Nan     Nan     22      21
    Nan     15      93      111     165     153
    Nan     Nan     Nan     Nan     Nan     Nan
    Nan     Nan     Nan     Nan     Nan     Nan
    141     121     Nan     Nan     Nan     Nan

Please note that top row is just headings and from 2nd row on wards my data starts. Will be grateful if anyone can help me in right direction to solve this puzzle.

请注意,顶行只是标题,我的数据从病房的第二行开始。如果有人能帮助我朝着正确的方向解决这个难题,我将不胜感激。

And also my 2nd question is that after deleting all Nan in all columns if I want to delete the rows where 4 or 5 columns data is missing then what will be the best solution.

还有我的第二个问题是,在删除所有列中的所有 Nan 之后,如果我想删除缺少 4 或 5 列数据的行,那么最佳解决方案是什么。

and last question is, is it possible after deleting the rows with most Nan's then how can I create box plot on the remaining for example 450 rows?

最后一个问题是,在删除包含大多数 Nan 的行之后是否有可能,然后如何在剩余的(例如 450 行)上创建箱线图?

Any response will be highly appreciated.

任何回应将不胜感激。

Regards,

问候,

采纳答案by Ami Tavory

I need to check if in any particular row all the values are NaN so that I can drop them from my dataset.

我需要检查在任何特定行中是否所有值都是 NaN,以便我可以将它们从我的数据集中删除。

That's exactly what pd.DataFrame.dropna(how='all')does:

这正是pd.DataFrame.dropna(how='all')它的作用:

In [3]: df = pd.DataFrame({'a': [None, 1, None], 'b': [None, 1, 2]})

In [4]: df
Out[4]: 
     a    b
0  NaN  NaN
1  1.0  1.0
2  NaN  2.0

In [5]: df.dropna(how='all')
Out[5]: 
     a    b
1  1.0  1.0
2  NaN  2.0

Regarding your second question, pd.DataFrame.boxplotwill do that. You can specify the columns you want (if needed), with the columnparameter. See the example in the docsalso.

关于你的第二个问题,pd.DataFrame.boxplot会这样做。您可以使用参数指定所需的列(如果需要)column。另请参阅文档中的示例

回答by Wong Tat Yau

For those search because wish to know on the question title:

对于那些因为想知道问题标题的搜索:

Check if all columns in rows value is NaN

检查行值中的所有列是否为 NaN

A simple approach would be:

一个简单的方法是:

df[[list_of_cols_to_check]].isnull().apply(lambda x: all(x), axis=1) 


import pandas as pd
import numpy as np


df = pd.DataFrame({'movie': [np.nan, 'thg', 'mol', 'mol', 'lob', 'lob'],
                  'rating': [np.nan, 4., 5., np.nan, np.nan, np.nan],
                  'name':   ['John', np.nan, 'N/A', 'Graham', np.nan, np.nan]}) 
df.head()

enter image description here

在此处输入图片说明



To check if all columns is NaN:

要检查所有列是否为 NaN:

cols_to_check = df.columns
df['is_na'] = df[cols_to_check].isnull().apply(lambda x: all(x), axis=1) 
df.head() 

enter image description here

在此处输入图片说明



To check if columns 'name', 'rating' are NaN:

要检查列“名称”、“评级”是否为 NaN:

cols_to_check = ['name', 'rating']
df['is_na'] = df[cols_to_check].isnull().apply(lambda x: all(x), axis=1) 
df.head()  

enter image description here

在此处输入图片说明