Python Pandas:检查行值中的所有列是否为 NaN
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39298372/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python Pandas: Check if all columns in rows value is NaN
提问by Baig
Kindly accept my apologies if my question has already been answered. I tried to find a solution but all I can find is to dropna solution for all NaN's in a dataframe. My question is that I have a dataframe with 6 columns and 500 rows. I need to check if in any particular row all the values are NaN so that I can drop them from my dataset. Example below row 2, 6 & 7 contains all Nan from col1 to col6:
如果我的问题已经得到回答,请接受我的道歉。我试图找到一个解决方案,但我能找到的只是为数据帧中的所有 NaN 删除解决方案。我的问题是我有一个包含 6 列和 500 行的数据框。我需要检查在任何特定行中是否所有值都是 NaN,以便我可以将它们从我的数据集中删除。下面的第 2、6 和 7 行示例包含从 col1 到 col6 的所有 Nan:
Col1 Col2 Col3 Col4 Col5 Col6
12 25 02 78 88 90
Nan Nan Nan Nan Nan Nan
Nan 35 03 11 65 53
Nan Nan Nan Nan 22 21
Nan 15 93 111 165 153
Nan Nan Nan Nan Nan Nan
Nan Nan Nan Nan Nan Nan
141 121 Nan Nan Nan Nan
Please note that top row is just headings and from 2nd row on wards my data starts. Will be grateful if anyone can help me in right direction to solve this puzzle.
请注意,顶行只是标题,我的数据从病房的第二行开始。如果有人能帮助我朝着正确的方向解决这个难题,我将不胜感激。
And also my 2nd question is that after deleting all Nan in all columns if I want to delete the rows where 4 or 5 columns data is missing then what will be the best solution.
还有我的第二个问题是,在删除所有列中的所有 Nan 之后,如果我想删除缺少 4 或 5 列数据的行,那么最佳解决方案是什么。
and last question is, is it possible after deleting the rows with most Nan's then how can I create box plot on the remaining for example 450 rows?
最后一个问题是,在删除包含大多数 Nan 的行之后是否有可能,然后如何在剩余的(例如 450 行)上创建箱线图?
Any response will be highly appreciated.
任何回应将不胜感激。
Regards,
问候,
采纳答案by Ami Tavory
I need to check if in any particular row all the values are NaN so that I can drop them from my dataset.
我需要检查在任何特定行中是否所有值都是 NaN,以便我可以将它们从我的数据集中删除。
That's exactly what pd.DataFrame.dropna(how='all')
does:
这正是pd.DataFrame.dropna(how='all')
它的作用:
In [3]: df = pd.DataFrame({'a': [None, 1, None], 'b': [None, 1, 2]})
In [4]: df
Out[4]:
a b
0 NaN NaN
1 1.0 1.0
2 NaN 2.0
In [5]: df.dropna(how='all')
Out[5]:
a b
1 1.0 1.0
2 NaN 2.0
Regarding your second question, pd.DataFrame.boxplot
will do that. You can specify the columns you want (if needed), with the column
parameter. See the example in the docsalso.
关于你的第二个问题,pd.DataFrame.boxplot
会这样做。您可以使用参数指定所需的列(如果需要)column
。另请参阅文档中的示例。
回答by Wong Tat Yau
For those search because wish to know on the question title:
对于那些因为想知道问题标题的搜索:
Check if all columns in rows value is NaN
检查行值中的所有列是否为 NaN
A simple approach would be:
一个简单的方法是:
df[[list_of_cols_to_check]].isnull().apply(lambda x: all(x), axis=1)
import pandas as pd
import numpy as np
df = pd.DataFrame({'movie': [np.nan, 'thg', 'mol', 'mol', 'lob', 'lob'],
'rating': [np.nan, 4., 5., np.nan, np.nan, np.nan],
'name': ['John', np.nan, 'N/A', 'Graham', np.nan, np.nan]})
df.head()
To check if all columns is NaN:
要检查所有列是否为 NaN:
cols_to_check = df.columns
df['is_na'] = df[cols_to_check].isnull().apply(lambda x: all(x), axis=1)
df.head()
To check if columns 'name', 'rating' are NaN:
要检查列“名称”、“评级”是否为 NaN:
cols_to_check = ['name', 'rating']
df['is_na'] = df[cols_to_check].isnull().apply(lambda x: all(x), axis=1)
df.head()