Python 在 Pandas 中删除 nan 行的更好方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36370839/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 17:47:46  来源:igfitidea点击:

better way to drop nan rows in pandas

pythonpandas

提问by kilojoules

On my own I found a way to drop nan rows from a pandas dataframe. Given a dataframe datwith column xwhich contains nan values,is there a more elegant way to do drop each row of datwhich has a nan value in the xcolumn?

我自己找到了一种从 Pandas 数据框中删除 nan 行的方法。给定一个包含 nan 值的dat列的数据框x,是否有更优雅的方法来删除列中dat具有 nan 值的每一行x

dat = dat[np.logical_not(np.isnan(dat.x))]
dat = dat.reset_index(drop=True)

回答by TerminalWitchcraft

Use dropna:

使用dropna

dat.dropna()

You can pass param howto drop if all labels are nan or any of the labels are nan

how如果所有标签都是 nan 或任何标签都是 nan,您可以将参数传递给 drop

dat.dropna(how='any')    #to drop if any value in the row has a nan
dat.dropna(how='all')    #to drop if all values in the row are nan

Hope that answers your question!

希望这能回答你的问题!

Edit 1:In case you want to drop rows containing nanvalues only from particular column(s), as suggested by J. Doe in his answer below, you can use the following:

编辑 1:如果您想删除nan仅包含特定列中值的行,如 J. Doe 在下面的回答中所建议的,您可以使用以下内容:

dat.dropna(subset=[col_list])  # col_list is a list of column names to consider for nan values.

回答by J. Doe

To expand Hitesh's answer if you want to drop rows where 'x' specifically is nan, you can use the subset parameter. His answer will drop rows where other columns have nans as well

如果要删除“x”特别是 nan 的行,要扩展 Hitesh 的答案,可以使用子集参数。他的回答将删除其他列也有 nan 的行

dat.dropna(subset=['x'])

回答by hRt

Just in case commands in previous answers doesn't work, Try this: dat.dropna(subset=['x'], inplace = True)

以防万一先前答案中的命令不起作用,请尝试以下操作: dat.dropna(subset=['x'], inplace = True)

回答by Chunxiao Li

bool_series=pd.notnull(dat["x"])
dat=dat[bool_series]

回答by Naveen Gabriel

To remove rows based on Nan value of particular column:

根据特定列的 Nan 值删除行:

d= pd.DataFrame([[2,3],[4,None]])   #creating data frame
d
Output:
    0   1
0   2   3.0
1   4   NaN
d = d[np.isfinite(d[1])]  #Select rows where value of 1st column is not nan
d

Output:
    0   1
0   2   3.0