pandas 熊猫 dropna() 功能不起作用

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49712002/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:26:17  来源:igfitidea点击:

Pandas dropna() function not working

pythonpandasdata-science

提问by Gilgamesh Skytrooper

I am trying to drop NA values from a pandas dataframe.

我正在尝试从 Pandas 数据框中删除 NA 值。

I have used dropna()(which should drop all NA rows from the dataframe). Yet, it does not work.

我已经使用过dropna()(它应该从数据框中删除所有 NA 行)。然而,它不起作用。

Here is the code:

这是代码:

import pandas as pd
import numpy as np
prison_data = pd.read_csv('https://andrewshinsuke.me/docs/compas-scores-two-years.csv')

That's how you get the data frame. As the following shows, the default read_csvmethod does indeed convert the NA data points to np.nan.

这就是您获得数据框的方式。如下所示,默认read_csv方法确实将 NA 数据点转换为np.nan.

np.isnan(prison_data.head()['out_custody'][4])

Out[2]: True

Conveniently, the head()of the DF already contains a NaN values (in the column out_custody), so printing prison_data.head()this, you get:

方便的head()是,DF 的 已经包含一个 NaN 值(在列中out_custody),所以打印prison_data.head()这个,你会得到:

   id                name   first         last compas_screening_date   sex  

0   1    miguel hernandez  miguel    hernandez            2013-08-14  Male
1   3         kevon dixon   kevon        dixon            2013-01-27  Male
2   4            ed philo      ed        philo            2013-04-14  Male
3   5         marcu brown   marcu        brown            2013-01-13  Male
4   6  bouthy pierrelouis  bouthy  pierrelouis            2013-03-26  Male

      dob  age          age_cat              race      ...        
0  1947-04-18   69  Greater than 45             Other      ...
1  1982-01-22   34          25 - 45  African-American      ...
2  1991-05-14   24     Less than 25  African-American      ...
3  1993-01-21   23     Less than 25  African-American      ...
4  1973-01-22   43          25 - 45             Other      ...

   v_decile_score  v_score_text  v_screening_date  in_custody  out_custody  

0               1           Low        2013-08-14  2014-07-07   2014-07-14
1               1           Low        2013-01-27  2013-01-26   2013-02-05
2               3           Low        2013-04-14  2013-06-16   2013-06-16
3               6        Medium        2013-01-13         NaN          NaN
4               1           Low        2013-03-26         NaN          NaN

priors_count.1 start   end event two_year_recid
0               0     0   327     0              0
1               0     9   159     1              1
2               4     0    63     0              1
3               1     0  1174     0              0
4               2     0  1102     0              0

However, running prison_data.dropna()does not change the dataframe in any way.

但是,运行prison_data.dropna()不会以任何方式更改数据帧。

prison_data.dropna()
np.isnan(prison_data.head()['out_custody'][4])


Out[3]: True

回答by rafaelc

df.dropna()by default returns a new dataset without NaNvalues. So, you have to assign it to the variable

df.dropna()默认情况下返回一个没有NaN值的新数据集。所以,你必须将它分配给变量

df = df.dropna()

if you want it to modify the dfinplace, you have to explicitly specify

如果您希望它修改df就地,则必须明确指定

df.dropna(inplace= True)

回答by Gilgamesh Skytrooper

it wasn't working because there was at least one nanper row

它不起作用,因为nan每行至少有一个