pandas 熊猫 dropna() 功能不起作用

Question

提问by Gilgamesh Skytrooper

I am trying to drop NA values from a pandas dataframe.

我正在尝试从 Pandas 数据框中删除 NA 值。

I have used dropna()(which should drop all NA rows from the dataframe). Yet, it does not work.

我已经使用过dropna()（它应该从数据框中删除所有 NA 行）。然而，它不起作用。

Here is the code:

这是代码：

import pandas as pd
import numpy as np
prison_data = pd.read_csv('https://andrewshinsuke.me/docs/compas-scores-two-years.csv')

That's how you get the data frame. As the following shows, the default read_csvmethod does indeed convert the NA data points to np.nan.

这就是您获得数据框的方式。如下所示，默认read_csv方法确实将 NA 数据点转换为np.nan.

np.isnan(prison_data.head()['out_custody'][4])

Out[2]: True

Conveniently, the head()of the DF already contains a NaN values (in the column out_custody), so printing prison_data.head()this, you get:

方便的head()是，DF 的已经包含一个 NaN 值（在列中out_custody），所以打印prison_data.head()这个，你会得到：

   id                name   first         last compas_screening_date   sex  

0   1    miguel hernandez  miguel    hernandez            2013-08-14  Male
1   3         kevon dixon   kevon        dixon            2013-01-27  Male
2   4            ed philo      ed        philo            2013-04-14  Male
3   5         marcu brown   marcu        brown            2013-01-13  Male
4   6  bouthy pierrelouis  bouthy  pierrelouis            2013-03-26  Male

      dob  age          age_cat              race      ...        
0  1947-04-18   69  Greater than 45             Other      ...
1  1982-01-22   34          25 - 45  African-American      ...
2  1991-05-14   24     Less than 25  African-American      ...
3  1993-01-21   23     Less than 25  African-American      ...
4  1973-01-22   43          25 - 45             Other      ...

   v_decile_score  v_score_text  v_screening_date  in_custody  out_custody  

0               1           Low        2013-08-14  2014-07-07   2014-07-14
1               1           Low        2013-01-27  2013-01-26   2013-02-05
2               3           Low        2013-04-14  2013-06-16   2013-06-16
3               6        Medium        2013-01-13         NaN          NaN
4               1           Low        2013-03-26         NaN          NaN

priors_count.1 start   end event two_year_recid
0               0     0   327     0              0
1               0     9   159     1              1
2               4     0    63     0              1
3               1     0  1174     0              0
4               2     0  1102     0              0

However, running prison_data.dropna()does not change the dataframe in any way.

但是，运行prison_data.dropna()不会以任何方式更改数据帧。

prison_data.dropna()
np.isnan(prison_data.head()['out_custody'][4])


Out[3]: True

Answer 1

回答by rafaelc

df.dropna()by default returns a new dataset without NaNvalues. So, you have to assign it to the variable

df.dropna()默认情况下返回一个没有NaN值的新数据集。所以，你必须将它分配给变量

df = df.dropna()

if you want it to modify the dfinplace, you have to explicitly specify

如果您希望它修改df就地，则必须明确指定

df.dropna(inplace= True)

Answer 2

回答by Gilgamesh Skytrooper

it wasn't working because there was at least one nanper row

它不起作用，因为nan每行至少有一个

pandas 熊猫 dropna() 功能不起作用

提问by Gilgamesh Skytrooper

回答by rafaelc

回答by Gilgamesh Skytrooper

相关推荐

最近更新

标签

pandas 熊猫 dropna() 功能不起作用

提问by Gilgamesh Skytrooper

回答by rafaelc

回答by Gilgamesh Skytrooper

相关推荐

pandas 将数组列表转换为熊猫数据框

pandas 熊猫修剪数据帧中的前导和尾随空白

无法使用 pip 安装 Pandas

pandas 如何将pandas数据帧转换为具有rdd属性的pyspark数据帧？

相关推荐

最近更新

标签