Pandas fillna:输出仍然有 NaN 值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18127160/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:04:20  来源:igfitidea点击:

Pandas fillna: Output still has NaN values

pythonpandas

提问by Amelio Vazquez-Reina

I am having a strange problem in Pandas. I have a Dataframe with several NaNvalues. I thought I could fill those NaNvalues using column means (that is, fill every NaNvalue with its column mean) but when I try the following

我在 Pandas 中遇到了一个奇怪的问题。我有一个包含多个NaN值的数据框。我以为我可以NaN使用列均值填充这些值(即,NaN用列均值填充每个值)但是当我尝试以下操作时

  col_means = mydf.apply(np.mean, 0)
  mydf = mydf.fillna(value=col_means)

I still see some NaNvalues. Why?

我仍然看到一些NaN价值。为什么?

Is it because I have more NaNvalues in my original dataframe than entries in col_means? And what exactly is the difference between fill-by-column vs fill-by-row?

是因为NaN我的原始数据框中的值比 中的条目多col_means吗?按列填充与按行填充之间究竟有什么区别?

回答by Andy Hayden

You can just fillnawith the df.mean()Series (which is dict-like):

您可以只fillna使用df.mean()系列(类似于 dict):

In [11]: df = pd.DataFrame([[1, np.nan], [np.nan, 4], [5, 6]])

In [12]: df
Out[12]:
    0   1
0   1 NaN
1 NaN   4
2   5   6

In [13]: df.fillna(df.mean())
Out[13]:
   0  1
0  1  5
1  3  4
2  5  6

Note: that df.mean()is the row-wise mean, which gives the fill values:

注意:这df.mean()是按行的平均值,它给出了填充值:

In [14]: df.mean()
Out[14]:
0    3
1    5
dtype: float64

Note: if df.mean()has some NaN values then these will be used in the DataFrame's fillna, perhaps you want to use a fillnaon this Series i.e.

注意:如果df.mean()有一些 NaN 值,那么这些将用于 DataFrame 的 fillna,也许你想fillna在这个系列上使用 a ,即

df.mean().fillna(0)
df.fillna(df.mean().fillna(0))