Pandas fillna：输出仍然有 NaN 值

Question

提问by Amelio Vazquez-Reina

I am having a strange problem in Pandas. I have a Dataframe with several NaNvalues. I thought I could fill those NaNvalues using column means (that is, fill every NaNvalue with its column mean) but when I try the following

我在 Pandas 中遇到了一个奇怪的问题。我有一个包含多个NaN值的数据框。我以为我可以NaN使用列均值填充这些值（即，NaN用列均值填充每个值）但是当我尝试以下操作时

  col_means = mydf.apply(np.mean, 0)
  mydf = mydf.fillna(value=col_means)

I still see some NaNvalues. Why?

我仍然看到一些NaN价值。为什么？

Is it because I have more NaNvalues in my original dataframe than entries in col_means? And what exactly is the difference between fill-by-column vs fill-by-row?

是因为NaN我的原始数据框中的值比中的条目多col_means吗？按列填充与按行填充之间究竟有什么区别？

Answer 1

回答by Andy Hayden

You can just fillnawith the df.mean()Series (which is dict-like):

您可以只fillna使用df.mean()系列（类似于 dict）：

In [11]: df = pd.DataFrame([[1, np.nan], [np.nan, 4], [5, 6]])

In [12]: df
Out[12]:
    0   1
0   1 NaN
1 NaN   4
2   5   6

In [13]: df.fillna(df.mean())
Out[13]:
   0  1
0  1  5
1  3  4
2  5  6

Note: that df.mean()is the row-wise mean, which gives the fill values:

注意：这df.mean()是按行的平均值，它给出了填充值：

In [14]: df.mean()
Out[14]:
0    3
1    5
dtype: float64

Note: if df.mean()has some NaN values then these will be used in the DataFrame's fillna, perhaps you want to use a fillnaon this Series i.e.

注意：如果df.mean()有一些 NaN 值，那么这些将用于 DataFrame 的 fillna，也许你想fillna在这个系列上使用 a ，即

df.mean().fillna(0)
df.fillna(df.mean().fillna(0))

Pandas fillna：输出仍然有 NaN 值

提问by Amelio Vazquez-Reina

回答by Andy Hayden

相关推荐

最近更新

标签

Pandas fillna：输出仍然有 NaN 值

提问by Amelio Vazquez-Reina

回答by Andy Hayden

相关推荐

在 Pandas 中将 Index 转换为 MultiIndex（分层索引）

从 SciPy 稀疏矩阵填充 Pandas SparseDataFrame

在 HDF5 中存储 Pandas 对象和常规 Python 对象

在 Pandas 中连接列作为索引

相关推荐

最近更新

标签