pandas Panda - Fillna - TypeError:无法使用空键标记索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48516205/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Panda - Fillna - TypeError: cannot label index with a null key
提问by D. Eggert
I am trying to work with a Pandas DataFrame which has some NaN values. When I try to
我正在尝试使用具有一些 NaN 值的 Pandas DataFrame。当我尝试
df.fillna(df.mean())
I get the following error and can not find a solution or reason for it: Error:
我收到以下错误,但找不到解决方案或原因:错误:
TypeError: cannot label index with a null key
All columns are int or float. I am even able to extract the single columns into an array, do fillna() on this array and re-integrate into the DataFrame.
所有列都是 int 或 float。我什至能够将单列提取到一个数组中,在这个数组上执行 fillna() 并重新集成到 DataFrame 中。
Any idea or hint? Thank you very much!
任何想法或提示?非常感谢!
My code:
我的代码:
test=pd.read_csv("../input/test.csv")
test.fillna(test.mean(),inplace=True)
The file I am working on is from Kaggle the test or train.csv. I have same error for both data: https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data
我正在处理的文件来自 Kaggle the test 或 train.csv。我对这两个数据都有相同的错误:https: //www.kaggle.com/c/house-prices-advanced-regression-techniques/data
The Error code is like this:
错误代码是这样的:
TypeError Traceback (most recent call last)
<ipython-input-29-ab3e419316e1> in <module>()
14
15 #Also test has NaN's
---> 16 test.fillna(test.mean(),inplace=True)
/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in fillna(self, value, method, axis, inplace, limit, downcast, **kwargs)
2752 self).fillna(value=value, method=method, axis=axis,
2753 inplace=inplace, limit=limit,
-> 2754 downcast=downcast, **kwargs)
2755
2756 @Appender(_shared_docs['shift'] % _shared_doc_kwargs)
/opt/conda/lib/python3.6/site-packages/pandas/core/generic.py in fillna(self, value, method, axis, inplace, limit, downcast)
3645 if k not in result:
3646 continue
-> 3647 obj = result[k]
3648 obj.fillna(v, limit=limit, inplace=True, downcast=downcast)
3649 return result
/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in __getitem__(self, key)
1962 return self._getitem_multilevel(key)
1963 else:
-> 1964 return self._getitem_column(key)
1965
1966 def _getitem_column(self, key):
/opt/conda/lib/python3.6/site-packages/pandas/core/frame.py in _getitem_column(self, key)
1972
1973 # duplicate columns & possible reduce dimensionality
-> 1974 result = self._constructor(self._data.get(key))
1975 if result.columns.is_unique:
1976 result = result[key]
/opt/conda/lib/python3.6/site-packages/pandas/core/internals.py in get(self, item, fastpath)
3603
3604 if isnull(item):
-> 3605 raise TypeError("cannot label index with a null key")
3606
3607 indexer = self.items.get_indexer_for([item])
TypeError: cannot label index with a null key
The error message is as follows:
回答by Pierluigi
The following example seems to work nicely:
以下示例似乎运行良好:
import pandas
x = pandas.DataFrame({
'x_1': [0, 1, 2, 3, 0, 1, 2, None, ],
'x_2': [0, 1, None, 3, 0, 1, 2, pandas.np.nan, ],
'x_3': [0, 1, 2, 3, 0, 1, 2, None, ],
'x_4': [0, 1, 2, 3, 0, pandas.np.NAN, 2, None, ],},
index=[0, 1, 2, 3, 4, 5, 6, 7])
x.fillna(x.mean(), inplace=True)
x.head()
producing:
生产:
x_1 x_2 x_3 x_4
0 0.000000 0.000000 0.000000 0.000000
1 1.000000 1.000000 1.000000 1.000000
2 2.000000 1.166667 2.000000 2.000000
3 3.000000 3.000000 3.000000 3.000000
4 0.000000 0.000000 0.000000 0.000000
5 1.000000 1.000000 1.000000 1.333333
6 2.000000 2.000000 2.000000 2.000000
7 1.285714 1.166667 1.285714 1.333333
Take a deeper look to your input data.
更深入地查看您的输入数据。
回答by Joe
You can try with:
您可以尝试:
df['your_column'] = df['your_column'].fillna((df['your_column'].mean()))
In this way you fill the NaN values with the average of its own column.
通过这种方式,您可以用其自己列的平均值填充 NaN 值。