用列表中的值替换 pandas.DataFrame 的 NaN 值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/42167429/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Replace NaN values of pandas.DataFrame with values from list
提问by MeanStreet
In a python script using the library pandas
, I have a dataset of let's say 100 lines with a feature "X", containing 36 NaN
values, and a list of size 36.
在使用库的 python 脚本中pandas
,我有一个数据集,假设有 100 行,具有特征“X”,包含 36 个NaN
值和大小为 36 的列表。
I want to replace all the 36 missing values of the column "X" by the 36 values I have in my list.
我想用列表中的 36 个值替换“X”列的所有 36 个缺失值。
It's likely to be a dumb question, but I went through all the doc and couldn't find a way to do it.
这可能是一个愚蠢的问题,但我浏览了所有文档,但找不到解决方法。
Here's an example :
这是一个例子:
INPUT
输入
Data: X Y
1 8
2 3
NaN 2
NaN 7
1 2
NaN 2
Filler
填料
List: [8, 6, 3]
OUTPUT
输出
Data: X Y
1 8
2 3
8 2
6 7
1 2
3 2
回答by bunji
Start with your dataframe df
从您的数据框开始 df
print(df)
X Y
0 1.0 8
1 2.0 3
2 NaN 2
3 NaN 7
4 1.0 2
5 NaN 2
Define the values you want to fill with (Note: there must be the same number of elements in your filler
list as NaN
values in your dataframe)
定义要填充的值(注意:filler
列表中的元素数量必须与NaN
数据框中的值相同)
filler = [8, 6, 3]
Filter your column (that contains the NaN
values) and overwrite the selected rows with your filler
过滤您的列(包含NaN
值)并用您的filler
df.X[df.X.isnull()] = filler
df.X[df.X.isnull()] = filler
df.loc[df.X.isnull(), 'X'] = filler
which gives:
这使:
print(df)
X Y
0 1.0 8
1 2.0 3
2 8.0 2
3 6.0 7
4 1.0 2
5 3.0 2
回答by Shijo
This may not be the efficient one, but still works :) First find all index for the Nan's and replace them in loop. Assuming that list is always bigger than number of Nan's
这可能不是有效的,但仍然有效:) 首先找到 Nan 的所有索引并在循环中替换它们。假设该列表始终大于 Nan 的数量
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': [np.nan, 1, 2], 'B': [10, np.nan, np.nan], 'C': [[20, 21, 22], [23, 24, 25], np.nan]})
lst=[12,35,78]
index = df['B'].index[df['B'].apply(np.isnan)] #find Index
cnt=0
for item in index:
df.set_value(item, 'B', lst[item]) #replace Nan of the nth index with value from Nth value from list
cnt=cnt+1
print df
A B C
0 NaN 10.0 [20, 21, 22]
1 1.0 NaN [23, 24, 25]
2 2.0 NaN NaN
Output .
输出 。
A B C
0 NaN 10.0 [20, 21, 22]
1 1.0 35.0 [23, 24, 25]
2 2.0 78.0 NaN
回答by Scratch'N'Purr
You'd have to use an iterator as an index marker for replacing your NaN's with the value in your custom list:
您必须使用迭代器作为索引标记,用自定义列表中的值替换 NaN:
import numpy as np
import pandas as pd
your_df = pd.DataFrame({'your_column': [0,1,2,np.nan,4,6,np.nan,np.nan,7,8,np.nan,9]}) # a df with 4 NaN's
print your_df
your_custom_list = [1,3,6,8] # custom list with 4 fillers
your_column_vals = your_df['your_column'].values
i_custom = 0 # starting index on your iterator for your custom list
for i in range(len(your_column_vals)):
if np.isnan(your_column_vals[i]):
your_column_vals[i] = your_custom_list[i_custom]
i_custom += 1 # increase the index
your_df['your_column'] = your_column_vals
print your_df
Output:
输出:
your_column
0 0.0
1 1.0
2 2.0
3 NaN
4 4.0
5 6.0
6 NaN
7 NaN
8 7.0
9 8.0
10 NaN
11 9.0
your_column
0 0.0
1 1.0
2 2.0
3 1.0
4 4.0
5 6.0
6 3.0
7 6.0
8 7.0
9 8.0
10 8.0
11 9.0