用列表中的值替换 pandas.DataFrame 的 NaN 值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42167429/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 02:57:16  来源:igfitidea点击:

Replace NaN values of pandas.DataFrame with values from list

pythonpandas

提问by MeanStreet

In a python script using the library pandas, I have a dataset of let's say 100 lines with a feature "X", containing 36 NaNvalues, and a list of size 36.

在使用库的 python 脚本中pandas,我有一个数据集,假设有 100 行,具有特征“X”,包含 36 个NaN值和大小为 36 的列表。

I want to replace all the 36 missing values of the column "X" by the 36 values I have in my list.

我想用列表中的 36 个值替换“X”列的所有 36 个缺失值。

It's likely to be a dumb question, but I went through all the doc and couldn't find a way to do it.

这可能是一个愚蠢的问题,但我浏览了所有文档,但找不到解决方法。

Here's an example :

这是一个例子:

INPUT

输入

Data:   X      Y
        1      8
        2      3
        NaN    2
        NaN    7
        1      2
        NaN    2

Filler

填料

List: [8, 6, 3]

OUTPUT

输出

Data:   X      Y
        1      8
        2      3
        8      2
        6      7
        1      2
        3      2

回答by bunji

Start with your dataframe df

从您的数据框开始 df

print(df)

     X  Y
0  1.0  8
1  2.0  3
2  NaN  2
3  NaN  7
4  1.0  2
5  NaN  2

Define the values you want to fill with (Note: there must be the same number of elements in your fillerlist as NaNvalues in your dataframe)

定义要填充的值(注意:filler列表中的元素数量必须与NaN数据框中的值相同)

filler = [8, 6, 3]

Filter your column (that contains the NaNvalues) and overwrite the selected rows with your filler

过滤您的列(包含NaN值)并用您的filler

df.X[df.X.isnull()] = filler

df.X[df.X.isnull()] = filler

df.loc[df.X.isnull(), 'X'] = filler

which gives:

这使:

print(df)

     X  Y
0  1.0  8
1  2.0  3
2  8.0  2
3  6.0  7
4  1.0  2
5  3.0  2

回答by Shijo

This may not be the efficient one, but still works :) First find all index for the Nan's and replace them in loop. Assuming that list is always bigger than number of Nan's

这可能不是有效的,但仍然有效:) 首先找到 Nan 的所有索引并在循环中替换它们。假设该列表始终大于 Nan 的数量

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [np.nan, 1, 2], 'B': [10, np.nan, np.nan], 'C': [[20, 21, 22], [23, 24, 25], np.nan]})
lst=[12,35,78]

index = df['B'].index[df['B'].apply(np.isnan)] #find Index
cnt=0
for item in index:
    df.set_value(item, 'B', lst[item]) #replace Nan of the nth index with value from Nth value from list
    cnt=cnt+1

print df

     A     B             C
0  NaN  10.0  [20, 21, 22]
1  1.0   NaN  [23, 24, 25]
2  2.0   NaN           NaN

Output .

输出 。

     A     B             C
0  NaN  10.0  [20, 21, 22]
1  1.0  35.0  [23, 24, 25]
2  2.0  78.0           NaN

回答by Scratch'N'Purr

You'd have to use an iterator as an index marker for replacing your NaN's with the value in your custom list:

您必须使用迭代器作为索引标记,用自定义列表中的值替换 NaN:

import numpy as np
import pandas as pd

your_df = pd.DataFrame({'your_column': [0,1,2,np.nan,4,6,np.nan,np.nan,7,8,np.nan,9]})  # a df with 4 NaN's
print your_df

your_custom_list = [1,3,6,8]  # custom list with 4 fillers

your_column_vals = your_df['your_column'].values

i_custom = 0  # starting index on your iterator for your custom list
for i in range(len(your_column_vals)):
    if np.isnan(your_column_vals[i]):
        your_column_vals[i] = your_custom_list[i_custom]
        i_custom += 1  # increase the index

your_df['your_column'] = your_column_vals

print your_df

Output:

输出:

    your_column
0           0.0
1           1.0
2           2.0
3           NaN
4           4.0
5           6.0
6           NaN
7           NaN
8           7.0
9           8.0
10          NaN
11          9.0
    your_column
0           0.0
1           1.0
2           2.0
3           1.0
4           4.0
5           6.0
6           3.0
7           6.0
8           7.0
9           8.0
10          8.0
11          9.0