用列表中的值替换 pandas.DataFrame 的 NaN 值

Question

提问by MeanStreet

In a python script using the library pandas, I have a dataset of let's say 100 lines with a feature "X", containing 36 NaNvalues, and a list of size 36.

在使用库的 python 脚本中pandas，我有一个数据集，假设有 100 行，具有特征“X”，包含 36 个NaN值和大小为 36 的列表。

I want to replace all the 36 missing values of the column "X" by the 36 values I have in my list.

我想用列表中的 36 个值替换“X”列的所有 36 个缺失值。

It's likely to be a dumb question, but I went through all the doc and couldn't find a way to do it.

这可能是一个愚蠢的问题，但我浏览了所有文档，但找不到解决方法。

Here's an example :

这是一个例子：

INPUT

输入

Data:   X      Y
        1      8
        2      3
        NaN    2
        NaN    7
        1      2
        NaN    2

Filler

填料

List: [8, 6, 3]

OUTPUT

输出

Data:   X      Y
        1      8
        2      3
        8      2
        6      7
        1      2
        3      2

Answer 1

回答by bunji

Start with your dataframe df

从您的数据框开始 df

print(df)

     X  Y
0  1.0  8
1  2.0  3
2  NaN  2
3  NaN  7
4  1.0  2
5  NaN  2

Define the values you want to fill with (Note: there must be the same number of elements in your fillerlist as NaNvalues in your dataframe)

定义要填充的值（注意：filler列表中的元素数量必须与NaN数据框中的值相同）

filler = [8, 6, 3]

Filter your column (that contains the NaNvalues) and overwrite the selected rows with your filler

过滤您的列（包含NaN值）并用您的filler

~~df.X[df.X.isnull()] = filler~~

df.loc[df.X.isnull(), 'X'] = filler

which gives:

这使：

print(df)

     X  Y
0  1.0  8
1  2.0  3
2  8.0  2
3  6.0  7
4  1.0  2
5  3.0  2

Answer 2

回答by Shijo

This may not be the efficient one, but still works :) First find all index for the Nan's and replace them in loop. Assuming that list is always bigger than number of Nan's

这可能不是有效的，但仍然有效:) 首先找到 Nan 的所有索引并在循环中替换它们。假设该列表始终大于 Nan 的数量

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [np.nan, 1, 2], 'B': [10, np.nan, np.nan], 'C': [[20, 21, 22], [23, 24, 25], np.nan]})
lst=[12,35,78]

index = df['B'].index[df['B'].apply(np.isnan)] #find Index
cnt=0
for item in index:
    df.set_value(item, 'B', lst[item]) #replace Nan of the nth index with value from Nth value from list
    cnt=cnt+1

print df

     A     B             C
0  NaN  10.0  [20, 21, 22]
1  1.0   NaN  [23, 24, 25]
2  2.0   NaN           NaN

Output .

输出。

     A     B             C
0  NaN  10.0  [20, 21, 22]
1  1.0  35.0  [23, 24, 25]
2  2.0  78.0           NaN

Answer 3

回答by Scratch'N'Purr

You'd have to use an iterator as an index marker for replacing your NaN's with the value in your custom list:

您必须使用迭代器作为索引标记，用自定义列表中的值替换 NaN：

import numpy as np
import pandas as pd

your_df = pd.DataFrame({'your_column': [0,1,2,np.nan,4,6,np.nan,np.nan,7,8,np.nan,9]})  # a df with 4 NaN's
print your_df

your_custom_list = [1,3,6,8]  # custom list with 4 fillers

your_column_vals = your_df['your_column'].values

i_custom = 0  # starting index on your iterator for your custom list
for i in range(len(your_column_vals)):
    if np.isnan(your_column_vals[i]):
        your_column_vals[i] = your_custom_list[i_custom]
        i_custom += 1  # increase the index

your_df['your_column'] = your_column_vals

print your_df

Output:

输出：

    your_column
0           0.0
1           1.0
2           2.0
3           NaN
4           4.0
5           6.0
6           NaN
7           NaN
8           7.0
9           8.0
10          NaN
11          9.0
    your_column
0           0.0
1           1.0
2           2.0
3           1.0
4           4.0
5           6.0
6           3.0
7           6.0
8           7.0
9           8.0
10          8.0
11          9.0

用列表中的值替换 pandas.DataFrame 的 NaN 值

提问by MeanStreet

回答by bunji

回答by Shijo

回答by Scratch'N'Purr

相关推荐

最近更新

标签

用列表中的值替换 pandas.DataFrame 的 NaN 值

提问by MeanStreet

回答by bunji

回答by Shijo

回答by Scratch'N'Purr

相关推荐

pandas 熊猫将列转换为总数的百分比

pandas 删除python pandas中所有列值的双引号

pandas 熊猫需要关闭连接吗？

将包含列表的 Pandas 列“unstack”成多行

相关推荐

最近更新

标签