如何根据另一列的 NaN 值在 Pandas 数据框中设置值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37962759/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:26:42  来源:igfitidea点击:

How set values in pandas dataframe based on NaN values of another column?

pythonpython-2.7pandasnan

提问by Rocketq

I have dataframe named dfwith original shape (4361, 15). Some of agefmcolumn`s values are NaN. Just look:

我有以df原始形状命名的数据框(4361, 15)。一些agefm列的值是 NaN。只是看看:

> df[df.agefm.isnull() == True].agefm.shape
(2282,)

Then I create new column and set all its values to 0:

然后我创建新列并将其所有值设置为 0:

df['nevermarr'] = 0

So I would like to set nevermarrvalue to 1, then in that row agefmis Nan:

所以我想将nevermarr值设置为 1,然后在那一行agefm是 Nan:

df[df.agefm.isnull() == True].nevermarr = 1

Nothing changed:

没有改变:

> df['nevermarr'].sum()
0

What am I doing wrong?

我究竟做错了什么?

回答by jezrael

The best is use numpy.where:

最好是使用numpy.where

df['nevermarr'] = np.where(df.agefm.isnull(), 1, 0)
print (df)
   agefm  nevermarr
0    NaN          1
1    5.0          0
2    6.0          0

Or use loc, ==Truecan be omitted:

或者使用loc,==True可以省略:

df.loc[df.agefm.isnull(), 'nevermarr'] = 1

Or mask:

mask

df['nevermarr'] = df.nevermarr.mask(df.agefm.isnull(), 1)
print (df)
   agefm  nevermarr
0    NaN          1
1    5.0          2
2    6.0          3

Sample:

样本:

import pandas as pd
import numpy as np

df = pd.DataFrame({'nevermarr':[7,2,3],
                   'agefm':[np.nan,5,6]})

print (df)
   agefm  nevermarr
0    NaN          7
1    5.0          2
2    6.0          3

df.loc[df.agefm.isnull(), 'nevermarr'] = 1
print (df)
   agefm  nevermarr
0    NaN          1
1    5.0          2
2    6.0          3