pandas 关于特定列的逐行填充?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/24015379/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Row-by-row fillna with respect to a specific column?
提问by wfh
I have the following pandas dataframe and I would like to fill the NaNs in columns A-C in a row-wise fashion with values from columns D. Is there an explicit way to do this where I can define that all the NaNs should depend row-wise on values in column D? I couldn't find a way to explicitly do this in fillna().
我有以下 Pandas 数据框,我想用 D 列中的值以行方式填充 AC 列中的 NaN。 D 列中的值?我找不到在 fillna() 中明确执行此操作的方法。
Note that there are additional columns E-Zwhich have their own NaNs and may have other rules for filling in NaNs, and should be left untouched.
请注意,还有其他列 EZ有自己的 NaN 并且可能有其他填充 NaN 的规则,应该保持不变。
A B C D E
158 158 158 177 ...
158 158 158 177 ...
NaN NaN NaN 177 ...
158 158 158 177 ...
NaN NaN NaN 177 ...
Would like to have this for columns A-C only:
只想对 AC 列使用此功能:
A B C D E
158 158 158 177 ...
158 158 158 177 ...
177 177 177 177 ...
158 158 158 177 ...
177 177 177 177 ...
Thanks.
谢谢。
回答by joris
Using the fillnafunction:
使用fillna功能:
df.fillna(axis=1, method='backfill')
will do if there are no NaN's in the other columns.
If there are and you want to leave them untouched, I think the only option in this way is to perform the fillnaon a subset of your dataframe. With example dataframe:
如果其他列中没有 NaN,就会这样做。
如果有并且您想让它们保持原样,我认为这种方式的唯一选择是对fillna数据帧的一个子集执行。使用示例数据框:
In [45]: df
Out[45]:
A B C D E F
0 158 158 158 177 1 10
1 158 158 158 177 2 20
2 NaN NaN NaN 177 3 30
3 158 158 158 177 NaN 40
4 NaN NaN NaN 177 5 50
In [48]: df[['A', 'B', 'C', 'D']] = df[['A', 'B', 'C', 'D']].fillna(axis=1, method='backfill')
In [49]: df
Out[49]:
A B C D E F
0 158 158 158 177 1 10
1 158 158 158 177 2 20
2 177 177 177 177 3 30
3 158 158 158 177 NaN 40
4 177 177 177 177 5 50
Udate:If you don't want to depend on the column order, you can also specify the values to use to fill for each row (like .fillna(value=df['D']). The only problem is that this only works for Series (when it is a dataframe, it tries to map the different values to fill to the different columns, not the rows). So with an apply to do it column by column, it works:
Udate:如果您不想依赖列顺序,您还可以指定用于填充每一行的值(如.fillna(value=df['D'])。唯一的问题是这仅适用于系列(当它是数据框时,它尝试将不同的值映射到不同的列,而不是行)。因此,通过申请逐列进行操作,它可以工作:
In [60]: df[['A', 'B', 'C']].apply(lambda x: x.fillna(value=df['D']))
Out[60]:
A B C
0 158 158 158
1 158 158 158
2 177 177 177
3 158 158 158
4 177 177 177

