pandas 如何在合并熊猫数据框中的两列时删除 nan 值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34989341/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to remove nan value while combining two column in Panda Data frame?
提问by imSonuGupta
I am trying but not able to remove nan
while combining two columns of a DataFrame
.
我正在尝试但无法nan
在组合 .a 的两列时删除DataFrame
。
Data is like:
数据是这样的:
feedback_id _id
568a8c25cac4991645c287ac nan
568df45b177e30c6487d3603 nan
nan 568df434832b090048f34974
nan 568cd22e9e82dfc166d7dff1
568df3f0832b090048f34711 nan
nan 568e5a38b4a797c664143dda
I want:
我想要:
feedback_request_id
568a8c25cac4991645c287ac
568df45b177e30c6487d3603
568df434832b090048f34974
568cd22e9e82dfc166d7dff1
568df3f0832b090048f34711
568e5a38b4a797c664143dda
Here is my code:
这是我的代码:
df3['feedback_request_id'] = ('' if df3['_id'].empty else df3['_id'].map(str)) + ('' if df3['feedback_id'].empty else df3['feedback_id'].map(str))
Output I'm getting:
我得到的输出:
feedback_request_id
568a8c25cac4991645c287acnan
568df45b177e30c6487d3603nan
nan568df434832b090048f34974
nan568cd22e9e82dfc166d7dff1
568df3f0832b090048f34711nan
nan568e5a38b4a797c664143dda
I have tried this, also:
我也试过这个:
df3['feedback_request_id'] = ('' if df3['_id']=='nan' else df3['_id'].map(str)) + ('' if df3['feedback_id']=='nan' else df3['feedback_id'].map(str))
But it's giving the error:
但它给出了错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
回答by jezrael
You can use combine_first
or fillna
:
您可以使用combine_first
或fillna
:
print df['feedback_id'].combine_first(df['_id'])
0 568a8c25cac4991645c287ac
1 568df45b177e30c6487d3603
2 568df434832b090048f34974
3 568cd22e9e82dfc166d7dff1
4 568df3f0832b090048f34711
5 568e5a38b4a797c664143dda
Name: feedback_id, dtype: object
print df['feedback_id'].fillna(df['_id'])
0 568a8c25cac4991645c287ac
1 568df45b177e30c6487d3603
2 568df434832b090048f34974
3 568cd22e9e82dfc166d7dff1
4 568df3f0832b090048f34711
5 568e5a38b4a797c664143dda
Name: feedback_id, dtype: object
回答by BallpointBen
If you want a solution that doesn't require referencing df
twice or any of its columns explicitly:
如果您想要一个不需要df
明确引用两次或其任何列的解决方案:
df.bfill(axis=1).iloc[:, 0]
With two columns, this will copy non-null values from the right column into the left, then select the left column.
对于两列,这会将非空值从右列复制到左列,然后选择左列。
回答by jpp
For an in-place solution, you can use pd.Series.update
with pd.DataFrame.pop
:
对于就地解决方案,您可以使用pd.Series.update
with pd.DataFrame.pop
:
df['feedback_id'].update(df.pop('_id'))
print(df)
feedback_id
0 568a8c25cac4991645c287ac
1 568df45b177e30c6487d3603
2 568df434832b090048f34974
3 568cd22e9e82dfc166d7dff1
4 568df3f0832b090048f34711
5 568e5a38b4a797c664143dda