Python 在 Pandas 数据框中的不同列上使用 lambda if 条件
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/37443082/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using lambda if condition on different columns in Pandas dataframe
提问by PeterL
I have simple dataframe:
我有简单的数据框:
import pandas as pd
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('abc'))
Thus for example:
因此例如:
a b c
0 -0.813530 -1.291862 1.330320
1 -1.066475 0.624504 1.690770
2 1.330330 -0.675750 -1.123389
3 0.400109 -1.224936 -1.704173
And then I want to create column “d” that contains value from “c” if c is positive. Else value from “b”.
然后我想创建列“d”,如果 c 是正数,它包含来自“c”的值。来自“b”的其他值。
I am trying:
我在尝试:
frame['d']=frame.apply(lambda x: frame['c'] if frame['c']>0 else frame['b'],axis=0)
But getting “ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index a')
但是得到“ValueError: ('系列的真值不明确。使用 a.empty, a.bool(), a.item(), a.any() 或 a.all()。', 'occurred at索引 a')
I was trying to google how to solve this, but did not succeed. Any tip please?
我试图谷歌如何解决这个问题,但没有成功。请问有什么提示吗?
回答by MaxU
is that what you want?
那是你要的吗?
In [300]: frame[['b','c']].apply(lambda x: x['c'] if x['c']>0 else x['b'], axis=1)
Out[300]:
0 -1.099891
1 0.582815
2 0.901591
3 0.900856
dtype: float64
回答by piRSquared
Solution
解决方案
use a vectorized approach
使用矢量化方法
frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)
Explanation
解释
This is derived from the sum of
这是从总和得出的
(frame.c > 0) * frame.c # frame.c if positive
Plus
加
(frame.c <= 0) * frame.b # frame.b if c is not positive
However
然而
(frame.c <=0 )
is equivalent to
相当于
(1 - frame.c > 0)
and when combined you get
当结合时你得到
frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)