Python 在 Pandas 数据框中的不同列上使用 lambda if 条件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/37443082/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:25:00  来源:igfitidea点击:

Using lambda if condition on different columns in Pandas dataframe

pythonpandasnumpydataframelambda

提问by PeterL

I have simple dataframe:

我有简单的数据框:

import pandas as pd
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('abc'))

Thus for example:

因此例如:

a   b   c
0   -0.813530   -1.291862   1.330320
1   -1.066475   0.624504    1.690770
2   1.330330    -0.675750   -1.123389
3   0.400109    -1.224936   -1.704173

And then I want to create column “d” that contains value from “c” if c is positive. Else value from “b”.

然后我想创建列“d”,如果 c 是正数,它包含来自“c”的值。来自“b”的其他值。

I am trying:

我在尝试:

frame['d']=frame.apply(lambda x: frame['c'] if frame['c']>0 else frame['b'],axis=0)

But getting “ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index a')

但是得到“ValueError: ('系列的真值不明确。使用 a.empty, a.bool(), a.item(), a.any() 或 a.all()。', 'occurred at索引 a')

I was trying to google how to solve this, but did not succeed. Any tip please?

我试图谷歌如何解决这个问题,但没有成功。请问有什么提示吗?

回答by MaxU

is that what you want?

那是你要的吗?

In [300]: frame[['b','c']].apply(lambda x: x['c'] if x['c']>0 else x['b'], axis=1)
Out[300]:
0   -1.099891
1    0.582815
2    0.901591
3    0.900856
dtype: float64

回答by piRSquared

Solution

解决方案

use a vectorized approach

使用矢量化方法

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)

Explanation

解释

This is derived from the sum of

这是从总和得出的

(frame.c > 0) * frame.c  # frame.c if positive

Plus

(frame.c <= 0) * frame.b  # frame.b if c is not positive

However

然而

(frame.c <=0 )

is equivalent to

相当于

(1 - frame.c > 0)

and when combined you get

当结合时你得到

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)