Pandas:如果条件从另一列更新列值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51787247/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas: update column values from another column if criteria
提问by sailestim
I have a DataFrame:
我有一个数据框:
A B
1: 0 1
2: 0 0
3: 1 1
4: 0 1
5: 1 0
I want to update each item column Aof the DataFrame with values of column Bif value from column Aequals 0.
如果A列的值等于 0,我想用B 列的值更新DataFrame 的每个项目列 A。
DataFrame I want to get:
我想获得的数据帧:
A B
1: 1 1
2: 0 0
3: 1 1
4: 1 1
5: 1 0
I've already tried this code
我已经试过这个代码
df['A'] = df['B'].apply(lambda x: x if df['A'] == 0 else df['A'])
df['A'] = df['B'].apply(lambda x: x if df['A'] == 0 else df['A'])
It raise an error :The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
它引发了一个错误:The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
回答by Zero
Use where
用 where
In [348]: df.A = np.where(df.A.eq(0), df.B, df.A)
In [349]: df
Out[349]:
A B
1: 1 1
2: 0 0
3: 1 1
4: 1 1
5: 1 0
回答by Don Thousand
df['A'] = df.apply(lambda x: x['B'] if x['A']==0 else x['A'], axis=1)
Output
输出
A B
1: 1 1
2: 0 0
3: 1 1
4: 1 1
5: 1 0
回答by ysearka
You can perform this by using a mask:
您可以使用掩码执行此操作:
df = pd.DataFrame()
df['A'] = [0,0,1,0,1]
df['B'] = [1,0,1,1,0]
mask = (df.A == 0)
df.loc[mask,'A'] = df.loc[mask,'B']
A B
0 1 1
1 0 0
2 1 1
3 1 1
4 1 0
EDIT: Ok this is actually a unefficient solution:
编辑:好的,这实际上是一个低效的解决方案:
%timeit df.loc[mask,'A'] = df.loc[mask,'B']
%timeit df.apply(lambda x: x['B'] if x['A']==0 else x['A'], axis=1)
%timeit np.where(df.A.eq(0), df.B, df.A)
5.52 ms ± 556 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.27 ms ± 167 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
796 μs ± 89.2 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
So thanks to zero for this efficient solution with np.where!
所以感谢零为 np.where 提供了这个有效的解决方案!