pandas 在熊猫数据框中附加布尔列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30912403/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Appending Boolean Column in Panda Dataframe
提问by linqu
I am learning pandas and got stuck with this problem here.
我正在学习Pandas并在这里遇到了这个问题。
I created a dataframe that tracks all users and the number of times they did something.
我创建了一个数据框来跟踪所有用户和他们做某事的次数。
To better understand the problem I created this example:
为了更好地理解问题,我创建了这个例子:
import pandas as pd
data = [
{'username': 'me', 'bought_apples': 2, 'bought_pears': 0},
{'username': 'you', 'bought_apples': 1, 'bought_pears': 1}
]
df = pd.DataFrame(data)
df['bought_something'] = df['bought_apples'] > 0 or df['bought_pears'] > 0
In the last line I want to add a column that indicates if they user has bought something at all.
在最后一行中,我想添加一列,指示他们的用户是否购买了某些东西。
This error pops up:
弹出这个错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
ValueError:系列的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。
I understand the point of ambiguity in panda's Series (also explained here) but I could not relate it to the problem.
我理解Pandas系列(也在这里解释)中的歧义点,但我无法将其与问题联系起来。
Interestingly this works
有趣的是这有效
df['bought_something'] = df['bought_apples'] > 0
Can anyone help me?
谁能帮我?
回答by EdChum
You can call sumrow-wise and compare if this is greater than 0:
您可以按sum行调用并比较它是否大于0:
In [105]:
df['bought_something'] = df[['bought_apples','bought_pears']].sum(axis=1) > 0
df
Out[105]:
bought_apples bought_pears username bought_something
0 2 0 me True
1 1 1 you True
Regarding your original attempt, the error message is telling you that it's ambiguous to compare a scalar with an array, if you want to orboolean conditions then you need to use the bit-wise operator |and wrap the conditions in parentheses due to operator precedence:
关于您最初的尝试,错误消息告诉您将标量与数组进行比较是不明确的,如果您想要or布尔条件,那么您需要使用按位运算符|并将条件括在括号中,因为运算符优先级:
In [111]:
df['bought_something'] = ((df['bought_apples'] > 0) | (df['bought_pears'] > 0))
df
Out[111]:
bought_apples bought_pears username bought_something
0 2 0 me True
1 1 1 you True
回答by Jianxun Li
The reason for that error is you use 'or' to 'join' two boolean vectors instead of boolean scalar. That's why it says it is ambiguous.
该错误的原因是您使用“或”来“连接”两个布尔向量而不是布尔标量。这就是为什么它说它是模棱两可的。

