根据多个条件向 Python Pandas DataFrame 添加新列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/49586471/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Add new column to Python Pandas DataFrame based on multiple conditions
提问by Abdul Rehman
I have a dataset with various columns as below:
我有一个包含各种列的数据集,如下所示:
discount tax total subtotal productid
3.98 1.06 21.06 20 3232
3.98 1.06 21.06 20 3232
3.98 6 106 100 3498
3.98 6 106 100 3743
3.98 6 106 100 3350
3.98 6 106 100 3370
46.49 3.36 66.84 63 695
discount tax total subtotal productid
3.98 1.06 21.06 20 3232
3.98 1.06 21.06 20 3232
3.98 6 106 100 3498
3.98 6 106 100 3743
3.98 6 106 100 3350
3.98 6 106 100 3370
46.49 3.36 66.84 63 695
Now, I need to add a new column Classand assign it the value of 0
or 1
on the base of the following conditions:
现在,我需要添加一个新列Class并根据以下条件为其分配0
或值1
:
if:
discount > 20%
no tax
total > 100
then the Class will 1
otherwise it should be 0
I have done it with a single condition but I don't how can I accomplish it under multiple conditions.
我在一个条件下完成了它,但我不知道如何在多个条件下完成它。
Here's wIat i have tried:
这是我尝试过的:
df_full['Class'] = df_full['amount'].map(lambda x: 1 if x > 100 else 0)
I have taken a look at all other similar questions but couldn't find any solution for my problem.I have tried all of the above-mentioned posts but stuck on this error:
TypeError: '>' not supported between instances of 'str' and 'int'
我已经查看了所有其他类似的问题,但找不到任何解决我的问题的方法。我已经尝试了上述所有帖子,但仍然遇到此错误:
TypeError: '>' not supported between instances of 'str' and 'int'
Here's in the case of first posted answer, i have tried it as:
这是第一次发布答案的情况,我已经尝试过:
df_full['class'] = np.where( ( (df_full['discount'] > 20) & (df_full['tax'] == 0 ) & (df_full['total'] > 100) & df_full['productdiscount'] ) , 1, 0)
回答by Gustavo Bezerra
You can apply an arbitrary function across a dataframe row using DataFrame.apply
.
您可以使用 跨数据帧行应用任意函数DataFrame.apply
。
In your case, you could define a function like:
在您的情况下,您可以定义一个函数,如:
def conditions(s):
if (s['discount'] > 20) or (s['tax'] == 0) or (s['total'] > 100):
return 1
else:
return 0
And use it to add a new column to your data:
并使用它为您的数据添加一个新列:
df_full['Class'] = df_full.apply(conditions, axis=1)
回答by Karl Anka
Judging by the image of your data is rather unclear what you mean by a discount
20%.
从你的数据图像来看,你说的discount
20%是什么意思是相当不清楚的。
However, you can likely do something like this.
但是,您可能会执行此类操作。
df['class'] = 0 # add a class column with 0 as default value
# find all rows that fulfills your conditions and set class to 1
df.loc[(df['discount'] / df['total'] > .2) & # if discount is more than .2 of total
(df['tax'] == 0) & # if tax is 0
(df['total'] > 100), # if total is > 100
'class'] = 1 # then set class to 1
Note that &
means and
here, if you want or
instead use |
.
请注意,这&
意味着and
在这里,如果您想or
改用|
.