pandas 根据其他列的条件在pandas中创建一个新列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43160484/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 03:19:39  来源:igfitidea点击:

Making a new column in pandas based on conditions of other columns

pythonpandasdataframelambdaapply

提问by nickm

I would like to make a new column based on an if statement that has conditionals of two or more other columns in a dataframe.

我想根据 if 语句创建一个新列,该语句具有数据框中两个或多个其他列的条件。

For example, column3 = True if (column1 < 10.0) and (column2 > 0.0).

例如,column3 = True if (column1 < 10.0) and (column2 > 0.0)。

I have looked around and it seems that other have used the apply method with a lambda function, but i am a bit of a novice on these.

我环顾四周,似乎其他人使用了带有 lambda 函数的 apply 方法,但我在这些方面有点新手。

I suppose i could make two additional columns that makes that row a 1 if the condition is met for each column, then sum the columns to check if all conditions are met, but this seems a bit inelegant.

我想我可以创建两个额外的列,如果每列都满足条件,则使该行成为 1,然后对列求和以检查是否满足所有条件,但这似乎有点不雅。

If you provide an answer with apply/lambda, let's suppose the dataframe is called sample_df and the columns are col1, col2, and col3.

如果您使用 apply/lambda 提供答案,假设数据框名为 sample_df,列是 col1、col2 和 col3。

Thanks so much!

非常感谢!

采纳答案by pansen

You can use evalhere for short:

您可以eval在此处简称:

# create some dummy data
df = pd.DataFrame(np.random.randint(0, 10, size=(5, 2)), 
                  columns=["col1", "col2"])
print(df)

    col1    col2
0   1       7
1   2       3
2   4       6
3   2       5
4   5       4

df["col3"] = df.eval("col1 < 5 and col2 > 5")
print(df)

    col1    col2    col3
0   1       7       True
1   2       3       False
2   4       6       True
3   2       5       False
4   5       4       False

You can also write it without eval via (df["col1"] < 5) & (df["col2"] > 5).

您也可以在没有 eval 的情况下通过(df["col1"] < 5) & (df["col2"] > 5).

You may also enhance the example with np.whereto explicitly set the values for the positiveand negativecases right away:

您还可以通过np.where立即显式设置正面负面案例的值来增强示例:

df["col4"] = np.where(df.eval("col1 < 5 and col2 > 5"), "Positive Value", "Negative Value")
print(df)

    col1    col2    col3    col4
0   1       7       True    Positive Value
1   2       3       False   Negative Value
2   4       6       True    Positive Value
3   2       5       False   Negative Value
4   5       4       False   Negative Value