Python 如何在 Pandas 中使用基于 DataFrame 布尔值的条件语句

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32713221/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 12:06:57  来源:igfitidea点击:

How to use a conditional statement based on DataFrame boolean value in pandas

pythonpandas

提问by iNoob

Now I know how to check the dataframe for specific values across multiple columns. However, I cant seem to work out how to carry out an if statement based on a boolean response.

现在我知道如何跨多列检查数据框的特定值。但是,我似乎无法弄清楚如何基于布尔响应执行 if 语句。

For example:

例如:

Walk directories using os.walkand read in a specific file into a dataframe.

使用os.walk特定文件遍历目录并将其读入数据帧。

for root, dirs, files in os.walk(main):
        filters = '*specificfile.csv'
        for filename in fnmatch.filter(files, filters):
        df = pd.read_csv(os.path.join(root, filename),error_bad_lines=False)

Now checking that dataframe across multiple columns. The first value being the column name (column1), the next value is the specific value I am looking for in that column(banana). I am then checking another column (column2) for a specific value (green). If both of these are true I want to carry out a specific task. However if it is false I want to do something else.

现在检查跨多列的数据框。第一个值是列名 (column1),下一个值是我在该列中查找的特定值 (banana)。然后我正在检查另一列(第 2 列)的特定值(绿色)。如果这两个都是真的,我想执行一项特定的任务。但是,如果它是假的,我想做其他事情。

so something like:

所以像:

if (df['column1']=='banana') & (df['colour']=='green'):
    do something
else: 
    do something

采纳答案by Anand S Kumar

If you want to check if any row of the DataFrame meets your conditions you can use .any()along with your condition . Example -

如果您想检查 DataFrame 的任何行是否满足您的条件,您可以.any()与您的条件一起使用。例子 -

if ((df['column1']=='banana') & (df['colour']=='green')).any():

Example -

例子 -

In [16]: df
Out[16]:
   A  B
0  1  2
1  3  4
2  5  6

In [17]: ((df['A']==1) & (df['B'] == 2)).any()
Out[17]: True

This is because your condition - ((df['column1']=='banana') & (df['colour']=='green'))- returns a Series of True/False values.

这是因为您的条件 - ((df['column1']=='banana') & (df['colour']=='green'))- 返回一系列 True/False 值。

This is because in pandas when you compare a series against a scalar value, it returns the result of comparing each row of that series against the scalar value and the result is a series of True/False values indicating the result of comparison of that row with the scalar value. Example -

这是因为在 Pandas 中,当您将系列与标量值进行比较时,它会返回将该系列的每一行与标量值进行比较的结果,结果是一系列 True/False 值,表示该行与标量值。例子 -

In [19]: (df['A']==1)
Out[19]:
0     True
1    False
2    False
Name: A, dtype: bool

In [20]: (df['B'] == 2)
Out[20]:
0     True
1    False
2    False
Name: B, dtype: bool

And the &does row-wise andfor the two series. Example -

并且这两个系列&是按行and进行的。例子 -

In [18]: ((df['A']==1) & (df['B'] == 2))
Out[18]:
0     True
1    False
2    False
dtype: bool

Now to check if any of the values from this series is True, you can use .any(), to check if all the values in the series are True, you can use .all().

现在要检查该系列中的任何值是否为 True,您可以使用.any(),检查系列中的所有值是否为 True,您可以使用.all().