Python 如何在 Pandas 中使用基于 DataFrame 布尔值的条件语句
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32713221/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to use a conditional statement based on DataFrame boolean value in pandas
提问by iNoob
Now I know how to check the dataframe for specific values across multiple columns. However, I cant seem to work out how to carry out an if statement based on a boolean response.
现在我知道如何跨多列检查数据框的特定值。但是,我似乎无法弄清楚如何基于布尔响应执行 if 语句。
For example:
例如:
Walk directories using os.walk
and read in a specific file into a dataframe.
使用os.walk
特定文件遍历目录并将其读入数据帧。
for root, dirs, files in os.walk(main):
filters = '*specificfile.csv'
for filename in fnmatch.filter(files, filters):
df = pd.read_csv(os.path.join(root, filename),error_bad_lines=False)
Now checking that dataframe across multiple columns. The first value being the column name (column1), the next value is the specific value I am looking for in that column(banana). I am then checking another column (column2) for a specific value (green). If both of these are true I want to carry out a specific task. However if it is false I want to do something else.
现在检查跨多列的数据框。第一个值是列名 (column1),下一个值是我在该列中查找的特定值 (banana)。然后我正在检查另一列(第 2 列)的特定值(绿色)。如果这两个都是真的,我想执行一项特定的任务。但是,如果它是假的,我想做其他事情。
so something like:
所以像:
if (df['column1']=='banana') & (df['colour']=='green'):
do something
else:
do something
采纳答案by Anand S Kumar
If you want to check if any row of the DataFrame meets your conditions you can use .any()
along with your condition . Example -
如果您想检查 DataFrame 的任何行是否满足您的条件,您可以.any()
与您的条件一起使用。例子 -
if ((df['column1']=='banana') & (df['colour']=='green')).any():
Example -
例子 -
In [16]: df
Out[16]:
A B
0 1 2
1 3 4
2 5 6
In [17]: ((df['A']==1) & (df['B'] == 2)).any()
Out[17]: True
This is because your condition - ((df['column1']=='banana') & (df['colour']=='green'))
- returns a Series of True/False values.
这是因为您的条件 - ((df['column1']=='banana') & (df['colour']=='green'))
- 返回一系列 True/False 值。
This is because in pandas when you compare a series against a scalar value, it returns the result of comparing each row of that series against the scalar value and the result is a series of True/False values indicating the result of comparison of that row with the scalar value. Example -
这是因为在 Pandas 中,当您将系列与标量值进行比较时,它会返回将该系列的每一行与标量值进行比较的结果,结果是一系列 True/False 值,表示该行与标量值。例子 -
In [19]: (df['A']==1)
Out[19]:
0 True
1 False
2 False
Name: A, dtype: bool
In [20]: (df['B'] == 2)
Out[20]:
0 True
1 False
2 False
Name: B, dtype: bool
And the &
does row-wise and
for the two series. Example -
并且这两个系列&
是按行and
进行的。例子 -
In [18]: ((df['A']==1) & (df['B'] == 2))
Out[18]:
0 True
1 False
2 False
dtype: bool
Now to check if any of the values from this series is True, you can use .any()
, to check if all the values in the series are True, you can use .all()
.
现在要检查该系列中的任何值是否为 True,您可以使用.any()
,检查系列中的所有值是否为 True,您可以使用.all()
.