Python 从 Pandas 中的过滤结果创建 bool 掩码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/38802675/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create bool mask from filter results in Pandas
提问by ade1e
I know how to create a mask to filter a dataframe when querying a single column:
我知道如何在查询单列时创建掩码来过滤数据框:
import pandas as pd
import datetime
index = pd.date_range('2013-1-1',periods=100,freq='30Min')
data = pd.DataFrame(data=list(range(100)), columns=['value'], index=index)
data['value2'] = 'A'
data['value2'].loc[0:10] = 'B'
data
value value2
2013-01-01 00:00:00 0 B
2013-01-01 00:30:00 1 B
2013-01-01 01:00:00 2 B
2013-01-01 01:30:00 3 B
2013-01-01 02:00:00 4 B
2013-01-01 02:30:00 5 B
2013-01-01 03:00:00 6 B
I use a simple mask here:
我在这里使用了一个简单的掩码:
mask = data['value'] > 4
data[mask]
value value2
2013-01-01 02:30:00 5 B
2013-01-01 03:00:00 6 B
2013-01-01 03:30:00 7 B
2013-01-01 04:00:00 8 B
2013-01-01 04:30:00 9 B
2013-01-01 05:00:00 10 A
My question is how to create a mask with multiple columns? So if I do this:
我的问题是如何创建具有多列的掩码?所以如果我这样做:
data[data['value2'] == 'A' ][data['value'] > 4]
This filters as I would expect but how do I create a bool mask from this as per my other example? I have provided the test data for this but I often want to create a mask on other types of data so Im looking for any pointers please.
这个过滤器正如我所期望的那样,但是我如何根据我的其他示例创建一个布尔掩码?我已经为此提供了测试数据,但我经常想在其他类型的数据上创建一个掩码,所以我请寻找任何指针。
回答by Kartik
Your boolean masks are boolean (obviously) so you can use boolean operationson them. The boolean operators include (but are not limited to) &
, |
which can combine your masks based on either an 'and' operation or an 'or' operation. In your specific case, you need an 'and' operation. So you simply write your mask like so:
您的布尔掩码是布尔值(显然),因此您可以对它们使用布尔运算。布尔运算符包括(但不限于)&
,|
它可以根据“与”操作或“或”操作组合您的掩码。在您的特定情况下,您需要一个“与”操作。所以你只需像这样写你的面具:
mask = (data['value2'] == 'A') & (data['value'] > 4)
This ensures you are selecting those rows for which both conditions are simultaneously satisfied. By replacing the &
with |
, one can select those rows for which either of the two conditions can be satisfied. You can select your result as usual:
这可确保您选择同时满足两个条件的那些行。通过更换&
具有|
,一个可以选择其中任一的两个条件可被满足的那些行。您可以像往常一样选择结果:
data[mask]
Although this question is answered by the answer to the question that ayhan points out in his comment, I thought that the OP was lacking the idea of boolean operations.
虽然这个问题是由 ayhan 在他的评论中指出的问题的答案回答的,但我认为 OP 缺乏布尔运算的想法。