Python 使用 OR 语句过滤 Pandas Dataframe
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29461185/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Filtering Pandas Dataframe using OR statement
提问by Josh
I have a pandas dataframe and I want to filter the whole df based on the value of two columns in the data frame. I want to get back all rows and columns where IBRD or IMF != 0.
我有一个熊猫数据框,我想根据数据框中两列的值过滤整个 df。我想取回 IBRD 或 IMF != 0 的所有行和列。
alldata_balance = alldata[(alldata[IBRD] !=0) or (alldata[IMF] !=0)]
but this gives me a ValueError
但这给了我一个 ValueError
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
ValueError:系列的真值不明确。使用 a.empty、a.bool()、a.item()、a.any() 或 a.all()。
So I know I am not using the or statement correctly, is there a way to do this?
所以我知道我没有正确使用 or 语句,有没有办法做到这一点?
采纳答案by Liam Foley
From the docs:
从文档:
Another common operation is the use of boolean vectors to filter the data. The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses.
另一个常见的操作是使用布尔向量来过滤数据。运营商是: | for or, & for and, and ~ for not. 这些必须使用括号进行分组。
http://pandas.pydata.org/pandas-docs/version/0.15.2/indexing.html#boolean-indexing
http://pandas.pydata.org/pandas-docs/version/0.15.2/indexing.html#boolean-indexing
Try:
尝试:
alldata_balance = alldata[(alldata[IBRD] !=0) | (alldata[IMF] !=0)]
回答by Majed
You can do like below to achieve your result:
你可以像下面这样做来达到你的结果:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
....
....
#use filter with plot
#or
fg=sns.factorplot('Retailer country', data=df1[(df1['Retailer country']=='United States') | (df1['Retailer country']=='France')], kind='count')
fg.set_xlabels('Retailer country')
plt.show()
#also
#and
fg=sns.factorplot('Retailer country', data=df1[(df1['Retailer country']=='United States') & (df1['Year']=='2013')], kind='count')
fg.set_xlabels('Retailer country')
plt.show()