如何在 Pandas 数据框中选择基于行的类别
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/33468566/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to select rows based categories in Pandas dataframe
提问by Skywalker326
this is really trivial but can't believe I have wandered around for an hour and still can find the answer, so here you are:
这真的很微不足道,但不敢相信我已经徘徊了一个小时仍然可以找到答案,所以你在这里:
df = pd.DataFrame({"cats":["a","b"], "vals":[1,2]})
df.cats = df.cats.astype("category")
df
My problem is how to select the row that its "cats" columns's category is "a". I know that df.loc[df.cats == "a"]
will work but it's based on equality on element. Is there a way to select based on levels of category?
我的问题是如何选择其“cats”列的类别为“a”的行。我知道这df.loc[df.cats == "a"]
会起作用,但它基于元素的相等性。有没有办法根据类别级别进行选择?
回答by Mike Müller
This works:
这有效:
df.cats[df.cats=='a']
UPDATE
更新
The question was updated. New solution:
问题已更新。新解决方案:
df[df.cats.cat.categories == ['a']]
回答by Michael P.
You can query the categorical list using df.cats.cat.categories
which prints output as
您可以使用df.cats.cat.categories
which 打印输出来查询分类列表
Index(['a', 'b'], dtype='object')
For this case, to select a row with category of 'a'
which is df.cats.cat.categories['0']
, you just use:
对于这种情况,选择与A类排'a'
它df.cats.cat.categories['0']
,你只需使用:
df[df.cats == df.cats.cat.categories[0]]
回答by Aru
df[df.cats.cat.categories == df.cats.cat.categories[0]]