向 Pandas 数据透视表添加过滤器
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43235930/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
adding filter to pandas pivot table
提问by progster
I would like to add a filtering condition to a pivot table, like this:
我想向数据透视表添加过滤条件,如下所示:
(Select the values of v2 equal to 'A')
(选择 v2 的值等于 'A')
pd.pivot_table(df,index=['v1'],columns=['v2'=='A'],values=['v3'],aggfunc='count')
Is that possible?
那可能吗?
回答by Josh Janjua
This is an extension of Grr'sanswer.
这是Grr答案的延伸。
Using their suggestion:
使用他们的建议:
pd.pivot_table(df[df.v3 == some_value], index='v1', columns='A', values='v3', aggfunc='count')
Produces an error:
产生错误:
"TypeError: pivot_table() got multiple values for argument 'values'"
“类型错误:pivot_table() 为参数‘值’获得了多个值”
I made a slight tweak, and it works for me:
我做了一个轻微的调整,它对我有用:
df[df.v3 == some_value].pivot_table(index='v1', columns='A', values='v3', aggfunc='count')
For adding multiple filters: Use &, |operators with a set of () to specify the priority. Using and,orresults error.
添加多个过滤器:使用&, | 运算符用一组 () 来指定优先级。使用and,或导致错误。
df[(df.v3 == some_value) & (df.v4 == some_value)].pivot_table(index='v1', columns='A', values='v3', aggfunc='count')
回答by Grr
If you want to filter by columns you could just pass a single column name, or list of names. For example:
如果您想按列过滤,您可以只传递一个列名或名称列表。例如:
pd.pivot_table(df, index='v1', columns='A', values='v3', aggfunc='count')
pd.pivot_table(df, index='v1', columns=['A', 'B', 'C'], values='v3', aggfunc='count')
If you want to filter by values you would just filter the DataFrame. For example:
如果您想按值过滤,您只需过滤 DataFrame。例如:
pd.pivot_table(df[df.v3 == some_value], index='v1', columns='A', values='v3', aggfunc='count')
回答by Vishnu Dhas
You can use a where
condition as well here:
您也可以where
在此处使用条件:
df.where([df.v3 == some_value]).pivot_table(index='v1', columns='A', values='v3', aggfunc='count')