Pandas 交叉表,但包含来自第三列聚合的值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39735068/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas crosstab, but with values from aggregation of third column
提问by user1700890
Here is my problem:
这是我的问题:
df = pd.DataFrame({'A': ['one', 'one', 'two', 'two', 'one'] ,
'B': ['Ar', 'Br', 'Cr', 'Ar','Ar'] ,
'C': [1, 0, 0, 1,0 ]})
I would like to generate something like output of pd.crosstab
function, but values on the intersection of column and row should come from aggregation of third column:
我想生成类似pd.crosstab
函数输出的东西,但列和行交叉处的值应该来自第三列的聚合:
Ar, Br, Cr
one 0.5 0 0
two 1 0 0
For example, there are two cases of 'one' and 'Ar' corresponding values in column 'C' are 1,0 we sum up values in column 'C' (0+1) and divide by number of values in column 'C', so we get (0+1)/2 =0.5. Whenever combination is not present we (like 'Cr' and 'one') we set it to zero. Any thoughts?
例如,有两种情况,“C”列中的“一”和“Ar”对应值是 1,0 我们将“C”列中的值相加 (0+1) 并除以“C”列中的值的数量',所以我们得到 (0+1)/2 =0.5。每当不存在组合时(如“Cr”和“一”),我们将其设置为零。有什么想法吗?
回答by MaxU
you can use pivot_table()method, which uses aggfunc='mean'
per-default:
您可以使用pivot_table()方法,该方法使用aggfunc='mean'
每个默认值:
In [46]: df.pivot_table(index='A', columns='B', values='C', fill_value=0)
Out[46]:
B Ar Br Cr
A
one 0.5 0 0
two 1.0 0 0