pandas 计数的python数据透视表
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47320563/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pivot table of counts
提问by Symphony
I have a dataframe df as follows:
我有一个数据框 df 如下:
| id | movie | value |
|----|-------|-------|
| 1 | a | 0 |
| 2 | a | 0 |
| 3 | a | 20 |
| 4 | a | 0 |
| 5 | a | 10 |
| 6 | a | 0 |
| 7 | a | 20 |
| 8 | b | 0 |
| 9 | b | 0 |
| 10 | b | 30 |
| 11 | b | 30 |
| 12 | b | 30 |
| 13 | b | 10 |
| 14 | c | 40 |
| 15 | c | 40 |
I want to create a 2X2 pivot table of counts as follows:
我想创建一个 2X2 的计数数据透视表,如下所示:
| Value | count(a) | count(b) | count ( C ) |
|-------|----------|----------|-------------|
| 0 | 4 | 2 | 0 |
| 10 | 1 | 1 | 0 |
| 20 | 2 | 0 | 0 |
| 30 | 0 | 3 | 0 |
| 40 | 0 | 0 | 2 |
I can do this very easily in Excel using Row and Column Labels. How can I do this using Python?
我可以使用行和列标签在 Excel 中轻松完成此操作。我怎样才能使用 Python 做到这一点?
回答by YOBEN_S
By using pd.crosstab
通过使用 pd.crosstab
pd.crosstab(df['value'],df['movie'])
Out[24]:
movie a b c
value
0 4 2 0
10 1 1 0
20 2 0 0
30 0 3 0
40 0 0 2
回答by edesz
It can be done this way with Pandas' basic pivot_table
functionality and aggregate functions (also need to import NumPy
). See the answer in this questionand Pandas pivot_table
documentation with examples:
可以通过 Pandas 的基本pivot_table
功能和聚合函数(也需要)以这种方式完成import NumPy
。请参阅此问题和 Pandaspivot_table
文档中的答案以及示例:
import numpy as np
df = ...
ndf = df.pivot_table(index=['value'],
columns='movie',
aggfunc=np.count_nonzero).reset_index().fillna(0).astype(int)
print(ndf)
value id
movie a b c
0 0 4 2 0
1 10 1 1 0
2 20 2 0 0
3 30 0 3 0
4 40 0 0 2
回答by Scott Boston
Since you are familiar with pivot tables in Excel, I'll give you the Pandas pivot_table
method also:
由于您熟悉 Excel 中的数据透视表,我pivot_table
还将为您提供 Pandas方法:
df.pivot_table('id','value','movie',aggfunc='count').fillna(0).astype(int)
Output:
输出:
movie a b c
value
0 4 2 0
10 1 1 0
20 2 0 0
30 0 3 0
40 0 0 2