Pandas pivot_table 保留顺序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44988213/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Pandas pivot_table preserve order
提问by Rahul Ranjan
>>> df
A B C D
0 foo one small 1
1 foo one large 2
2 foo one large 2
3 foo two small 3
4 foo two small 3
5 bar one large 4
6 bar one small 5
7 bar two small 6
8 bar two large 7
>>> table = pivot_table(df, values='D', index=['A', 'B'],
... columns=['C'], aggfunc=np.sum)
>>> table
small large
foo one 1 4
two 6 NaN
bar one 5 4
two 6 7
I want the output to be as shown above, but I get a sorted output. bar comes above foo and so on.
我希望输出如上所示,但我得到一个排序的输出。bar 高于 foo 等等。
采纳答案by student
While creating pivot_table
, the index is automatically sortedalphabetically. Not only foo
and bar
, you may also notice small
and large
is sorted. If you want foo
on top, you may need to sort
them again using sortlevel
. If you are expecting output as in example here, then sorting on A
and C
both may be needed.
创建时pivot_table
,索引会自动按字母顺序排序。不仅foo
and bar
,你可能还会注意到small
andlarge
是排序的。如果你想foo
在上面,你可能需要sort
再次使用它们sortlevel
。如果你期待输出作为例如这里,然后排序上A
和C
两个可能是必要的。
table.sortlevel(["A","B"], ascending= [False,True], sort_remaining=False, inplace=True)
table.sortlevel(["C"], axis=1, ascending=False, sort_remaining=False, inplace=True)
print(table)
Output:
输出:
C small large
A B
foo one 1.0 4.0
two 6.0 NaN
bar one 5.0 4.0
two 6.0 7.0
Update:
更新:
To remove index names A
, B
and C
:
删除索引名称A
,B
以及C
:
table.columns.name = None
table.index.names = (None, None)
回答by ayhan
I think pivot_table doesn't have an option for sorting, but groupby has:
我认为 pivot_table 没有排序选项,但 groupby 有:
df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().unstack('C')
Out:
C small large
A B
foo one 1.0 4.0
two 6.0 NaN
bar one 5.0 4.0
two 6.0 7.0
You pass the grouping columns to groupby and for the ones you want to show as column values, you use unstack.
您将分组列传递给 groupby,对于要显示为列值的列,您可以使用 unstack。
If you don't want the index names, rename them as None:
如果您不想要索引名称,请将它们重命名为 None:
df.groupby(['A', 'B', 'C'], sort=False)['D'].sum().rename_axis([None, None, None]).unstack(level=2)
Out:
small large
foo one 1.0 4.0
two 6.0 NaN
bar one 5.0 4.0
two 6.0 7.0