如何在 Pandas 中对数据透视表进行排序

Question

提问by Cheng

Here is the code:

这是代码：

test = pd.DataFrame({'country':['us','ca','ru','cn','ru','cn','us','ca','ru','cn','us','ca','ru','cn','us','ca'], 'month':[5,6,7,5,6,7,5,5,6,7,5,6,6,5,5,6], 'id':[x for x in range(16)]})
p = test.pivot_table(index=['month', 'country'], aggfunc='count')[['id']]

The output looks like this:

输出如下所示：

I'd like to sort the table by the idcolumn, so that the largest number appear on top like:

我想按id列对表格进行排序，以便最大的数字出现在顶部，例如：

                    id
month    country
           us       4
  5        cn       2
           ca       1

Answer 1

采纳答案by jezrael

You need DataFrame.reset_index, DataFrame.sort_valuesand DataFrame.set_index:

你需要DataFrame.reset_index，DataFrame.sort_values并且DataFrame.set_index：

p1 = p.reset_index()
      .sort_values(['month','id'], ascending=[1,0])
      .set_index(['month','country'])
print (p1)
               id
month country    
5     us        4
      cn        2
      ca        1
6     ca        3
      ru        3
7     cn        2
      ru        1

because this solution does not work :(

因为这个解决方案不起作用:(

p1 = p.sort_index(level='month', sort_remaining=True) \
      .sort_values('id', ascending=False)
print (p1)
               id
month country    
5     us        4
6     ca        3
      ru        3
5     cn        2
7     cn        2
5     ca        1
7     ru        1

Answer 2

回答by piRSquared

Option 1
This sorts by idwithin groups defined by the monthlevel within the index

选项 1
这id在由month索引内的级别定义的组内排序

p.groupby(
    level='month', group_keys=False
).apply(pd.DataFrame.sort_values, by='id', ascending=False)

               id
month country    
5     us        4
      cn        2
      ca        1
6     ca        3
      ru        3
7     cn        2
      ru        1

Option 2
This first sorts the entire dataframe by idthen sorts again by the monthlevel within the index. However, I had to use sort_remaining=Falsefor self-explanatory reasons and kind='mergesort'because mergesortis a stable sort and won't mess with the pre-existing order within groups defined by the 'month' level.

选项 2
这首先对整个数据帧进行id排序，然后再次按month索引内的级别排序。但是，sort_remaining=False出于不言自明的原因，我不得不使用它，kind='mergesort'因为它mergesort是一种稳定的排序，不会与“月”级别定义的组内预先存在的顺序混淆。

p.sort_values('id', ascending=False) \
 .sort_index(level='month', sort_remaining=False, kind='mergesort')

               id
month country    
5     us        4
      cn        2
      ca        1
6     ca        3
      ru        3
7     cn        2
      ru        1

Option 3
This uses numpy's lexsort... this works, but I don't like it because it depends on idbeing numeric and my being able to put a negative in front of it to get descending ordering. /shrug

选项 3
这使用 numpy 的lexsort......这有效，但我不喜欢它，因为它取决于id数字并且我能够在它前面放一个负数以获得降序。/耸肩

p.iloc[np.lexsort([-p.id.values, p.index.get_level_values('month')])]

               id
month country    
5     us        4
      cn        2
      ca        1
6     ca        3
      ru        3
7     cn        2
      ru        1

如何在 Pandas 中对数据透视表进行排序

提问by Cheng

采纳答案by jezrael

回答by piRSquared

相关推荐

最近更新

标签

如何在 Pandas 中对数据透视表进行排序

提问by Cheng

采纳答案by jezrael

回答by piRSquared

相关推荐

Pandas Melt with Multiple Value Vars

pandas 如何并排绘制2个直方图？

Pandas - 带条件公式的 Groupby

pandas 根据列值的长度过滤数据框行

相关推荐

最近更新

标签