Python 多个groupby后如何将pandas数据从索引移动到列

Question

提问by prooffreader

I have the following pandas dataframe:

我有以下熊猫数据框：

dfalph.head()

token    year    uses  books
  386   xanthos  1830    3     3
  387   xanthos  1840    1     1
  388   xanthos  1840    2     2
  389   xanthos  1868    2     2
  390   xanthos  1875    1     1

I aggregate the rows with duplicate tokenand yearslike so:

我用聚合重复行token和years像这样：

dfalph = dfalph[['token','year','uses','books']].groupby(['token', 'year']).agg([np.sum])
dfalph.columns = dfalph.columns.droplevel(1)
dfalph.head()

               uses  books
token    year       
xanthos  1830    3     3
         1840    3     3
         1867    2     2
         1868    2     2
         1875    1     1

Instead of having the 'token' and 'year' fields in the index, I would like to return them to columns and have an integer index.

我想将它们返回到列并具有整数索引，而不是在索引中包含“令牌”和“年份”字段。

Answer 1

采纳答案by DSM

Method #1: reset_index()

方法#1：reset_index()

>>> g
              uses  books
               sum    sum
token   year             
xanthos 1830     3      3
        1840     3      3
        1868     2      2
        1875     1      1

[4 rows x 2 columns]
>>> g = g.reset_index()
>>> g
     token  year  uses  books
                   sum    sum
0  xanthos  1830     3      3
1  xanthos  1840     3      3
2  xanthos  1868     2      2
3  xanthos  1875     1      1

[4 rows x 4 columns]

Method #2: don't make the index in the first place, using as_index=False

方法#2：首先不要创建索引，使用as_index=False

>>> g = dfalph[['token', 'year', 'uses', 'books']].groupby(['token', 'year'], as_index=False).sum()
>>> g
     token  year  uses  books
0  xanthos  1830     3      3
1  xanthos  1840     3      3
2  xanthos  1868     2      2
3  xanthos  1875     1      1

[4 rows x 4 columns]

Answer 2

回答by Adarsh Madrecha

I defer form the accepted answer. While there are 2 ways to do this, these will not necessarily result in same output. Specially when you are using Grouperin groupby

我推迟了接受的答案。虽然有两种方法可以做到这一点，但这些方法不一定会产生相同的输出。特别是当您使用Grouper在groupby

index=False
reset_index()

index=False
reset_index()

example df

例子 df

+---------+---------+-------------+------------+
| column1 | column2 | column_date | column_sum |
+---------+---------+-------------+------------+
| A       | M       | 26-10-2018  |          2 |
| B       | M       | 28-10-2018  |          3 |
| A       | M       | 30-10-2018  |          6 |
| B       | M       | 01-11-2018  |          3 |
| C       | N       | 03-11-2018  |          4 |
+---------+---------+-------------+------------+

They do not work the same way.

它们的工作方式不同。

df = df.groupby(
    by=[
        'column1',
        'column2',
        pd.Grouper(key='column_date', freq='M')
    ],
    as_index=False
).sum()

The above will give

以上会给

+---------+---------+------------+
| column1 | column2 | column_sum |
+---------+---------+------------+
| A       | M       |          8 |
| B       | M       |          3 |
| B       | M       |          3 |
| C       | N       |          4 |
+---------+---------+------------+

While,

尽管，

df = df.groupby(
    by=[
        'column1',
        'column2',
        pd.Grouper(key='column_date', freq='M')
    ]
).sum().reset_index()

Will give

会给

+---------+---------+-------------+------------+
| column1 | column2 | column_date | column_sum |
+---------+---------+-------------+------------+
| A       | M       | 31-10-2018  |          8 |
| B       | M       | 31-10-2018  |          3 |
| B       | M       | 30-11-2018  |          3 |
| C       | N       | 30-11-2018  |          4 |
+---------+---------+-------------+------------+

Answer 3

回答by user1809802

You need to add drop=True:

您需要添加drop=True：

df.reset_index(drop=True)

df = df.groupby(
    by=[
        'column1',
        'column2',
        pd.Grouper(key='column_date', freq='M')
    ]
).sum().reset_index(drop=True)

Python 多个groupby后如何将pandas数据从索引移动到列

提问by prooffreader

采纳答案by DSM

回答by Adarsh Madrecha

回答by user1809802

相关推荐

最近更新

标签

Python 多个groupby后如何将pandas数据从索引移动到列

提问by prooffreader

采纳答案by DSM

回答by Adarsh Madrecha

回答by user1809802

相关推荐

Python Tkinter：如何使用线程来防止主事件循环“冻结”

Python Pandas 数据框总行

相当于R表的python

Python Tkinter：树视图小部件

相关推荐

最近更新

标签