pandas 将一个 DataFrame 分组到一个新的 DataFrame 中，并以 arange 作为索引

Question

提问by Bas

I have a question, simplified in this example. Consider this Pandas DataFrame, df_a:

我有一个问题，在这个例子中简化了。考虑这个 Pandas DataFrame，df_a：

df_a=pd.DataFrame([['1001',34.3,'red'],['1001',900.04,'red'],['1001',776,'red'],['1003',18.95,'green'],['1004',321.2,'blue']],columns=['id','amount','name'])

    id      amount  name
0   1001    34.30   red
1   1001    900.04  red
2   1001    776.00  red
3   1003    18.95   green
4   1004    321.20  blue

I would like to groupby this dataframe by summing the amount into a new DataFrame and create a new 'arange'-like index. This should be the result I would like to have:

我想通过将数量相加到一个新的 DataFrame 并创建一个新的类似“arange”的索引来对这个数据帧进行分组。这应该是我想要的结果：

    id      amount
0   1001    1710.34
1   1003    18.95
2   1004    321.20

But my efforts create a Series (I would like a DataFrame as result):

但是我的努力创建了一个系列（我想要一个 DataFrame 作为结果）：

df_a.groupby(['id'])['amount'].sum()

id
1001    1710.34
1003      18.95
1004     321.20
Name: amount, dtype: float64

or create a new index based on the id column:

或者根据 id 列创建一个新索引：

pd.DataFrame(df_a.groupby(['id'])['amount'].sum())

        amount
id  
1001    1710.34
1003    18.95
1004    321.20

I've also tried to pass the index parameter, but that doesn't work either:

我也试过传递 index 参数，但这也不起作用：

pd.DataFrame(df_a.groupby(['id'])['amount'].sum(),index=df_a.index.values)

   amount
0   NaN
1   NaN
2   NaN
3   NaN
4   NaN

Does anyone have an elegant solution for this ?

有没有人对此有一个优雅的解决方案？

Answer 1

采纳答案by Vaishali

You have a parameter as_index in groupby for that

您在 groupby 中有一个参数 as_index

df_a.groupby('id', as_index = False)['amount'].sum()

You get

你得到

    id  amount
0   1001    1710.34
1   1003    18.95
2   1004    321.20

Answer 2

回答by student

You can try the following by adding to_frame()and reset_index():

您可以通过添加to_frame()和来尝试以下操作reset_index()：

new_df = df_a.groupby(['id'])['amount'].sum().to_frame('amount').reset_index()
print(new_df)

Result:

结果：

     id   amount
0  1001  1710.34
1  1003    18.95
2  1004   321.20

If you only use to_frame()i.e. using

如果你只使用to_frame()ie 使用

df_a.groupby(['id'])['amount'].sum().to_frame('amount')

it will keep index on idas following:

它将保持索引id如下：

      amount
id           
1001  1710.34
1003    18.95
1004   321.20

Other way is to reset index on dataframe in your above code:

另一种方法是在上面的代码中重置数据帧上的索引：

new_df = pd.DataFrame(df_a.groupby(['id'])['amount'].sum()).reset_index()

Output would be same as above:

输出将与上面相同：

     id   amount
0  1001  1710.34
1  1003    18.95
2  1004   321.20

pandas 将一个 DataFrame 分组到一个新的 DataFrame 中，并以 arange 作为索引

提问by Bas

采纳答案by Vaishali

回答by student

相关推荐

最近更新

标签

pandas 将一个 DataFrame 分组到一个新的 DataFrame 中，并以 arange 作为索引

提问by Bas

采纳答案by Vaishali

回答by student

相关推荐

将样式应用于保存到 HTML 文件的 Pandas 数据框

pandas 处理pandas Data Frame列Name中的特殊字符

pandas 如何在一系列熊猫中显示标题？

pandas 熊猫数据框分组并按工作日排序

相关推荐

最近更新

标签