在 Pandas DataFrame 中按字典分组
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41185560/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Group by a Dictionary in Pandas DataFrame
提问by tonyibm
I read data from an excel file, it is like this:
我从一个excel文件中读取数据,是这样的:
and I want to get the sum of Q1 and Q2 for each employee, so I create a dictionary, and want to group by based on that,
我想得到每个员工的 Q1 和 Q2 的总和,所以我创建了一个字典,并想根据它进行分组,
import pandas as pd
mb2 = pd.read_excel('C:\Users\IBM_ADMIN\Desktop\ml-1m\工作簿1.xlsx', sheetname='Sheet3')
mapping = {'Jan':'Q1','Feb':'Q1','Mar':'Q1','Apr':'Q2','May':'Q2','Jun':'Q2'}
mb2.groupby(by=mapping,axis=1).sum()
How to show also the field Name? I try to do like this, but it didn't work:
如何同时显示字段名称?我尝试这样做,但没有奏效:
mb2.groupby(by=(mapping,'Name'),axis=1).sum()
采纳答案by Blackecho
A quick solution is to save the resulting DataFrame
in a new variable (grouped_data
in the following code) and then assign the Name
column to it using the original DataFrame
:
一个快速的解决方案是将结果保存DataFrame
在一个新变量中(grouped_data
在以下代码中),然后Name
使用原始变量将列分配给它DataFrame
:
grouped_data = mb2.groupby(by=mapping,axis=1).sum()
grouped_data['Name'] = mb2['Name']
print(grouped_data)
EDIT:Another way of achieving the same result is to modify the mapping
dictionary, adding a "Name": "Name"
pair to it, and calling mb2.groupby(by=mapping,axis=1).sum()
as before.
编辑:实现相同结果的另一种方法是修改mapping
字典,向其中添加一"Name": "Name"
对,然后mb2.groupby(by=mapping,axis=1).sum()
像以前一样调用。
回答by jezrael
You can use set_index
first, then groupby
and last reset_index
:
您可以使用set_index
first , thengroupby
和 last reset_index
:
df = mb2.set_index('Name').groupby(by=mapping,axis=1).sum().reset_index()
print (df)
Name Q1 Q2
0 Tony 1367 1835
1 Rafal 1405 2574
2 John 1550 1409
3 Katrina 1555 1509
4 Kathy 1583 1491
5 Tom 1344 1874