pandas groupby，您可以获得一列的最大值和另一列的最小值

Question

提问by lhay86

I have a dataframe as follows:

我有一个数据框如下：

user    num1    num2
a       1       1
a       2       2
a       3       3
b       4       4
b       5       5

I want a dataframe which has the minimum from num1 for each user, and the maximum of num2 for each user.

我想要一个数据帧，每个用户的 num1 最小值，每个用户的 num2 最大值。

The output should be like:

输出应该是这样的：

user    num1    num2
a       1       3
b       4       5

I know that if I wanted the max of both columns I could just do:

我知道如果我想要两列的最大值，我可以这样做：

a.groupby('user')['num1', 'num2'].max()

Is there some equivalent without having to do something like:

是否有一些等价物而无需执行以下操作：

series_1 = a.groupby('user')['num1'].min() 
series_2 = a.groupby('user')['num2'].max()

# converting from series to df so I can do a join on user
df_1 = pd.DataFrame(np.array([series_1]).transpose(), index=series_1.index, columns=['num1']) 
df_2 = pd.DataFrame(np.array([series_2]).transpose(), index=series_2.index, columns=['num2'])

df_1.join(df_2)

Answer 1

回答by jezrael

Use groupby+ aggby dict, so then is necessary order columns by subsetor reindex_axis. Last add reset_indexfor convert indexto columnif necessary.

使用groupby+ aggby dict，因此必须按subsetor 对列进行排序reindex_axis。如有必要，最后添加reset_index转换index为column。

df = a.groupby('user').agg({'num1':'min', 'num2':'max'})[['num1','num2']].reset_index()
print (df)
  user  num1  num2
0    a     1     3
1    b     4     5

What is same as:

什么是相同的：

df = a.groupby('user').agg({'num1':'min', 'num2':'max'})
                      .reindex_axis(['num1','num2'], axis=1)
                      .reset_index()
print (df)
  user  num1  num2
0    a     1     3
1    b     4     5

pandas groupby，您可以获得一列的最大值和另一列的最小值

提问by lhay86

回答by jezrael

相关推荐

最近更新

标签

pandas groupby，您可以获得一列的最大值和另一列的最小值

提问by lhay86

回答by jezrael

相关推荐

使用 Pandas 过滤和比较日期

如何在 Pandas 数据帧上为 Twitter 数据应用 NLTK word_tokenize 库？

将 Pandas 列转换为逗号分隔的列表以在 sql 语句中使用

将列值拆分为 2 个新列 - Python Pandas

相关推荐

最近更新

标签