Python 如何在 Pandas 中按数据框分组并保留列

Question

提问by Adrian Ribao

given a dataframe that logs uses of some books like this:

给定一个数据框，它记录了一些书籍的使用情况，如下所示：

Name   Type   ID
Book1  ebook  1
Book2  paper  2
Book3  paper  3
Book1  ebook  1
Book2  paper  2

I need to get the count of all the books, keeping the other columns and get this:

我需要计算所有书籍的数量，保留其他列并得到这个：

Name   Type   ID    Count
Book1  ebook  1     2
Book2  paper  2     2
Book3  paper  3     1

How can this be done?

如何才能做到这一点？

Thanks!

谢谢！

Answer 1

采纳答案by EdChum

You want the following:

您需要以下内容：

In [20]:
df.groupby(['Name','Type','ID']).count().reset_index()

Out[20]:
    Name   Type  ID  Count
0  Book1  ebook   1      2
1  Book2  paper   2      2
2  Book3  paper   3      1

In your case the 'Name', 'Type' and 'ID' cols match in values so we can groupbyon these, call countand then reset_index.

在您的情况下，“名称”、“类型”和“ID”列的值匹配，因此我们可以groupby对这些列进行调用count，然后调用reset_index.

An alternative approach would be to add the 'Count' column using transformand then call drop_duplicates:

另一种方法是使用添加“计数”列transform，然后调用drop_duplicates：

In [25]:
df['Count'] = df.groupby(['Name'])['ID'].transform('count')
df.drop_duplicates()

Out[25]:
    Name   Type  ID  Count
0  Book1  ebook   1      2
1  Book2  paper   2      2
2  Book3  paper   3      1

Answer 2

回答by jpobst

I think as_index=False should do the trick.

我认为 as_index=False 应该可以解决问题。

df.groupby(['Name','Type','ID'], as_index=False).count()

Answer 3

回答by NeStack

If you have many columns in a df it makes sense to use df.groupby(['foo']).agg(...), see here. The .agg()function allows you to choose what to do with the columns you don't want to apply operations on. If you just want to keep them, use .agg({'col1': 'first', 'col2': 'first', ...}. Instead of 'first', you can also apply 'sum', 'mean'and others.

如果 df 中有很多列，使用它是有意义的df.groupby(['foo']).agg(...)，请参见此处。该.agg()函数允许您选择如何处理不想对其应用操作的列。如果您只想保留它们，请使用.agg({'col1': 'first', 'col2': 'first', ...}. 相反的'first'，你也可以申请'sum'，'mean'和其他人。

Python 如何在 Pandas 中按数据框分组并保留列

提问by Adrian Ribao

采纳答案by EdChum

回答by jpobst

回答by NeStack

相关推荐

最近更新

标签

Python 如何在 Pandas 中按数据框分组并保留列

提问by Adrian Ribao

采纳答案by EdChum

回答by jpobst

回答by NeStack

相关推荐

Python AttributeError: 'tuple' 对象没有属性 'split'

Python 将行附加到 Pandas DataFrame 而不制作新副本

Python 在 Pandas 中对列和索引使用合并

在python2.7中将原始字符串转换为JSON对象

相关推荐

最近更新

标签