pandas 熊猫DF中的重复行

Question

提问by Guforu

I have a DF in Pandas, which looks like:

我在 Pandas 中有一个 DF，它看起来像：

Letters Numbers
A       1
A       3
A       2
A       1
B       1
B       2
B       3
C       2
C       2

I'm looking to count the number of similar rows and save the result in a third column. For example, the output I'm looking for:

我希望计算相似行的数量并将结果保存在第三列中。例如，我正在寻找的输出：

Letters Numbers Events
A       1       2
A       2       1
A       3       1
B       1       1
B       2       1
B       3       1
C       2       2

An example of what I'm looking to do is here. The best idea I've come up with is to use count_values(), but I think this is just for one column. Another idea is to use duplicated(), anyway I don't want construct any for-loop. I'm pretty sure, that a Pythonic alternative to a for loop exists.

我想要做的一个例子是here。我想出的最好的主意是使用count_values()，但我认为这仅适用于一列。另一个想法是使用duplicated()，无论如何我不想构造任何for-loop。我很确定，存在 for 循环的 Pythonic 替代方案。

Answer 1

回答by joris

You can groupby these two columns and then calculate the sizes of the groups:

您可以对这两列进行分组，然后计算组的大小：

In [16]: df.groupby(['Letters', 'Numbers']).size()
Out[16]: 
Letters  Numbers
A        1          2
         2          1
         3          1
B        1          1
         2          1
         3          1
C        2          2
dtype: int64

To get a DataFrame like in your example output, you can reset the index with reset_index.

要获得示例输出中的 DataFrame，您可以使用reset_index.

Answer 2

回答by EdChum

You can use a combination of groupby, transformand then drop_duplicates

您可以组合使用groupby,transform然后drop_duplicates

In [84]:

df['Events'] = df.groupby('Letters')['Numbers'].transform(pd.Series.value_counts)
df.drop_duplicates()
Out[84]:
  Letters  Numbers  Events
0       A        1       2
1       A        3       1
2       A        2       1
4       B        1       1
5       B        2       1
6       B        3       1
7       C        2       2

pandas 熊猫DF中的重复行

提问by Guforu

回答by joris

回答by EdChum

相关推荐

最近更新

标签

pandas 熊猫DF中的重复行

提问by Guforu

回答by joris

回答by EdChum

相关推荐

pandas 从数据框列检查字符串是否为 nan

pandas 如何绘制样品的 PMF？

pandas Seaborn groupby 熊猫系列

pandas 熊猫：两个布尔系列的总和

相关推荐

最近更新

标签