pandas 在熊猫数据框中选择独特的观察

Question

提问by Michael

I have a pandasdata frame with a column uniqueid. I would like to remove all duplicates from the data frame based on this column, such that all remaining observations are unique.

我有一个pandas带有 column的数据框uniqueid。我想从基于此列的数据框中删除所有重复项，以便所有剩余的观察结果都是唯一的。

Answer 1

回答by cwharland

There is also the drop_duplicates()method for any data frame (docs here). You can pass specific columns to drop from as an argument.

还有drop_duplicates()用于任何数据框的方法（此处为文档）。您可以将要删除的特定列作为参数传递。

df.drop_duplicates(subset='uniqueid', inplace=True)

Answer 2

回答by TomAugspurger

Use the duplicatedmethod

使用duplicated方法

Since we only care if uniqueid(Ain my example) is duplicated, select that and call duplicatedon that series. Then use the ~to flip the bools.

由于我们只关心uniqueid（A在我的示例中）是否重复，因此选择它并调用duplicated该系列。然后使用~翻转布尔值。

In [90]: df = pd.DataFrame({'A': ['a', 'b', 'b', 'c'], 'B': [1, 2, 3, 4]})

In [91]: df
Out[91]: 
   A  B
0  a  1
1  b  2
2  b  3
3  c  4

In [92]: df['A'].duplicated()
Out[92]: 
0    False
1    False
2     True
3    False
Name: A, dtype: bool

In [93]: df.loc[~df['A'].duplicated()]
Out[93]: 
   A  B
0  a  1
1  b  2
3  c  4

pandas 在熊猫数据框中选择独特的观察

提问by Michael

回答by cwharland

回答by TomAugspurger

相关推荐

最近更新

标签

pandas 在熊猫数据框中选择独特的观察

提问by Michael

回答by cwharland

回答by TomAugspurger

相关推荐

pandas 熊猫 groupby 和 qcut

Pandas：使用多索引数据进行透视

pandas 熊猫分组日期

pandas 将数据框保存和加载到 csv 导致未命名列

相关推荐

最近更新

标签