pandas python pandas删除系列中的重复项

Question

提问by mathtick

Is there a function to enforce that the index is unique or is it only possibly to handle this in python 'itself' by converting to dict and back or something like that?

是否有一个函数来强制索引是唯一的，或者它是否只能通过转换为 dict 并返回或类似的东西在 python '本身'中处理这个？

As noted in the comments below: python pandas is a project built on numpy/scipy.

正如下面的评论所指出的：python pandas 是一个建立在 numpy/scipy 上的项目。

to_dict and back works, but I bet this gets slow when you get BIG.

to_dict 和 back 有效，但我敢打赌，当你变大时，这会变慢。

In [24]: a = pandas.Series([1,2,3], index=[1,1,2])

In [25]: a
Out[25]: 
1    1
1    2
2    3

In [26]: a = a.to_dict()

In [27]: a
Out[27]: {1: 2, 2: 3}

In [28]: a = pandas.Series(a)

In [29]: a
Out[29]: 
1    2
2    3

Answer 1

采纳答案by root

Use groupbyand last()

使用groupby和last()

In [279]: s
Out[279]: 
a    1
b    2
b    3
b    4
e    5

In [280]: grouped = s.groupby(level=0)

In [281]: grouped.first()
Out[281]: 
a    1
b    2
e    5

In [282]: grouped.last()
Out[282]: 
a    1
b    4
e    5

Answer 2

回答by Wes McKinney

BTW we plan on adding a drop_duplicatesmethod to Series like DataFrame.drop_duplicatesin the near future.

顺便说一句，我们计划在不久的将来drop_duplicates向 Series添加一个方法DataFrame.drop_duplicates。

pandas python pandas删除系列中的重复项

提问by mathtick

采纳答案by root

回答by Wes McKinney

相关推荐

最近更新

标签

pandas python pandas删除系列中的重复项

提问by mathtick

采纳答案by root

回答by Wes McKinney

相关推荐

pandas 对熊猫系列进行排序

pandas python中的约翰森协整检验

pandas.read_csv 中 parse_date=[0] 和 parse_date=True 的区别

如何通过 Pandas 中的多级索引进行“分组”

相关推荐

最近更新

标签