pandas python pandas删除系列中的重复项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12962705/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:27:15  来源:igfitidea点击:

python pandas remove duplicates in series

pythonpandas

提问by mathtick

Is there a function to enforce that the index is unique or is it only possibly to handle this in python 'itself' by converting to dict and back or something like that?

是否有一个函数来强制索引是唯一的,或者它是否只能通过转换为 dict 并返回或类似的东西在 python '本身'中处理这个?

As noted in the comments below: python pandas is a project built on numpy/scipy.

正如下面的评论所指出的:python pandas 是一个建立在 numpy/scipy 上的项目。

to_dict and back works, but I bet this gets slow when you get BIG.

to_dict 和 back 有效,但我敢打赌,当你变大时,这会变慢。

In [24]: a = pandas.Series([1,2,3], index=[1,1,2])

In [25]: a
Out[25]: 
1    1
1    2
2    3

In [26]: a = a.to_dict()

In [27]: a
Out[27]: {1: 2, 2: 3}

In [28]: a = pandas.Series(a)

In [29]: a
Out[29]: 
1    2
2    3

采纳答案by root

Use groupbyand last()

使用groupbylast()

In [279]: s
Out[279]: 
a    1
b    2
b    3
b    4
e    5

In [280]: grouped = s.groupby(level=0)

In [281]: grouped.first()
Out[281]: 
a    1
b    2
e    5

In [282]: grouped.last()
Out[282]: 
a    1
b    4
e    5

回答by Wes McKinney

BTW we plan on adding a drop_duplicatesmethod to Series like DataFrame.drop_duplicatesin the near future.

顺便说一句,我们计划在不久的将来drop_duplicates向 Series添加一个方法DataFrame.drop_duplicates