pandas python pandas删除系列中的重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12962705/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
python pandas remove duplicates in series
提问by mathtick
Is there a function to enforce that the index is unique or is it only possibly to handle this in python 'itself' by converting to dict and back or something like that?
是否有一个函数来强制索引是唯一的,或者它是否只能通过转换为 dict 并返回或类似的东西在 python '本身'中处理这个?
As noted in the comments below: python pandas is a project built on numpy/scipy.
正如下面的评论所指出的:python pandas 是一个建立在 numpy/scipy 上的项目。
to_dict and back works, but I bet this gets slow when you get BIG.
to_dict 和 back 有效,但我敢打赌,当你变大时,这会变慢。
In [24]: a = pandas.Series([1,2,3], index=[1,1,2])
In [25]: a
Out[25]:
1 1
1 2
2 3
In [26]: a = a.to_dict()
In [27]: a
Out[27]: {1: 2, 2: 3}
In [28]: a = pandas.Series(a)
In [29]: a
Out[29]:
1 2
2 3
采纳答案by root
Use groupbyand last()
使用groupby和last()
In [279]: s
Out[279]:
a 1
b 2
b 3
b 4
e 5
In [280]: grouped = s.groupby(level=0)
In [281]: grouped.first()
Out[281]:
a 1
b 2
e 5
In [282]: grouped.last()
Out[282]:
a 1
b 4
e 5
回答by Wes McKinney
BTW we plan on adding a drop_duplicatesmethod to Series like DataFrame.drop_duplicatesin the near future.
顺便说一句,我们计划在不久的将来drop_duplicates向 Series添加一个方法DataFrame.drop_duplicates。

