pandas 根据第 2 列的不同值获取行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43694900/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get Rows based on distinct values from Column 2
提问by import.zee
I am a newbie to pandas, tried searching this on google but still no luck. How can I get the rows by distinct values in column2?
我是Pandas的新手,尝试在谷歌上搜索这个,但仍然没有运气。如何通过 column2 中的不同值获取行?
For example, I have the dataframe bellow:
例如,我有下面的数据框:
>>> df
COL1 COL2
a.com 22
b.com 45
c.com 34
e.com 45
f.com 56
g.com 22
h.com 45
I want to get the rows based on unique values in COL2
我想根据 COL2 中的唯一值获取行
>>> df
COL1 COL2
a.com 22
b.com 45
c.com 34
f.com 56
So, how can I get that? I would appreciate it very much if anyone can provide any help.
那么,我怎样才能得到它?如果有人能提供任何帮助,我将不胜感激。
回答by jezrael
Use drop_duplicates
with specifying column COL2
for check duplicates:
drop_duplicates
与指定列COL2
一起使用以检查重复项:
df = df.drop_duplicates('COL2')
#same as
#df = df.drop_duplicates('COL2', keep='first')
print (df)
COL1 COL2
0 a.com 22
1 b.com 45
2 c.com 34
4 f.com 56
You can also keep only last values:
您也可以只保留最后一个值:
df = df.drop_duplicates('COL2', keep='last')
print (df)
COL1 COL2
2 c.com 34
4 f.com 56
5 g.com 22
6 h.com 45
Or remove all duplicates:
或删除所有重复项:
df = df.drop_duplicates('COL2', keep=False)
print (df)
COL1 COL2
2 c.com 34
4 f.com 56