pandas 删除pandas中所有列中具有相同值的重复行

Question

提问by jovicbg

I have a dataframe with about a half a million rows. As I could see, there are plenty of duplicate rows, so how can I drop duplicate rows that have the same value in all of the columns (about 80 columns), not just one?

我有一个大约有一百万行的数据框。正如我所看到的，有很多重复的行，那么如何删除所有列（大约 80 列）中具有相同值的重复行，而不仅仅是一个？

df:

df：

period_start_time    id    val1    val2    val3
06.13.2017 22:00:00  i53    32      2       10
06.13.2017 22:00:00  i32    32      2       10
06.13.2017 22:00:00  i32    4       2       8
06.13.2017 22:00:00  i32    4       2       8
06.13.2017 22:00:00  i32    4       2       8
06.13.2017 22:00:00  i20    7       7       22
06.13.2017 22:00:00  i20    7       7       22

Desired output:

期望的输出：

period_start_time    id    val1    val2    val3
06.13.2017 22:00:00  i53    32      2       10
06.13.2017 22:00:00  i32    32      2       10
06.13.2017 22:00:00  i32    4       2       8
06.13.2017 22:00:00  i20    7       7       22

Answer 1

回答by jezrael

Use drop_duplicates:

使用drop_duplicates：

df = df.drop_duplicates()
print (df)
     period_start_time   id  val1  val2  val3
0  06.13.2017 22:00:00  i53    32     2    10
1  06.13.2017 22:00:00  i32    32     2    10
2  06.13.2017 22:00:00  i32     4     2     8
5  06.13.2017 22:00:00  i20     7     7    22

pandas 删除pandas中所有列中具有相同值的重复行

提问by jovicbg

回答by jezrael

相关推荐

最近更新

标签

pandas 删除pandas中所有列中具有相同值的重复行

提问by jovicbg

回答by jezrael

相关推荐

pandas 为什么“reset_index(drop=True)”函数会意外删除列？

pandas 删除多索引级别但保留列的名称 - 熊猫

Pandas 循环遍历数据框列表和更改索引

python pandas column dtype=object 导致合并失败：DtypeWarning: Columns have mixed types

相关推荐

最近更新

标签