python从2个列表中删除重复项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18194968/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 10:09:27  来源:igfitidea点击:

python remove duplicates from 2 lists

pythonlistduplicate-removal

提问by michael

I am trying to remove duplicates from 2 lists. so I wrote this function:

我正在尝试从 2 个列表中删除重复项。所以我写了这个函数:

a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]

b = ["ijk", "lmn", "opq", "rst", "123", "456", ]

for i in b:
    if i in a:
        print "found " + i
        b.remove(i)

print b

But I find that the matching items following a matched item does not get remove.

但是我发现匹配项之后的匹配项不会被删除。

I get result like this:

我得到这样的结果:

found ijk
found opq
['lmn', 'rst', '123', '456']

but i expect result like this:

但我希望结果是这样的:

['123', '456']

['123', '456']

How can I fix my function to do what I want?

我怎样才能修复我的功能来做我想做的事?

Thank you.

谢谢你。

回答by Sukrit Kalra

Your problem seems to be that you're changing the list you're iterating over. Iterate over a copy of the list instead.

您的问题似乎是您正在更改正在迭代的列表。而是迭代列表的副本。

for i in b[:]:
    if i in a:
        b.remove(i)


>>> b
['123', '456']

But, How about using a list comprehension instead?

但是,如何使用列表理解来代替?

>>> a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
>>> b = ["ijk", "lmn", "opq", "rst", "123", "456", ]
>>> [elem for elem in b if elem not in a ]
['123', '456']

回答by Joran Beasley

or a set

或一组

set(b).difference(a)

be forewarned sets will not preserve order if that is important

预先警告如果这很重要,集合将不会保留顺序

回答by Mario Rossi

What about

关于什么

b= set(b) - set(a)

If you need possible repetitions in bto also appear repeated in the result and/or order to be preserved, then

如果您需要可能的重复b出现在结果中重复出现和/或要保留的顺序,那么

b= [ x for x in b if not x in a ] 

would do.

会做。

回答by Anthony Perot

You asked to remove both the lists duplicates, here's my solution:

您要求删除两个列表重复项,这是我的解决方案:

from collections import OrderedDict
a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
b = ["ijk", "lmn", "opq", "rst", "123", "456", ]

x = OrderedDict.fromkeys(a)
y = OrderedDict.fromkeys(b)

for k in x:
    if k in y:
        x.pop(k)
        y.pop(k)


print x.keys()
print y.keys()

Result:

结果:

['abc', 'def', 'xyz']
['123', '456']

The nice thing here is that you keep the order of both lists items

这里的好处是您保持两个列表项的顺序

回答by 7stud

Here is what's going on. Suppose you have this list:

这是发生了什么。假设你有这个列表:

['a', 'b', 'c', 'd']

and you are looping over every element in the list. Suppose you are currently at index position 1:

并且您正在遍历列表中的每个元素。假设您当前位于索引位置 1:

['a', 'b', 'c', 'd']
       ^
       |
   index = 1

...and you remove the element at index position 1, giving you this:

...然后您删除索引位置 1 处的元素,为您提供:

['a',      'c', 'd']
       ^
       |
    index 1

After removing the item, the other items slide to the left, giving you this:

删除项目后,其他项目向左滑动,为您提供:

['a', 'c', 'd']
       ^
       |
    index 1

Then when the loop runs again, the loop increments the index to 2, giving you this:

然后当循环再次运行时,循环将索引增加到 2,给你这个:

['a', 'c', 'd']
            ^ 
            |
         index = 2

See how you skipped over 'c'? The lesson is: never delete an element from a list that you are looping over.

看看你是如何跳过“c”的?教训是:永远不要从您正在循环的列表中删除元素。

回答by Mayur Patel

One way of avoiding the problem of editing a list while you iterate over it, is to use comprehensions:

避免在迭代列表时编辑列表问题的一种方法是使用推导式:

a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
b = ["ijk", "lmn", "opq", "rst", "123", "456", ]
b = [x for x in b if not x in a]

回答by Vincenzo Pii

There are already many answers on "how can you fix it?", so this is a "how can you improve it and be more pythonic?": since what you want to achieve is to get the difference between list band list a, you should use difference operation on sets (operations on sets):

已经有很多关于“你如何修复它?”的答案,所以这是一个“你如何改进它并变得更加pythonic?”:因为你想要实现的是获得 listb和 list之间的区别a,你应该对集合使用差分操作(对集合的操作):

>>> a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
>>> b = ["ijk", "lmn", "opq", "rst", "123", "456", ]
>>> s1 = set(a)
>>> s2 = set(b)
>>> s2 - s1
set(['123', '456'])