Python 获取两个列表之间的差异
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3462143/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get difference between two lists
提问by Max Frai
I have two lists in Python, like these:
我在 Python 中有两个列表,如下所示:
temp1 = ['One', 'Two', 'Three', 'Four']
temp2 = ['One', 'Two']
I need to create a third list with items from the first list which aren't present in the second one. From the example I have to get:
我需要创建第三个列表,其中包含第一个列表中第二个列表中不存在的项目。从我必须得到的例子中:
temp3 = ['Three', 'Four']
Are there any fast ways without cycles and checking?
有没有没有循环和检查的快速方法?
采纳答案by ars
In [5]: list(set(temp1) - set(temp2))
Out[5]: ['Four', 'Three']
Beware that
当心
In [5]: set([1, 2]) - set([2, 3])
Out[5]: set([1])
where you might expect/want it to equal set([1, 3]). If you do want set([1, 3])as your answer, you'll need to use set([1, 2]).symmetric_difference(set([2, 3])).
您可能期望/希望它等于set([1, 3]). 如果您确实想要set([1, 3])作为答案,则需要使用set([1, 2]).symmetric_difference(set([2, 3])).
回答by Maciej Kucharz
Try this:
尝试这个:
temp3 = set(temp1) - set(temp2)
回答by matt b
temp3 = [item for item in temp1 if item not in temp2]
回答by aaronasterling
i'll toss in since none of the present solutions yield a tuple:
我会折腾,因为目前的解决方案都没有产生元组:
temp3 = tuple(set(temp1) - set(temp2))
alternatively:
或者:
#edited using @Mark Byers idea. If you accept this one as answer, just accept his instead.
temp3 = tuple(x for x in temp1 if x not in set(temp2))
Like the other non-tuple yielding answers in this direction, it preserves order
像这个方向的其他非元组产生答案一样,它保留了顺序
回答by Mark Byers
The existing solutions all offer either one or the other of:
现有的解决方案都提供以下一种或另一种:
- Faster than O(n*m) performance.
- Preserve order of input list.
- 比 O(n*m) 性能更快。
- 保留输入列表的顺序。
But so far no solution has both. If you want both, try this:
但到目前为止,还没有解决方案兼具两者。如果两者都想要,试试这个:
s = set(temp2)
temp3 = [x for x in temp1 if x not in s]
Performance test
性能测试
import timeit
init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
print timeit.timeit('list(set(temp1) - set(temp2))', init, number = 100000)
print timeit.timeit('s = set(temp2);[x for x in temp1 if x not in s]', init, number = 100000)
print timeit.timeit('[item for item in temp1 if item not in temp2]', init, number = 100000)
Results:
结果:
4.34620224079 # ars' answer
4.2770634955 # This answer
30.7715615392 # matt b's answer
The method I presented as well as preserving order is also (slightly) faster than the set subtraction because it doesn't require construction of an unnecessary set. The performance difference would be more noticable if the first list is considerably longer than the second and if hashing is expensive. Here's a second test demonstrating this:
我提出的方法以及保留顺序也(略)比集合减法快,因为它不需要构造不必要的集合。如果第一个列表比第二个列表长得多并且散列很昂贵,则性能差异将更加明显。这是证明这一点的第二个测试:
init = '''
temp1 = [str(i) for i in range(100000)]
temp2 = [str(i * 2) for i in range(50)]
'''
Results:
结果:
11.3836875916 # ars' answer
3.63890368748 # this answer (3 times faster!)
37.7445402279 # matt b's answer
回答by Mohammed
this could be even faster than Mark's list comprehension:
这可能比 Mark 的列表理解更快:
list(itertools.filterfalse(set(temp2).__contains__, temp1))
回答by arulmr
The difference between two lists (say list1 and list2) can be found using the following simple function.
可以使用以下简单函数找到两个列表(比如 list1 和 list2)之间的区别。
def diff(list1, list2):
c = set(list1).union(set(list2)) # or c = set(list1) | set(list2)
d = set(list1).intersection(set(list2)) # or d = set(list1) & set(list2)
return list(c - d)
or
或者
def diff(list1, list2):
return list(set(list1).symmetric_difference(set(list2))) # or return list(set(list1) ^ set(list2))
By Using the above function, the difference can be found using diff(temp2, temp1)or diff(temp1, temp2). Both will give the result ['Four', 'Three']. You don't have to worry about the order of the list or which list is to be given first.
通过使用上述功能,可以使用diff(temp2, temp1)或找到差异diff(temp1, temp2)。两者都会给出结果['Four', 'Three']。您不必担心列表的顺序或首先给出哪个列表。
回答by manhgd
This is another solution:
这是另一种解决方案:
def diff(a, b):
xa = [i for i in set(a) if i not in b]
xb = [i for i in set(b) if i not in a]
return xa + xb
回答by soundcorner
You could use a naive method if the elements of the difflist are sorted and sets.
如果对 difflist 的元素进行排序和设置,您可以使用一种简单的方法。
list1=[1,2,3,4,5]
list2=[1,2,3]
print list1[len(list2):]
or with native set methods:
或使用本机设置方法:
subset=set(list1).difference(list2)
print subset
import timeit
init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
print "Naive solution: ", timeit.timeit('temp1[len(temp2):]', init, number = 100000)
print "Native set solution: ", timeit.timeit('set(temp1).difference(temp2)', init, number = 100000)
Naive solution: 0.0787101593292
天真的解决方案:0.0787101593292
Native set solution: 0.998837615564
原生集解决方案:0.998837615564
回答by sreemanth pulagam
single line version of arulmrsolution
arulmr解决方案的单行版本
def diff(listA, listB):
return set(listA) - set(listB) | set(listA) -set(listB)

