Python 从另一个列表中删除出现在一个列表中的所有元素

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4211209/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 14:47:15  来源:igfitidea点击:

Remove all the elements that occur in one list from another

pythonlist

提问by fandom

Let's say I have two lists, l1and l2. I want to perform l1 - l2, which returns all elements of l1not in l2.

假设我有两个列表,l1并且l2. 我想执行l1 - l2,它返回l1not in 的所有元素l2

I can think of a naive loop approach to doing this, but that is going to be really inefficient. What is a pythonic and efficient way of doing this?

我可以想到一种简单的循环方法来执行此操作,但这将非常低效。这样做的pythonic和有效方法是什么?

As an example, if I have l1 = [1,2,6,8] and l2 = [2,3,5,8], l1 - l2should return [1,6]

例如,如果我有l1 = [1,2,6,8] and l2 = [2,3,5,8]l1 - l2应该返回[1,6]

回答by Donut

Python has a language feature called List Comprehensionsthat is perfectly suited to making this sort of thing extremely easy. The following statement does exactly what you want and stores the result in l3:

Python 有一个语言特性叫做List Comprehensions,它非常适合让这种事情变得非常简单。以下语句完全符合您的要求并将结果存储在l3

l3 = [x for x in l1 if x not in l2]

l3will contain [1, 6].

l3将包含[1, 6].

回答by Arkku

One way is to use sets:

一种方法是使用集合:

>>> set([1,2,6,8]) - set([2,3,5,8])
set([1, 6])

回答by nonot1

Use the Python set type. That would be the most Pythonic. :)

使用 Python 集类型。那将是最Pythonic的。:)

Also, since it's native, it should be the most optimized method too.

此外,由于它是原生的,它也应该是最优化的方法。

See:

看:

http://docs.python.org/library/stdtypes.html#set

http://docs.python.org/library/stdtypes.html#set

http://docs.python.org/library/sets.htm(for older python)

http://docs.python.org/library/sets.htm(适用于较旧的 Python)

# Using Python 2.7 set literal format.
# Otherwise, use: l1 = set([1,2,6,8])
#
l1 = {1,2,6,8}
l2 = {2,3,5,8}
l3 = l1 - l2

回答by Daniel Pryden

Expanding on Donut's answer and the other answers here, you can get even better results by using a generator comprehension instead of a list comprehension, and by using a setdata structure (since the inoperator is O(n) on a list but O(1) on a set).

扩展 Donut 的答案和此处的其他答案,您可以通过使用生成器推导式而不是列表推导式以及使用set数据结构(因为in运算符在列表上为 O(n) 但 O(1))获得更好的结果在一组)。

So here's a function that would work for you:

所以这里有一个适合你的功能:

def filter_list(full_list, excludes):
    s = set(excludes)
    return (x for x in full_list if x not in s)

The result will be an iterable that will lazily fetch the filtered list. If you need a real list object (e.g. if you need to do a len()on the result), then you can easily build a list like so:

结果将是一个可迭代的,它将懒惰地获取过滤后的列表。如果您需要一个真正的列表对象(例如,如果您需要对len()结果执行 a ),那么您可以轻松构建一个列表,如下所示:

filtered_list = list(filter_list(full_list, excludes))

回答by Akshay Hazari

Alternate Solution :

替代解决方案:

reduce(lambda x,y : filter(lambda z: z!=y,x) ,[2,3,5,8],[1,2,6,8])

回答by Moinuddin Quadri

As an alternative, you may also use filterwith the lambda expressionto get the desired result. For example:

作为替代方案,您也可以使用filterlambda 表达式来获得所需的结果。例如:

>>> l1 = [1,2,6,8]
>>> l2 = set([2,3,5,8])

#     v  `filter` returns the a iterator object. Here I'm type-casting 
#     v  it to `list` in order to display the resultant value
>>> list(filter(lambda x: x not in l2, l1))
[1, 6]

Performance Comparison

性能比较

Here I am comparing the performance of all the answers mentioned here. As expected, Arkku'ssetbased operation is fastest.

在这里,我比较这里提到的所有答案的性能。正如预期的那样,基于Arkku 的set操作是最快的。

  • Arkku's Set Difference- First (0.124 usec per loop)

    mquadri$ python -m timeit -s "l1 = set([1,2,6,8]); l2 = set([2,3,5,8]);" "l1 - l2"
    10000000 loops, best of 3: 0.124 usec per loop
    
  • Daniel Pryden's List Comprehension with setlookup- Second (0.302 usec per loop)

    mquadri$ python -m timeit -s "l1 = [1,2,6,8]; l2 = set([2,3,5,8]);" "[x for x in l1 if x not in l2]"
    1000000 loops, best of 3: 0.302 usec per loop
    
  • Donut's List Comprehension on plain list- Third (0.552 usec per loop)

    mquadri$ python -m timeit -s "l1 = [1,2,6,8]; l2 = [2,3,5,8];" "[x for x in l1 if x not in l2]"
    1000000 loops, best of 3: 0.552 usec per loop
    
  • Moinuddin Quadri's using filter- Fourth (0.972 usec per loop)

    mquadri$ python -m timeit -s "l1 = [1,2,6,8]; l2 = set([2,3,5,8]);" "filter(lambda x: x not in l2, l1)"
    1000000 loops, best of 3: 0.972 usec per loop
    
  • Akshay Hazari's using combination of reduce+ filter- Fifth (3.97 usec per loop)

    mquadri$ python -m timeit "l1 = [1,2,6,8]; l2 = [2,3,5,8];" "reduce(lambda x,y : filter(lambda z: z!=y,x) ,l1,l2)"
    100000 loops, best of 3: 3.97 usec per loop
    
  • Arkku 的设置差异- 第一(每个循环 0.124微秒

    mquadri$ python -m timeit -s "l1 = set([1,2,6,8]); l2 = set([2,3,5,8]);" "l1 - l2"
    10000000 loops, best of 3: 0.124 usec per loop
    
  • Daniel Pryden's List Comprehension with setlookup- Second(0.302 usec per loop)

    mquadri$ python -m timeit -s "l1 = [1,2,6,8]; l2 = set([2,3,5,8]);" "[x for x in l1 if x not in l2]"
    1000000 loops, best of 3: 0.302 usec per loop
    
  • 普通列表上的甜甜圈列表理解- 第三个(每个循环 0.552 usec)

    mquadri$ python -m timeit -s "l1 = [1,2,6,8]; l2 = [2,3,5,8];" "[x for x in l1 if x not in l2]"
    1000000 loops, best of 3: 0.552 usec per loop
    
  • Moinuddin Quadri 的使用filter- 第四个(每个循环 0.972 微秒)

    mquadri$ python -m timeit -s "l1 = [1,2,6,8]; l2 = set([2,3,5,8]);" "filter(lambda x: x not in l2, l1)"
    1000000 loops, best of 3: 0.972 usec per loop
    
  • Akshay Hazari 使用reduce+filter- Fifth 的组合(每个循环 3.97 微秒)

    mquadri$ python -m timeit "l1 = [1,2,6,8]; l2 = [2,3,5,8];" "reduce(lambda x,y : filter(lambda z: z!=y,x) ,l1,l2)"
    100000 loops, best of 3: 3.97 usec per loop
    

PS:setdoes not maintain the order and removes the duplicate elements from the list. Hence, do not use set differenceif you need any of these.

PS:set不维护顺序并从列表中删除重复的元素。因此,如果您需要其中任何一个,请不要使用集差

回答by lbsweek

use Set Comprehensions{x for x in l2} or set(l2) to get set, then use List Comprehensionsto get list

使用 Set Comprehensions{x for x in l2} 或 set(l2) 获取集合,然后使用List Comprehensions获取列表

l2set = set(l2)
l3 = [x for x in l1 if x not in l2set]

benchmark test code:

基准测试代码:

import time

l1 = list(range(1000*10 * 3))
l2 = list(range(1000*10 * 2))

l2set = {x for x in l2}

tic = time.time()
l3 = [x for x in l1 if x not in l2set]
toc = time.time()
diffset = toc-tic
print(diffset)

tic = time.time()
l3 = [x for x in l1 if x not in l2]
toc = time.time()
difflist = toc-tic
print(difflist)

print("speedup %fx"%(difflist/diffset))

benchmark test result:

基准测试结果:

0.0015058517456054688
3.968189239501953
speedup 2635.179227x