Python列表减法运算
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3428536/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python list subtraction operation
提问by daydreamer
I want to do something similar to this:
我想做类似的事情:
>>> x = [1,2,3,4,5,6,7,8,9,0]
>>> x
[1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
>>> y = [1,3,5,7,9]
>>> y
[1, 3, 5, 7, 9]
>>> y - x # (should return [2,4,6,8,0])
But this is not supported by python lists What is the best way of doing it?
但是python列表不支持这样做的最佳方法是什么?
采纳答案by aaronasterling
Use a list comprehension:
使用列表理解:
[item for item in x if item not in y]
If you want to use the -infix syntax, you can just do:
如果你想使用中-缀语法,你可以这样做:
class MyList(list):
def __init__(self, *args):
super(MyList, self).__init__(args)
def __sub__(self, other):
return self.__class__(*[item for item in self if item not in other])
you can then use it like:
然后你可以像这样使用它:
x = MyList(1, 2, 3, 4)
y = MyList(2, 5, 2)
z = x - y
But if you don't absolutely need list properties (for example, ordering), just use sets as the other answers recommend.
但是,如果您不是绝对需要列表属性(例如,排序),只需按照其他答案推荐的方式使用集合。
回答by quantumSoup
Use set difference
使用集差
>>> z = list(set(x) - set(y))
>>> z
[0, 8, 2, 4, 6]
Or you might just have x and y be sets so you don't have to do any conversions.
或者您可能只设置了 x 和 y,因此您不必进行任何转换。
回答by Santa
That is a "set subtraction" operation. Use the set data structure for that.
那是一个“集合减法”操作。为此使用 set 数据结构。
In Python 2.7:
在 Python 2.7 中:
x = {1,2,3,4,5,6,7,8,9,0}
y = {1,3,5,7,9}
print x - y
Output:
输出:
>>> print x - y
set([0, 8, 2, 4, 6])
回答by user3435376
Try this.
尝试这个。
def subtract_lists(a, b):
""" Subtracts two lists. Throws ValueError if b contains items not in a """
# Terminate if b is empty, otherwise remove b[0] from a and recurse
return a if len(b) == 0 else [a[:i] + subtract_lists(a[i+1:], b[1:])
for i in [a.index(b[0])]][0]
>>> x = [1,2,3,4,5,6,7,8,9,0]
>>> y = [1,3,5,7,9]
>>> subtract_lists(x,y)
[2, 4, 6, 8, 0]
>>> x = [1,2,3,4,5,6,7,8,9,0,9]
>>> subtract_lists(x,y)
[2, 4, 6, 8, 0, 9] #9 is only deleted once
>>>
回答by nguyên
if duplicate and ordering items are problem :
如果重复和订购项目有问题:
[i for i in a if not i in b or b.remove(i)]
[i for i in a if not i in b or b.remove(i)]
a = [1,2,3,3,3,3,4]
b = [1,3]
result: [2, 3, 3, 3, 4]
回答by abarnert
For many use cases, the answer you want is:
对于许多用例,您想要的答案是:
ys = set(y)
[item for item in x if item not in ys]
This is a hybrid between aaronasterling's answerand quantumSoup's answer.
这是aaronasterling 的答案和quantumSoup 的答案之间的混合体。
aaronasterling's version does len(y)item comparisons for each element in x, so it takes quadratic time. quantumSoup's version uses sets, so it does a single constant-time set lookup for each element in x—but, because it converts bothxand yinto sets, it loses the order of your elements.
aaronasterling 的版本len(y)对 中的每个元素进行项目比较x,因此需要二次时间。quantumSoup的版本用途套,所以它在每个元素一个固定时间组查找x-但是,因为其转换都x和y成组,它就会失去你的元素的顺序。
By converting only yinto a set, and iterating xin order, you get the best of both worlds—linear time, and order preservation.*
通过仅y转换为集合并按x顺序迭代,您可以获得两全其美——线性时间和顺序保留。*
However, this still has a problem from quantumSoup's version: It requires your elements to be hashable. That's pretty much built into the nature of sets.** If you're trying to, e.g., subtract a list of dicts from another list of dicts, but the list to subtract is large, what do you do?
然而,这仍然有一个来自quantumSoup 版本的问题:它要求你的元素是可散列的。这几乎是集合的本质。** 如果你想,例如,从另一个字典列表中减去一个字典列表,但是要减去的列表很大,你会怎么做?
If you can decorate your values in some way that they're hashable, that solves the problem. For example, with a flat dictionary whose values are themselves hashable:
如果您可以以某种方式装饰您的值,使它们可以散列,那么问题就解决了。例如,使用其值本身可散列的平面字典:
ys = {tuple(item.items()) for item in y}
[item for item in x if tuple(item.items()) not in ys]
If your types are a bit more complicated (e.g., often you're dealing with JSON-compatible values, which are hashable, or lists or dicts whose values are recursively the same type), you can still use this solution. But some types just can't be converted into anything hashable.
如果您的类型有点复杂(例如,您经常处理 JSON 兼容的值,它们是可散列的,或者其值递归地为相同类型的列表或字典),您仍然可以使用此解决方案。但有些类型无法转换为任何可散列的类型。
If your items aren't, and can't be made, hashable, but they are comparable, you can at least get log-linear time (O(N*log M), which is a lot better than the O(N*M)time of the list solution, but not as good as the O(N+M)time of the set solution) by sorting and using bisect:
如果您的项目不是,也不能制作,可散列,但它们具有可比性,您至少可以获得对数线性时间 ( O(N*log M),这比O(N*M)列表解决方案的时间要好得多,但不如O(N+M)设置解决方案的时间)通过排序和使用bisect:
ys = sorted(y)
def bisect_contains(seq, item):
index = bisect.bisect(seq, item)
return index < len(seq) and seq[index] == item
[item for item in x if bisect_contains(ys, item)]
If your items are neither hashable nor comparable, then you're stuck with the quadratic solution.
如果您的项目既不是可散列的,也不是可比的,那么您就只能使用二次解。
* Note that you could also do this by using a pair of OrderedSetobjects, for which you can find recipes and third-party modules. But I think this is simpler.
* 请注意,您也可以使用一对OrderedSet对象来执行此操作,您可以找到这些对象的配方和第三方模块。但我认为这更简单。
** The reason set lookups are constant time is that all it has to do is hash the value and see if there's an entry for that hash. If it can't hash the value, this won't work.
** 设置查找是恒定时间的原因是它所要做的就是散列值并查看是否有该散列的条目。如果它不能散列值,这将不起作用。
回答by rudolfbyker
Looking up values in sets are faster than looking them up in lists:
在集合中查找值比在列表中查找要快:
[item for item in x if item not in set(y)]
I believe this will scale slightly better than:
我相信这将比以下扩展略好:
[item for item in x if item not in y]
Both preserve the order of the lists.
两者都保留列表的顺序。
回答by Joao Nicolau
This example subtracts two lists:
此示例减去两个列表:
# List of pairs of points
list = []
list.append([(602, 336), (624, 365)])
list.append([(635, 336), (654, 365)])
list.append([(642, 342), (648, 358)])
list.append([(644, 344), (646, 356)])
list.append([(653, 337), (671, 365)])
list.append([(728, 13), (739, 32)])
list.append([(756, 59), (767, 79)])
itens_to_remove = []
itens_to_remove.append([(642, 342), (648, 358)])
itens_to_remove.append([(644, 344), (646, 356)])
print("Initial List Size: ", len(list))
for a in itens_to_remove:
for b in list:
if a == b :
list.remove(b)
print("Final List Size: ", len(list))
回答by Hamid Zafar
The answer provided by @aaronasterling looks good, however, it is not compatible with the default interface of list: x = MyList(1, 2, 3, 4)vs x = MyList([1, 2, 3, 4]). Thus, the below code can be used as a more python-list friendly:
@aaronasterling 提供的答案看起来不错,但是,它与 list: x = MyList(1, 2, 3, 4)vs的默认接口不兼容x = MyList([1, 2, 3, 4])。因此,下面的代码可以用作更友好的 Python 列表:
class MyList(list):
def __init__(self, *args):
super(MyList, self).__init__(*args)
def __sub__(self, other):
return self.__class__([item for item in self if item not in other])
Example:
例子:
x = MyList([1, 2, 3, 4])
y = MyList([2, 5, 2])
z = x - y
回答by Eds_k
I think this is faster:
我认为这更快:
In [1]: a = [1,2,3,4,5]
In [2]: b = [2,3,4,5]
In [3]: c = set(a) ^ set(b)
In [4]: c
Out[4]: {1}

