Python 如何检查列表中的所有项目是否都在另一个列表中?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15147751/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 13:28:21  来源:igfitidea点击:

How to check if all items in a list are there in another list?

python

提问by pogo

I have two lists say

我有两个清单说

List1 = ['a','c','c']
List2 = ['x','b','a','x','c','y','c']

Now I want to find out if all elements of List1 are there in List2. In this case all there are. I can't use the subset function because I can have repeated elements in lists. I can use a for loop to count the number of occurrences of each item in List1 and see if it is less than or equal to the number of occurrences in List2. Is there a better way to do this?

现在我想知道 List1 的所有元素是否都在 List2 中。在这种情况下,都有。我不能使用子集函数,因为我可以在列表中有重复的元素。我可以使用 for 循环来统计 List1 中每个项目的出现次数,看看它是否小于或等于 List2 中的出现次数。有一个更好的方法吗?

Thanks.

谢谢。

采纳答案by poke

When number of occurrences doesn't matter, you can still use the subset functionality, by creating a set on the fly:

当出现次数无关紧要时,您仍然可以通过动态创建集合来使用子集功能:

>>> list1 = ['a', 'c', 'c']
>>> list2 = ['x', 'b', 'a', 'x', 'c', 'y', 'c']
>>> set(list1) < set(list2)
True

If you need to check if each element shows up at least as many times in the second list as in the first list, you can make use of the Counter type and define your own subset relation:

如果您需要检查每个元素在第二个列表中出现的次数是否至少与第一个列表中的相同,您可以使用 Counter 类型并定义您自己的子集关系:

>>> from collections import Counter
>>> def counterSubset(list1, list2):
        c1, c2 = Counter(list1), Counter(list2)
        for k, n in c1.items():
            if n > c2[k]:
                return False
        return True

>>> counterSubset(list1, list2)
True
>>> counterSubset(list1 + ['a'], list2)
False
>>> counterSubset(list1 + ['z'], list2)
False

If you already have counters (which might be a useful alternative to store your data anyway), you can also just write this as a single line:

如果您已经有计数器(无论如何这可能是存储数据的有用替代方案),您也可以将其写为一行:

>>> all(n <= c2[k] for k, n in c1.items())
True

回答by jeffam217

This will return true is all the items in List1 are in List2

如果 List1 中的所有项目都在 List2 中,这将返回 true

def list1InList2(list1, list2):
    for item in list1:
        if item not in list2:
            return False
    return True

回答by shantanoo

def check_subset(list1, list2):
    try:
        [list2.remove(x) for x in list1]
        return 'all elements in list1 are in list2'
    except:
        return 'some elements in list1 are not in list2'

回答by DevPlayer

Be aware of the following:

请注意以下事项:

>>>listA = ['a', 'a', 'b','b','b','c']
>>>listB = ['b', 'a','a','b','c','d']
>>>all(item in listB for item in listA)
True

If you read the "all" line as you would in English, This is not wrong but can be misleading, as listA has a third 'b' but listB does not.

如果您像在英语中一样阅读“all”行,这并没有错,但可能会产生误导,因为 listA 有第三个 'b' 而 listB 没有。

This also has the same issue:

这也有同样的问题:

def list1InList2(list1, list2):
    for item in list1:
        if item not in list2:
            return False
    return True

Just a note. The following does not work:

只是一个注释。以下不起作用:

>>>tupA = (1,2,3,4,5,6,7,8,9)
>>>tupB = (1,2,3,4,5,6,6,7,8,9)
>>>set(tupA) < set(TupB)
False

If you convert the tuples to lists it still does not work. I don't know why strings work but ints do not.

如果将元组转换为列表,它仍然不起作用。我不知道为什么字符串可以工作,但整数不行。

Works but has same issue of not keeping count of element occurances:

有效,但有相同的问题,即不记录元素出现次数:

>>>set(tupA).issubset(set(tupB))
True

Using sets is not a comprehensive solution for multi-occurrance element matching.

使用集合不是多出现元素匹配的综合解决方案。

But here is a one-liner solution/adaption to shantanoo's answer without try/except:

但这是一个单行解决方案/适应shantanoo的答案,无需尝试/除外:

all(True if sequenceA.count(item) <= sequenceB.count(item) else False for item in sequenceA)

A builtin function wrapping a list comprehension using a ternary conditional operator. Python is awesome! Note that the "<=" should not be "==".

使用三元条件运算符包装列表推导式的内置函数。蟒蛇真棒!请注意,“<=”不应为“==”。

With this solution sequence A and B can be type tuple and list and other "sequences" with "count" methods. The elements in both sequences can be most types. I would not use this with dicts as it is now, hence the use "sequence" instead of "iterable".

使用此解决方案,序列 A 和 B 可以使用“计数”方法键入元组和列表以及其他“序列”。两个序列中的元素都可以是大多数类型。我不会像现在这样将它与 dicts 一起使用,因此使用“序列”而不是“可迭代”。

回答by fferri

A solution using Counterand the builtin intersection method (note that -is proper multiset difference, not an element-wise subtraction):

使用Counter和内置交集方法的解决方案(请注意,这-是正确的多集差异,而不是逐元素减法):

from collections import Counter

def is_subset(l1, l2):
    c1, c2 = Counter(l1), Counter(l2)
    return not c1 - c2

Test:

测试:

>>> List1 = ['a','c','c']
>>> List2 = ['x','b','a','x','c','y','c']
>>> is_subset(List1, List2)
True

回答by abarnert

I can't use the subset function because I can have repeated elements in lists.

我不能使用子集函数,因为我可以在列表中有重复的元素。

What this means is that you want to treat your lists as multisetsrather than sets. The usual way to handle multisets in Python is with collections.Counter:

这意味着您希望将列表视为multisets而不是sets。在 Python 中处理多集的常用方法是collections.Counter

A Counteris a dict subclass for counting hashable objects. It is an unordered collection where elements are stored as dictionary keys and their counts are stored as dictionary values. Counts are allowed to be any integer value including zero or negative counts. The Counterclass is similar to bags or multisets in other languages.

ACounter是用于计算可散列对象的 dict 子类。它是一个无序集合,其中元素存储为字典键,它们的计数存储为字典值。计数可以是任何整数值,包括零或负计数。Counter班是类似于其他语言包或者多集。

And, while you canimplement subset for multisets (implemented with Counter) by looping and comparing counts, as in poke's answer, this is unnecessary—just as you canimplement subset for sets (implemented with setor frozenset) by looping and testing in, but it's unnecessary.

并且,虽然您可以Counter通过循环和比较计数实现多集的子集(用 实现),如在poke 的回答中一样,但这是不必要的——就像您可以通过循环和测试实现集的子集(用set或实现frozensetin,但这是不必要的。

The Countertype already implements all the set operators extended in the obvious way for multisets.<1So you can just write subset in terms of those operators, and it will work for both setand Counterout of the box.

Counter类型已经实现了以明显的方式为多重集扩展的所有集合运算符。<1所以,你可以在这些运营商而言只是写子集,它会为工作都setCounter开箱即用。

With (multi)set difference:2

与(多)集差:2

def is_subset(c1, c2):
    return not c1 - c2

Or with (multi)set intersection:

或与(多)集合交集:

def is_subset(c1, c2):
    def c1 & c2 == c1


1. You may be wondering why, if Counterimplements the set operators, it doesn't just implement <and <=for proper subset and subset. Although I can't find the email thread, I'm pretty sure this was discussed, and the answer was that "the set operators" are defined as the specific set of operators defined in the initial version of collections.abc.Set(which has since been expanded, IIRC…), not all operators that sethappens to include for convenience, in the exact same way that Counterdoesn't have named methods like intersectionthat's friendly to other types than &just because setdoes.

1. 你可能想知道为什么,如果Counter实现集合运算符,它不只是实现<<=为真子集和子集。虽然我找不到电子邮件线程,但我很确定已经讨论过这个问题,答案是“集合运算符”被定义为初始版本中定义的特定运算符集collections.abc.Set(此后已扩展, IIRC...),并不是所有的操作符都是set为了方便而包含的,Counter就像没有命名方法那样intersection对其他类型友好,而&不仅仅是因为set这样做。

2. This depends on the fact that collections in Python are expected to be falsey when empty and truthy otherwise. This is documented herefor the builtin types, and the fact that booltests fall back to lenis explained here—but it's ultimately just a convention, so that "quasi-collections" like numpy arrays can violate it if they have a good reason. It holds for "real collections" like Counter, OrderedDict, etc. If you're really worried about that, you can write len(c1 - c2) == 0, but note that this is against the spirit of PEP 8.

2. 这取决于这样一个事实,即 Python 中的集合在为空时预计为假,否则为真。这在此处针对内置类型进行了记录,并且在此处解释bool测试回退到的事实-但这最终只是一个约定,因此如果有充分的理由,像 numpy 数组这样的“准集合”可能会违反它。它适用于“真正的集合”,如、等。如果您真的担心这一点,您可以编写,但请注意,这违背了PEP 8的精神。lenCounterOrderedDictlen(c1 - c2) == 0