Python 如何测试一个列表是否包含另一个列表?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3847386/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to test if a list contains another list?
提问by None
How can I test if a list contains another list (ie. it's a contiguous subsequence). Say there was a function called contains:
我如何测试一个列表是否包含另一个列表(即它是一个连续的子序列)。假设有一个名为 contains 的函数:
contains([1,2], [-1, 0, 1, 2]) # Returns [2, 3] (contains returns [start, end])
contains([1,3], [-1, 0, 1, 2]) # Returns False
contains([1, 2], [[1, 2], 3]) # Returns False
contains([[1, 2]], [[1, 2], 3]) # Returns [0, 0]
Edit:
编辑:
contains([2, 1], [-1, 0, 1, 2]) # Returns False
contains([-1, 1, 2], [-1, 0, 1, 2]) # Returns False
contains([0, 1, 2], [-1, 0, 1, 2]) # Returns [1, 3]
采纳答案by Dave Kirby
Here is my version:
这是我的版本:
def contains(small, big):
for i in xrange(len(big)-len(small)+1):
for j in xrange(len(small)):
if big[i+j] != small[j]:
break
else:
return i, i+len(small)
return False
It returns a tuple of (start, end+1) since I think that is more pythonic, as Andrew Jaffe points out in his comment. It does not slice any sublists so should be reasonably efficient.
它返回一个 (start, end+1) 元组,因为我认为这更像 Pythonic,正如 Andrew Jaffe 在他的评论中指出的那样。它不会对任何子列表进行切片,因此应该相当有效。
One point of interest for newbies is that it uses the else clause on the for statement- this is not something I use very often but can be invaluable in situations like this.
新手感兴趣的一点是它在 for 语句中使用else 子句- 这不是我经常使用的东西,但在这种情况下可能是无价的。
This is identical to finding substrings in a string, so for large lists it may be more efficient to implement something like the Boyer-Moore algorithm.
这与在字符串中查找子字符串相同,因此对于大型列表,实现类似Boyer-Moore 算法的方法可能更有效。
回答by Thomas O
If all items are unique, you can use sets.
如果所有项目都是唯一的,则可以使用集合。
>>> items = set([-1, 0, 1, 2])
>>> set([1, 2]).issubset(items)
True
>>> set([1, 3]).issubset(items)
False
回答by eumiro
After OP's edit:
OP编辑后:
def contains(small, big):
for i in xrange(1 + len(big) - len(small)):
if small == big[i:i+len(small)]:
return i, i + len(small) - 1
return False
回答by martineau
This works and is fairly fast since it does the linear searching using the builtin list.index()method and ==operator:
这有效并且相当快,因为它使用内置list.index()方法和==运算符进行线性搜索:
def contains(sub, pri):
M, N = len(pri), len(sub)
i, LAST = 0, M-N+1
while True:
try:
found = pri.index(sub[0], i, LAST) # find first elem in sub
except ValueError:
return False
if pri[found:found+N] == sub:
return [found, found+N-1]
else:
i = found+1
回答by intuited
I tried to make this as efficient as possible.
我试图使其尽可能高效。
It uses a generator; those unfamiliar with these beasts are advised to check out their documentationand that of yield expressions.
它使用发电机;建议那些不熟悉这些野兽的人查看他们的文档和yield 表达式的文档。
Basically it creates a generator of values from the subsequence that can be reset by sending it a true value. If the generator is reset, it starts yielding again from the beginning of sub.
基本上它从子序列创建一个值的生成器,可以通过向它发送一个真值来重置它。如果生成器被重置,它会从 的开头再次开始产生sub。
Then it just compares successive values of sequencewith the generator yields, resetting the generator if they don't match.
然后它只是将 的连续值sequence与生成器产量进行比较,如果它们不匹配,则重置生成器。
When the generator runs out of values, i.e. reaches the end of subwithout being reset, that means that we've found our match.
当生成器用完值时,即到达末尾而sub没有被重置,这意味着我们找到了匹配项。
Since it works for any sequence, you can even use it on strings, in which case it behaves similarly to str.find, except that it returns Falseinstead of -1.
由于它适用于任何序列,您甚至可以在字符串上使用它,在这种情况下,它的行为类似于str.find,除了它返回False而不是-1。
As a further note: I think that the second value of the returned tuple should, in keeping with Python standards, normally be one higher. i.e. "string"[0:2] == "st". But the spec says otherwise, so that's how this works.
进一步说明:我认为返回的元组的第二个值,按照 Python 标准,通常应该更高。即"string"[0:2] == "st"。但是规范另有说明,所以这就是它的工作原理。
It depends on if this is meant to be a general-purpose routine or if it's implementing some specific goal; in the latter case it might be better to implement a general-purpose routine and then wrap it in a function which twiddles the return value to suit the spec.
这取决于这是一个通用例程还是它正在实现某个特定目标;在后一种情况下,最好实现一个通用例程,然后将其包装在一个函数中,该函数会根据规范调整返回值。
def reiterator(sub):
"""Yield elements of a sequence, resetting if sent ``True``."""
it = iter(sub)
while True:
if (yield it.next()):
it = iter(sub)
def find_in_sequence(sub, sequence):
"""Find a subsequence in a sequence.
>>> find_in_sequence([2, 1], [-1, 0, 1, 2])
False
>>> find_in_sequence([-1, 1, 2], [-1, 0, 1, 2])
False
>>> find_in_sequence([0, 1, 2], [-1, 0, 1, 2])
(1, 3)
>>> find_in_sequence("subsequence",
... "This sequence contains a subsequence.")
(25, 35)
>>> find_in_sequence("subsequence", "This one doesn't.")
False
"""
start = None
sub_items = reiterator(sub)
sub_item = sub_items.next()
for index, item in enumerate(sequence):
if item == sub_item:
if start is None: start = index
else:
start = None
try:
sub_item = sub_items.send(start is None)
except StopIteration:
# If the subsequence is depleted, we win!
return (start, index)
return False
回答by jfs
Here's a straightforward algorithm that uses list methods:
这是一个使用列表方法的简单算法:
#!/usr/bin/env python
def list_find(what, where):
"""Find `what` list in the `where` list.
Return index in `where` where `what` starts
or -1 if no such index.
>>> f = list_find
>>> f([2, 1], [-1, 0, 1, 2])
-1
>>> f([-1, 1, 2], [-1, 0, 1, 2])
-1
>>> f([0, 1, 2], [-1, 0, 1, 2])
1
>>> f([1,2], [-1, 0, 1, 2])
2
>>> f([1,3], [-1, 0, 1, 2])
-1
>>> f([1, 2], [[1, 2], 3])
-1
>>> f([[1, 2]], [[1, 2], 3])
0
"""
if not what: # empty list is always found
return 0
try:
index = 0
while True:
index = where.index(what[0], index)
if where[index:index+len(what)] == what:
return index # found
index += 1 # try next position
except ValueError:
return -1 # not found
def contains(what, where):
"""Return [start, end+1] if found else empty list."""
i = list_find(what, where)
return [i, i + len(what)] if i >= 0 else [] #NOTE: bool([]) == False
if __name__=="__main__":
import doctest; doctest.testmod()
回答by ChessMaster
I think this one is fast...
我觉得这个很快...
def issublist(subList, myList, start=0):
if not subList: return 0
lenList, lensubList = len(myList), len(subList)
try:
while lenList - start >= lensubList:
start = myList.index(subList[0], start)
for i in xrange(lensubList):
if myList[start+i] != subList[i]:
break
else:
return start, start + lensubList - 1
start += 1
return False
except:
return False
回答by 9000
May I humbly suggest the Rabin-Karp algorithmif the biglist is really big. The link even contains almost-usable code in almost-Python.
如果列表真的很大,我可以谦虚地建议Rabin-Karp 算法big。该链接甚至包含几乎可用的 Python 代码。
回答by Oleksiy
If we refine the problem talking about testing if a list contains another list with as a sequence, the answer could be the next one-liner:
如果我们细化讨论测试列表是否包含另一个列表作为序列的问题,答案可能是下一个单行:
def contains(subseq, inseq):
return any(inseq[pos:pos + len(subseq)] == subseq for pos in range(0, len(inseq) - len(subseq) + 1))
Here unit tests I used to tune up this one-liner:
这里是我用来调整这个单行代码的单元测试:
回答by Bart Mensfort
Smallest code:
最小代码:
def contains(a,b):
str(a)[1:-1].find(str(b)[1:-1])>=0

