Python 从列表中删除重复的元组,如果它们完全相同,包括项目的顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19416786/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 13:42:16  来源:igfitidea点击:

Remove duplicate tuples from a list if they are exactly the same including order of items

pythonlistduplicatestuplesitertools

提问by 5813

I know questions similar to this have been asked many, many times on Stack Overflow, but I need to remove duplicate tuples from a list, but not just if their elements match up, their elements have to be in the same order. In other words, (4,3,5)and (3,4,5)would both be present in the output, while if there were both(3,3,5)and (3,3,5), only one would be in the output.

我知道类似这样的问题在 Stack Overflow 上已经被问过很多次了,但我需要从列表中删除重复的元组,但不仅仅是如果它们的元素匹配,它们的元素必须按相同的顺序。换句话说,(4,3,5)and(3,4,5)都会出现在输出中,而如果同时存在(3,3,5)and (3,3,5),则输出中只会有一个。

Specifically, my code is:

具体来说,我的代码是:

import itertools

x = [1,1,1,2,2,2,3,3,3,4,4,5]
y = []

for x in itertools.combinations(x,3):
    y.append(x)
print(y)

of which the output is quite lengthy. For example, in the output, there should be both (1,2,1)and (1,1,2). But there should only be one (1,2,2).

其中输出相当冗长。例如,在输出中,应该同时存在(1,2,1)(1,1,2)。但应该只有一个(1,2,2)

采纳答案by 5813

setwill take care of that:

set会照顾到:

>>> a = [(1,2,2), (2,2,1), (1,2,2), (4,3,5), (3,3,5), (3,3,5), (3,4,5)]
>>> set(a)
set([(1, 2, 2), (2, 2, 1), (3, 4, 5), (3, 3, 5), (4, 3, 5)])
>>> list(set(a))
[(1, 2, 2), (2, 2, 1), (3, 4, 5), (3, 3, 5), (4, 3, 5)]
>>>

setwill remove only exactduplicates.

set将仅删除完全重复的内容。

回答by TankorSmash

Using a setshould probably work. A set is basically a container that doesn't contain any duplicated elements.

使用集合应该可以工作。集合基本上是一个不包含任何重复元素的容器。

Python also includes a data type for sets. A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.

Python 还包括集合的数据类型。集合是没有重复元素的无序集合。基本用途包括成员资格测试和消除重复条目。Set 对象还支持并集、交集、差和对称差等数学运算。

import itertools

x = [1,1,1,2,2,2,3,3,3,4,4,5]
y = set()

for x in itertools.combinations(x,3):
    y.add(x)
print(y)

回答by xiaowl

No need to do forloop, combinationsgives a generator.

不需要做for循环,combinations给了一个生成器。

x = [1,1,1,2,2,2,3,3,3,4,4,5]
y = list(set(itertools.combinations(x,3)))

回答by sashkello

What you need is unique permutations rather than combinations:

您需要的是独特的排列而不是组合:

y = list(set(itertools.permutations(x,3)))

That is, (1,2,2) and (2,1,2) will be considered as same combination and only one of them will be returned. They are, however, different permutations. Use set()to remove duplicates.

也就是说, (1,2,2) 和 (2,1,2) 将被视为相同的组合,并且只会返回其中之一。然而,它们是不同的排列。使用set()删除重复。

If afterwards you want to sort elements within each tuple and also have the whole list sorted, you can do:

如果之后您想对每个元组中的元素进行排序并对整个列表进行排序,您可以执行以下操作:

y = [tuple(sorted(q)) for q in y]
y.sort()

回答by Tim Peters

This will probably do what you want, but it's vast overkill. It's a low-level prototype for a generator that maybe added to itertoolssome day. It's low level to ease re-implementing it in C. Where Nis the length of the iterable input, it requires worst-case space O(N)and does at most N*(N-1)//2element comparisons, regardless of how many anagrams are generated. Both of those are optimal ;-)

这可能会做你想要的,但它是巨大的矫枉过正。这是一个生成器的低级原型,可能会在itertools某一天添加。在 C 中轻松重新实现它是低级别的。N可迭代输入的长度在哪里,它需要最坏情况的空间O(N),并且最多进行N*(N-1)//2元素比较,无论生成多少字谜。这两个都是最佳的;-)

You'd use it like so:

你会像这样使用它:

>>> x = [1,1,1,2,2,2,3,3,3,4,4,5]
>>> for t in anagrams(x, 3):
...     print(t)
(1, 1, 1)
(1, 1, 2)
(1, 1, 3)
(1, 1, 4)
(1, 1, 5)
(1, 2, 1)
...

There will be no duplicates in the output. Note: this is Python 3 code. It needs a few changes to run under Python 2.

输出中不会有重复项。注意:这是 Python 3 代码。它需要一些更改才能在 Python 2 下运行。

import operator

class ENode:
    def __init__(self, initial_index=None):
        self.indices = [initial_index]
        self.current = 0
        self.prev = self.next = self

    def index(self):
        "Return current index."
        return self.indices[self.current]

    def unlink(self):
        "Remove self from list."
        self.prev.next = self.next
        self.next.prev = self.prev

    def insert_after(self, x):
        "Insert node x after self."
        x.prev = self
        x.next = self.next
        self.next.prev = x
        self.next = x

    def advance(self):
        """Advance the current index.

        If we're already at the end, remove self from list.

        .restore() undoes everything .advance() did."""

        assert self.current < len(self.indices)
        self.current += 1
        if self.current == len(self.indices):
            self.unlink()

    def restore(self):
        "Undo what .advance() did."
        assert self.current <= len(self.indices)
        if self.current == len(self.indices):
            self.prev.insert_after(self)
        self.current -= 1

def build_equivalence_classes(items, equal):
    ehead = ENode()
    for i, elt in enumerate(items):
        e = ehead.next
        while e is not ehead:
            if equal(elt, items[e.indices[0]]):
                # Add (index of) elt to this equivalence class.
                e.indices.append(i)
                break
            e = e.next
        else:
            # elt not equal to anything seen so far:  append
            # new equivalence class.
            e = ENode(i)
            ehead.prev.insert_after(e)
    return ehead

def anagrams(iterable, count=None, equal=operator.__eq__):
    def perm(i):
        if i:
            e = ehead.next
            assert e is not ehead
            while e is not ehead:
                result[count - i] = e.index()
                e.advance()
                yield from perm(i-1)
                e.restore()
                e = e.next
        else:
            yield tuple(items[j] for j in result)

    items = tuple(iterable)
    if count is None:
        count = len(items)
    if count > len(items):
        return

    ehead = build_equivalence_classes(items, equal)
    result = [None] * count
    yield from perm(count)

回答by Shashank

You were really close. Just get permutations, not combinations. Order matters in permutations, and it does not in combinations. Thus (1, 2, 2) is a distinct permutation from (2, 2, 1). However (1, 2, 2) is considered a singular combination of one 1 and two 2s. Therefore (2, 2, 1) is not considered a distinct combination from (1, 2, 2).

你真的很亲近。只得到排列,而不是组合。顺序在排列中很重要,在组合中并不重要。因此 (1, 2, 2) 是与 (2, 2, 1) 不同的排列。然而 (1, 2, 2) 被认为是一个 1 和两个 2 的奇异组合。因此 (2, 2, 1) 不被认为是与 (1, 2, 2) 不同的组合。

You can convert your list y to a set so that you remove duplicates...

您可以将列表 y 转换为集合,以便删除重复项...

import itertools

x = [1,1,1,2,2,2,3,3,3,4,4,5]
y = []

for x in itertools.permutations(x,3):
    y.append(x)
print(set(y))

And voila, you are done. :)

瞧,你完成了。:)