从 C# 中的 List<T> 中删除重复项

Question

提问by JC Grubbs

Anyone have a quick method for de-duplicating a generic List in C#?

任何人都有在 C# 中去重复泛型列表的快速方法？

Answer 1

采纳答案by Jason Baker

Perhaps you should consider using a HashSet.

也许您应该考虑使用HashSet。

From the MSDN link:

从 MSDN 链接：

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        HashSet<int> evenNumbers = new HashSet<int>();
        HashSet<int> oddNumbers = new HashSet<int>();

        for (int i = 0; i < 5; i++)
        {
            // Populate numbers with just even numbers.
            evenNumbers.Add(i * 2);

            // Populate oddNumbers with just odd numbers.
            oddNumbers.Add((i * 2) + 1);
        }

        Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
        DisplaySet(evenNumbers);

        Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
        DisplaySet(oddNumbers);

        // Create a new HashSet populated with even numbers.
        HashSet<int> numbers = new HashSet<int>(evenNumbers);
        Console.WriteLine("numbers UnionWith oddNumbers...");
        numbers.UnionWith(oddNumbers);

        Console.Write("numbers contains {0} elements: ", numbers.Count);
        DisplaySet(numbers);
    }

    private static void DisplaySet(HashSet<int> set)
    {
        Console.Write("{");
        foreach (int i in set)
        {
            Console.Write(" {0}", i);
        }
        Console.WriteLine(" }");
    }
}

/* This example produces output similar to the following:
 * evenNumbers contains 5 elements: { 0 2 4 6 8 }
 * oddNumbers contains 5 elements: { 1 3 5 7 9 }
 * numbers UnionWith oddNumbers...
 * numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
 */

Answer 2

回答by Lasse V. Karlsen

Sort it, then check two and two next to each others, as the duplicates will clump together.

对它进行排序，然后检查两个和两个彼此相邻，因为重复项会聚集在一起。

Something like this:

像这样的东西：

list.Sort();
Int32 index = list.Count - 1;
while (index > 0)
{
    if (list[index] == list[index - 1])
    {
        if (index < list.Count - 1)
            (list[index], list[list.Count - 1]) = (list[list.Count - 1], list[index]);
        list.RemoveAt(list.Count - 1);
        index--;
    }
    else
        index--;
}

Notes:

笔记：

Comparison is done from back to front, to avoid having to resort list after each removal
This example now uses C# Value Tuples to do the swapping, substitute with appropriate code if you can't use that
The end-result is no longer sorted

比较是从后到前进行的，以避免每次删除后都必须重新使用列表
此示例现在使用 C# 值元组进行交换，如果不能使用，则用适当的代码替换
最终结果不再排序

Answer 3

回答by Tom Hawtin - tackline

In Java (I assume C# is more or less identical):

在 Java 中（我假设 C# 或多或少相同）：

list = new ArrayList<T>(new HashSet<T>(list))

If you really wanted to mutate the original list:

如果你真的想改变原始列表：

List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);

To preserve order, simply replace HashSet with LinkedHashSet.

要保持顺序，只需将 HashSet 替换为 LinkedHashSet。

Answer 4

回答by Motti

If you don't care about the order you can just shove the items into a HashSet, if you dowant to maintain the order you can do something like this:

如果您不关心订单，您可以将项目推入一个HashSet，如果您确实想维护订单，您可以执行以下操作：

var unique = new List<T>();
var hs = new HashSet<T>();
foreach (T t in list)
    if (hs.Add(t))
        unique.Add(t);

Or the Linq way:

或 Linq 方式：

var hs = new HashSet<T>();
list.All( x =>  hs.Add(x) );

Edit:The HashSetmethod is O(N)time and O(N)space while sorting and then making unique (as suggested by @lassevkand others) is O(N*lgN)time and O(1)space so it's not so clear to me (as it was at first glance) that the sorting way is inferior (my apologies for the temporary down vote...)

编辑：排序时的HashSet方法是O(N)时间和O(N)空间，然后使唯一（如@ lassevk和其他人所建议的）是O(N*lgN)时间和O(1)空间，所以我不太清楚（就像乍一看一样）排序方式较差（我的为暂时的否决投票道歉......）

Answer 5

回答by ljs

How about:

怎么样：

var noDupes = list.Distinct().ToList();

In .net 3.5?

在 .net 3.5 中？

Answer 6

回答by Factor Mystic

If you're using .Net 3+, you can use Linq.

如果您使用 .Net 3+，则可以使用 Linq。

List<T> withDupes = LoadSomeData();
List<T> noDupes = withDupes.Distinct().ToList();

Answer 7

回答by Keith

As kronoz said in .Net 3.5 you can use Distinct().

正如 kronoz 在 .Net 3.5 中所说，您可以使用Distinct().

In .Net 2 you could mimic it:

在 .Net 2 中，你可以模仿它：

public IEnumerable<T> DedupCollection<T> (IEnumerable<T> input) 
{
    var passedValues = new HashSet<T>();

    // Relatively simple dupe check alg used as example
    foreach(T item in input)
        if(passedValues.Add(item)) // True if item is new
            yield return item;
}

This could be used to dedupe any collection and will return the values in the original order.

这可用于对任何集合进行重复数据删除，并将按原始顺序返回值。

It's normally much quicker to filter a collection (as both Distinct()and this sample does) than it would be to remove items from it.

通常，过滤集合（正如两者Distinct()和本示例所做的那样）比从中删除项目要快得多。

Answer 8

回答by Even Mien

Simply initialize a HashSet with a List of the same type:

只需使用相同类型的 List 初始化 HashSet ：

var noDupes = new HashSet<T>(withDupes);

Or, if you want a List returned:

或者，如果您希望返回一个 List：

var noDupsList = new HashSet<T>(withDupes).ToList();

Answer 9

回答by Geoff Taylor

An extension method might be a decent way to go... something like this:

扩展方法可能是一个不错的方法......像这样：

public static List<T> Deduplicate<T>(this List<T> listToDeduplicate)
{
    return listToDeduplicate.Distinct().ToList();
}

And then call like this, for example:

然后像这样调用，例如：

List<int> myFilteredList = unfilteredList.Deduplicate();

Answer 10

回答by Bhasin

Another way in .Net 2.0

.Net 2.0 中的另一种方式

    static void Main(string[] args)
    {
        List<string> alpha = new List<string>();

        for(char a = 'a'; a <= 'd'; a++)
        {
            alpha.Add(a.ToString());
            alpha.Add(a.ToString());
        }

        Console.WriteLine("Data :");
        alpha.ForEach(delegate(string t) { Console.WriteLine(t); });

        alpha.ForEach(delegate (string v)
                          {
                              if (alpha.FindAll(delegate(string t) { return t == v; }).Count > 1)
                                  alpha.Remove(v);
                          });

        Console.WriteLine("Unique Result :");
        alpha.ForEach(delegate(string t) { Console.WriteLine(t);});
        Console.ReadKey();
    }

从 C# 中的 List<T> 中删除重复项

提问by JC Grubbs

采纳答案by Jason Baker

回答by Lasse V. Karlsen

回答by Tom Hawtin - tackline

回答by Motti

回答by ljs

回答by Factor Mystic

回答by Keith

回答by Even Mien

回答by Geoff Taylor

回答by Bhasin

相关推荐

最近更新

标签

从 C# 中的 List<T> 中删除重复项

提问by JC Grubbs

采纳答案by Jason Baker

回答by Lasse V. Karlsen

回答by Tom Hawtin - tackline

回答by Motti

回答by ljs

回答by Factor Mystic

回答by Keith

回答by Even Mien

回答by Geoff Taylor

回答by Bhasin

相关推荐

C# 如何计算图表的趋势线？

C# 比较 .NET 中的两个字节数组

C# 我可以将 ASP.Net 会话 ID 放在隐藏的表单字段中吗？

C# 是否有内置方法来比较集合？

相关推荐

最近更新

标签