从 C# 中的 List<T> 中删除重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47752/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove duplicates from a List<T> in C#
提问by JC Grubbs
Anyone have a quick method for de-duplicating a generic List in C#?
任何人都有在 C# 中去重复泛型列表的快速方法?
采纳答案by Jason Baker
Perhaps you should consider using a HashSet.
也许您应该考虑使用HashSet。
From the MSDN link:
从 MSDN 链接:
using System;
using System.Collections.Generic;
class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();
for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);
// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}
Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
DisplaySet(evenNumbers);
Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
DisplaySet(oddNumbers);
// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);
Console.Write("numbers contains {0} elements: ", numbers.Count);
DisplaySet(numbers);
}
private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}
/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/
回答by Lasse V. Karlsen
Sort it, then check two and two next to each others, as the duplicates will clump together.
对它进行排序,然后检查两个和两个彼此相邻,因为重复项会聚集在一起。
Something like this:
像这样的东西:
list.Sort();
Int32 index = list.Count - 1;
while (index > 0)
{
if (list[index] == list[index - 1])
{
if (index < list.Count - 1)
(list[index], list[list.Count - 1]) = (list[list.Count - 1], list[index]);
list.RemoveAt(list.Count - 1);
index--;
}
else
index--;
}
Notes:
笔记:
- Comparison is done from back to front, to avoid having to resort list after each removal
- This example now uses C# Value Tuples to do the swapping, substitute with appropriate code if you can't use that
- The end-result is no longer sorted
- 比较是从后到前进行的,以避免每次删除后都必须重新使用列表
- 此示例现在使用 C# 值元组进行交换,如果不能使用,则用适当的代码替换
- 最终结果不再排序
回答by Tom Hawtin - tackline
In Java (I assume C# is more or less identical):
在 Java 中(我假设 C# 或多或少相同):
list = new ArrayList<T>(new HashSet<T>(list))
If you really wanted to mutate the original list:
如果你真的想改变原始列表:
List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);
To preserve order, simply replace HashSet with LinkedHashSet.
要保持顺序,只需将 HashSet 替换为 LinkedHashSet。
回答by Motti
If you don't care about the order you can just shove the items into a HashSet, if you dowant to maintain the order you can do something like this:
如果您不关心订单,您可以将项目推入一个HashSet,如果您确实想维护订单,您可以执行以下操作:
var unique = new List<T>();
var hs = new HashSet<T>();
foreach (T t in list)
if (hs.Add(t))
unique.Add(t);
Or the Linq way:
或 Linq 方式:
var hs = new HashSet<T>();
list.All( x => hs.Add(x) );
Edit:The HashSetmethod is O(N)time and O(N)space while sorting and then making unique (as suggested by @lassevkand others) is O(N*lgN)time and O(1)space so it's not so clear to me (as it was at first glance) that the sorting way is inferior (my apologies for the temporary down vote...)
编辑:排序时的HashSet方法是O(N)时间和O(N)空间,然后使唯一(如@ lassevk和其他人所建议的)是O(N*lgN)时间和O(1)空间,所以我不太清楚(就像乍一看一样)排序方式较差(我的为暂时的否决投票道歉......)
回答by ljs
How about:
怎么样:
var noDupes = list.Distinct().ToList();
In .net 3.5?
在 .net 3.5 中?
回答by Factor Mystic
If you're using .Net 3+, you can use Linq.
如果您使用 .Net 3+,则可以使用 Linq。
List<T> withDupes = LoadSomeData();
List<T> noDupes = withDupes.Distinct().ToList();
回答by Keith
As kronoz said in .Net 3.5 you can use Distinct().
正如 kronoz 在 .Net 3.5 中所说,您可以使用Distinct().
In .Net 2 you could mimic it:
在 .Net 2 中,你可以模仿它:
public IEnumerable<T> DedupCollection<T> (IEnumerable<T> input)
{
var passedValues = new HashSet<T>();
// Relatively simple dupe check alg used as example
foreach(T item in input)
if(passedValues.Add(item)) // True if item is new
yield return item;
}
This could be used to dedupe any collection and will return the values in the original order.
这可用于对任何集合进行重复数据删除,并将按原始顺序返回值。
It's normally much quicker to filter a collection (as both Distinct()and this sample does) than it would be to remove items from it.
通常,过滤集合(正如两者Distinct()和本示例所做的那样)比从中删除项目要快得多。
回答by Even Mien
Simply initialize a HashSet with a List of the same type:
只需使用相同类型的 List 初始化 HashSet :
var noDupes = new HashSet<T>(withDupes);
Or, if you want a List returned:
或者,如果您希望返回一个 List:
var noDupsList = new HashSet<T>(withDupes).ToList();
回答by Geoff Taylor
An extension method might be a decent way to go... something like this:
扩展方法可能是一个不错的方法......像这样:
public static List<T> Deduplicate<T>(this List<T> listToDeduplicate)
{
return listToDeduplicate.Distinct().ToList();
}
And then call like this, for example:
然后像这样调用,例如:
List<int> myFilteredList = unfilteredList.Deduplicate();
回答by Bhasin
Another way in .Net 2.0
.Net 2.0 中的另一种方式
static void Main(string[] args)
{
List<string> alpha = new List<string>();
for(char a = 'a'; a <= 'd'; a++)
{
alpha.Add(a.ToString());
alpha.Add(a.ToString());
}
Console.WriteLine("Data :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t); });
alpha.ForEach(delegate (string v)
{
if (alpha.FindAll(delegate(string t) { return t == v; }).Count > 1)
alpha.Remove(v);
});
Console.WriteLine("Unique Result :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t);});
Console.ReadKey();
}

