从 C# 中的 List<T> 中删除重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/47752/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove duplicates from a List<T> in C#
提问by JC Grubbs
Anyone have a quick method for de-duplicating a generic List in C#?
任何人都有在 C# 中去重复泛型列表的快速方法?
采纳答案by Jason Baker
Perhaps you should consider using a HashSet.
也许您应该考虑使用HashSet。
From the MSDN link:
从 MSDN 链接:
using System;
using System.Collections.Generic;
class Program
{
static void Main()
{
HashSet<int> evenNumbers = new HashSet<int>();
HashSet<int> oddNumbers = new HashSet<int>();
for (int i = 0; i < 5; i++)
{
// Populate numbers with just even numbers.
evenNumbers.Add(i * 2);
// Populate oddNumbers with just odd numbers.
oddNumbers.Add((i * 2) + 1);
}
Console.Write("evenNumbers contains {0} elements: ", evenNumbers.Count);
DisplaySet(evenNumbers);
Console.Write("oddNumbers contains {0} elements: ", oddNumbers.Count);
DisplaySet(oddNumbers);
// Create a new HashSet populated with even numbers.
HashSet<int> numbers = new HashSet<int>(evenNumbers);
Console.WriteLine("numbers UnionWith oddNumbers...");
numbers.UnionWith(oddNumbers);
Console.Write("numbers contains {0} elements: ", numbers.Count);
DisplaySet(numbers);
}
private static void DisplaySet(HashSet<int> set)
{
Console.Write("{");
foreach (int i in set)
{
Console.Write(" {0}", i);
}
Console.WriteLine(" }");
}
}
/* This example produces output similar to the following:
* evenNumbers contains 5 elements: { 0 2 4 6 8 }
* oddNumbers contains 5 elements: { 1 3 5 7 9 }
* numbers UnionWith oddNumbers...
* numbers contains 10 elements: { 0 2 4 6 8 1 3 5 7 9 }
*/
回答by Lasse V. Karlsen
Sort it, then check two and two next to each others, as the duplicates will clump together.
对它进行排序,然后检查两个和两个彼此相邻,因为重复项会聚集在一起。
Something like this:
像这样的东西:
list.Sort();
Int32 index = list.Count - 1;
while (index > 0)
{
if (list[index] == list[index - 1])
{
if (index < list.Count - 1)
(list[index], list[list.Count - 1]) = (list[list.Count - 1], list[index]);
list.RemoveAt(list.Count - 1);
index--;
}
else
index--;
}
Notes:
笔记:
- Comparison is done from back to front, to avoid having to resort list after each removal
- This example now uses C# Value Tuples to do the swapping, substitute with appropriate code if you can't use that
- The end-result is no longer sorted
- 比较是从后到前进行的,以避免每次删除后都必须重新使用列表
- 此示例现在使用 C# 值元组进行交换,如果不能使用,则用适当的代码替换
- 最终结果不再排序
回答by Tom Hawtin - tackline
In Java (I assume C# is more or less identical):
在 Java 中(我假设 C# 或多或少相同):
list = new ArrayList<T>(new HashSet<T>(list))
If you really wanted to mutate the original list:
如果你真的想改变原始列表:
List<T> noDupes = new ArrayList<T>(new HashSet<T>(list));
list.clear();
list.addAll(noDupes);
To preserve order, simply replace HashSet with LinkedHashSet.
要保持顺序,只需将 HashSet 替换为 LinkedHashSet。
回答by Motti
If you don't care about the order you can just shove the items into a HashSet
, if you dowant to maintain the order you can do something like this:
如果您不关心订单,您可以将项目推入一个HashSet
,如果您确实想维护订单,您可以执行以下操作:
var unique = new List<T>();
var hs = new HashSet<T>();
foreach (T t in list)
if (hs.Add(t))
unique.Add(t);
Or the Linq way:
或 Linq 方式:
var hs = new HashSet<T>();
list.All( x => hs.Add(x) );
Edit:The HashSet
method is O(N)
time and O(N)
space while sorting and then making unique (as suggested by @lassevkand others) is O(N*lgN)
time and O(1)
space so it's not so clear to me (as it was at first glance) that the sorting way is inferior (my apologies for the temporary down vote...)
编辑:排序时的HashSet
方法是O(N)
时间和O(N)
空间,然后使唯一(如@ lassevk和其他人所建议的)是O(N*lgN)
时间和O(1)
空间,所以我不太清楚(就像乍一看一样)排序方式较差(我的为暂时的否决投票道歉......)
回答by ljs
How about:
怎么样:
var noDupes = list.Distinct().ToList();
In .net 3.5?
在 .net 3.5 中?
回答by Factor Mystic
If you're using .Net 3+, you can use Linq.
如果您使用 .Net 3+,则可以使用 Linq。
List<T> withDupes = LoadSomeData();
List<T> noDupes = withDupes.Distinct().ToList();
回答by Keith
As kronoz said in .Net 3.5 you can use Distinct()
.
正如 kronoz 在 .Net 3.5 中所说,您可以使用Distinct()
.
In .Net 2 you could mimic it:
在 .Net 2 中,你可以模仿它:
public IEnumerable<T> DedupCollection<T> (IEnumerable<T> input)
{
var passedValues = new HashSet<T>();
// Relatively simple dupe check alg used as example
foreach(T item in input)
if(passedValues.Add(item)) // True if item is new
yield return item;
}
This could be used to dedupe any collection and will return the values in the original order.
这可用于对任何集合进行重复数据删除,并将按原始顺序返回值。
It's normally much quicker to filter a collection (as both Distinct()
and this sample does) than it would be to remove items from it.
通常,过滤集合(正如两者Distinct()
和本示例所做的那样)比从中删除项目要快得多。
回答by Even Mien
Simply initialize a HashSet with a List of the same type:
只需使用相同类型的 List 初始化 HashSet :
var noDupes = new HashSet<T>(withDupes);
Or, if you want a List returned:
或者,如果您希望返回一个 List:
var noDupsList = new HashSet<T>(withDupes).ToList();
回答by Geoff Taylor
An extension method might be a decent way to go... something like this:
扩展方法可能是一个不错的方法......像这样:
public static List<T> Deduplicate<T>(this List<T> listToDeduplicate)
{
return listToDeduplicate.Distinct().ToList();
}
And then call like this, for example:
然后像这样调用,例如:
List<int> myFilteredList = unfilteredList.Deduplicate();
回答by Bhasin
Another way in .Net 2.0
.Net 2.0 中的另一种方式
static void Main(string[] args)
{
List<string> alpha = new List<string>();
for(char a = 'a'; a <= 'd'; a++)
{
alpha.Add(a.ToString());
alpha.Add(a.ToString());
}
Console.WriteLine("Data :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t); });
alpha.ForEach(delegate (string v)
{
if (alpha.FindAll(delegate(string t) { return t == v; }).Count > 1)
alpha.Remove(v);
});
Console.WriteLine("Unique Result :");
alpha.ForEach(delegate(string t) { Console.WriteLine(t);});
Console.ReadKey();
}