C# 在对象列表中查找对象数据重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/619612/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find object data duplicates in List of objects
提问by Chris Conway
Using c# 3 and .Net Framework 3.5, I have a Person object
使用 c# 3 和 .Net Framework 3.5,我有一个 Person 对象
public Person
{
public int Id { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public int SSN { get; set; }
}
and I've got a List of them:
我有他们的名单:
List<Person> persons = GetPersons();
How can I get all the Person objects in persons where SSN is not unique in the list and remove them from the persons list and ideally add them to another list called "List<Person> dupes
"?
如何获取列表中 SSN 不是唯一的人员中的所有 Person 对象,并将它们从人员列表中删除并理想地将它们添加到另一个名为“ List<Person> dupes
”的列表中?
The original list might look something like this:
原始列表可能如下所示:
persons = new List<Person>();
persons.Add(new Person { Id = 1,
FirstName = "Chris",
LastName="Columbus",
SSN=111223333 }); // Is a dupe
persons.Add(new Person { Id = 1,
FirstName = "E.E.",
LastName="Cummings",
SSN=987654321 });
persons.Add(new Person { Id = 1,
FirstName = "John",
LastName="Steinbeck",
SSN=111223333 }); // Is a dupe
persons.Add(new Person { Id = 1,
FirstName = "Yogi",
LastName="Berra",
SSN=123456789 });
And the end result would have Cummings and Berra in the original persons list and would have Columbus and Steinbeck in a list called dupes.
最终结果会将卡明斯和贝拉列入原始人员名单,并将哥伦布和斯坦贝克列入名为 dupes 的名单。
Many thanks!
非常感谢!
采纳答案by gcores
This gets you the duplicated SSN:
这将为您提供重复的 SSN:
var duplicatedSSN =
from p in persons
group p by p.SSN into g
where g.Count() > 1
select g.Key;
The duplicated list would be like:
重复的列表如下:
var duplicated = persons.FindAll( p => duplicatedSSN.Contains(p.SSN) );
And then just iterate over the duplicates and remove them.
然后只需迭代重复项并删除它们。
duplicated.ForEach( dup => persons.Remove(dup) );
回答by Steven Evers
well if you implement IComparable like so:
好吧,如果你像这样实现 IComparable :
int IComparable<Person>.CompareTo(Person person)
{
return this.SSN.CompareTo(person.SSN);
}
then a comparison like the following will work:
那么像下面这样的比较将起作用:
for (Int32 i = 0; i < people.Count; i++)
{
for (Int32 j = 1; j < items.Count; j++)
{
if (i != j && items[i] == items[j])
{
// duplicate
}
}
}
回答by Mike Marshall
Traverse the list and keep a Hashtable of SSN/count pairs. Then enumerate your table and remove the items that match SSNs where SSN count > 0.
遍历列表并保留 SSN/count 对的哈希表。然后枚举您的表并删除与 SSN 计数 > 0 的 SSN 匹配的项目。
Dictionary<string, int> ssnTable = new Dictionary<string, int>();
foreach (Person person in persons)
{
try
{
int count = ssnTable[person.SSN];
count++;
ssnTable[person.SSN] = count;
}
catch(Exception ex)
{
ssnTable.Add(person.SSN, 1);
}
}
// traverse ssnTable here and remove items where value of entry (item count) > 1
回答by Graeme Bradbury
List<Person> actualPersons = persons.Distinct().ToList();
List<Person> duplicatePersons = persons.Except(actualPersons).ToList();
回答by Chris Conway
Thanks to gcores for getting me started down a correct path. Here's what I ended up doing:
感谢 gcores 让我开始了正确的道路。这是我最终做的:
var duplicatedSSN =
from p in persons
group p by p.SSN into g
where g.Count() > 1
select g.Key;
var duplicates = new List<Person>();
foreach (var dupeSSN in duplicatedSSN)
{
foreach (var person in persons.FindAll(p => p.SSN == dupeSSN))
duplicates.Add(person);
}
duplicates.ForEach(dup => persons.Remove(dup));
回答by Chris Conway
Does persons
have to be a List<Person>
? What if it were a Dictionary<int, Person>
?
是否persons
必须是List<Person>
?如果是Dictionary<int, Person>
?
var persons = new Dictionary<int, Person>();
...
// For each person you want to add to the list:
var person = new Person
{
...
};
if (!persons.ContainsKey(person.SSN))
{
persons.Add(person.SSN, person);
}
// If you absolutely, positively got to have a List:
using System.Linq;
List<Person> personsList = persons.Values.ToList();
If you are working with unique instances of Person
(as opposed to different instances that might happen to have the same properties), you might get better performance with a HashSet
.
如果您正在使用 的唯一实例Person
(而不是可能碰巧具有相同属性的不同实例),则使用HashSet
.
回答by Peter Ombwa
Based on the recommendation by @gcores above.
基于上面@gcores 的推荐。
If you want to add a single object of the duplicated SSN back to the list of persons, then add the following line:
如果要将重复 SSN 的单个对象添加回人员列表,请添加以下行:
IEnumerable<IGrouping<string, Person>> query = duplicated.GroupBy(d => d.SSN, d => d);
foreach (IGrouping<string, Person> duplicateGroup in query)
{
persons.Add(duplicateGroup .First());
}
My assumption here is that you may only want to remove duplicate values minus the original value that the duplicates derived from.
我的假设是您可能只想删除重复值减去重复值源自的原始值。