list C# LINQ 在列表中查找重复项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18547354/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 02:04:19  来源:igfitidea点击:

C# LINQ find duplicates in List

linqlistduplicate-removal

提问by Mirko Arcese

Using LINQ, from a List<int>, how can I retrieve a list that contains entries repeated more than once and their values?

使用 LINQ,List<int>如何从 a 中检索包含重复多次的条目及其值的列表?

回答by Save

The easiest way to solve the problem is to group the elements based on their value, and then pick a representative of the group if there are more than one element in the group. In LINQ, this translates to:

解决问题的最简单方法是根据元素的值对元素进行分组,如果组中的元素不止一个,则选择一个代表该组的元素。在 LINQ 中,这转化为:

var query = lst.GroupBy(x => x)
              .Where(g => g.Count() > 1)
              .Select(y => y.Key)
              .ToList();

If you want to know how many times the elements are repeated, you can use:

如果想知道元素重复了多少次,可以使用:

var query = lst.GroupBy(x => x)
              .Where(g => g.Count() > 1)
              .Select(y => new { Element = y.Key, Counter = y.Count() })
              .ToList();

This will return a Listof an anonymous type, and each element will have the properties Elementand Counter, to retrieve the information you need.

这将返回List匿名类型的 a ,并且每个元素都将具有属性ElementCounter以检索您需要的信息。

And lastly, if it's a dictionary you are looking for, you can use

最后,如果它是您要查找的字典,则可以使用

var query = lst.GroupBy(x => x)
              .Where(g => g.Count() > 1)
              .ToDictionary(x => x.Key, y => y.Count());

This will return a dictionary, with your element as key, and the number of times it's repeated as value.

这将返回一个字典,以您的元素作为键,并将其重复的次数作为值。

回答by maxbeaudoin

Find out if an enumerable contains any duplicate:

找出可枚举是否包含任何重复项

var anyDuplicate = enumerable.GroupBy(x => x.Key).Any(g => g.Count() > 1);

Find out if allvalues in an enumerable are unique:

找出枚举中的所有值是否都是唯一的

var allUnique = enumerable.GroupBy(x => x.Key).All(g => g.Count() == 1);

回答by HuBeZa

Another way is using HashSet:

另一种方法是使用HashSet

var hash = new HashSet<int>();
var duplicates = list.Where(i => !hash.Add(i));

If you want unique values in your duplicates list:

如果您想在重复列表中使用唯一值:

var myhash = new HashSet<int>();
var mylist = new List<int>(){1,1,2,2,3,3,3,4,4,4};
var duplicates = mylist.Where(item => !myhash.Add(item)).Distinct().ToList();

Here is the same solution as a generic extension method:

这是与通用扩展方法相同的解决方案:

public static class Extensions
{
  public static IEnumerable<TSource> GetDuplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector, IEqualityComparer<TKey> comparer)
  {
    var hash = new HashSet<TKey>(comparer);
    return source.Where(item => !hash.Add(selector(item))).ToList();
  }

  public static IEnumerable<TSource> GetDuplicates<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer)
  {
    return source.GetDuplicates(x => x, comparer);      
  }

  public static IEnumerable<TSource> GetDuplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)
  {
    return source.GetDuplicates(selector, null);
  }

  public static IEnumerable<TSource> GetDuplicates<TSource>(this IEnumerable<TSource> source)
  {
    return source.GetDuplicates(x => x, null);
  }
}

回答by Alex Siepman

You can do this:

你可以这样做:

var list = new[] {1,2,3,1,4,2};
var duplicateItems = list.Duplicates();

With these extension methods:

使用这些扩展方法:

public static class Extensions
{
    public static IEnumerable<TSource> Duplicates<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)
    {
        var grouped = source.GroupBy(selector);
        var moreThan1 = grouped.Where(i => i.IsMultiple());
        return moreThan1.SelectMany(i => i);
    }

    public static IEnumerable<TSource> Duplicates<TSource, TKey>(this IEnumerable<TSource> source)
    {
        return source.Duplicates(i => i);
    }

    public static bool IsMultiple<T>(this IEnumerable<T> source)
    {
        var enumerator = source.GetEnumerator();
        return enumerator.MoveNext() && enumerator.MoveNext();
    }
}

Using IsMultiple() in the Duplicates method is faster than Count() because this does not iterate the whole collection.

在 Duplicates 方法中使用 IsMultiple() 比 Count() 更快,因为它不会迭代整个集合。

回答by Ricardo Figueiredo

I created a extention to response to this you could includ it in your projects, I think this return the most case when you search for duplicates in List or Linq.

我创建了一个扩展来对此做出回应,您可以将其包含在您的项目中,我认为当您在 List 或 Linq 中搜索重复项时,这会返回最多的情况。

Example:

例子:

//Dummy class to compare in list
public class Person
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Surname { get; set; }
    public Person(int id, string name, string surname)
    {
        this.Id = id;
        this.Name = name;
        this.Surname = surname;
    }
}


//The extention static class
public static class Extention
{
    public static IEnumerable<T> getMoreThanOnceRepeated<T>(this IEnumerable<T> extList, Func<T, object> groupProps) where T : class
    { //Return only the second and next reptition
        return extList
            .GroupBy(groupProps)
            .SelectMany(z => z.Skip(1)); //Skip the first occur and return all the others that repeats
    }
    public static IEnumerable<T> getAllRepeated<T>(this IEnumerable<T> extList, Func<T, object> groupProps) where T : class
    {
        //Get All the lines that has repeating
        return extList
            .GroupBy(groupProps)
            .Where(z => z.Count() > 1) //Filter only the distinct one
            .SelectMany(z => z);//All in where has to be retuned
    }
}

//how to use it:
void DuplicateExample()
{
    //Populate List
    List<Person> PersonsLst = new List<Person>(){
    new Person(1,"Ricardo","Figueiredo"), //fist Duplicate to the example
    new Person(2,"Ana","Figueiredo"),
    new Person(3,"Ricardo","Figueiredo"),//second Duplicate to the example
    new Person(4,"Margarida","Figueiredo"),
    new Person(5,"Ricardo","Figueiredo")//third Duplicate to the example
    };

    Console.WriteLine("All:");
    PersonsLst.ForEach(z => Console.WriteLine("{0} -> {1} {2}", z.Id, z.Name, z.Surname));
    /* OUTPUT:
        All:
        1 -> Ricardo Figueiredo
        2 -> Ana Figueiredo
        3 -> Ricardo Figueiredo
        4 -> Margarida Figueiredo
        5 -> Ricardo Figueiredo
        */

    Console.WriteLine("All lines with repeated data");
    PersonsLst.getAllRepeated(z => new { z.Name, z.Surname })
        .ToList()
        .ForEach(z => Console.WriteLine("{0} -> {1} {2}", z.Id, z.Name, z.Surname));
    /* OUTPUT:
        All lines with repeated data
        1 -> Ricardo Figueiredo
        3 -> Ricardo Figueiredo
        5 -> Ricardo Figueiredo
        */
    Console.WriteLine("Only Repeated more than once");
    PersonsLst.getMoreThanOnceRepeated(z => new { z.Name, z.Surname })
        .ToList()
        .ForEach(z => Console.WriteLine("{0} -> {1} {2}", z.Id, z.Name, z.Surname));
    /* OUTPUT:
        Only Repeated more than once
        3 -> Ricardo Figueiredo
        5 -> Ricardo Figueiredo
        */
}

回答by LAV VISHWAKARMA

To find the duplicate values only :

仅查找重复值:

var duplicates = list.GroupBy(x => x.Key).Any(g => g.Count() > 1);

Eg. var list = new[] {1,2,3,1,4,2};

例如。var list = new[] {1,2,3,1,4,2};

so group by will group the numbers by their keys and will maintain the count(number of times it repeated) with it. After that, we are just checking the values who have repeated more than once.

所以 group by 将按它们的键对数字进行分组,并用它维护计数(重复的次数)。之后,我们只是检查重复多次的值。

To find the uniuqe values only :

仅查找 uniuqe 值:

var unique = list.GroupBy(x => x.Key).All(g => g.Count() == 1);

Eg. var list = new[] {1,2,3,1,4,2};

例如。var list = new[] {1,2,3,1,4,2};

so group by will group the numbers by their keys and will maintain the count(number of times it repeated) with it. After that, we are just checking the values who have repeated only once means are unique.

所以 group by 将按它们的键对数字进行分组,并用它维护计数(重复的次数)。之后,我们只是检查只重复一次的值是否唯一。

回答by GeoB

Complete set of Linq to SQL extensions of Duplicates functions checked in MS SQL Server. Without using .ToList() or IEnumerable. These queries executing in SQL Server rather than in memory.. The results only return at memory.

在 MS SQL Server 中检查的 Duplicates 函数的完整 Linq to SQL 扩展集。不使用 .ToList() 或 IEnumerable。这些查询在 SQL Server 中而不是在内存中执行。. 结果只在内存中返回。

public static class Linq2SqlExtensions {

    public class CountOfT<T> {
        public T Key { get; set; }
        public int Count { get; set; }
    }

    public static IQueryable<TKey> Duplicates<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).Select(s => s.Key);

    public static IQueryable<TSource> GetDuplicates<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).SelectMany(s => s);

    public static IQueryable<CountOfT<TKey>> DuplicatesCounts<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).Select(y => new CountOfT<TKey> { Key = y.Key, Count = y.Count() });

    public static IQueryable<Tuple<TKey, int>> DuplicatesCountsAsTuble<TSource, TKey>(this IQueryable<TSource> source, Expression<Func<TSource, TKey>> groupBy)
        => source.GroupBy(groupBy).Where(w => w.Count() > 1).Select(s => Tuple.Create(s.Key, s.Count()));
}

回答by Aykut Gündo?du

there is an answer but i did not understand why is not working;

有一个答案,但我不明白为什么不起作用;

var anyDuplicate = enumerable.GroupBy(x => x.Key).Any(g => g.Count() > 1);

my solution is like that in this situation;

在这种情况下,我的解决方案就是这样;

var duplicates = model.list
                    .GroupBy(s => s.SAME_ID)
                    .Where(g => g.Count() > 1).Count() > 0;
if(duplicates) {
    doSomething();
}