C# 比较两个通用列表差异的最快方法

Question

提问by Frank

What is the quickest (and least resource intensive) to compare two massive (>50.000 items) and as a result have two lists like the ones below:

比较两个大型（> 50.000 项）并因此有两个列表（如下所示）的最快（和最少资源密集型）是什么：

items that show up in the first list but not in the second
items that show up in the second list but not in the first

显示在第一个列表中但不在第二个列表中的项目
显示在第二个列表中但不在第一个列表中的项目

Currently I'm working with the List or IReadOnlyCollection and solve this issue in a linq query:

目前我正在使用 List 或 IReadOnlyCollection 并在 linq 查询中解决此问题：

var list1 = list.Where(i => !list2.Contains(i)).ToList();
var list2 = list2.Where(i => !list.Contains(i)).ToList();

But this doesn't perform as good as i would like. Any idea of making this quicker and less resource intensive as i need to process a lot of lists?

但这并没有我想要的那么好。因为我需要处理很多列表，所以有什么想法可以让这个更快，资源消耗更少吗？

Answer 1

采纳答案by Jon Skeet

Use Except:

使用Except：

var firstNotSecond = list1.Except(list2).ToList();
var secondNotFirst = list2.Except(list1).ToList();

I suspect there are approaches which would actually be marginally faster than this, but even this will be vastlyfaster than your O(N * M) approach.

我怀疑有这实际上是略高于这个速度的方法，但即使这样会大大超过你的O（N * M）的方法要快。

If you want to combine these, you could create a method with the above and then a return statement:

如果你想结合这些，你可以用上面的方法创建一个方法，然后是一个 return 语句：

return !firstNotSecond.Any() && !secondNotFirst.Any();

One point to note is that there isa difference in results between the original code in the question and the solution here: any duplicate elements which are only in one list will only be reported once with my code, whereas they'd be reported as many times as they occur in the original code.

要注意的一点是，有是在问题的原代码和这里的解决方案之间的结果有所不同：其中仅在一个列表中的任何重复的元素将只报告一次我的代码，而他们会被报告为多它们在原始代码中出现的次数。

For example, with lists of [1, 2, 2, 2, 3]and [1], the "elements in list1 but not list2" result in the original code would be [2, 2, 2, 3]. With my code it would just be [2, 3]. In many cases that won't be an issue, but it's worth being aware of.

例如，对于[1, 2, 2, 2, 3]and 的列表，[1]原始代码中的“list1 中的元素而不是 list2 中的元素”结果将是[2, 2, 2, 3]。使用我的代码，它只会是[2, 3]. 在许多情况下，这不会成为问题，但值得注意。

Answer 2

回答by Tim Schmelter

More efficient would be using Enumerable.Except:

更有效的是使用Enumerable.Except：

var inListButNotInList2 = list.Except(list2);
var inList2ButNotInList = list2.Except(list);

This method is implemented by using deferred execution. That means you could write for example:

该方法是通过使用延迟执行来实现的。这意味着您可以编写例如：

var first10 = inListButNotInList2.Take(10);

It is also efficient since it internally uses a Set<T>to compare the objects. It works by first collecting all distinct values from the second sequence, and then streaming the results of the first, checking that they haven't been seen before.

它也很有效，因为它在内部使用 aSet<T>来比较对象。它的工作原理是首先从第二个序列中收集所有不同的值，然后流式传输第一个序列的结果，检查它们之前是否没有出现过。

Answer 3

回答by Ali Issa

try this way:

试试这个方法：

var difList = list1.Where(a => !list2.Any(a1 => a1.id == a.id))
            .Union(list2.Where(a => !list1.Any(a1 => a1.id == a.id)));

Answer 4

回答by Pius Hermit

Not for this Problem, but here's some code to compare lists for equal and not! identical objects:

不是针对这个问题，而是这里有一些代码来比较列表是否相等！相同的对象：

public class EquatableList<T> : List<T>, IEquatable<EquatableList<T>> where    T : IEquatable<T>

/// <summary>
/// True, if this contains element with equal property-values
/// </summary>
/// <param name="element">element of Type T</param>
/// <returns>True, if this contains element</returns>
public new Boolean Contains(T element)
{
    return this.Any(t => t.Equals(element));
}

/// <summary>
/// True, if list is equal to this
/// </summary>
/// <param name="list">list</param>
/// <returns>True, if instance equals list</returns>
public Boolean Equals(EquatableList<T> list)
{
    if (list == null) return false;
    return this.All(list.Contains) && list.All(this.Contains);
}

Answer 5

回答by e.gad

If you want the results to be case insensitive, the following will work:

如果您希望结果不区分大小写，则以下操作将起作用：

List<string> list1 = new List<string> { "a.dll", "b1.dll" };
List<string> list2 = new List<string> { "A.dll", "b2.dll" };

var firstNotSecond = list1.Except(list2, StringComparer.OrdinalIgnoreCase).ToList();
var secondNotFirst = list2.Except(list1, StringComparer.OrdinalIgnoreCase).ToList();

firstNotSecondwould contain b1.dll

firstNotSecond将包含b1.dll

secondNotFirstwould contain b2.dll

secondNotFirst将包含b2.dll

Answer 6

回答by Sathish

I have used this code to compare two list which has million of records.

我使用此代码来比较两个包含数百万条记录的列表。

This method will not take much time

这个方法不会花太多时间

    //Method to compare two list of string
    private List<string> Contains(List<string> list1, List<string> list2)
    {
        List<string> result = new List<string>();

        result.AddRange(list1.Except(list2, StringComparer.OrdinalIgnoreCase));
        result.AddRange(list2.Except(list1, StringComparer.OrdinalIgnoreCase));

        return result;
    }

Answer 7

回答by Jibz

May be its funny, but works for me

可能很有趣，但对我有用

string.Join("",List1) != string.Join("", List2)

Answer 8

回答by Fajoui El Mahdi

This is the best solution you'll found

这是您会找到的最佳解决方案

var list3 = list1.Where(l => list2.ToList().Contains(l));

Answer 9

回答by Ali Khaleghi Karsalari

If only combined result needed, this will work too:

如果只需要组合结果，这也将起作用：

var set1 = new HashSet<T>(list1);
var set2 = new HashSet<T>(list2);
var areEqual = set1.SetEquals(set2);

where T is type of lists element.

其中 T 是列表元素的类型。

Answer 10

回答by Devon Parsons

using System.Collections.Generic;
using System.Linq;

namespace YourProject.Extensions
{
    public static class ListExtensions
    {
        public static bool SetwiseEquivalentTo<T>(this List<T> list, List<T> other)
            where T: IEquatable<T>
        {
            if (list.Except(other).Any())
                return false;
            if (other.Except(list).Any())
                return false;
            return true;
        }
    }
}

Sometimes you only need to know iftwo lists are different, and not what those differences are. In that case, consider adding this extension method to your project. Note that your listed objects should implement IEquatable!

有时候，你只需要知道，如果两个列表是不同的，而不是那些差异。在这种情况下，请考虑将此扩展方法添加到您的项目中。请注意，您列出的对象应该实现 IEquatable！

Usage:

用法：

public sealed class Car : IEquatable<Car>
{
    public Price Price { get; }
    public List<Component> Components { get; }

    ...
    public override bool Equals(object obj)
        => obj is Car other && Equals(other);

    public bool Equals(Car other)
        => Price == other.Price
            && Components.SetwiseEquivalentTo(other.Components);

    public override int GetHashCode()
        => Components.Aggregate(
            Price.GetHashCode(),
            (code, next) => code ^ next.GetHashCode()); // Bitwise XOR
}

Whatever the Componentclass is, the methods shown here for Carshould be implemented almost identically.

无论Component是什么类，此处显示的方法Car都应该几乎相同地实现。

It's very important to note how we've written GetHashCode. In order to properly implement IEquatable, Equalsand GetHashCodemustoperate on the instance's properties in a logically compatible way.

请务必注意我们是如何编写 GetHashCode 的。为了正确实现IEquatable，Equals并且GetHashCode必须以逻辑上兼容的方式对实例的属性进行操作。

Two lists with the same contents are still different objects, and will produce different hash codes. Since we want these two lists to be treated as equal, we must let GetHashCodeproduce the same value for each of them. We can accomplish this by delegating the hashcode to every element in the list, and using the standard bitwise XOR to combine them all. XOR is order-agnostic, so it doesn't matter if the lists are sorted differently. It only matters that they contain nothing but equivalent members.

两个内容相同的列表仍然是不同的对象，会产生不同的哈希码。由于我们希望这两个列表被视为相等，我们必须让GetHashCode它们为每个列表生成相同的值。我们可以通过将哈希码委托给列表中的每个元素，并使用标准的按位异或将它们全部组合来实现这一点。XOR 与顺序无关，因此列表的排序方式是否不同都没有关系。重要的是它们只包含等效的成员。

Note: the strange name is to imply the fact that the method does not consider the order of the elements in the list. If you do care about the order of the elements in the list, this method is not for you!

注意：奇怪的名字是暗示该方法不考虑列表中元素的顺序。如果您确实关心列表中元素的顺序，则此方法不适合您！

C# 比较两个通用列表差异的最快方法

提问by Frank

采纳答案by Jon Skeet

回答by Tim Schmelter

回答by Ali Issa

回答by Pius Hermit

回答by e.gad

回答by Sathish

回答by Jibz

回答by Fajoui El Mahdi

回答by Ali Khaleghi Karsalari

回答by Devon Parsons

相关推荐

最近更新

标签

C# 比较两个通用列表差异的最快方法

提问by Frank

采纳答案by Jon Skeet

回答by Tim Schmelter

回答by Ali Issa

回答by Pius Hermit

回答by e.gad

回答by Sathish

回答by Jibz

回答by Fajoui El Mahdi

回答by Ali Khaleghi Karsalari

回答by Devon Parsons

相关推荐

在 C# 中创建动态字符串数组并通过循环将字符串（split 方法的结果）添加到两个单独的数组中

C# 具有多个 AND 条件的 DataTable Select 实现

C# 如何绑定转发器 ItemDataBound，更新下拉列表 SelectedIndexChanged

C# 如何暂停和恢复 BackgroundWorker？

相关推荐

最近更新

标签