C# 比较两个不同长度的数组并显示差异

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1022986/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:43:51  来源:igfitidea点击:

Compare Two Arrays Of Different Lengths and Show Differences

c#arrays

提问by Sean

Problem:
I have two arrays that can possibly be different lengths. I need to iterate through both arrays and find similarities, additions, and deletions.

问题:
我有两个长度可能不同的数组。我需要遍历两个数组并找到相似之处、添加和删除。

What's the fastest and most efficient way to accomplish this in C#?

在 C# 中完成此任务的最快和最有效的方法是什么?

Edit:The arrays are pre-sorted and they can contain anywhere between 50-100 items. Also, there aren't any constraints on speed and/or memory usage (however, no one likes a memory hog;)

编辑:数组是预先排序的,它们可以包含 50-100 个项目之间的任何地方。此外,对速度和/或内存使用没有任何限制(但是,没有人喜欢内存占用;)



For example:

例如:

String[] Foo_Old = {"test1", "test2", "test3"};
String[] Foo_New = {"test1", "test2", "test4", "test5"};

AND

String[] Bar_Old = {"test1", "test2", "test4"};
String[] Bar_New = {"test1", "test3"};


Differences:

区别:

(with respect to the Foo_New array)

(关于 Foo_New 数组)

[Same]    "test1"
[Same]    "test2"
[Removed] "test3"
[Added]   "test4"
[Added]   "test5"

(with respect to the Bar_New array)

(关于 Bar_New 数组)

[Same]    "test1"
[Removed] "test2"
[Removed] "test4"
[Added]   "test3"

采纳答案by JP Alioto

You can use Exceptand Intersect...

您可以使用“除外”和“相交”...

var Foo_Old = new[] { "test1", "test2", "test3" }; 
var Foo_New = new[] { "test1", "test2", "test4", "test5" };

var diff = Foo_New.Except( Foo_Old );
var inter = Foo_New.Intersect( Foo_Old );
var rem = Foo_Old.Except(Foo_New);

foreach (var s in diff)
{
    Console.WriteLine("Added " + s);
}

foreach (var s in inter)
{
    Console.WriteLine("Same " + s);
}

foreach (var s in rem)
{
    Console.WriteLine("Removed " + s);
}

回答by Rob Rolnick

Since your arrays are sorted, you should be able to just go through the arrays simultaneously, and in one pass and determine if each element is in the other array. (Similar to the merge step in merge sort.) You can see a sample of that below:

由于您的数组已排序,您应该能够同时遍历数组,并在一次传递中确定每个元素是否在另一个数组中。(类似于合并排序中的合并步骤。)您可以在下面看到一个示例:

string[] oldVersion = { "test1", "test2", "test3" };
string[] newVersion = { "test1", "test2", "test4", "test5" };

int oldIndex = 0, newIndex = 0;

while ((oldIndex < oldVersion.Length) && (newIndex < newVersion.Length)) {
   int comparison = oldVersion[oldIndex].CompareTo(newVersion[newIndex]);

   if (comparison < 0)
      Console.WriteLine("[Removed]\t" + oldVersion[oldIndex++]);
   else if (comparison > 0)
      Console.WriteLine("[Added]\t\t" + newVersion[newIndex++]);
   else {
      Console.WriteLine("[Same]\t\t" + oldVersion[oldIndex++]);
      newIndex++;
   }
}

while (oldIndex < oldVersion.Length)
   Console.WriteLine("[Removed]\t" + oldVersion[oldIndex++]);

while (newIndex < newVersion.Length)
   Console.WriteLine("[Added]\t\t" + newVersion[newIndex++]);

Alternatively you'd need to go through one array, and for each element in this array, do a single pass of the other array looking for a match.

或者,您需要遍历一个数组,并且对于该数组中的每个元素,对另一个数组执行单次遍历以寻找匹配项。

Edit: JP has a good suggestion on how to do this using the framework. Although, assuming the arrays are sorted, the benefit of my approach is that you only have to do one pass to find all the results. You would not have to do three passes.

编辑:JP 对如何使用框架执行此操作有很好的建议。虽然,假设数组已排序,但我的方法的好处是您只需执行一次即可找到所有结果。你不必做三遍。

回答by Sam Saffron

I wrote this a while back:

不久前我写了这个:

Usage:

用法:

foreach (var diff in Foo_Old.Diff(Foo_New)){
   Console.WriteLine ("{0} action performed on {1}",diff.DiffAction,diff.Value);
}

Implementation:

执行:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace LinqExtensions {

    enum DiffAction {
       Added,
       Removed,
       Same
    }

    class DiffPair<T> {
        public T Value { get; set; }
        public DiffAction DiffAction { get; set; }
    }

    static class DiffExtension {
        public static IEnumerable<DiffPair<T>> Diff<T>
                 (
                     this IEnumerable<T> original,
                     IEnumerable<T> target 
                 ) {

            Dictionary<T, DiffAction> results = new Dictionary<T, DiffAction>();

            foreach (var item in original) {
                results[item] = DiffAction.Removed;
            }

            foreach (var item in target) {
                if (results.ContainsKey(item)) {
                    results[item] = DiffAction.Same;
                } else {
                    results[item] = DiffAction.Added;
                }
            }
            return results.Select(
                pair => new DiffPair<T> {
                    Value=pair.Key, 
                    DiffAction = pair.Value
                });
        }
    }

}

回答by Cade Roux

I went ahead and hand-coded one and use the example in the accepted answer, and the hand-coded one performs a little better. I handled outputting my strings a little differently. Other factors to consider include whether the Except make a sorted copy of the array (since it cannot assume it's sorted) or whether it makes some kind of hash or a linear search (it's actually restricted to IEnumerable - for very large arrays which are already sorted, this could be a problem). You could change mine to compare IEnumerable (which is more general) instead of IComparable[].

我继续手动编码并使用已接受答案中的示例,手动编码的性能稍好一些。我处理输出我的字符串有点不同。其他需要考虑的因素包括,Except 是否制作数组的排序副本(因为它不能假设它已排序),或者它是否进行某种散列或线性搜索(它实际上仅限于 IEnumerable - 对于已经排序的非常大的数组,这可能是一个问题)。您可以更改我的以比较 IEnumerable(更通用)而不是 IComparable[]。

static void ArrayCompare(IComparable[] Old, IComparable[] New)
{
    int lpOld = 0;
    int lpNew = 0;
    int OldLength = Old.Length;
    int NewLength = New.Length;
    while (lpOld < OldLength || lpNew < NewLength)
    {
        int compare;

        if (lpOld >= OldLength) compare = 1;
        else if (lpNew >= NewLength) compare = -1;
        else compare = Old[lpOld].CompareTo(New[lpNew]);

        if (compare < 0)
        {
            Debug.WriteLine(string.Format("[Removed] {0}", Old[lpOld].ToString()));
            lpOld++;
        }
        else if (compare > 0)
        {
            Debug.WriteLine(string.Format("[Added] {0}", New[lpNew].ToString()));
            lpNew++;
        }
        else
        {
            Debug.WriteLine(string.Format("[Same] {0}", Old[lpOld].ToString()));
            lpOld++;
            lpNew++;
        }
    }
}

static void ArrayCompare2(IComparable[] Old, IComparable[] New) {
    var diff = New.Except( Old );
    var inter = New.Intersect( Old );
    var rem = Old.Except(New);

    foreach (var s in diff)
    {
        Debug.WriteLine("Added " + s);
    }

    foreach (var s in inter)
    {
        Debug.WriteLine("Same " + s);
    }

    foreach (var s in rem)
    {
        Debug.WriteLine("Removed " + s);
    }
}

static void Main(string[] args)
{
    String[] Foo_Old = {"test1", "test2", "test3"};
    String[] Foo_New = {"test1", "test2", "test4", "test5"};
    String[] Bar_Old = {"test1", "test2", "test4"};
    String[] Bar_New = {"test1", "test3"};

    Stopwatch w1 = new Stopwatch();
    w1.Start();
    for (int lp = 0; lp < 10000; lp++)
    {
        ArrayCompare(Foo_Old, Foo_New);
        ArrayCompare(Bar_Old, Bar_New);
    }
    w1.Stop();

    Stopwatch w2 = new Stopwatch();
    w2.Start();
    for (int lp = 0; lp < 10000; lp++)
    {
        ArrayCompare2(Foo_Old, Foo_New);
        ArrayCompare2(Bar_Old, Bar_New);
    }
    w2.Stop();

    Debug.WriteLine(w1.Elapsed.ToString());
    Debug.WriteLine(w2.Elapsed.ToString());
}