C++ 在 O(n) 时间和 O(1) 空间中查找重复项

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5739024/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 18:46:33  来源:igfitidea点击:

Finding duplicates in O(n) time and O(1) space

c++calgorithm

提问by Zaki

Input: Given an array of n elements which contains elements from 0 to n-1, with any of these numbers appearing any number of times.

输入:给定一个由 n 个元素组成的数组,其中包含从 0 到 n-1 的元素,这些数字中的任何一个出现任意次数。

Goal : To find these repeating numbers in O(n) and using only constant memory space.

目标:在 O(n) 中找到这些重复的数字,并且只使用恒定的内存空间。

For example, let n be 7 and array be {1, 2, 3, 1, 3, 0, 6}, the answer should be 1 & 3. I checked similar questions here but the answers used some data structures like HashSetetc.

例如,让 n 为 7,数组为 {1, 2, 3, 1, 3, 0, 6},答案应该是 1 & 3。我在这里检查了类似的问题,但答案使用了一些数据结构,例如HashSet等。

Any efficient algorithm for the same?

任何有效的算法?

回答by caf

This is what I came up with, which doesn't require the additional sign bit:

这就是我想出的,它不需要额外的符号位:

for i := 0 to n - 1
    while A[A[i]] != A[i] 
        swap(A[i], A[A[i]])
    end while
end for

for i := 0 to n - 1
    if A[i] != i then 
        print A[i]
    end if
end for

The first loop permutes the array so that if element xis present at least once, then one of those entries will be at position A[x].

第一个循环对数组进行置换,以便如果元素x至少出现一次,则其中一个条目将位于 位置A[x]

Note that it may not look O(n) at first blush, but it is - although it has a nested loop, it still runs in O(N)time. A swap only occurs if there is an isuch that A[i] != i, and each swap sets at least one element such that A[i] == i, where that wasn't true before. This means that the total number of swaps (and thus the total number of executions of the whileloop body) is at most N-1.

请注意,乍一看它可能不是 O(n),但确实如此 - 尽管它有一个嵌套循环,但它仍然O(N)及时运行。交换仅在存在i这样的情况下发生A[i] != i,并且每个交换设置至少一个元素使得A[i] == i,而这在以前不是真的。这意味着交换的总数(以及while循环体的执行总数)最多为N-1

The second loop prints the values of xfor which A[x]doesn't equal x- since the first loop guarantees that if xexists at least once in the array, one of those instances will be at A[x], this means that it prints those values of xwhich are not present in the array.

第二个循环打印xfor whichA[x]不等于的值x- 因为第一个循环保证如果x数组中至少存在一次,其中一个实例将在A[x],这意味着它打印那些x不存在的值数组。

(Ideone link so you can play with it)

(Ideone 链接,所以你可以玩它)

回答by j_random_hacker

caf's brilliant answerprints each number that appears k times in the array k-1 times. That's useful behaviour, but the question arguably calls for each duplicate to be printed once only, and he alludes to the possibility of doing this without blowing the linear time/constant space bounds. This can be done by replacing his second loop with the following pseudocode:

caf 的精彩答案打印了在数组 k-1 次中出现 k 次的每个数字。这是有用的行为,但可以说这个问题要求每个副本只打印一次,他暗示这样做的可能性不会超出线性时间/恒定空间界限。这可以通过用以下伪代码替换他的第二个循环来完成:

for (i = 0; i < N; ++i) {
    if (A[i] != i && A[A[i]] == A[i]) {
        print A[i];
        A[A[i]] = i;
    }
}

This exploits the property that after the first loop runs, if any value mappears more than once, then one of those appearances is guaranteed to be in the correct position, namely A[m]. If we are careful we can use that "home" location to store information about whether any duplicates have been printed yet or not.

这利用了在第一个循环运行后的属性,如果任何值m出现不止一次,那么这些出现中的一个保证在正确的位置,即A[m]。如果我们小心,我们可以使用该“家”位置来存储有关是否已打印任何重复项的信息。

In caf's version, as we went through the array, A[i] != iimplied that A[i]is a duplicate. In my version, I rely on a slightly different invariant: that A[i] != i && A[A[i]] == A[i]implies that A[i]is a duplicate that we haven't seen before. (If you drop the "that we haven't seen before" part, the rest can be seen to be implied by the truth of caf's invariant, and the guarantee that all duplicates have some copy in a home location.) This property holds at the outset (after caf's 1st loop finishes) and I show below that it's maintained after each step.

在 caf 的版本中,当我们遍历数组时,A[i] != i暗示它A[i]是重复的。在我的版本中,我依赖一个稍微不同的不变量:这A[i] != i && A[A[i]] == A[i]意味着它A[i]我们以前从未见过的重复项。(如果你去掉“我们以前没有见过的”部分,其余的可以被认为是由 caf 的不变量的真实性所暗示的,并且保证所有的重复都在一个家庭位置有一些副本。)这个属性在一开始(在 caf 的第一个循环完成之后),我在下面展示了它在每一步之后都得到维护。

As we go through the array, success on the A[i] != ipart of the test implies that A[i]could bea duplicate that hasn't been seen before. If we haven't seen it before, then we expect A[i]'s home location to point to itself -- that's what's tested for by the second half of the ifcondition. If that's the case, we print it and alter the home location to point back to this first found duplicate, creating a 2-step "cycle".

当我们遍历数组时,A[i] != i测试部分的成功意味着它A[i]可能是以前从未见过的重复。如果我们之前没有见过它,那么我们希望A[i]的 home 位置指向它自己——这就是if条件的后半部分所测试的。如果是这种情况,我们将打印它并更改主位置以指向第一个找到的副本,创建一个两步“循环”。

To see that this operation doesn't alter our invariant, suppose m = A[i]for a particular position isatisfying A[i] != i && A[A[i]] == A[i]. It's obvious that the change we make (A[A[i]] = i) will work to prevent other non-home occurrences of mfrom being output as duplicates by causing the 2nd half of their ifconditions to fail, but will it work when iarrives at the home location, m? Yes it will, because now, even though at this new iwe find that the 1st half of the ifcondition, A[i] != i, is true, the 2nd half tests whether the location it points to is a home location and finds that it isn't. In this situation we no longer know whether mor A[m]was the duplicate value, but we know that either way, it has already been reported, because these 2-cycles are guaranteed not to appear in the result of caf's 1st loop. (Note that if m != A[m]then exactly one of mand A[m]occurs more than once, and the other does not occur at all.)

为了看到这个操作不会改变我们的不变量,假设m = A[i]一个特定的位置i满足A[i] != i && A[A[i]] == A[i]。很明显,我们所做的更改 ( A[A[i]] = i) 将m通过导致其if条件的第二半失败来防止其他非本地事件作为重复输出,但是当i到达本地位置时它会起作用m吗?是的,它会,因为现在,即使在这个新iif条件下,我们发现条件的第一半A[i] != i为真,第二半测试它指向的位置是否是家庭位置,发现它不是。在这种情况下,我们已经不知道是否mA[m]为重复的值,但我们知道,无论哪种方式,它已经被报道了,因为这两个周期保证不会出现在 caf 的第一个循环的结果中。(注意,如果m != A[m]随后的只有一个mA[m]发生不止一次,而其他没有出现在所有。)

回答by Prasoon Saurav

Here is the pseudocode

这是伪代码

for i <- 0 to n-1:
   if (A[abs(A[i])]) >= 0 :
       (A[abs(A[i])]) = -(A[abs(A[i])])
   else
      print i
end for

Sample code in C++

C++ 中的示例代码

回答by hoha

For relatively small N we can use div/mod operations

对于相对较小的 N,我们可以使用 div/mod 操作

n.times do |i|
  e = a[i]%n
  a[e] += n
end

n.times do |i| 
  count = a[i]/n
  puts i if count > 1
end

Not C/C++ but anyway

不是 C/C++ 但无论如何

http://ideone.com/GRZPI

http://ideone.com/GRZPI

回答by CAFxX

Not really pretty but at least it's easy to see the O(N) and O(1) properties. Basically we scan the array and, for each number we see if the corresponding position has been flagged already-seen-once (N) or already-seen-multiple-times (N+1). If it is flagged already-seen-once, we print it and flag it already-seen-multiple-times. If it is not flagged, we flag it already-seen-once and we move the original value of the corresponding index to the current position (flagging is a destructive operation).

不是很漂亮,但至少很容易看到 O(N) 和 O(1) 属性。基本上我们扫描数组,对于每个数字,我们看到相应的位置是否被标记为已经见过一次 (N) 或已经见过多次 (N+1)。如果它被标记为已经见过一次,我们打印它并标记它已经见过多次。如果它没有被标记,我们标记它已经见过一次,我们将相应索引的原始值移动到当前位置(标记是一种破坏性操作)。

for (i=0; i<a.length; i++) {
  value = a[i];
  if (value >= N)
    continue;
  if (a[value] == N)  {
    a[value] = N+1; 
    print value;
  } else if (a[value] < N) {
    if (value > i)
      a[i--] = a[value];
    a[value] = N;
  }
}

or, better yet (faster, despite the double loop):

或者,更好(更快,尽管有双循环):

for (i=0; i<a.length; i++) {
  value = a[i];
  while (value < N) {
    if (a[value] == N)  {
      a[value] = N+1; 
      print value;
      value = N;
    } else if (a[value] < N) {
      newvalue = value > i ? a[value] : N;
      a[value] = N;
      value = newvalue;
    }
  }
}

回答by Anshul garg

One solution in C is:

C中的一种解决方案是:

#include <stdio.h>

int finddup(int *arr,int len)
{
    int i;
    printf("Duplicate Elements ::");
    for(i = 0; i < len; i++)
    {
        if(arr[abs(arr[i])] > 0)
          arr[abs(arr[i])] = -arr[abs(arr[i])];
        else if(arr[abs(arr[i])] == 0)
        {
             arr[abs(arr[i])] = - len ;
        }
        else
          printf("%d ", abs(arr[i]));
    }

}
int main()
{   
    int arr1[]={0,1,1,2,2,0,2,0,0,5};
    finddup(arr1,sizeof(arr1)/sizeof(arr1[0]));
    return 0;
}

It is O(n) time and O(1) space complexity.

时间复杂度为 O(n),空间复杂度为 O(1)。

回答by Ivan Voroshilin

Let's assume that we present this array as a uni-directional graph data structure - each number is a vertex and its index in the array points to another vertex forming an edge of the graph.

让我们假设我们将此数组表示为单向图数据结构 - 每个数字都是一个顶点,其在数组中的索引指向另一个顶点,形成图的边。

For even more simplicity we have indices 0 to n-1 and range of number from 0..n-1. e.g.

为了更简单,我们有索引 0 到 n-1 和从 0..n-1 的数字范围。例如

   0  1  2  3  4 
 a[3, 2, 4, 3, 1]

0(3) --> 3(3) is a cycle.

0(3) --> 3(3) 是一个循环。

Answer: Just traverse the array relying on indices. if a[x] = a[y] then it's a cycle and thus duplicate. Skip to the next index and continue again and so forth until the end of of an array. Complexity: O(n) time and O(1) space.

答:只需根据索引遍历数组即可。如果 a[x] = a[y] 那么它是一个循环,因此是重复的。跳到下一个索引并再次继续,直到数组结束。复杂度:O(n) 时间和 O(1) 空间。

回答by CrazyPro007

I have created one sample playground app in swift for finding duplicates in 0(n) time complexity and constant extra space. Please check the url Finding Duplicates

我已经快速创建了一个示例 Playground 应用程序,用于在 0(n) 时间复杂度和恒定额外空间中查找重复项。请检查 url查找重复项

IMPAbove solution worked when an array contains elements from 0 to n-1 , with any of these numbers appearing any number of times.

当数组包含从 0 到 n-1 的元素且这些数字中的任何一个出现任意次数时,IMPAbove 解决方案有效。

回答by user12704811

private static void printRepeating(int arr[], int size) {
        int i = 0;
        int j = 1;
        while (i < (size - 1)) {
            if (arr[i] == arr[j]) {
                System.out.println(arr[i] + " repeated at index " + j);
                j = size;
            }
            j++;
            if (j >= (size - 1)) {
                i++;
                j = i + 1;
            }
        }

    }

回答by Eli

static void findrepeat()
{
    int[] arr = new int[7] {0,2,1,0,0,4,4};

    for (int i = 0; i < arr.Length; i++)
    {
        if (i != arr[i])
        {
            if (arr[i] == arr[arr[i]])
            {
                Console.WriteLine(arr[i] + "!!!");
            }

            int t = arr[i];
            arr[i] = arr[arr[i]];
            arr[t] = t;
        }
    }

    for (int j = 0; j < arr.Length; j++)
    {
        Console.Write(arr[j] + " ");
    }
    Console.WriteLine();

    for (int j = 0; j < arr.Length; j++)
    {
        if (j == arr[j])
        {
            arr[j] = 1;
        }
        else
        {
            arr[arr[j]]++;
            arr[j] = 0;
        }
    }

    for (int j = 0; j < arr.Length; j++)
    {
        Console.Write(arr[j] + " ");
    }
    Console.WriteLine();
}