C++ 在 O(n) 时间和 O(1) 空间中查找重复项
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5739024/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Finding duplicates in O(n) time and O(1) space
提问by Zaki
Input: Given an array of n elements which contains elements from 0 to n-1, with any of these numbers appearing any number of times.
输入:给定一个由 n 个元素组成的数组,其中包含从 0 到 n-1 的元素,这些数字中的任何一个出现任意次数。
Goal : To find these repeating numbers in O(n) and using only constant memory space.
目标:在 O(n) 中找到这些重复的数字,并且只使用恒定的内存空间。
For example, let n be 7 and array be {1, 2, 3, 1, 3, 0, 6}, the answer should be 1 & 3.
I checked similar questions here but the answers used some data structures like HashSet
etc.
例如,让 n 为 7,数组为 {1, 2, 3, 1, 3, 0, 6},答案应该是 1 & 3。我在这里检查了类似的问题,但答案使用了一些数据结构,例如HashSet
等。
Any efficient algorithm for the same?
任何有效的算法?
回答by caf
This is what I came up with, which doesn't require the additional sign bit:
这就是我想出的,它不需要额外的符号位:
for i := 0 to n - 1
while A[A[i]] != A[i]
swap(A[i], A[A[i]])
end while
end for
for i := 0 to n - 1
if A[i] != i then
print A[i]
end if
end for
The first loop permutes the array so that if element x
is present at least once, then one of those entries will be at position A[x]
.
第一个循环对数组进行置换,以便如果元素x
至少出现一次,则其中一个条目将位于 位置A[x]
。
Note that it may not look O(n) at first blush, but it is - although it has a nested loop, it still runs in O(N)
time. A swap only occurs if there is an i
such that A[i] != i
, and each swap sets at least one element such that A[i] == i
, where that wasn't true before. This means that the total number of swaps (and thus the total number of executions of the while
loop body) is at most N-1
.
请注意,乍一看它可能不是 O(n),但确实如此 - 尽管它有一个嵌套循环,但它仍然O(N)
及时运行。交换仅在存在i
这样的情况下发生A[i] != i
,并且每个交换设置至少一个元素使得A[i] == i
,而这在以前不是真的。这意味着交换的总数(以及while
循环体的执行总数)最多为N-1
。
The second loop prints the values of x
for which A[x]
doesn't equal x
- since the first loop guarantees that if x
exists at least once in the array, one of those instances will be at A[x]
, this means that it prints those values of x
which are not present in the array.
第二个循环打印x
for whichA[x]
不等于的值x
- 因为第一个循环保证如果x
数组中至少存在一次,其中一个实例将在A[x]
,这意味着它打印那些x
不存在的值数组。
回答by j_random_hacker
caf's brilliant answerprints each number that appears k times in the array k-1 times. That's useful behaviour, but the question arguably calls for each duplicate to be printed once only, and he alludes to the possibility of doing this without blowing the linear time/constant space bounds. This can be done by replacing his second loop with the following pseudocode:
caf 的精彩答案打印了在数组 k-1 次中出现 k 次的每个数字。这是有用的行为,但可以说这个问题要求每个副本只打印一次,他暗示这样做的可能性不会超出线性时间/恒定空间界限。这可以通过用以下伪代码替换他的第二个循环来完成:
for (i = 0; i < N; ++i) {
if (A[i] != i && A[A[i]] == A[i]) {
print A[i];
A[A[i]] = i;
}
}
This exploits the property that after the first loop runs, if any value m
appears more than once, then one of those appearances is guaranteed to be in the correct position, namely A[m]
. If we are careful we can use that "home" location to store information about whether any duplicates have been printed yet or not.
这利用了在第一个循环运行后的属性,如果任何值m
出现不止一次,那么这些出现中的一个保证在正确的位置,即A[m]
。如果我们小心,我们可以使用该“家”位置来存储有关是否已打印任何重复项的信息。
In caf's version, as we went through the array, A[i] != i
implied that A[i]
is a duplicate. In my version, I rely on a slightly different invariant: that A[i] != i && A[A[i]] == A[i]
implies that A[i]
is a duplicate that we haven't seen before. (If you drop the "that we haven't seen before" part, the rest can be seen to be implied by the truth of caf's invariant, and the guarantee that all duplicates have some copy in a home location.) This property holds at the outset (after caf's 1st loop finishes) and I show below that it's maintained after each step.
在 caf 的版本中,当我们遍历数组时,A[i] != i
暗示它A[i]
是重复的。在我的版本中,我依赖一个稍微不同的不变量:这A[i] != i && A[A[i]] == A[i]
意味着它A[i]
是我们以前从未见过的重复项。(如果你去掉“我们以前没有见过的”部分,其余的可以被认为是由 caf 的不变量的真实性所暗示的,并且保证所有的重复都在一个家庭位置有一些副本。)这个属性在一开始(在 caf 的第一个循环完成之后),我在下面展示了它在每一步之后都得到维护。
As we go through the array, success on the A[i] != i
part of the test implies that A[i]
could bea duplicate that hasn't been seen before. If we haven't seen it before, then we expect A[i]
's home location to point to itself -- that's what's tested for by the second half of the if
condition. If that's the case, we print it and alter the home location to point back to this first found duplicate, creating a 2-step "cycle".
当我们遍历数组时,A[i] != i
测试部分的成功意味着它A[i]
可能是以前从未见过的重复。如果我们之前没有见过它,那么我们希望A[i]
的 home 位置指向它自己——这就是if
条件的后半部分所测试的。如果是这种情况,我们将打印它并更改主位置以指向第一个找到的副本,创建一个两步“循环”。
To see that this operation doesn't alter our invariant, suppose m = A[i]
for a particular position i
satisfying A[i] != i && A[A[i]] == A[i]
. It's obvious that the change we make (A[A[i]] = i
) will work to prevent other non-home occurrences of m
from being output as duplicates by causing the 2nd half of their if
conditions to fail, but will it work when i
arrives at the home location, m
? Yes it will, because now, even though at this new i
we find that the 1st half of the if
condition, A[i] != i
, is true, the 2nd half tests whether the location it points to is a home location and finds that it isn't. In this situation we no longer know whether m
or A[m]
was the duplicate value, but we know that either way, it has already been reported, because these 2-cycles are guaranteed not to appear in the result of caf's 1st loop. (Note that if m != A[m]
then exactly one of m
and A[m]
occurs more than once, and the other does not occur at all.)
为了看到这个操作不会改变我们的不变量,假设m = A[i]
一个特定的位置i
满足A[i] != i && A[A[i]] == A[i]
。很明显,我们所做的更改 ( A[A[i]] = i
) 将m
通过导致其if
条件的第二半失败来防止其他非本地事件作为重复输出,但是当i
到达本地位置时它会起作用m
吗?是的,它会,因为现在,即使在这个新i
的if
条件下,我们发现条件的第一半A[i] != i
为真,第二半测试它指向的位置是否是家庭位置,发现它不是。在这种情况下,我们已经不知道是否m
或A[m]
为重复的值,但我们知道,无论哪种方式,它已经被报道了,因为这两个周期保证不会出现在 caf 的第一个循环的结果中。(注意,如果m != A[m]
随后的只有一个m
和A[m]
发生不止一次,而其他没有出现在所有。)
回答by Prasoon Saurav
Here is the pseudocode
这是伪代码
for i <- 0 to n-1:
if (A[abs(A[i])]) >= 0 :
(A[abs(A[i])]) = -(A[abs(A[i])])
else
print i
end for
回答by hoha
For relatively small N we can use div/mod operations
对于相对较小的 N,我们可以使用 div/mod 操作
n.times do |i|
e = a[i]%n
a[e] += n
end
n.times do |i|
count = a[i]/n
puts i if count > 1
end
Not C/C++ but anyway
不是 C/C++ 但无论如何
回答by CAFxX
Not really pretty but at least it's easy to see the O(N) and O(1) properties. Basically we scan the array and, for each number we see if the corresponding position has been flagged already-seen-once (N) or already-seen-multiple-times (N+1). If it is flagged already-seen-once, we print it and flag it already-seen-multiple-times. If it is not flagged, we flag it already-seen-once and we move the original value of the corresponding index to the current position (flagging is a destructive operation).
不是很漂亮,但至少很容易看到 O(N) 和 O(1) 属性。基本上我们扫描数组,对于每个数字,我们看到相应的位置是否被标记为已经见过一次 (N) 或已经见过多次 (N+1)。如果它被标记为已经见过一次,我们打印它并标记它已经见过多次。如果它没有被标记,我们标记它已经见过一次,我们将相应索引的原始值移动到当前位置(标记是一种破坏性操作)。
for (i=0; i<a.length; i++) {
value = a[i];
if (value >= N)
continue;
if (a[value] == N) {
a[value] = N+1;
print value;
} else if (a[value] < N) {
if (value > i)
a[i--] = a[value];
a[value] = N;
}
}
or, better yet (faster, despite the double loop):
或者,更好(更快,尽管有双循环):
for (i=0; i<a.length; i++) {
value = a[i];
while (value < N) {
if (a[value] == N) {
a[value] = N+1;
print value;
value = N;
} else if (a[value] < N) {
newvalue = value > i ? a[value] : N;
a[value] = N;
value = newvalue;
}
}
}
回答by Anshul garg
One solution in C is:
C中的一种解决方案是:
#include <stdio.h>
int finddup(int *arr,int len)
{
int i;
printf("Duplicate Elements ::");
for(i = 0; i < len; i++)
{
if(arr[abs(arr[i])] > 0)
arr[abs(arr[i])] = -arr[abs(arr[i])];
else if(arr[abs(arr[i])] == 0)
{
arr[abs(arr[i])] = - len ;
}
else
printf("%d ", abs(arr[i]));
}
}
int main()
{
int arr1[]={0,1,1,2,2,0,2,0,0,5};
finddup(arr1,sizeof(arr1)/sizeof(arr1[0]));
return 0;
}
It is O(n) time and O(1) space complexity.
时间复杂度为 O(n),空间复杂度为 O(1)。
回答by Ivan Voroshilin
Let's assume that we present this array as a uni-directional graph data structure - each number is a vertex and its index in the array points to another vertex forming an edge of the graph.
让我们假设我们将此数组表示为单向图数据结构 - 每个数字都是一个顶点,其在数组中的索引指向另一个顶点,形成图的边。
For even more simplicity we have indices 0 to n-1 and range of number from 0..n-1. e.g.
为了更简单,我们有索引 0 到 n-1 和从 0..n-1 的数字范围。例如
0 1 2 3 4
a[3, 2, 4, 3, 1]
0(3) --> 3(3) is a cycle.
0(3) --> 3(3) 是一个循环。
Answer: Just traverse the array relying on indices. if a[x] = a[y] then it's a cycle and thus duplicate. Skip to the next index and continue again and so forth until the end of of an array. Complexity: O(n) time and O(1) space.
答:只需根据索引遍历数组即可。如果 a[x] = a[y] 那么它是一个循环,因此是重复的。跳到下一个索引并再次继续,直到数组结束。复杂度:O(n) 时间和 O(1) 空间。
回答by CrazyPro007
I have created one sample playground app in swift for finding duplicates in 0(n) time complexity and constant extra space. Please check the url Finding Duplicates
我已经快速创建了一个示例 Playground 应用程序,用于在 0(n) 时间复杂度和恒定额外空间中查找重复项。请检查 url查找重复项
IMPAbove solution worked when an array contains elements from 0 to n-1 , with any of these numbers appearing any number of times.
当数组包含从 0 到 n-1 的元素且这些数字中的任何一个出现任意次数时,IMPAbove 解决方案有效。
回答by user12704811
private static void printRepeating(int arr[], int size) {
int i = 0;
int j = 1;
while (i < (size - 1)) {
if (arr[i] == arr[j]) {
System.out.println(arr[i] + " repeated at index " + j);
j = size;
}
j++;
if (j >= (size - 1)) {
i++;
j = i + 1;
}
}
}
回答by Eli
static void findrepeat()
{
int[] arr = new int[7] {0,2,1,0,0,4,4};
for (int i = 0; i < arr.Length; i++)
{
if (i != arr[i])
{
if (arr[i] == arr[arr[i]])
{
Console.WriteLine(arr[i] + "!!!");
}
int t = arr[i];
arr[i] = arr[arr[i]];
arr[t] = t;
}
}
for (int j = 0; j < arr.Length; j++)
{
Console.Write(arr[j] + " ");
}
Console.WriteLine();
for (int j = 0; j < arr.Length; j++)
{
if (j == arr[j])
{
arr[j] = 1;
}
else
{
arr[arr[j]]++;
arr[j] = 0;
}
}
for (int j = 0; j < arr.Length; j++)
{
Console.Write(arr[j] + " ");
}
Console.WriteLine();
}