C++ 越界访问数组不会出错,为什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1239938/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Accessing an array out of bounds gives no error, why?
提问by seg.server.fault
I am assigning values in a C++ program out of the bounds like this:
我在 C++ 程序中分配值越界,如下所示:
#include <iostream>
using namespace std;
int main()
{
int array[2];
array[0] = 1;
array[1] = 2;
array[3] = 3;
array[4] = 4;
cout << array[3] << endl;
cout << array[4] << endl;
return 0;
}
The program prints 3
and 4
. It should not be possible. I am using g++ 4.3.3
该程序打印3
和4
。这应该是不可能的。我正在使用 g++ 4.3.3
Here is compile and run command
这是编译和运行命令
$ g++ -W -Wall errorRange.cpp -o errorRange
$ ./errorRange
3
4
Only when assigning array[3000]=3000
does it give me a segmentation fault.
只有在分配时array[3000]=3000
它才会给我一个分段错误。
If gcc doesn't check for array bounds, how can I be sure if my program is correct, as it can lead to some serious issues later?
如果 gcc 不检查数组边界,我如何确定我的程序是否正确,因为它以后可能会导致一些严重的问题?
I replaced the above code with
我用上面的代码替换了
vector<int> vint(2);
vint[0] = 0;
vint[1] = 1;
vint[2] = 2;
vint[5] = 5;
cout << vint[2] << endl;
cout << vint[5] << endl;
and this one also produces no error.
而这个也不会产生错误。
回答by jalf
Welcome to every C/C++ programmer's bestest friend: Undefined Behavior.
欢迎来到每个 C/C++ 程序员最好的朋友:未定义行为。
There is a lot that is not specified by the language standard, for a variety of reasons. This is one of them.
由于各种原因,有很多语言标准没有指定。这是其中之一。
In general, whenever you encounter undefined behavior, anythingmight happen. The application may crash, it may freeze, it may eject your CD-ROM drive or make demons come out of your nose. It may format your harddrive or email all your porn to your grandmother.
通常,每当您遇到未定义的行为时,任何事情都可能发生。应用程序可能会崩溃,可能会冻结,可能会弹出您的 CD-ROM 驱动器或让恶魔从您的鼻子里出来。它可能会格式化您的硬盘或将您所有的内容通过电子邮件发送给您的祖母。
It may even, if you are really unlucky, appearto work correctly.
如果您真的不走运,它甚至可能看起来工作正常。
The language simply says what should happen if you access the elements withinthe bounds of an array. It is left undefined what happens if you go out of bounds. It might seemto work today, on your compiler, but it is not legal C or C++, and there is no guarantee that it'll still work the next time you run the program. Or that it hasn't overwritten essential data even now, and you just haven't encountered the problems, that it isgoing to cause — yet.
该语言只是说明如果访问数组边界内的元素会发生什么。如果你出界,会发生什么是不确定的。它今天似乎可以在您的编译器上运行,但它不是合法的 C 或 C++,并且不能保证下次运行程序时它仍然可以运行。或者说,它并没有被覆盖的基本数据即使是现在,你只是还没有遇到的问题,它是将原因-但。
As for whythere is no bounds checking, there are a couple aspects to the answer:
至于为什么没有边界检查,答案有几个方面:
- An array is a leftover from C. C arrays are about as primitive as you can get. Just a sequence of elements with contiguous addresses. There is no bounds checking because it is simply exposing raw memory. Implementing a robust bounds-checking mechanism would have been almost impossible in C.
- In C++, bounds-checking is possible on class types. But an array is still the plain old C-compatible one. It is not a class. Further, C++ is also built on another rule which makes bounds-checking non-ideal. The C++ guiding principle is "you don't pay for what you don't use". If your code is correct, you don't need bounds-checking, and you shouldn't be forced to pay for the overhead of runtime bounds-checking.
- So C++ offers the
std::vector
class template, which allows both.operator[]
is designed to be efficient. The language standard does not require that it performs bounds checking (although it does not forbid it either). A vector also has theat()
member function which is guaranteedto perform bounds-checking. So in C++, you get the best of both worlds if you use a vector. You get array-like performance without bounds-checking, andyou get the ability to use bounds-checked access when you want it.
- 数组是 C 的剩余部分。 C 数组与您所能获得的一样原始。只是具有连续地址的元素序列。没有边界检查,因为它只是暴露原始内存。在 C 中实现强大的边界检查机制几乎是不可能的。
- 在 C++ 中,可以对类类型进行边界检查。但是数组仍然是普通的旧 C 兼容数组。它不是一个类。此外,C++ 还建立在另一条规则之上,这使得边界检查变得不理想。C++ 的指导原则是“不用为不使用的东西付费”。如果你的代码是正确的,你就不需要边界检查,你不应该被迫支付运行时边界检查的开销。
- 所以 C++ 提供了
std::vector
类模板,它允许两者。operator[]
旨在提高效率。语言标准不要求它执行边界检查(尽管它也没有禁止)。向量还具有保证执行边界检查的at()
成员函数。因此,在 C++ 中,如果使用向量,则可以两全其美。您无需边界检查即可获得类似数组的性能,并且可以在需要时使用边界检查访问。
回答by Richard Corden
Using g++, you can add the command line option: -fstack-protector-all
.
使用 g++,您可以添加命令行选项: -fstack-protector-all
.
On your example it resulted in the following:
在您的示例中,结果如下:
> g++ -o t -fstack-protector-all t.cc
> ./t
3
4
/bin/bash: line 1: 15450 Segmentation fault ./t
It doesn't really help you find or solve the problem, but at least the segfault will let you know that somethingis wrong.
它并不能真正帮助您找到或解决问题,但至少段错误会让您知道出现了问题。
回答by Arkaitz Jimenez
g++ does not check for array bounds, and you may be overwriting something with 3,4 but nothing really important, if you try with higher numbers you'll get a crash.
g++ 不检查数组边界,你可能会用 3,4 覆盖一些东西,但没有什么真正重要的,如果你尝试使用更高的数字,你会崩溃。
You are just overwriting parts of the stack that are not used, you could continue till you reach the end of the allocated space for the stack and it'd crash eventually
您只是覆盖了未使用的堆栈部分,您可以继续直到到达为堆栈分配的空间的末尾,并且最终会崩溃
EDIT: You have no way of dealing with that, maybe a static code analyzer could reveal those failures, but that's too simple, you may have similar(but more complex) failures undetected even for static analyzers
编辑:您无法处理这个问题,也许静态代码分析器可以揭示这些故障,但这太简单了,即使对于静态分析器,您也可能无法检测到类似(但更复杂)的故障
回答by jkeys
It's undefined behavior as far as I know. Run a larger program with that and it will crash somewhere along the way. Bounds checking is not a part of raw arrays (or even std::vector).
据我所知,这是未定义的行为。用它运行一个更大的程序,它会在途中的某个地方崩溃。边界检查不是原始数组(甚至 std::vector)的一部分。
Use std::vector with std::vector::iterator
's instead so you don't have to worry about it.
将 std::vector 与std::vector::iterator
's一起使用,这样您就不必担心了。
Edit:
编辑:
Just for fun, run this and see how long until you crash:
只是为了好玩,运行这个,看看你多久会崩溃:
int main()
{
int array[1];
for (int i = 0; i != 100000; i++)
{
array[i] = i;
}
return 0; //will be lucky to ever reach this
}
Edit2:
编辑2:
Don't run that.
不要运行那个。
Edit3:
编辑3:
OK, here is a quick lesson on arrays and their relationships with pointers:
好的,这是关于数组及其与指针关系的快速课程:
When you use array indexing, you are really using a pointer in disguise (called a "reference"), that is automatically dereferenced. This is why instead of *(array[1]), array[1] automatically returns the value at that value.
当您使用数组索引时,您实际上是在变相使用一个指针(称为“引用”),它会自动取消引用。这就是为什么 array[1] 自动返回该值而不是 *(array[1]) 的原因。
When you have a pointer to an array, like this:
当你有一个指向数组的指针时,像这样:
int array[5];
int *ptr = array;
Then the "array" in the second declaration is really decaying to a pointer to the first array. This is equivalent behavior to this:
然后第二个声明中的“数组”实际上衰减到指向第一个数组的指针。这是等效的行为:
int *ptr = &array[0];
When you try to access beyond what you allocated, you are really just using a pointer to other memory (which C++ won't complain about). Taking my example program above, that is equivalent to this:
当您尝试访问超出您分配的内容时,您实际上只是在使用指向其他内存的指针(C++ 不会抱怨)。以我上面的示例程序为例,这相当于:
int main()
{
int array[1];
int *ptr = array;
for (int i = 0; i != 100000; i++, ptr++)
{
*ptr++ = i;
}
return 0; //will be lucky to ever reach this
}
The compiler won't complain because in programming, you often have to communicate with other programs, especially the operating system. This is done with pointers quite a bit.
编译器不会抱怨,因为在编程中,您经常需要与其他程序进行通信,尤其是操作系统。这是通过指针完成的。
回答by Arpegius
Hint
暗示
If you want to have fast constraint size arrays with range error check, try using boost::array, (also std::tr1::arrayfrom <tr1/array>
it will be standard container in next C++ specification). It's much faster then std::vector. It reserve memory on heap or inside class instance, just like int array[].
This is simple sample code:
如果您想使用具有范围错误检查的快速约束大小数组,请尝试使用boost::array,(来自它的std::tr1::array<tr1/array>
也将成为下一个 C++ 规范中的标准容器)。它比 std::vector 快得多。它在堆或类实例内部保留内存,就像 int array[] 一样。
这是简单的示例代码:
#include <iostream>
#include <boost/array.hpp>
int main()
{
boost::array<int,2> array;
array.at(0) = 1; // checking index is inside range
array[1] = 2; // no error check, as fast as int array[2];
try
{
// index is inside range
std::cout << "array.at(0) = " << array.at(0) << std::endl;
// index is outside range, throwing exception
std::cout << "array.at(2) = " << array.at(2) << std::endl;
// never comes here
std::cout << "array.at(1) = " << array.at(1) << std::endl;
}
catch(const std::out_of_range& r)
{
std::cout << "Something goes wrong: " << r.what() << std::endl;
}
return 0;
}
This program will print:
该程序将打印:
array.at(0) = 1
Something goes wrong: array<>: index out of range
回答by Paul Dixon
You are certainly overwriting your stack, but the program is simple enough that effects of this go unnoticed.
您肯定会覆盖您的堆栈,但该程序非常简单,以至于不会引起注意。
回答by Karl Voigtland
C or C++ will not check the bounds of an array access.
C 或 C++ 不会检查数组访问的边界。
You are allocating the array on the stack. Indexing the array via array[3]
is equivalent to *(array + 3)
, where array is a pointer to &array[0]. This will result in undefined behavior.
您正在堆栈上分配数组。通过对数组进行索引array[3]
相当于 * (array + 3)
,其中数组是指向 &array[0] 的指针。这将导致未定义的行为。
One way to catch this sometimesin C is to use a static checker, such as splint. If you run:
有时在 C 中捕获这种情况的一种方法是使用静态检查器,例如splint。如果你运行:
splint +bounds array.c
on,
在,
int main(void)
{
int array[1];
array[1] = 1;
return 0;
}
then you will get the warning:
然后你会收到警告:
array.c: (in function main) array.c:5:9: Likely out-of-bounds store: array[1] Unable to resolve constraint: requires 0 >= 1 needed to satisfy precondition: requires maxSet(array @ array.c:5:9) >= 1 A memory write may write to an address beyond the allocated buffer.
array.c:(在函数 main 中)array.c:5:9:可能越界存储:array[1] 无法解决约束:需要 0 >= 1 需要满足前提条件:需要 maxSet(array @array .c:5:9) >= 1 内存写入可能会写入超出分配缓冲区的地址。
回答by Todd Stout
Run this through Valgrindand you might see an error.
通过Valgrind运行它,您可能会看到一个错误。
As Falaina pointed out, valgrind does not detect many instances of stack corruption. I just tried the sample under valgrind, and it does indeed report zero errors. However, Valgrind can be instrumental in finding many other types of memory problems, it's just not particularly useful in this case unless you modify your bulid to include the --stack-check option. If you build and run the sample as
正如 Falaina 指出的那样,valgrind 不会检测到许多堆栈损坏的实例。我刚刚在 valgrind 下尝试了示例,它确实报告了零错误。然而,Valgrind 可以帮助发现许多其他类型的内存问题,在这种情况下它不是特别有用,除非你修改你的 bulid 以包含 --stack-check 选项。如果您构建并运行示例
g++ --stack-check -W -Wall errorRange.cpp -o errorRange
valgrind ./errorRange
valgrind willreport an error.
valgrind会报错。
回答by John Bode
Undefined behavior working in your favor. Whatever memory you're clobbering apparently isn't holding anything important. Note that C and C++ do not do bounds checking on arrays, so stuff like that isn't going to be caught at compile or run time.
未定义的行为对你有利。无论你正在破坏什么记忆,显然都没有任何重要的东西。请注意,C 和 C++ 不会对数组进行边界检查,因此不会在编译或运行时捕获此类内容。
回答by Nathan Clark
When you initialize the array with int array[2]
, space for 2 integers is allocated; but the identifier array
simply points to the beginning of that space. When you then access array[3]
and array[4]
, the compiler then simply increments that address to point to where those values would be, if the array was long enough; try accessing something like array[42]
without initializing it first, you'll end up getting whatever value happened to already be in memory at that location.
当您使用 初始化数组时int array[2]
,会分配 2 个整数的空间;但标识符array
只是指向该空间的开头。然后,当您访问array[3]
and 时array[4]
,如果数组足够长,编译器就会简单地增加该地址以指向这些值的位置;尝试访问类似的东西array[42]
而不先初始化它,你最终会得到该位置已经在内存中的任何值。
Edit:
编辑:
More info on pointers/arrays: http://home.netcom.com/~tjensen/ptr/pointers.htm
有关指针/数组的更多信息:http: //home.netcom.com/~tjensen/ptr/pointers.htm