C++ 大型二维阵列会导致分段错误

Question

提问by Thomas L Holaday

I am writing some C++ code in Linux where I have declared a few 2D arrays like so:

我正在 Linux 中编写一些 C++ 代码，我在其中声明了一些二维数组，如下所示：

 double x[5000][500], y[5000][500], z[5000][500];

During compilation there is no error. When I execute it says "segmentation fault".

在编译过程中没有错误。当我执行它时说“分段错误”。

Wen I reduce the size of the array from 5000 to 50, the program runs fine. How can I protect myself against this problem?

文我将数组的大小从 5000 减少到 50，程序运行良好。我怎样才能保护自己免受这个问题的影响？

Answer 1

回答by Thomas L Holaday

If your program looks like this ...

如果你的程序看起来像这样......

int main(int, char **) {
   double x[5000][500],y[5000][500],z[5000][500];
   // ...
   return 0;
}

... then you are overflowing the stack. The fastest way to fix this is to add the word static.

...那么你正在溢出堆栈。解决此问题的最快方法是添加单词static。

int main(int, char **) {
   static double x[5000][500],y[5000][500],z[5000][500];
   // ...
   return 0;
}

The second fastest way to fix this is to move the declaration out of the function:

解决此问题的第二快方法是将声明移出函数：

double x[5000][500],y[5000][500],z[5000][500];
int main(int, char **) {
   // ...
   return 0;
}

The third fastest way to fix this is to allocate the memory on the heap:

解决此问题的第三种最快方法是在堆上分配内存：

int main(int, char **) {
   double **x = new double*[5000];
   double **y = new double*[5000];
   double **z = new double*[5000];
   for (size_t i = 0; i < 5000; i++) {
      x[i] = new double[500];
      y[i] = new double[500];
      z[i] = new double[500];
   }
   // ...
   for (size_t i = 5000; i > 0; ) {
      delete[] z[--i];
      delete[] y[i];
      delete[] x[i];
   }
   delete[] z;
   delete[] y;
   delete[] x;

   return 0;
}

The fourth fastest way is to allocate them on the heap using std::vector. It is fewer lines in your file but more lines in the compilation unit, and you must either think of a meaningful name for your derived vector types or tuck them into an anonymous namespace so they won't pollute the global namespace:

第四个最快的方法是使用 std::vector 在堆上分配它们。您的文件中的行数较少，但编译单元中的行数更多，您必须为派生的向量类型考虑一个有意义的名称，或者将它们放入匿名命名空间中，这样它们就不会污染全局命名空间：

#include <vector>
using std::vector
namespace { 
  struct Y : public vector<double> { Y() : vector<double>(500) {} };
  struct XY : public vector<Y> { XY() : vector<Y>(5000) {} } ;
}
int main(int, char **) {
  XY x, y, z;
  // ...
  return 0;
}

The fifth fastest way is to allocate them on the heap, but use templates so the dimensions are not so remote from the objects:

第五个最快的方法是在堆上分配它们，但使用模板，这样维度就不会远离对象：

include <vector>
using namespace std;
namespace {
  template <size_t N>
  struct Y : public vector<double> { Y() : vector<double>(N) {} };
  template <size_t N1, size_t N2>
  struct XY : public vector< Y<N2> > { XY() : vector< Y<N2> > (N1) {} } ;
}
int main(int, char **) {
  XY<5000,500> x, y, z;
  XY<500,50> mini_x, mini_y, mini_z;
  // ...
  return 0;
}

The most performant way is to allocate the two-dimensional arrays as one-dimensional arrays, and then use index arithmetic.

性能最好的方法是将二维数组分配为一维数组，然后使用索引算法。

All the above assumes that you have some reason, a good one or a poor one, for wanting to craft your own multidimensional array mechanism. If you have no reason, and expect to use multidimensional arrays again, strongly consider installing a library:

以上所有内容都假设您有某种理由，无论是好的还是坏的，想要制作自己的多维数组机制。如果你没有理由，并希望再次使用多维数组，强烈考虑安装一个库：

A plays-nicely-with-STL way is to use the Boost Multidimensional Array.
A speed way is to use Blitz++.

一个很好地与 STL 一起玩的方法是使用Boost Multidimensional Array。
一种快速的方法是使用Blitz++。

Answer 2

回答by xtofl

These arrays are on the stack. Stacks are quite limited in size. You probably run into a ... stack overflow :)

这些数组在堆栈上。堆栈的大小非常有限。您可能会遇到...堆栈溢出:)

If you want to avoid this, you need to put them on the free store:

如果您想避免这种情况，您需要将它们放在免费商店中：

double* x =new double[5000*5000];

But you better start the good habit of using the standard containers, which wrap all this for you:

但是你最好养成使用标准容器的好习惯，它为你包装了这一切：

std::vector< std::vector<int> > x( std::vector<int>(500), 5000 );

Plus: even if the stack fits the arrays, you still need room for functions to put their frames on it.

另外：即使堆栈适合数组，您仍然需要空间让函数将它们的框架放在上面。

Answer 3

回答by Beno?t

You may want to try and use Boost.Multi_array

您可能想尝试使用Boost.Multi_array

typedef boost::multi_array<double, 2> Double2d;
Double2d x(boost::extents[5000][500]);
Double2d y(boost::extents[5000][500]);
Double2d z(boost::extents[5000][500]);

The actual large memory chunk will be allocated on the heap and automatically deallocated when necessary.

实际的大内存块将在堆上分配，并在必要时自动释放。

Answer 4

回答by Norman Ramsey

Your declaration should appear at top level, outside any procedure or method.

您的声明应该出现在顶层，在任何过程或方法之外。

By far the easiest way to diagnose a segfaultin C or C++ code is to use valgrind. If one of your arrays is at fault, valgrind will pinpoint exactly where and how. If the fault lies elsewhere, it will tell you that, too.

到目前为止，在 C 或 C++ 代码中诊断段错误的最简单方法是使用valgrind。如果您的阵列之一出现故障，valgrind 将准确指出位置和方式。如果故障出在其他地方，它也会告诉您。

valgrind can be used on any x86 binary but will give more information if you compile with gcc -g.

valgrind 可用于任何 x86 二进制文件，但如果您使用gcc -g.

Answer 5

回答by Robert S. Barnes

One reservation about always using vector: as far as I understand it, if you walk off the end of the array it just allocates a larger array and copies everything over which might create subtle and hard to find errors when you are really tying to work with a fixed size array. At least with a real array you'll segfault if you walk off the end making the error easier to catch.

关于始终使用向量的一个保留意见：据我所知，如果您离开数组的末尾，它只会分配一个更大的数组并复制所有内容，当您真正想要使用时，这些内容可能会产生微妙且难以发现的错误一个固定大小的数组。至少对于一个真正的数组，如果你走到最后，使错误更容易捕捉到，你就会出现段错误。

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {

typedef double (*array5k_t)[5000];

array5k_t array5k = calloc(5000, sizeof(double)*5000);

// should generate segfault error
array5k[5000][5001] = 10;

return 0;
}

Answer 6

回答by Charlie

Looks to me like you have an honest-to-Spolsky stack overflow!

在我看来，你有一个诚实的 Spolsky 堆栈溢出！

Try compiling your program with gcc's -fstack-check option. If your arrays are too big to allocate on the stack, you'll get a StorageError exception.

尝试使用 gcc 的 -fstack-check 选项编译您的程序。如果您的数组太大而无法在堆栈上分配，您将收到 StorageError 异常。

I think it's a good bet, though, as 5000*500*3 doubles (8 bytes each) comes to around 60 megs - no platform has enough stack for that. You'll have to allocate your big arrays on the heap.

不过，我认为这是一个不错的选择，因为 5000*500*3 双倍（每个 8 个字节）大约为 60 兆字节 - 没有平台有足够的堆栈来实现这一点。您必须在堆上分配大数组。

Answer 7

回答by Tom

Another solution to the previous ones would be to execute a

以前的另一种解决方案是执行一个

ulimit -s stack_area

to expand the maximum stack.

扩展最大堆栈。

C++ 大型二维阵列会导致分段错误

提问by Thomas L Holaday

回答by Thomas L Holaday

回答by xtofl

回答by Beno?t

回答by Norman Ramsey

回答by Robert S. Barnes

回答by Charlie

回答by Tom

相关推荐

最近更新

标签

C++ 大型二维阵列会导致分段错误

提问by Thomas L Holaday

回答by Thomas L Holaday

回答by xtofl

回答by Beno?t

回答by Norman Ramsey

回答by Robert S. Barnes

回答by Charlie

回答by Tom

相关推荐

在 C++ 中按完整路径打开文件

C++ int64_t 的定义

查找文本行数的最快方法（C++）

C++ 在一行上声明多个对象指针会导致编译器错误

相关推荐

最近更新

标签