C++ .bss 部分零初始化变量是否占用 elf 文件中的空间?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/610682/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 16:14:15  来源:igfitidea点击:

Do .bss section zero initialized variables occupy space in elf file?

c++storageelfsegments

提问by Wouter Lievens

If I understand correctly, the .bsssection in ELF files is used to allocate space for zero-initialized variables. Our tool chain produces ELF files, hence my question: does the .bsssection actually have to contain all those zeroes? It seems such an awful waste of spaces that when, say, I allocate a global ten megabyte array, it results in ten megabytes of zeroes in the ELF file. What am I seeing wrong here?

如果我理解正确,.bssELF 文件中的部分用于为零初始化变量分配空间。我们的工具链生成 ELF 文件,因此我的问题是:该.bss部分实际上必须包含所有这些零吗?这似乎是一种可怕的空间浪费,比如说,当我分配一个全局的 10 兆字节数组时,它会在 ELF 文件中产生 10 兆字节的零。我在这里看到了什么错误?

回答by Johannes Schaub - litb

Has been some time since i worked with ELF. But i think i still remember this stuff. No, it does not physically contain those zeros. If you look into an ELF file program header, then you will see each header has two numbers: One is the size in the file. And another is the size as the section has when allocated in virtual memory (readelf -l ./a.out):

自从我与 ELF 合作以来已经有一段时间了。但我想我仍然记得这些东西。不,它实际上不包含这些零。如果您查看 ELF 文件程序头,您会看到每个头都有两个数字:一个是文件的大小。另一个是在虚拟内存 ( readelf -l ./a.out) 中分配时该节的大小:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4
  INTERP         0x000114 0x08048114 0x08048114 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  LOAD           0x000000 0x08048000 0x08048000 0x00454 0x00454 R E 0x1000
  LOAD           0x000454 0x08049454 0x08049454 0x00104 0x61bac RW  0x1000
  DYNAMIC        0x000468 0x08049468 0x08049468 0x000d0 0x000d0 RW  0x4
  NOTE           0x000128 0x08048128 0x08048128 0x00020 0x00020 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

Headers of type LOADare the one that are copied into virtual memory when the file is loaded for execution. Other headers contain other information, like the shared libraries that are needed. As you see, the FileSizeand MemSizsignificantly differ for the header that contains the bsssection (the second LOADone):

类型标头LOAD是在加载文件以供执行时复制到虚拟内存中的标头。其他标头包含其他信息,例如所需的共享库。如您所见,包含该部分(第二个)的标题的FileSizeMemSiz显着不同:bssLOAD

0x00104 (file-size) 0x61bac (mem-size)

For this example code:

对于此示例代码:

int a[100000];
int main() { }

The ELF specification says that the part of a segment that the mem-size is greater than the file-size is just filled out with zeros in virtual memory. The segment to section mapping of the second LOADheader is like this:

ELF 规范说,内存大小大于文件大小的段的部分只是在虚拟内存中用零填充。第二个LOAD标头的段到段映射是这样的:

03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss

So there are some other sections in there too. For C++ constructor/destructors. The same thing for Java. Then it contains a copy of the .dynamicsection and other stuff useful for dynamic linking (i believe this is the place that contains the needed shared libraries among other stuff). After that the .datasection that contains initialized globals and local static variables. At the end, the .bsssection appears, which is filled by zeros at load time because file-size does not cover it.

所以那里也有一些其他部分。对于 C++ 构造函数/析构函数。Java 也是一样。然后它包含该.dynamic部分的副本和其他对动态链接有用的东西(我相信这是包含其他东西所需的共享库的地方)。之后是.data包含初始化全局变量和局部静态变量的部分。最后,该.bss部分出现,在加载时用零填充,因为文件大小不包括它。

By the way, you can see into which output-section a particular symbol is going to be placed by using the -Mlinker option. For gcc, you use -Wl,-Mto put the option through to the linker. The above example shows that ais allocated within .bss. It may help you verify that your uninitialized objects really end up in .bssand not somewhere else:

顺便说一下,您可以使用-M链接器选项查看特定符号将放置在哪个输出部分。对于 gcc,您使用-Wl,-M将选项传递给链接器。上面的例子显示了a.bss. 它可以帮助您验证未初始化的对象是否确实最终在.bss而不是其他地方:

.bss            0x08049560    0x61aa0
 [many input .o files...]
 *(COMMON) 
 *fill*         0x08049568       0x18 00
 COMMON         0x08049580    0x61a80 /tmp/cc2GT6nS.o
                0x08049580                a
                0x080ab000                . = ALIGN ((. != 0x0)?0x4:0x1) 
                0x080ab000                . = ALIGN (0x4) 
                0x080ab000                . = ALIGN (0x4) 
                0x080ab000                _end = .

GCC keeps uninitialized globals in a COMMON section by default, for compatibility with old compilers, that allow to have globals defined twice in a program without multiple definition errors. Use -fno-commonto make GCC use the .bss sections for object files (does not make a difference for the final linked executable, because as you see it's going to get into a .bss output section anyway. This is controlled by the linker script. Display it with ld -verbose). But that shouldn't scare you, it's just an internal detail. See the manpage of gcc.

默认情况下,GCC 将未初始化的全局变量保存在 COMMON 部分中,以与旧编译器兼容,允许在程序中定义两次全局变量而不会出现多个定义错误。使用-fno-common使GCC使用对象文件的.bss段(未作出最终链接的可执行文件的差别,因为当你看到它会进入反正的.bss输出部分,这是由控制链接脚本。它显示与ld -verbose)。但这不应该吓到您,这只是一个内部细节。请参阅 gcc 的联机帮助页。

回答by D.Shawley

The .bsssection in an ELF file is used for static data which is not initializedprogrammatically but guaranteed to be set to zero at runtime. Here's a little example that will explain the difference.

所述.bss在ELF文件部分用于其静态数据未初始化编程但保证在运行时被设定为零。这里有一个小例子可以解释这种差异。

int main() {
    static int bss_test1[100];
    static int bss_test2[100] = {0};
    return 0;
}

In this case bss_test1is placed into the .bsssince it is uninitialized. bss_test2however is placed into the .datasegment along with a bunch of zeros. The runtime loader basically allocates the amount of space reserved for the .bssand zeroes it out before any userland code begins executing.

在这种情况下bss_test1.bss因为它是未初始化的,所以被放入。bss_test2然而,它.data与一堆零一起放入段中。运行时加载器基本上分配为 保留的空间量,.bss并在任何用户空间代码开始执行之前将其清零。

You can see the difference using objdump, nm, or similar utilities:

您可以使用看出差别objdumpnm或类似的实用程序:

moozletoots$ objdump -t a.out | grep bss_test
08049780 l     O .bss   00000190              bss_test1.3
080494c0 l     O .data  00000190              bss_test2.4

This is usually one of the first surprisesthat embedded developers run into... never initialize statics to zero explicitly. The runtime loader (usually) takes care of that. As soon as you initialize anything explicitly, you are telling the compiler/linker to include the data in the executable image.

这通常是嵌入式开发人员遇到的第一个惊喜……永远不要明确地将静态初始化为零。运行时加载器(通常)会处理这个问题。一旦你显式地初始化任何东西,你就告诉编译器/链接器将数据包含在可执行映像中。

回答by mouviciel

A .bsssection is not stored in an executable file. Of the most common sections (.text, .data, .bss), only .text(actual code) and .data(initialized data) are present in an ELF file.

.bss节不存储在一个可执行文件。在最常见的部分 ( .text, .data, .bss) 中,ELF 文件中仅存在.text(实际代码)和.data(初始化数据)。

回答by mouviciel

That is correct, .bss is not present physically in the file, rather just the information about its size is present for the dynamic loader to allocate the .bss section for the application program. As thumb rule only LOAD, TLS Segment gets the memory for the application program, rest are used for dynamic loader.

这是正确的,.bss 并不实际存在于文件中,而只是存在有关其大小的信息,以便动态加载程序为应用程序分配 .bss 部分。由于经验法则只有 LOAD,TLS Segment 为应用程序获取内存,其余用于动态加载器。

About static executable file, bss sections is also given space in the execuatble

关于静态可执行文件,在可执行文件中也给了 bss 部分的空间

Embedded application where there is no loader this is common.

没有加载器的嵌入式应用程序很常见。

Suman

苏曼