Linux brk() 系统调用有什么作用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6988487/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What does the brk() system call do?
提问by nik
According to Linux programmers manual:
根据 Linux 程序员手册:
brk() and sbrk() change the location of the program break, which defines the end of the process's data segment.
brk() 和 sbrk() 改变程序中断的位置,它定义了进程数据段的结尾。
What does the data segment mean over here? Is it just the data segment or data, BSS, and heap combined?
这里的数据段是什么意思?是数据段还是数据、BSS、堆的结合?
According to wiki:
根据维基:
Sometimes the data, BSS, and heap areas are collectively referred to as the "data segment".
有时,数据、BSS 和堆区统称为“数据段”。
I see no reason for changing the size of just the data segment. If it is data, BSSand heap collectively then it makes sense as heap will get more space.
我认为没有理由只更改数据段的大小。如果是数据、BSS和堆,那么它是有意义的,因为堆将获得更多空间。
Which brings me to my second question. In all the articles I read so far, author says that heap grows upward and stack grows downward. But what they do not explain is what happens when heap occupies all the space between heap and stack?
这就引出了我的第二个问题。在我目前阅读的所有文章中,作者都说堆向上增长,堆栈向下增长。但是他们没有解释的是当堆占据了堆和栈之间的所有空间时会发生什么?
采纳答案by zwol
In the diagram you posted, the "break"—the address manipulated by brk
and sbrk
—is the dotted line at the top of the heap.
在您发布的图表中,“中断”(由brk
和操作的地址)sbrk
是堆顶部的虚线。
The documentation you've read describes this as the end of the "data segment" because in traditional (pre-shared-libraries, pre-mmap
) Unix the data segment was continuous with the heap; before program start, the kernel would load the "text" and "data" blocks into RAM starting at address zero (actually a little above address zero, so that the NULL pointer genuinely didn't point to anything) and set the break address to the end of the data segment. The first call to malloc
would then use sbrk
to move the break up and create the heap in betweenthe top of the data segment and the new, higher break address, as shown in the diagram, and subsequent use of malloc
would use it to make the heap bigger as necessary.
您阅读的文档将其描述为“数据段”的结尾,因为在传统(预共享库,预mmap
)Unix 中,数据段与堆是连续的;在程序启动之前,内核会将“文本”和“数据”块加载到 RAM 中,从地址 0 开始(实际上比地址 0 稍高一点,因此 NULL 指针真正不指向任何内容)并将中断地址设置为数据段的结尾。然后第一次调用malloc
将sbrk
用于移动中断并在数据段顶部和新的更高中断地址之间创建堆,如图所示,随后使用malloc
将使用它使堆更大有必要的。
Meantime, the stack starts at the top of memory and grows down. The stack doesn't need explicit system calls to make it bigger; either it starts off with as much RAM allocated to it as it can ever have (this was the traditional approach) or there is a region of reserved addresses below the stack, to which the kernel automatically allocates RAM when it notices an attempt to write there (this is the modern approach). Either way, there may or may not be a "guard" region at the bottom of the address space that can be used for stack. If this region exists (all modern systems do this) it is permanently unmapped; if eitherthe stack or the heap tries to grow into it, you get a segmentation fault. Traditionally, though, the kernel made no attempt to enforce a boundary; the stack could grow into the heap, or the heap could grow into the stack, and either way they would scribble over each other's data and the program would crash. If you were very lucky it would crash immediately.
同时,堆栈从内存顶部开始并向下增长。堆栈不需要显式系统调用来使它变大;要么开始时分配给它的 RAM 与它所能拥有的一样多(这是传统方法),要么在堆栈下方有一个保留地址区域,当内核注意到有写入尝试时,它会自动分配 RAM (这是现代方法)。无论哪种方式,地址空间底部可能有也可能没有可用于堆栈的“保护”区域。如果该区域存在(所有现代系统都这样做),则它永久未映射;如果要么堆栈或堆试图增长到它,你会得到一个分段错误。但是,传统上,内核不会尝试强制执行边界;堆栈可能会增长到堆中,或者堆也可能会增长到堆栈中,无论哪种方式,它们都会在彼此的数据上乱写并且程序会崩溃。如果你很幸运,它会立即崩溃。
I'm not sure where the number 512GB in this diagram comes from. It implies a 64-bit virtual address space, which is inconsistent with the very simple memory map you have there. A real 64-bit address space looks more like this:
我不确定这个图中 512GB 的数字来自哪里。它意味着一个 64 位虚拟地址空间,这与您在那里拥有的非常简单的内存映射不一致。一个真正的 64 位地址空间看起来更像这样:
Legend: t: text, d: data, b: BSS
This is not remotely to scale, and it shouldn't be interpreted as exactly how any given OS does stuff (after I drew it I discovered that Linux actually puts the executable much closer to address zero than I thought it did, and the shared libraries at surprisingly high addresses). The black regions of this diagram are unmapped -- any access causes an immediate segfault -- and they are giganticrelative to the gray areas. The light-gray regions are the program and its shared libraries (there can be dozens of shared libraries); each has an independenttext and data segment (and "bss" segment, which also contains global data but is initialized to all-bits-zero rather than taking up space in the executable or library on disk). The heap is no longer necessarily continous with the executable's data segment -- I drew it that way, but it looks like Linux, at least, doesn't do that. The stack is no longer pegged to the top of the virtual address space, and the distance between the heap and the stack is so enormous that you don't have to worry about crossing it.
这不是遥不可及的,它不应该被解释为任何给定的操作系统如何做事(在我绘制它之后我发现 Linux 实际上将可执行文件比我想象的更接近于地址零,并且共享库在令人惊讶的高地址)。该图中的黑色区域未映射——任何访问都会立即导致段错误——并且它们相对于灰色区域来说是巨大的。浅灰色区域是程序及其共享库(可能有几十个共享库);每个人都有一个独立的文本和数据段(以及“bss”段,它也包含全局数据,但被初始化为所有位为零,而不是占用磁盘上可执行文件或库中的空间)。堆不再必须与可执行文件的数据段连续——我是这样画的,但看起来 Linux 至少不会那样做。栈不再挂在虚拟地址空间的顶部,堆和栈之间的距离如此之大,你不必担心跨越它。
The break is still the upper limit of the heap. However, what I didn't show is that there could be dozens of independent allocations of memory off there in the black somewhere, made with mmap
instead of brk
. (The OS will try to keep these far away from the brk
area so they don't collide.)
中断仍然是堆的上限。然而,我没有展示的是,可能有几十个独立的内存分配在黑色的某个地方,用mmap
而不是brk
. (操作系统会尽量让它们远离该brk
区域,以免它们发生碰撞。)
回答by Brian Gordon
I can answer your second question. Malloc will fail and return a null pointer. That's why you always check for a null pointer when dynamically allocating memory.
我可以回答你的第二个问题。Malloc 将失败并返回一个空指针。这就是为什么在动态分配内存时总是检查空指针的原因。
回答by Anders Abel
The heap is placed last in the program's data segment. brk()
is used to change (expand) the size of the heap. When the heap cannot grow any more any malloc
call will fail.
堆放在程序数据段的最后。brk()
用于改变(扩展)堆的大小。当堆不能再增长时,任何malloc
调用都会失败。
回答by monchalve
The data segment is the portion of memory that holds all your static data, read in from the executable at launch and usually zero-filled.
数据段是保存所有静态数据的内存部分,在启动时从可执行文件中读取,通常是零填充。
回答by R.. GitHub STOP HELPING ICE
There is a special designated anonymous private memory mapping (traditionally located just beyond the data/bss, but modern Linux will actually adjust the location with ASLR). In principle it's no better than any other mapping you could create with mmap
, but Linux has some optimizations that make it possible to expand the end of this mapping (using the brk
syscall) upwards with reduced locking cost relative to what mmap
or mremap
would incur. This makes it attractive for malloc
implementations to use when implementing the main heap.
有一个特殊指定的匿名私有内存映射(传统上位于 data/bss 之外,但现代 Linux 实际上会使用 ASLR 调整位置)。原则上,它并不比您可以使用 来创建的任何其他映射更好mmap
,但 Linux 有一些优化,可以brk
向上扩展此映射的末尾(使用系统调用),并且相对于mmap
或mremap
将产生的锁定成本降低。这使得malloc
在实现主堆时使用的实现很有吸引力。
回答by luser droog
You can use brk
and sbrk
yourself to avoid the "malloc overhead" everyone's always complaining about. But you can't easily use this method in conjuction with malloc
so it's only appropriate when you don't have to free
anything. Because you can't. Also, you should avoid any library calls which may use malloc
internally. Ie. strlen
is probably safe, but fopen
probably isn't.
您可以使用brk
和sbrk
自己来避免每个人总是抱怨的“malloc 开销”。但是你不能轻易地结合使用这个方法,malloc
所以它只在你不需要free
任何东西时才合适。因为你不能。此外,您应该避免任何可能在malloc
内部使用的库调用。IE。strlen
可能是安全的,但fopen
可能不是。
Call sbrk
just like you would call malloc
. It returns a pointer to the current break and increments the break by that amount.
呼叫sbrk
就像你打电话malloc
。它返回一个指向当前中断的指针,并将中断增加该数量。
void *myallocate(int n){
return sbrk(n);
}
While you can't free individual allocations (because there's no malloc-overhead, remember), you canfree the entire spaceby calling brk
with the value returned by the first call to sbrk
, thus rewinding the brk.
虽然您无法释放单个分配(因为没有malloc-overhead,请记住),但您可以通过调用第一次调用返回的值来释放整个空间,从而回绕 brk。brk
sbrk
void *memorypool;
void initmemorypool(void){
memorypool = sbrk(0);
}
void resetmemorypool(void){
brk(memorypool);
}
You could even stack these regions, discarding the most recent region by rewinding the break to the region's start.
您甚至可以堆叠这些区域,通过将中断倒回到区域的开始来丢弃最近的区域。
One more thing ...
还有一件事 ...
sbrk
is also useful in code golfbecause it's 2 characters shorter than malloc
.
sbrk
在代码高尔夫中也很有用,因为它比malloc
.
回答by skanzariya
malloc uses brk system call to allocate memory.
malloc 使用 brk 系统调用来分配内存。
include
包括
int main(void){
char *a = malloc(10);
return 0;
}
run this simple program with strace, it will call brk system.
用 strace 运行这个简单的程序,它会调用 brk 系统。