C语言 malloc() 内部是如何实现的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3479330/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How is malloc() implemented internally?
提问by bodacydo
Can anyone explain how malloc()works internally?
谁能解释一下malloc()内部是如何工作的?
I have sometimes done strace programand I see a lot of sbrksystem calls, doing man sbrktalks about it being used in malloc()but not much more.
我有时做过strace program,我看到很多sbrk系统调用,man sbrk谈论它正在被使用,malloc()但不多。
采纳答案by DarkDust
The sbrksystem call moves the "border" of the data segment. This means it moves a border of an area in which a program may read/write data (letting it grow or shrink, although AFAIK no mallocreally gives memory segments back to the kernel with that method). Aside from that, there's also mmapwhich is used to map files into memory but is also used to allocate memory (if you need to allocate shared memory, mmapis how you do it).
该sbrk系统调用将数据段的“边界”。这意味着它移动了程序可以读/写数据的区域的边界(让它增长或缩小,尽管 AFAIK 没有malloc真正使用该方法将内存段返回给内核)。除此之外,还有mmapwhich 用于将文件映射到内存但也用于分配内存(如果您需要分配共享内存,mmap则是如何分配的)。
So you have two methods of getting more memory from the kernel: sbrkand mmap. There are various strategies on how to organize the memory that you've got from the kernel.
因此,您有两种从内核获取更多内存的方法:sbrk和mmap. 关于如何组织您从内核获得的内存,有多种策略。
One naive way is to partition it into zones, often called "buckets", which are dedicated to certain structure sizes. For example, a mallocimplementation could create buckets for 16, 64, 256 and 1024 byte structures. If you ask mallocto give you memory of a given size it rounds that number up to the next bucket size and then gives you an element from that bucket. If you need a bigger area malloccould use mmapto allocate directly with the kernel. If the bucket of a certain size is empty malloccould use sbrkto get more space for a new bucket.
一种简单的方法是将其划分为专用于某些结构尺寸的区域,通常称为“桶”。例如,一个malloc实现可以为 16、64、256 和 1024 字节结构创建桶。如果您要求malloc为您提供给定大小的内存,它将将该数字向上舍入到下一个存储桶大小,然后为您提供该存储桶中的一个元素。如果需要更大的区域malloc可以用mmap内核直接分配。如果某个大小的桶是空的,则malloc可以用来sbrk为新桶获得更多空间。
There are various mallocdesigns and there is propably no one true way of implementing mallocas you need to make a compromise between speed, overhead and avoiding fragmentation/space effectiveness. For example, if a bucket runs out of elements an implementation might get an element from a bigger bucket, split it up and add it to the bucket that ran out of elements. This would be quite space efficient but would not be possible with every design. If you just get another bucket via sbrk/mmapthat might be faster and even easier, but not as space efficient. Also, the design must of course take into account that "free" needs to make space available to mallocagain somehow. You don't just hand out memory without reusing it.
有各种各样的malloc设计,并且可能没有一种真正的实现方式,malloc因为您需要在速度、开销和避免碎片/空间效率之间做出妥协。例如,如果一个桶用完元素,一个实现可能会从更大的桶中获取一个元素,将其拆分并将其添加到用完元素的桶中。这将非常节省空间,但并非每种设计都可行。如果您只是通过sbrk/获得另一个存储桶,mmap那可能会更快,甚至更容易,但空间效率不高。此外,设计当然必须考虑到“免费”需要以malloc某种方式再次提供可用空间。你不会只分发内存而不重用它。
If you're interested, the OpenSER/Kamailio SIP proxy has two mallocimplementations (they need their own because they make heavy use of shared memory and the system mallocdoesn't support shared memory). See: https://github.com/OpenSIPS/opensips/tree/master/mem
如果您有兴趣,OpenSER/Kamailio SIP 代理有两个malloc实现(它们需要自己的实现,因为它们大量使用共享内存并且系统malloc不支持共享内存)。参见:https: //github.com/OpenSIPS/opensips/tree/master/mem
Then you could also have a look at the GNU libc mallocimplementation, but that one is very complicated, IIRC.
然后你也可以看看GNU libc mallocimplementation,但那个非常复杂,IIRC。
回答by doron
Simplistically malloc and free work like this:
简单地 malloc 和 free 工作是这样的:
malloc provides access to a process's heap. The heap is a construct in the C core library (commonly libc) that allows objects to obtain exclusive access to some space on the process's heap.
malloc 提供对进程堆的访问。堆是 C 核心库(通常是 libc)中的一种构造,它允许对象获得对进程堆上某些空间的独占访问权。
Each allocation on the heap is called a heap cell. This typically consists of a header that hold information on the size of the cell as well as a pointer to the next heap cell. This makes a heap effectively a linked list.
堆上的每个分配都称为堆单元。这通常由一个包含单元大小信息的标头以及指向下一个堆单元的指针组成。这使得堆有效地成为链表。
When one starts a process, the heap contains a single cell that contains all the heap space assigned on startup. This cell exists on the heap's free list.
当启动一个进程时,堆包含一个单元格,其中包含启动时分配的所有堆空间。此单元格存在于堆的空闲列表中。
When one calls malloc, memory is taken from the large heap cell, which is returned by malloc. The rest is formed into a new heap cell that consists of all the rest of the memory.
当调用 malloc 时,内存会从 malloc 返回的大堆单元中获取。其余部分形成一个新的堆单元,由所有其余内存组成。
When one frees memory, the heap cell is added to the end of the heap's free list. Subsequent mallocs walk the free list looking for a cell of suitable size.
当释放内存时,堆单元被添加到堆的空闲列表的末尾。随后的 malloc 遍历空闲列表以寻找合适大小的单元格。
As can be expected the heap can get fragmented and the heap manager may from time to time, try to merge adjacent heap cells.
可以预料,堆可能会碎片化,堆管理器可能会不时尝试合并相邻的堆单元。
When there is no memory left on the free list for a desired allocation, malloc calls brk or sbrk which are the system calls requesting more memory pages from the operating system.
当空闲列表上没有剩余内存用于所需分配时,malloc 会调用 brk 或 sbrk,它们是系统调用,从操作系统请求更多内存页面。
Now there are a few modification to optimize heap operations.
现在有一些修改来优化堆操作。
- For large memory allocations (typically > 512 bytes, the heap manager may go straight to the OS and allocate a full memory page.
- The heap may specify a minimum size of allocation to prevent large amounts of fragmentation.
- The heap may also divide itself into bins one for small allocations and one for larger allocations to make larger allocations quicker.
- There are also clever mechanisms for optimizing multi-threaded heap allocation.
- 对于大内存分配(通常 > 512 字节,堆管理器可能会直接进入操作系统并分配完整的内存页面。
- 堆可以指定最小分配大小以防止大量碎片。
- 堆也可以将自己分成若干箱,一个用于小分配,另一个用于较大分配,以更快地进行较大的分配。
- 还有一些用于优化多线程堆分配的巧妙机制。
回答by mgalgs
It's also important to realize that simply moving the program break pointer around with brkand sbrkdoesn't actually allocatethe memory, it just sets up the address space. On Linux, for example, the memory will be "backed" by actual physical pages when that address range is accessed, which will result in a page fault, and will eventually lead to the kernel calling into the page allocator to get a backing page.
同样重要的是要意识到,简单地移动程序中断指针brk而sbrk不是实际分配内存,它只是设置地址空间。例如,在 Linux 上,当访问该地址范围时,内存将被实际物理页面“支持”,这将导致页面错误,并最终导致内核调用页面分配器以获取支持页面。

