Python 的列表是如何实现的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3917574/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How is Python's List Implemented?
提问by Greg
Is it a linked list, an array? I searched around and only found people guessing. My C knowledge isn't good enough to look at the source code.
是链表还是数组?我四处寻找,只发现人们在猜测。我的 C 知识不足以查看源代码。
采纳答案by Amber
It's a dynamic array. Practical proof: Indexing takes (of course with extremely small differences (0.0013 μsecs!)) the same time regardless of index:
这是一个动态数组。实际证明:无论索引如何,索引都需要(当然有极小的差异(0.0013 微秒!))相同的时间:
...>python -m timeit --setup="x = [None]*1000" "x[500]"
10000000 loops, best of 3: 0.0579 usec per loop
...>python -m timeit --setup="x = [None]*1000" "x[0]"
10000000 loops, best of 3: 0.0566 usec per loop
I would be astounded if IronPython or Jython used linked lists - they would ruin the performance of many many widely-used libraries built on the assumption that lists are dynamic arrays.
如果 IronPython 或 Jython 使用链表,我会感到震惊——它们会破坏许多基于列表是动态数组的假设而构建的广泛使用的库的性能。
回答by NullUserException
This is implementation dependent, but IIRC:
这取决于实现,但 IIRC:
- CPython uses an array of pointers
- Jython uses an
ArrayList - IronPython apparently also uses an array. You can browse the source codeto find out.
- CPython 使用指针数组
- Jython 使用一个
ArrayList - IronPython 显然也使用数组。您可以浏览源代码进行查找。
Thus they all have O(1) random access.
因此它们都有 O(1) 随机访问。
回答by Amber
In CPython, lists are arrays of pointers. Other implementations of Python may choose to store them in different ways.
在 CPython 中,列表是指针数组。Python 的其他实现可能会选择以不同的方式存储它们。
回答by Fred Foo
The C code is pretty simple, actually. Expanding one macro and pruning some irrelevant comments, the basic structure is in listobject.h, which defines a list as:
实际上,C 代码非常简单。扩展一个宏并修剪一些不相关的注释,基本结构是 in listobject.h,它定义了一个列表:
typedef struct {
PyObject_HEAD
Py_ssize_t ob_size;
/* Vector of pointers to list elements. list[0] is ob_item[0], etc. */
PyObject **ob_item;
/* ob_item contains space for 'allocated' elements. The number
* currently in use is ob_size.
* Invariants:
* 0 <= ob_size <= allocated
* len(list) == ob_size
* ob_item == NULL implies ob_size == allocated == 0
*/
Py_ssize_t allocated;
} PyListObject;
PyObject_HEADcontains a reference count and a type identifier. So, it's a vector/array that overallocates. The code for resizing such an array when it's full is in listobject.c. It doesn't actually double the array, but grows by allocating
PyObject_HEAD包含一个引用计数和一个类型标识符。所以,它是一个过度分配的向量/数组。用于在此类数组已满时调整其大小的代码在listobject.c. 它实际上并没有将数组加倍,而是通过分配来增长
new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
new_allocated += newsize;
to the capacity each time, where newsizeis the requested size (not necessarily allocated + 1because you can extendby an arbitrary number of elements instead of append'ing them one by one).
每次的容量,newsize请求的大小在哪里(不一定allocated + 1是因为您可以extend通过任意数量的元素而不是append一个一个地将它们连接起来)。
See also the Python FAQ.
另请参阅Python 常见问题解答。
回答by ravi77o
According to the documentation,
根据文档,
Python's lists are really variable-length arrays, not Lisp-style linked lists.
Python 的列表实际上是变长数组,而不是 Lisp 风格的链表。
回答by RussellStewart
As others have stated above, the lists (when appreciably large) are implemented by allocating a fixed amount of space, and, if that space should fill, allocating a larger amount of space and copying over the elements.
正如上面其他人所说,列表(当相当大时)是通过分配固定数量的空间来实现的,如果该空间应该填满,则分配更多的空间并复制元素。
To understand why the method is O(1) amortized, without loss of generality, assume we have inserted a = 2^n elements, and we now have to double our table to 2^(n+1) size. That means we're currently doing 2^(n+1) operations. Last copy, we did 2^n operations. Before that we did 2^(n-1)... all the way down to 8,4,2,1. Now, if we add these up, we get 1 + 2 + 4 + 8 + ... + 2^(n+1) = 2^(n+2) - 1 < 4*2^n = O(2^n) = O(a) total insertions (i.e. O(1) amortized time). Also, it should be noted that if the table allows deletions the table shrinking has to be done at a different factor (e.g 3x)
为了理解为什么该方法是 O(1) 分摊的,不失一般性,假设我们已经插入了 a = 2^n 个元素,我们现在必须将我们的表加倍到 2^(n+1) 大小。这意味着我们目前正在进行 2^(n+1) 次操作。最后一个副本,我们做了 2^n 次操作。在此之前,我们做了 2^(n-1)... 一直到 8,4,2,1。现在,如果我们把这些加起来,我们得到 1 + 2 + 4 + 8 + ... + 2^(n+1) = 2^(n+2) - 1 < 4*2^n = O(2^ n) = O(a) 总插入次数(即 O(1) 分摊时间)。另外,应该注意的是,如果表允许删除,则必须以不同的因子(例如 3x)进行表收缩
回答by Lesya
I would suggest Laurent Luce's article "Python list implementation". Was really useful for me because the author explains how the list is implemented in CPython and uses excellent diagrams for this purpose.
我建议Laurent Luce 的文章“Python 列表实现”。对我来说真的很有用,因为作者解释了列表是如何在 CPython 中实现的,并为此使用了出色的图表。
List object C structure
A list object in CPython is represented by the following C structure.
ob_itemis a list of pointers to the list elements. allocated is the number of slots allocated in memory.typedef struct { PyObject_VAR_HEAD PyObject **ob_item; Py_ssize_t allocated; } PyListObject;It is important to notice the difference between allocated slots and the size of the list. The size of a list is the same as
len(l). The number of allocated slots is what has been allocated in memory. Often, you will see that allocated can be greater than size. This is to avoid needing callingrealloceach time a new elements is appended to the list.
列表对象 C 结构
CPython 中的列表对象由以下 C 结构表示。
ob_item是指向列表元素的指针列表。分配的是内存中分配的插槽数。typedef struct { PyObject_VAR_HEAD PyObject **ob_item; Py_ssize_t allocated; } PyListObject;重要的是要注意分配的插槽和列表大小之间的差异。列表的大小与
len(l). 已分配槽的数量是在内存中分配的数量。通常,您会看到分配的可能大于大小。这是为了避免realloc每次将新元素附加到列表时都需要调用。
...
...
Append
We append an integer to the list:
l.append(1). What happens?We continue by adding one more element:
l.append(2).list_resizeis called with n+1 = 2 but because the allocated size is 4, there is no need to allocate more memory. Same thing happens when we add 2 more integers:l.append(3),l.append(4). The following diagram shows what we have so far.
附加
我们继续添加一个元素:
l.append(2)。list_resize以 n+1 = 2 调用,但由于分配的大小为 4,因此无需分配更多内存。当我们再添加 2 个整数时会发生同样的事情:l.append(3),l.append(4)。下图显示了我们目前所拥有的。
...
...
Insert
Let's insert a new integer (5) at position 1:
l.insert(1,5)and look at what happens internally.
插入
...
...
Pop
When you pop the last element:
l.pop(),listpop()is called.list_resizeis called insidelistpop()and if the new size is less than half of the allocated size then the list is shrunk.You can observe that slot 4 still points to the integer but the important thing is the size of the list which is now 4. Let's pop one more element. In
list_resize(), size – 1 = 4 – 1 = 3 is less than half of the allocated slots so the list is shrunk to 6 slots and the new size of the list is now 3.You can observe that slot 3 and 4 still point to some integers but the important thing is the size of the list which is now 3.
流行音乐
当您弹出最后一个元素时:
l.pop(),listpop()被调用。list_resize在内部调用listpop(),如果新大小小于分配大小的一半,则列表会缩小。您可以观察到插槽 4 仍然指向整数,但重要的是列表的大小,现在是 4。让我们再弹出一个元素。在 中
list_resize(),size – 1 = 4 – 1 = 3 小于已分配插槽的一半,因此列表缩小到 6 个插槽,并且列表的新大小现在为 3。
...
...
RemovePython list object has a method to remove a specific element:
l.remove(5).
回答by hasib
A list in Python is something like an array, where you can store multiple values. List is mutable that means you can change it. The more important thing you should know, when we create a list, Python automatically creates a reference_id for that list variable. If you change it by assigning others variable the main list will be change. Let's try with a example:
Python 中的列表类似于数组,您可以在其中存储多个值。List 是可变的,这意味着您可以更改它。你应该知道的更重要的事情是,当我们创建一个列表时,Python 会自动为该列表变量创建一个 reference_id。如果您通过分配其他变量来更改它,则主列表将更改。让我们用一个例子试试:
list_one = [1,2,3,4]
my_list = list_one
#my_list: [1,2,3,4]
my_list.append("new")
#my_list: [1,2,3,4,'new']
#list_one: [1,2,3,4,'new']
We append my_listbut our main list has changed. That mean's list didn't assign as a copy list assign as its reference.
我们追加my_list但我们的主要列表已更改。这意味着的列表没有分配为副本列表分配为其参考。
回答by gaurav
In CPython list is implemented as dynamic array, and therefore when we append at that time not only one macro is added but some more space is allocated so that everytime new space should not be added.
在 CPython 中,列表被实现为动态数组,因此当我们 append 时,不仅添加了一个宏,而且分配了更多空间,因此不应每次添加新空间。


