Python 列表(元组)中每个元素有多少字节?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/135664/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-03 19:32:49  来源:igfitidea点击:

How many bytes per element are there in a Python list (tuple)?

pythonmemory-management

提问by jfs

For example, how much memory is required to store a list of one million (32-bit) integers?

例如,存储一百万(32 位)整数的列表需要多少内存?

alist = range(1000000) # or list(range(1000000)) in Python 3.0

采纳答案by jfs

Useful links:

有用的链接:

How to get memory size/usage of python object

如何获取python对象的内存大小/使用情况

Memory sizes of python objects?

python对象的内存大小?

if you put data into dictionary, how do we calculate the data size?

如果将数据放入字典中,我们如何计算数据大小?

However they don't give a definitive answer. The way to go:

然而他们并没有给出明确的答案。走的路:

  1. Measure memory consumed by Python interpreter with/without the list (use OS tools).

  2. Use a third-party extension module which defines some sort of sizeof(PyObject).

  1. 使用/不使用列表测量 Python 解释器消耗的内存(使用操作系统工具)。

  2. 使用定义某种 sizeof(PyObject) 的第三方扩展模块。

Update:

更新

Recipe 546530: Size of Python objects (revised)

秘诀 546530:Python 对象的大小(已修订)

import asizeof

N = 1000000
print asizeof.asizeof(range(N)) / N
# -> 20 (python 2.5, WinXP, 32-bit Linux)
# -> 33 (64-bit Linux)

回答by Dan Lenski

"It depends." Python allocates space for lists in such a way as to achieve amortized constant timefor appending elements to the list.

“这取决于。” Python 为列表分配空间的方式是实现将元素附加到列表的分摊常数时间

In practice, what this means with the current implementation is... the list always has space allocated for a power-of-two number of elements. So range(1000000) will actually allocate a list big enough to hold 2^20 elements (~ 1.045 million).

在实践中,这对于当前的实现意味着......列表总是为两个元素的幂分配空间。所以 range(1000000) 实际上会分配一个足够大的列表来容纳 2^20 个元素(~104.5 万)。

This is only the space required to store the list structure itself (which is an array of pointers to the Python objects for each element). A 32-bit system will require 4 bytes per element, a 64-bit system will use 8 bytes per element.

这只是存储列表结构本身所需的空间(它是指向每个元素的 Python 对象的指针数组)。32 位系统每个元素需要 4 个字节,64 位系统每个元素需要 8 个字节。

Furthermore, you need space to store the actual elements. This varies widely. For small integers (-5 to 256 currently), no additional space is needed, but for larger numbers Python allocates a new object for each integer, which takes 10-100 bytes and tends to fragment memory.

此外,您需要空间来存储实际元素。这差别很大。对于小整数(当前为 -5 到 256),不需要额外的空间,但对于更大的数字,Python 会为每个整数分配一个新对象,这需要 10-100 字节并且往往会造成内存碎片。

Bottom line: it's complicatedand Python lists are nota good way to store large homogeneous data structures. For that, use the arraymodule or, if you need to do vectorized math, use NumPy.

底线:它很复杂,而且 Python 列表不是存储大型同构数据结构的好方法。为此,请使用该array模块,或者,如果您需要进行矢量化数学运算,请使用 NumPy。

PS- Tuples, unlike lists, are not designedto have elements progressively appended to them. I don't know how the allocator works, but don't even think about using it for large data structures :-)

PS- 与列表不同,元组的设计并不是为了将元素逐渐附加到它们上面。我不知道分配器是如何工作的,但甚至不要考虑将它用于大型数据结构:-)

回答by Constantin

Addressing "tuple" part of the question

解决问题的“元组”部分

Declaration of CPython's PyTuple in a typical build configuration boils down to this:

在典型的构建配置中声明 CPython 的 PyTuple 归结为:

struct PyTuple {
  size_t refcount; // tuple's reference count
  typeobject *type; // tuple type object
  size_t n_items; // number of items in tuple
  PyObject *items[1]; // contains space for n_items elements
};

Size of PyTuple instance is fixed during it's construction and cannot be changed afterwards. The number of bytes occupied by PyTuple can be calculated as

PyTuple 实例的大小在其构建期间是固定的,之后无法更改。PyTuple 占用的字节数可以计算为

sizeof(size_t) x 2 + sizeof(void*) x (n_items + 1).

sizeof(size_t) x 2 + sizeof(void*) x (n_items + 1).

This gives shallowsize of tuple. To get fullsize you also need to add total number of bytes consumed by object graph rooted in PyTuple::items[]array.

这给出了元组的尺寸。要获得完整大小,您还需要添加以PyTuple::items[]数组为根的对象图消耗的总字节数。

It's worth noting that tuple construction routines make sure that only single instance of empty tuple is ever created (singleton).

值得注意的是,元组构造例程确保只创建空元组的单个实例(单例)。

References: Python.h, object.h, tupleobject.h, tupleobject.c

参考资料: Python.hobject.htupleobject.htupleobject.c

回答by Constantin

A new function, getsizeof(), takes a Python object and returns the amount of memory used by the object, measured in bytes. Built-in objects return correct results; third-party extensions may not, but can define a __sizeof__()method to return the object's size.

一个新函数,getsizeof()接受一个 Python 对象并返回该对象使用的内存量,以字节为单位。内置对象返回正确的结果;第三方扩展可能不会,但可以定义一个 __sizeof__()方法来返回对象的大小。

kveretennicov@nosignal:~/py/r26rc2$ ./python
Python 2.6rc2 (r26rc2:66712, Sep  2 2008, 13:11:55) 
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
>>> import sys
>>> sys.getsizeof(range(1000000))
4000032
>>> sys.getsizeof(tuple(range(1000000)))
4000024

Obviously returned numbers don't include memory consumed by contained objects (sys.getsizeof(1) == 12).

显然,返回的数字不包括被包含对象消耗的内存(sys.getsizeof(1) == 12)。

回答by HenryR

This is implementation specific, I'm pretty sure. Certainly it depends on the internal representation of integers - you can't assume they'll be stored as 32-bit since Python gives you arbitrarily large integers so perhaps small ints are stored more compactly.

这是特定于实现的,我很确定。当然,这取决于整数的内部表示——你不能假设它们会被存储为 32 位,因为 Python 给你任意大的整数,所以小整数可能会更紧凑地存储。

On my Python (2.5.1 on Fedora 9 on core 2 duo) the VmSize before allocation is 6896kB, after is 22684kB. After one more million element assignment, VmSize goes to 38340kB. This very grossly indicates around 16000kB for 1000000 integers, which is around 16 bytes per integer. That suggests a lotof overhead for the list. I'd take these numbers with a large pinch of salt.

在我的 Python(2.5.1 on Fedora 9 on core 2 duo)上,分配前的 VmSize 为 6896kB,分配后为 22684kB。再分配一百万个元素后,VmSize 变为 38340kB。这非常粗略地表明 1000000 个整数大约 16000kB,每个整数大约 16 个字节。这表明列表有很多开销。我会用一大撮盐来计算这些数字。