python 中的 len() 和 sys.getsizeof() 方法有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17574076/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the difference between len() and sys.getsizeof() methods in python?
提问by Balamurugan
When I ran the below code I got 3 and 36 as the answers respectively.
当我运行下面的代码时,我分别得到了 3 和 36 作为答案。
x ="abd"
print len(x)
print sys.getsizeof(x)
Can someone explain to me what's the difference between them ?
有人可以向我解释它们之间有什么区别吗?
采纳答案by Martijn Pieters
They are not the same thing at all.
他们是不一样的东西可言。
len()
queries for the number of items contained in a container. For a string that's the number of characters:
len()
查询容器中包含的项目数。对于字符数的字符串:
Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).
返回对象的长度(项目数)。参数可以是序列(字符串、元组或列表)或映射(字典)。
sys.getsizeof()
on the other hand returns the memory sizeof the object:
sys.getsizeof()
另一方面返回对象的内存大小:
Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
以字节为单位返回对象的大小。对象可以是任何类型的对象。所有内置对象都将返回正确的结果,但对于第三方扩展,这不一定适用,因为它是特定于实现的。
Python string objects are not simple sequences of characters, 1 byte per character.
Python 字符串对象不是简单的字符序列,每个字符 1 个字节。
Specifically, the sys.getsizeof()
function includes the garbage collector overhead if any:
具体来说,该sys.getsizeof()
函数包括垃圾收集器开销(如果有):
getsizeof()
calls the object's__sizeof__
method and adds an additional garbage collector overhead if the object is managed by the garbage collector.
getsizeof()
__sizeof__
如果对象由垃圾收集器管理,则调用对象的方法并添加额外的垃圾收集器开销。
String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__
method returns (in C code):
字符串对象不需要被跟踪(它们不能创建循环引用),但字符串对象确实需要更多的内存,而不仅仅是每个字符的字节数。在 Python 2 中,__sizeof__
方法返回(在 C 代码中):
Py_ssize_t res;
res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;
return PyInt_FromSsize_t(res);
where PyStringObject_SIZE
is the C struct header size for the type, PyString_GET_SIZE
basically is the same as len()
and Py_TYPE(v)->tp_itemsize
is the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZE
that is confusing you; on my Mac that size is 37 bytes:
其中PyStringObject_SIZE
是该类型的C结构报头大小,PyString_GET_SIZE
基本上是相同的len()
并且Py_TYPE(v)->tp_itemsize
是所述每字符大小。在 Python 2.7 中,对于字节字符串,每个字符的大小为 1,但这PyStringObject_SIZE
让您感到困惑;在我的 Mac 上,大小为 37 字节:
>>> sys.getsizeof('')
37
For unicode
strings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contentsof the string.
对于unicode
字符串,每个字符的大小最多为 2 或 4(取决于编译选项)。在 Python 3.3 和更新版本中,Unicode 字符串每个字符占用 1 到 4 个字节,具体取决于字符串的内容。