python 中的 len() 和 sys.getsizeof() 方法有什么区别?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/17574076/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the difference between len() and sys.getsizeof() methods in python?
提问by Balamurugan
When I ran the below code I got 3 and 36 as the answers respectively.
当我运行下面的代码时,我分别得到了 3 和 36 作为答案。
x ="abd"
print len(x)
print sys.getsizeof(x)
Can someone explain to me what's the difference between them ?
有人可以向我解释它们之间有什么区别吗?
采纳答案by Martijn Pieters
They are not the same thing at all.
他们是不一样的东西可言。
len()queries for the number of items contained in a container. For a string that's the number of characters:
len()查询容器中包含的项目数。对于字符数的字符串:
Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).
返回对象的长度(项目数)。参数可以是序列(字符串、元组或列表)或映射(字典)。
sys.getsizeof()on the other hand returns the memory sizeof the object:
sys.getsizeof()另一方面返回对象的内存大小:
Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
以字节为单位返回对象的大小。对象可以是任何类型的对象。所有内置对象都将返回正确的结果,但对于第三方扩展,这不一定适用,因为它是特定于实现的。
Python string objects are not simple sequences of characters, 1 byte per character.
Python 字符串对象不是简单的字符序列,每个字符 1 个字节。
Specifically, the sys.getsizeof()function includes the garbage collector overhead if any:
具体来说,该sys.getsizeof()函数包括垃圾收集器开销(如果有):
getsizeof()calls the object's__sizeof__method and adds an additional garbage collector overhead if the object is managed by the garbage collector.
getsizeof()__sizeof__如果对象由垃圾收集器管理,则调用对象的方法并添加额外的垃圾收集器开销。
String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__method returns (in C code):
字符串对象不需要被跟踪(它们不能创建循环引用),但字符串对象确实需要更多的内存,而不仅仅是每个字符的字节数。在 Python 2 中,__sizeof__方法返回(在 C 代码中):
Py_ssize_t res;
res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;
return PyInt_FromSsize_t(res);
where PyStringObject_SIZEis the C struct header size for the type, PyString_GET_SIZEbasically is the same as len()and Py_TYPE(v)->tp_itemsizeis the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZEthat is confusing you; on my Mac that size is 37 bytes:
其中PyStringObject_SIZE是该类型的C结构报头大小,PyString_GET_SIZE基本上是相同的len()并且Py_TYPE(v)->tp_itemsize是所述每字符大小。在 Python 2.7 中,对于字节字符串,每个字符的大小为 1,但这PyStringObject_SIZE让您感到困惑;在我的 Mac 上,大小为 37 字节:
>>> sys.getsizeof('')
37
For unicodestrings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contentsof the string.
对于unicode字符串,每个字符的大小最多为 2 或 4(取决于编译选项)。在 Python 3.3 和更新版本中,Unicode 字符串每个字符占用 1 到 4 个字节,具体取决于字符串的内容。

