Python C API:将 PyObjects 分配给字典会导致内存泄漏
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/43236127/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python C API: Assigning PyObjects to a dictionary causes memory leak
提问by Zwackelmann
I am writing a C++ wrapper for Python using the Python C API. In my case I have to make bigger amounts of byte oriented data accessible for the Python script. For this purpose I use the PyByteArray_FromStringAndSize
method to produce a Python bytearray (https://docs.python.org/2.7/c-api/bytearray.html).
我正在使用 Python C API 为 Python 编写 C++ 包装器。就我而言,我必须让 Python 脚本可以访问更大量的面向字节的数据。为此,我使用该PyByteArray_FromStringAndSize
方法生成 Python bytearray ( https://docs.python.org/2.7/c-api/bytearray.html)。
When returning this bytearray directly I have not experienced any problems. When however adding the bytearray into a Python dict, the memory from the bytearray will not be released once the dict is destroyed.
直接返回这个字节数组时,我没有遇到任何问题。然而,当将 bytearray 添加到 Python dict 时,一旦 dict 被销毁,bytearray 中的内存将不会被释放。
This can be solved by calling Py_DECREF
on the bytearray object after adding the bytearray object to the Python dict.
这可以通过Py_DECREF
在将 bytearray 对象添加到 Python dict 后调用bytearray 对象来解决。
Below is a complete working example of my code containing a method dummyArrPlain
returning the plain bytearray and a method dummyArrInDict
returning a bytearray in a dict. The second method will produce a memory leak unless Py_DECREF(pyData);
is called.
下面是我的代码的完整工作示例,其中包含一个dummyArrPlain
返回纯字节数组的方法和一个dummyArrInDict
返回 dict 中的字节数组的方法。除非Py_DECREF(pyData);
调用第二种方法,否则将产生内存泄漏。
My question is:Why is Py_DECREF
necessary at this point. Intuitively I would have expected that Py_DECREF
should be called once the dict is destroyed.
我的问题是:为什么Py_DECREF
在这一点上是必要的。直觉上,我原以为Py_DECREF
一旦 dict 被销毁就应该调用它。
Also I assign values like in the following to a dict:
此外,我将如下所示的值分配给字典:
PyDict_SetItem(dict, PyString_FromString("i"), PyInt_FromLong(i));
Will this also produce a memory leak when not calling Py_DECREF
on the created string and long?
当不调用Py_DECREF
创建的字符串和长时,这也会产生内存泄漏吗?
This is my dummy C++ wrapper:
这是我的虚拟 C++ 包装器:
#include <python2.7/Python.h>
static char module_docstring[] = "This is a module causing a memory leak";
static PyObject *dummyArrPlain(PyObject *self, PyObject *args);
static PyObject *dummyArrInDict(PyObject *self, PyObject *args);
static PyMethodDef module_methods[] = {
{"dummy_arr_plain", dummyArrPlain, METH_VARARGS, "returns a plain dummy bytearray"},
{"dummy_arr_in_dict", dummyArrInDict, METH_VARARGS, "returns a dummy bytearray in a dict"},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initlibdummy(void)
{
PyObject *m = Py_InitModule("libdummy", module_methods);
if (m == NULL)
return;
}
static PyObject *dummyArrPlain(PyObject *self, PyObject *args)
{
int len = 10000000;
char* data = new char[len];
for(int i=0; i<len; i++) {
data[i] = 0;
}
PyObject * pyData = PyByteArray_FromStringAndSize(data, len);
delete [] data;
return pyData;
}
static PyObject *dummyArrInDict(PyObject *self, PyObject *args)
{
int len = 10000000;
char* data = new char[len];
for(int i=0; i<len; i++) {
data[i] = 0;
}
PyObject * pyData = PyByteArray_FromStringAndSize(data, len);
delete [] data;
PyObject *dict = PyDict_New();
PyDict_SetItem(dict, PyString_FromString("data"), pyData);
// memory leak without Py_DECREF(pyData);
return dict;
}
And a dummy python script using the wrapper:
还有一个使用包装器的虚拟 python 脚本:
import libdummy
import time
while True:
a = libdummy.dummy_arr_in_dict()
time.sleep(0.01)
采纳答案by CristiFati
It's a matter of [Python 2.0.Docs]: Ownership rules. I'm going to exemplify on Python 2.7.10(pretty old, but I don't think that the behavior has (significantly) changed along the way).
这是[Python 2.0.Docs]: Ownership rules 的问题。我将举例说明Python 2.7.10(相当旧,但我认为行为并没有(显着)改变)。
PyByteArray_FromStringAndSize(bytearrayobject.c: 168) creates a new object (using PyObject_New, and allocates memory for the buffer as well).
PyByteArray_FromStringAndSize( bytearrayobject.c: 168) 创建一个新对象(使用PyObject_New,并为缓冲区分配内存)。
By default, the refcountof that object (or better: of any newly created object) is 1(set by _Py_NewReference), so that when the user calls delupon it, or at program exit, the refcountwill be decreased, and when reaching 0, the object will be deallocated.
默认情况下,该对象(或更好:任何新创建的对象)的引用计数为1(由_Py_NewReference设置),因此当用户对其调用del或程序退出时,引用计数将减少,当达到0,对象将被释放。
This is the behavior on the flow where the object is returned
But, in dummyArrInDict's case, PyDict_SetItemdoes (indirectly) a Py_INCREFof pyData(it does other stuff, but only this is relevant in the current situation), ending up with a refcountof 2and therefore the memory leak
这是返回对象的流上的行为
但是,在dummyArrInDict的情况下,PyDict_SetItem执行(间接)pyData的Py_INCREF(它执行其他操作,但仅在当前情况下相关),最终引用计数为2,因此内存泄漏
It's basically same thing that you're doing with data: you allocate memory for it, and when you no longer need it, you free it (this is because you're not returning it, you only use it temporarily).
这基本上与您对data做的事情相同:您为它分配内存,当您不再需要它时,您将其释放(这是因为您没有返回它,您只是暂时使用它)。
Note: It's safer to use the Xmacros (e.g. [Python 2.Docs]: Py_XDECREF, especially since you're not testing for NULLthe returned PyObjects).
注意:使用X宏更安全(例如[Python 2.Docs]: Py_XDECREF,特别是因为您没有测试返回的PyObject是否为NULL)。
For more details, also take a look at [Python 2.Docs]: C API Reference.
有关更多详细信息,还可以查看[Python 2.Docs]:C API 参考。