pandas 使用熊猫数据帧的内存泄漏

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14224068/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:34:37  来源:igfitidea点击:

Memory leak using pandas dataframe

pythonpandasmemory-leaksmemory-leak-detectorobjgraph

提问by sebpiq

I am using pandas.DataFramein a multi-threaded code (actually a custom subclass of DataFramecalled Sound). I have noticed that I have a memory leak, since the memory usage of my program augments gradually over 10mn, to finally reach ~100% of my computer memory and crash.

pandas.DataFrame在多线程代码中使用(实际上是DataFrame调用的自定义子类Sound)。我注意到我有内存泄漏,因为我的程序的内存使用量逐渐增加超过 1000 万,最终达到我计算机内存的 ~100% 并崩溃。

I used objgraphto try tracking this leak, and found out that the count of instances of MyDataFrameis going up all the time while it shouldn't : every thread in its runmethod creates an instance, makes some calculations, saves the result in a file and exits ... so no references should be kept.

我使用objgraph尝试跟踪此泄漏,并发现 的实例数MyDataFrame一直在增加,而它不应该:其run方法中的每个线程都创建一个实例,进行一些计算,将结果保存在文件中,然后退出......所以不应该保留任何引用。

Using objgraphI found that all the data frames in memory have a similar reference graph :

使用objgraph我发现内存中的所有数据帧都有一个类似的参考图:

enter image description here

在此处输入图片说明

I have no idea if that's normal or not ... it looks like this is what is keeping my objects in memory. Any idea, advice, insight ?

我不知道这是否正常......看起来这就是将我的对象保存在内存中的原因。任何想法,建议,见解?

采纳答案by Wes McKinney

Confirmed that there's some kind of memory leak going on in the indexing infrastructure. It's notcaused by the above reference graph. Let's move the discussion to GitHub (SO is for Q&A):

确认索引基础结构中存在某种内存泄漏。这不是由上面的参考图引起的。让我们将讨论转移到 GitHub(SO 用于问答):

https://github.com/pydata/pandas/issues/2659

https://github.com/pydata/pandas/issues/2659

EDIT: this actually appears to not be a memory leak at all, but has to do with the OS memory allocation issues perhaps. Please have a look at the github issue for more information

编辑:这实际上似乎根本不是内存泄漏,但可能与操作系统内存分配问题有关。请查看 github 问题以获取更多信息