Python - 解决内存泄漏

Question

提问by Casebash

I have a Python program that runs a series of experiments, with no data intended to be stored from one test to another. My code contains a memory leak which I am completely unable to find (I've look at the other threadson memory leaks). Due to time constraints, I have had to give up on finding the leak, but if I were able to isolate each experiment, the program would probably run long enough to produce the results I need.

我有一个运行一系列实验的 Python 程序，没有打算从一个测试存储到另一个测试的数据。我的代码包含一个我完全找不到的内存泄漏（我已经查看了有关内存泄漏的其他线程）。由于时间限制，我不得不放弃寻找泄漏点，但如果我能够隔离每个实验，该程序可能会运行足够长的时间来产生我需要的结果。

Would running each test in a separate thread help?
Are there any other methods of isolating the effects of a leak?

在单独的线程中运行每个测试有帮助吗？
是否有其他方法可以隔离泄漏的影响？

Detail on the specific situation

具体情况详述

My code has two parts: an experiment runner and the actual experiment code.
Although no globals are shared between the code for running all the experiments and the code used by each experiment, some classes/functions are necessarily shared.
The experiment runner isn't just a simple for loop that can be easily put into a shell script. It first decides on the tests which need to be run given the configuration parameters, then runs the tests then outputs the data in a particular way.
I tried manually calling the garbage collector in case the issue was simply that garbage collection wasn't being run, but this did not work

我的代码有两部分：一个实验运行程序和实际的实验代码。
尽管运行所有实验的代码和每个实验使用的代码之间没有共享全局变量，但某些类/函数必须共享。
实验运行程序不仅仅是一个可以轻松放入 shell 脚本的简单 for 循环。它首先根据配置参数决定需要运行的测试，然后运行测试，然后以特定方式输出数据。
我尝试手动调用垃圾收集器，以防问题仅仅是没有运行垃圾收集，但这不起作用

Update

更新

Gnibbler's answer has actually allowed me to find out that my ClosenessCalculation objects which store allof the data used during each calculation are not being killed off. I then used that to manually delete some links which seems to have fixed the memory issues.

Gnibbler 的回答实际上让我发现我的 ClosenessCalculation 对象没有被杀死，这些对象存储了每次计算期间使用的所有数据。然后我用它来手动删除一些似乎修复了内存问题的链接。

Answer 1

回答by John La Rooy

You can use something like this to help track down memory leaks

你可以使用这样的东西来帮助追踪内存泄漏

>>> from collections import defaultdict
>>> from gc import get_objects
>>> before = defaultdict(int)
>>> after = defaultdict(int)
>>> for i in get_objects():
...     before[type(i)] += 1 
...

now suppose the tests leaks some memory

现在假设测试泄漏了一些内存

>>> leaked_things = [[x] for x in range(10)]
>>> for i in get_objects():
...     after[type(i)] += 1
... 
>>> print [(k, after[k] - before[k]) for k in after if after[k] - before[k]]
[(<type 'list'>, 11)]

11 because we have leaked one list containing 10 more lists

11 因为我们泄露了一个包含另外 10 个列表的列表

Answer 2

回答by Alex Martelli

Threads would not help. If you must give up on finding the leak, then the only solution to contain its effect is running a new processonce in a while (e.g., when a test has left overall memory consumption too high for your liking -- you can determine VM size easily by reading /proc/self/statusin Linux, and other similar approaches on other OS's).

线程无济于事。如果您必须放弃查找泄漏，那么遏制其影响的唯一解决方案就是不时运行一个新进程（例如，当测试使整体内存消耗超出您的喜好时——您可以确定 VM 大小通过/proc/self/status在 Linux 中阅读以及在其他操作系统上的其他类似方法轻松阅读）。

Make sure the overall script takes an optional parameter to tell it what test number (or other test identification) to start from, so that when one instance of the script decides it's taking up too much memory, it can tell its successor where to restart from.

确保整个脚本采用一个可选参数来告诉它从哪个测试编号（或其他测试标识）开始，这样当脚本的一个实例决定它占用太多内存时，它可以告诉它的后继从哪里重新启动.

Or, more solidly, make sure that as each test is completed its identification is appended to some file with a well-known name. When the program starts it begins by reading that file and thus knows what tests have already been run. This architecture is more solid because it also covers the case where the program crashesduring a test; of course, to fully automate recovery from such crashes, you'll want a separate watchdog program and process to be in charge of starting a fresh instance of the test program when it determines the previous one has crashed (it could use subprocessfor the purpose -- it also needs a way to tell when the sequence is finished, e.g. a normal exit from the test program could mean that while any crash or exit with a status != 0 signify the need to start a new fresh instance).

或者，更确切地说，确保在每个测试完成后，其标识会附加到某个具有众所周知名称的文件中。当程序启动时，它首先读取该文件，从而知道已经运行了哪些测试。这种架构更加稳固，因为它还涵盖了程序在测试过程中崩溃的情况；当然，要从此类崩溃中完全自动恢复，您需要一个单独的看门狗程序和进程来负责在确定前一个程序崩溃时启动测试程序的新实例（它可以subprocess用于此目的 - - 它还需要一种方法来判断序列何时完成，例如从测试程序正常退出可能意味着任何崩溃或退出状态为 != 0 表示需要启动一个新的实例）。

If these architectures appeal but you need further help implementing them, just comment to this answer and I'll be happy to supply example code -- I don't want to do it "preemptively" in case there are as-yet-unexpressed issues that make the architectures unsuitable for you. (It might also help to know what platforms you need to run on).

如果这些架构很吸引人，但您需要进一步的帮助来实现它们，只需对此答案发表评论，我将很乐意提供示例代码——我不想“先发制人”，以防存在尚未表达的问题这使架构不适合您。（这也可能有助于了解您需要在哪些平台上运行）。

Answer 3

回答by Dmitry Rubanovich

I had the same problem with a third party C library which was leaking. The most clean work-around that I could think of was to fork and wait. The advantage of it is that you don't even have to create a separate process after each run. You can define the size of your batch.

我在泄漏的第三方 C 库中遇到了同样的问题。我能想到的最干净的解决方法是分叉并等待。它的优点是您甚至不必在每次运行后创建单独的进程。您可以定义批次的大小。

Here's a general solution (if you ever find the leak, the only change you need to make is to change run() to call run_single_process() instead of run_forked() and you'll be done):

这是一个通用的解决方案（如果您发现了泄漏，您需要做的唯一更改是将 run() 更改为调用 run_single_process() 而不是 run_forked() ，您将完成）：

import os,sys
batchSize = 20

class Runner(object):
    def __init__(self,dataFeedGenerator,dataProcessor):
        self._dataFeed = dataFeedGenerator
        self._caller = dataProcessor

    def run(self):
        self.run_forked()

    def run_forked(self):
        dataFeed = self._dataFeed
        dataSubFeed = []
        for i,dataMorsel in enumerate(dataFeed,1):
            if i % batchSize > 0:
                dataSubFeed.append(dataMorsel)
            else:
                self._dataFeed = dataSubFeed
                self.fork()
                dataSubFeed = []
                if self._child_pid is 0:
                    self.run_single_process()
                self.endBatch()

    def run_single_process(self)
        for dataMorsel in self._dataFeed:
            self._caller(dataMorsel)

    def fork(self):
        self._child_pid = os.fork()

    def endBatch(self):
        if self._child_pid is not 0:
            os.waitpid(self._child_pid, 0)
        else:
            sys.exit() # exit from the child when done

This isolates the memory leak to the child process. And it will never leak more times than the value of the batchSize variable.

这将内存泄漏隔离到子进程。并且它永远不会泄漏比 batchSize 变量的值更多的次数。

Answer 4

回答by paxdiablo

I would simply refactor the experiments into individual functions (if not like that already) then accept an experiment number from the command line which calls the single experiment function.

我会简单地将实验重构为单个函数（如果不是这样的话），然后从调用单个实验函数的命令行接受一个实验编号。

The just bodgy up a shell script as follows:

简单的写一个shell脚本如下：

#!/bin/bash

for expnum in 1 2 3 4 5 6 7 8 9 10 11 ; do
    python youProgram ${expnum} otherParams
done

That way, you can leave most of your code as-is and this will clear out any memory leaks you think you have in between each experiment.

这样，您可以将大部分代码保持原样，这将清除您认为在每次实验之间存在的任何内存泄漏。

Of course, the best solution is always to find and fix the root cause of a problem but, as you've already stated, that's not an option for you.

当然，最好的解决方案始终是找到并解决问题的根本原因，但正如您已经说过的，这不是您的选择。

Although it's hard to imagine a memory leak in Python, I'll take your word on that one - you may want to at least consider the possibility that you're mistaken there, however. Consider raising thatin a separate question, something that we can work on at low priority (as opposed to this quick-fix version).

尽管很难想象 Python 中的内存泄漏，但我会相信你的话——然而，你可能至少要考虑一下你在那里弄错的可能性。考虑在一个单独的问题中提出这个问题，我们可以在低优先级（而不是这个快速修复版本）下进行处理。

Update:Making community wiki since the question has changed somewhat from the original. I'd delete the answer but for the fact I still think it's useful - you could do the same to your experiment runner as I proposed the bash script for, you just need to ensure that the experiments are separate processes so that memory leaks dont occur (if the memory leaks are in the runner, you're going to have to do root cause analysis and fix the bug properly).

更新：制作社区维基，因为问题与原始问题有所不同。我会删除答案，但事实上我仍然认为它很有用 - 您可以像我提出的 bash 脚本一样对您的实验运行程序执行相同的操作，您只需要确保实验是单独的进程，以便不会发生内存泄漏（如果内存泄漏在运行程序中，您将不得不进行根本原因分析并正确修复错误）。

Python - 解决内存泄漏

提问by Casebash

回答by John La Rooy

回答by Alex Martelli

回答by Dmitry Rubanovich

回答by paxdiablo

相关推荐

最近更新

标签

Python - 解决内存泄漏

提问by Casebash

回答by John La Rooy

回答by Alex Martelli

回答by Dmitry Rubanovich

回答by paxdiablo

相关推荐

Python 计算集群

Python SSH/SFTP 模块？

python 将文档字符串添加到命名元组？

python 如何使用 PIL 调整大小并将旋转 EXIF 信息应用于文件？

相关推荐

最近更新

标签