Keras + Tensorflow 和 Python 中的多处理

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42504669/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 21:47:57  来源:igfitidea点击:

Keras + Tensorflow and Multiprocessing in Python

pythontensorflowneural-networkkeraspython-multiprocessing

提问by John Cast

I'm using Keras with Tensorflow as backend.

我使用 Keras 和 Tensorflow 作为后端。

I am trying to save a model in my main process and then load/run (i.e. call model.predict) within another process.

我试图在我的主进程中保存一个模型,然后model.predict在另一个进程中加载/运行(即调用)。

I'm currently just trying the naive approach from the docs to save/load the model: https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model.
So basically:

我目前只是尝试使用文档中的幼稚方法来保存/加载模型:https: //keras.io/getting-started/faq/#how-can-i-save-a-keras-model
所以基本上:

  1. model.save()in main process
  2. model = load_model()in child process
  3. model.predict()in child process
  1. model.save()在主进程中
  2. model = load_model()在子进程中
  3. model.predict()在子进程中

However, it simply hangs on the load_modelcall.

但是,它只是挂断load_model电话。

Searching around I've discovered this potentially related answer suggesting that Keras can only be utilized in one process: using multiprocessing with theanobut am unsure if this is true (can't seem to find much on this).

四处搜索我发现这个潜在相关的答案表明 Keras 只能在一个过程中使用将多处理与 theano 一起使用,但我不确定这是否属实(似乎找不到太多关于此的信息)。

Is there a way to accomplish my goal? A high level description or short example is greatly appreciated.

有没有办法实现我的目标?非常感谢高级描述或简短示例。

Note: I've attempted approaches along the lines of passing a graph to the process but failed since it seems tensorflow graphs aren't pickable (related SO post for that here: Tensorflow: Passing a session to a python multiprocess). If there is indeed a way to pass the tensorflow graph/model to the child process then I am open to that as well.

注意:我尝试了将图形传递给进程的方法,但失败了,因为似乎张量流图形不可选择(相关的 SO 帖子在这里:Tensorflow: Passing a session to a python multiprocess)。如果确实有一种方法可以将张量流图/模型传递给子进程,那么我也对此持开放态度。

Thanks!

谢谢!

回答by Marcin Mo?ejko

From my experience - the problem lies in loading Kerasto one process and then spawning a new process when the kerashas been loaded to your main environment. But for some applications (like e.g. training a mixture of Kerasmodels) it's simply better to have all of this things in one process. So what I advise is the following (a little bit cumbersome - but working for me) approach:

根据我的经验 - 问题在于加载Keras到一个进程,然后keras在加载到主环境后生成一个新进程。但是对于某些应用程序(例如训练混合Keras模型),最好将所有这些都放在一个过程中。所以我的建议是以下(有点麻烦 - 但对我有用)方法:

  1. DO NOT LOAD KERAS TO YOUR MAIN ENVIRONMENT. If you want to load Keras / Theano / TensorFlow do it only in the function environment. E.g. don'tdo this:

    import keras
    
    def training_function(...):
        ...
    

    but do the following:

    def training_function(...):
        import keras
        ...
    
  2. Run work connected with each model in a separate process:I'm usually creating workers which are making the job (like e.g. training, tuning, scoring) and I'm running them in separate processes. What is nice about it that whole memory used by this process is completely freedwhen your process is done. This helps you with loads of memory problems which you usually come across when you are using multiprocessing or even running multiple models in one process. So this looks e.g. like this:

    def _training_worker(train_params):
        import keras
        model = obtain_model(train_params)
        model.fit(train_params)
        send_message_to_main_process(...)
    
    def train_new_model(train_params):
        training_process = multiprocessing.Process(target=_training_worker, args = train_params)
        training_process.start()
        get_message_from_training_process(...)
        training_process.join()
    
  1. 不要将 KERAS 加载到您的主要环境中。如果你想加载 Keras / Theano / TensorFlow 只在函数环境中做。例如要这样做:

    import keras
    
    def training_function(...):
        ...
    

    但请执行以下操作:

    def training_function(...):
        import keras
        ...
    
  2. 在单独的流程中运行与每个模型相关的工作:我通常创建正在完成工作的工作人员(例如培训、调整、评分),并且我在单独的流程中运行它们。当你的进程完成时,这个进程使用的整个内存被完全释放有什么好处。这可以帮助您解决在使用多处理甚至在一个进程中运行多个模型时通常会遇到的大量内存问题。所以这看起来像这样:

    def _training_worker(train_params):
        import keras
        model = obtain_model(train_params)
        model.fit(train_params)
        send_message_to_main_process(...)
    
    def train_new_model(train_params):
        training_process = multiprocessing.Process(target=_training_worker, args = train_params)
        training_process.start()
        get_message_from_training_process(...)
        training_process.join()
    

Different approach is simply preparing different scripts for different model actions. But this may cause memory errors especially when your models are memory consuming. NOTEthat due to this reason it's better to make your execution strictly sequential.

不同的方法只是为不同的模型动作准备不同的脚本。但这可能会导致内存错误,尤其是当您的模型消耗内存时。请注意,由于这个原因,最好严格按顺序执行。

回答by VictorLi

I created one simple example to show how to run Keras model in multiple processes with multiple gpus. Hope this sample could help you. https://github.com/yuanyuanli85/Keras-Multiple-Process-Prediction

我创建了一个简单的示例来展示如何在具有多个 gpu 的多个进程中运行 Keras 模型。希望这个样本可以帮助你。 https://github.com/yuanyuanli85/Keras-Multiple-Process-Prediction

回答by Mark

I created a decorator that fixed my code.

我创建了一个装饰器来修复我的代码。

from multiprocessing import Pipe, Process

def child_process(func):
    """Makes the function run as a separate process."""
    def wrapper(*args, **kwargs):
        def worker(conn, func, args, kwargs):
            conn.send(func(*args, **kwargs))
            conn.close()
        parent_conn, child_conn = Pipe()
        p = Process(target=worker, args=(child_conn, func, args, kwargs))
        p.start()
        ret = parent_conn.recv()
        p.join()
        return ret
return wrapper

@child_process
def keras_stuff():
    """ Keras stuff here"""