Python 模型执行后清除 Tensorflow GPU 内存
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/39758094/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Clearing Tensorflow GPU memory after model execution
提问by David Parks
I've trained 3 models and am now running code that loads each of the 3 checkpoints in sequence and runs predictions using them. I'm using the GPU.
我已经训练了 3 个模型,现在正在运行按顺序加载 3 个检查点中的每一个并使用它们运行预测的代码。我正在使用 GPU。
When the first model is loaded it pre-allocates the entire GPU memory (which I want for working through the first batch of data). But it doesn't unload memory when it's finished. When the second model is loaded, using both tf.reset_default_graph()
and with tf.Graph().as_default()
the GPU memory still is fully consumed from the first model, and the second model is then starved of memory.
当第一个模型加载时,它会预先分配整个 GPU 内存(我想要处理第一批数据)。但是当它完成时它不会卸载内存。当加载第二个模型时,同时使用两者tf.reset_default_graph()
并且with tf.Graph().as_default()
GPU 内存仍然从第一个模型完全消耗,然后第二个模型内存不足。
Is there a way to resolve this, other than using Python subprocesses or multiprocessing to work around the problem (the only solution I've found on via google searches)?
除了使用 Python 子进程或多处理来解决这个问题(我通过谷歌搜索找到的唯一解决方案)之外,有没有办法解决这个问题?
采纳答案by Oliver Wilken
A git issue from June 2016 (https://github.com/tensorflow/tensorflow/issues/1727) indicates that there is the following problem:
2016 年 6 月的 git 问题(https://github.com/tensorflow/tensorflow/issues/1727)表明存在以下问题:
currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down.
目前GPUDevice中的Allocator属于ProcessState,本质上是一个全局单例。使用 GPU 的第一个会话对其进行初始化,并在进程关闭时释放自身。
Thus the only workaround would be to use processes and shut them down after the computation.
因此,唯一的解决方法是使用进程并在计算后关闭它们。
Example Code:
示例代码:
import tensorflow as tf
import multiprocessing
import numpy as np
def run_tensorflow():
n_input = 10000
n_classes = 1000
# Create model
def multilayer_perceptron(x, weight):
# Hidden layer with RELU activation
layer_1 = tf.matmul(x, weight)
return layer_1
# Store layers weight & bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
pred = multilayer_perceptron(x, weights)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(100):
batch_x = np.random.rand(10, 10000)
batch_y = np.random.rand(10, 1000)
sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})
print "finished doing stuff with tensorflow!"
if __name__ == "__main__":
# option 1: execute code with extra process
p = multiprocessing.Process(target=run_tensorflow)
p.start()
p.join()
# wait until user presses enter key
raw_input()
# option 2: just execute the function
run_tensorflow()
# wait until user presses enter key
raw_input()
So if you would call the function run_tensorflow()
within a process you created and shut the process down (option 1), the memory is freed. If you just run run_tensorflow()
(option 2) the memory is not freed after the function call.
因此,如果您run_tensorflow()
在创建的进程中调用该函数并关闭该进程(选项 1),则内存将被释放。如果您只是运行run_tensorflow()
(选项 2),则在函数调用后不会释放内存。
回答by TanLingxiao
I use numbato releae gpu, with tensorflow I can not find a effect method.
我用numba来释放gpu,用tensorflow我找不到效果方法。
import tensorflow as tf
from numba import cuda
a = tf.constant([1.0,2.0,3.0],shape=[3],name='a')
b = tf.constant([1.0,2.0,3.0],shape=[3],name='b')
with tf.device('/gpu:1'):
c = a+b
TF_CONFIG = tf.ConfigProto(
gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.1),
allow_soft_placement=True)
sess = tf.Session(config=TF_CONFIG)
sess.run(tf.global_variables_initializer())
i=1
while(i<1000):
i=i+1
print(sess.run(c))
sess.close() # if don't use numba,the gpu can't be released
cuda.select_device(1)
cuda.close()
with tf.device('/gpu:1'):
c = a+b
TF_CONFIG = tf.ConfigProto(
gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.5),
allow_soft_placement=True)
sess = tf.Session(config=TF_CONFIG)
sess.run(tf.global_variables_initializer())
while(1):
print(sess.run(c))
回答by hitesh kumar
You can use numba library to release all the gpu memory
您可以使用 numba 库释放所有 gpu 内存
pip install numba
pip 安装 numba
from numba import cuda
device = cuda.get_current_device()
device.reset()
This will release all the memory
这将释放所有内存
回答by Yaroslav Bulatov
GPU memory allocated by tensors is released (back into TensorFlow memory pool) as soon as the tensor is not needed anymore (before the .run call terminates). GPU memory allocated for variables is released when variable containers are destroyed. In case of DirectSession (ie, sess=tf.Session("")) it is when session is closed or explicitly reset (added in 62c159ff)
一旦不再需要张量(在 .run 调用终止之前),张量分配的 GPU 内存就会被释放(回到 TensorFlow 内存池)。当变量容器被销毁时,为变量分配的 GPU 内存会被释放。在 DirectSession(即 sess=tf.Session(""))的情况下,它是会话关闭或显式重置时(在62c159ff 中添加 )
回答by liviaerxin
Now there seem to be two ways to resolve the iterative training model or if you use future multipleprocess pool to serve the model training, where the process in the pool will not be killed if the future finished. You can apply two methods in the training process to release GPU memory meanwhile you wish to preserve the main process.
现在似乎有两种方法可以解决迭代训练模型,或者如果您使用未来的多进程池来为模型训练提供服务,如果将来完成,池中的进程不会被杀死。您可以在训练过程中应用两种方法来释放 GPU 内存,同时您希望保留主过程。
- call a subprocess to run the model training. when one phase training completed, the subprocess will exit and free memory. It's easy to get the return value.
- call the multiprocessing.Process(p) to run the model training(p.start), and p.join will indicate the process exit and free memory.
- 调用一个子进程来运行模型训练。当一阶段训练完成时,子进程将退出并释放内存。很容易得到返回值。
- 调用 multiprocessing.Process(p) 运行模型训练(p.start),p.join 将指示进程退出并释放内存。
Here is a helper function using multiprocess.Process which can open a new process to run your python written function and reture value instead of using Subprocess,
这是一个使用 multiprocess.Process 的辅助函数,它可以打开一个新进程来运行您的 Python 编写的函数并返回值,而不是使用 Subprocess,
# open a new process to run function
def process_run(func, *args):
def wrapper_func(queue, *args):
try:
logger.info('run with process id: {}'.format(os.getpid()))
result = func(*args)
error = None
except Exception:
result = None
ex_type, ex_value, tb = sys.exc_info()
error = ex_type, ex_value,''.join(traceback.format_tb(tb))
queue.put((result, error))
def process(*args):
queue = Queue()
p = Process(target = wrapper_func, args = [queue] + list(args))
p.start()
result, error = queue.get()
p.join()
return result, error
result, error = process(*args)
return result, error