Python 如何防止 tensorflow 分配 GPU 内存的全部?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34199233/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 14:36:50  来源:igfitidea点击:

How to prevent tensorflow from allocating the totality of a GPU memory?

pythontensorflownvidia-titan

提问by Fabien C.

I work in an environment in which computational resources are shared, i.e., we have a few server machines equipped with a few Nvidia Titan X GPUs each.

我在一个计算资源共享的环境中工作,即我们有几台服务器机器,每台机器都配备了几个 Nvidia Titan X GPU。

For small to moderate size models, the 12 GB of the Titan X are usually enough for 2–3 people to run training concurrently on the same GPU. If the models are small enough that a single model does not take full advantage of all the computational units of the GPU, this can actually result in a speedup compared with running one training process after the other. Even in cases where the concurrent access to the GPU does slow down the individual training time, it is still nice to have the flexibility of having multiple users simultaneously train on the GPU.

对于中小型模型,Titan X 的 12 GB 通常足以让 2-3 人在同一 GPU 上同时运行训练。如果模型足够小以至于单个模型不能充分利用 GPU 的所有计算单元,那么与运行一个接一个的训练过程相比,这实际上会导致加速。即使在并发访问 GPU 确实会减慢个人训练时间的情况下,拥有让多个用户同时在 GPU 上训练的灵活性仍然很好。

The problem with TensorFlow is that, by default, it allocates the full amount of available GPU memory when it is launched. Even for a small two-layer neural network, I see that all 12 GB of the GPU memory are used up.

TensorFlow 的问题在于,默认情况下,它会在启动时分配全部可用的 GPU 内存。即使对于一个小的两层神经网络,我也看到所有 12 GB 的 GPU 内存都用完了。

Is there a way to make TensorFlow only allocate, say, 4 GB of GPU memory, if one knows that this is enough for a given model?

有没有办法让 TensorFlow 只分配,比如说,4 GB 的 GPU 内存,如果知道这对于给定的模型来说已经足够了?

采纳答案by mrry

You can set the fraction of GPU memory to be allocated when you construct a tf.Sessionby passing a tf.GPUOptionsas part of the optional configargument:

tf.Session通过将 atf.GPUOptions作为可选config参数的一部分传递,您可以设置在构造 a 时要分配的 GPU 内存部分:

# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

The per_process_gpu_memory_fractionacts as a hard upper bound on the amount of GPU memory that will be used by the process on each GPU on the same machine. Currently, this fraction is applied uniformly to all of the GPUs on the same machine; there is no way to set this on a per-GPU basis.

per_process_gpu_memory_fraction用作硬的上限上,将用于通过所述方法在每个GPU在同一台机器上GPU存储器量。目前,这一部分统一应用于同一台机器上的所有 GPU;无法在每个 GPU 的基础上进行设置。

回答by Sergey Demyanov

config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

https://github.com/tensorflow/tensorflow/issues/1578

https://github.com/tensorflow/tensorflow/issues/1578

回答by Lerner Zhang

Shameless plug: If you install the GPU supported Tensorflow, the session will first allocate all GPU whether you set it to use only CPU or GPU. I may add my tip that even you set the graph to use CPU only you should set the same configuration(as answered above:) ) to prevent the unwanted GPU occupation.

无耻的插件:如果你安装了 GPU 支持的 Tensorflow,无论你将其设置为仅使用 CPU 还是 GPU,会话都会首先分配所有 GPU。我可能会添加我的提示,即使您将图形设置为仅使用 CPU,您也应该设置相同的配置(如上回答:))以防止不必要的 GPU 占用。

And in interactive interface like IPython you should also set that configure, otherwise it will allocate all memory and left almost none for others. This is sometimes hard to notice.

在像 IPython 这样的交互式界面中,您还应该设置该配置,否则它将分配所有内存而几乎不为其他人留下任何内存。这有时很难注意到。

回答by user1767754

Here is an excerpt from the Book Deep Learning with TensorFlow

这是本书的摘录 Deep Learning with TensorFlow

In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as it is needed by the process. TensorFlow provides two configurationoptions on the session to control this. The first is the allow_growthoption, which attempts to allocate only as much GPU memory based on runtime allocations, it starts out allocating very little memory, and as sessions get run and more GPU memory is needed, we extend the GPU memory region needed by the TensorFlow process.

在某些情况下,进程只需要分配可用内存的一个子集,或者只在进程需要时增加内存使用量是可取的。TensorFlow在会话上提供了两个配置选项来控制它。第一个是allow_growth选项,它尝试根据运行时分配仅分配尽可能多的 GPU 内存,它开始分配很少的内存,并且随着会话的运行和需要更多的 GPU 内存,我们扩展了 TensorFlow 所需的 GPU 内存区域过程。

1) Allow growth: (more flexible)

1)允许增长:(更灵活)

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)

The second method is per_process_gpu_memory_fractionoption, which determines the fraction of the overall amount of memory that eachvisible GPU should be allocated. Note:No release of memory needed, it can even worsen memory fragmentation when done.

第二种方法是per_process_gpu_memory_fraction选项,它确定each应分配可见 GPU的总内存量的比例。注意:不需要释放内存,完成后它甚至会加剧内存碎片。

2) Allocate fixed memory:

2)分配固定内存

To only allocate 40%of the total memory of each GPU by:

通过以下方式仅分配40%每个 GPU 的总内存:

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)

Note:That's only useful though if you truly want to bind the amount of GPU memory available on the TensorFlow process.

注意:只有当您真的想绑定 TensorFlow 进程上可用的 GPU 内存量时,这才有用。

回答by Urs

All the answers above assume execution with a sess.run()call, which is becoming the exception rather than the rule in recent versions of TensorFlow.

上面的所有答案都假设通过sess.run()调用执行,这在 TensorFlow 的最新版本中正在成为例外而不是规则。

When using the tf.Estimatorframework (TensorFlow 1.4 and above) the way to pass the fraction along to the implicitly created MonitoredTrainingSessionis,

使用tf.Estimator框架(TensorFlow 1.4 及更高版本)时,将分数传递给隐式创建的方法MonitoredTrainingSession是,

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
trainingConfig = tf.estimator.RunConfig(session_config=conf, ...)
tf.estimator.Estimator(model_fn=..., 
                       config=trainingConfig)

Similarly in Eager mode (TensorFlow 1.5 and above),

同样在 Eager 模式下(TensorFlow 1.5 及更高版本),

opts = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
conf = tf.ConfigProto(gpu_options=opts)
tfe.enable_eager_execution(config=conf)

Edit: 11-04-2018As an example, if you are to use tf.contrib.gan.train, then you can use something similar to bellow:

编辑:11-04-2018例如,如果您要使用tf.contrib.gan.train,则可以使用类似于以下内容的内容:

tf.contrib.gan.gan_train(........, config=conf)

回答by Khan

i tried to train unet on voc data set but because of huge image size, memory finishes. i tried all the above tips, even tried with batch size==1, yet to no improvement. sometimes TensorFlow version also causes the memory issues. try by using

我尝试在 voc 数据集上训练 unet,但由于图像大小过大,内存耗尽。我尝试了上述所有提示,甚至尝试使用批量大小== 1,但没有任何改进。有时 TensorFlow 版本也会导致内存问题。尝试使用

pip install tensorflow-gpu==1.8.0

pip 安装 tensorflow-gpu==1.8.0

回答by Imran Ud Din

Well I am new to tensorflow, I have Geforce 740m or something GPU with 2GB ram, I was running mnist handwritten kind of example for a native language with training data containing of 38700 images and 4300 testing images and was trying to get precision , recall , F1 using following code as sklearn was not giving me precise reults. once i added this to my existing code i started getting GPU errors.

好吧,我是 tensorflow 的新手,我有 Geforce 740m 或带有 2GB ram 的 GPU,我正在运行 MNIST 手写类型的本地语言示例,训练数据包含 38700 张图像和 4300 张测试图像,并试图获得精度、召回率、 F1 使用以下代码作为 sklearn 没有给我精确的结果。一旦我将它添加到我现有的代码中,我就开始收到 GPU 错误。

TP = tf.count_nonzero(predicted * actual)
TN = tf.count_nonzero((predicted - 1) * (actual - 1))
FP = tf.count_nonzero(predicted * (actual - 1))
FN = tf.count_nonzero((predicted - 1) * actual)

prec = TP / (TP + FP)
recall = TP / (TP + FN)
f1 = 2 * prec * recall / (prec + recall)

plus my model was heavy i guess, i was getting memory error after 147, 148 epochs, and then I thought why not create functions for the tasks so I dont know if it works this way in tensrorflow, but I thought if a local variable is used and when out of scope it may release memory and i defined the above elements for training and testing in modules, I was able to achieve 10000 epochs without any issues, I hope this will help..

加上我的模型很重,我猜我在 147、148 个时期后出现内存错误,然后我想为什么不为任务创建函数,所以我不知道它是否在 tensrorflow 中以这种方式工作,但我想如果局部变量是使用并且超出范围时它可能会释放内存,并且我定义了上述用于在模块中进行训练和测试的元素,我能够毫无问题地实现 10000 个时代,我希望这会有所帮助..

回答by Theo

Updated for TensorFlow 2.0 Alpha and beyond

针对 TensorFlow 2.0 Alpha 及更高版本进行了更新

From the 2.0 Alpha docs, the answer is now just one line before you do anything with TensorFlow:

从 2.0 Alpha 文档中,在您使用 TensorFlow 执行任何操作之前,答案现在只需一行:

import tensorflow as tf
tf.config.gpu.set_per_process_memory_growth(True)

回答by Mey Khalili

You can use

您可以使用

TF_FORCE_GPU_ALLOW_GROWTH=true

in your environment variables.

在您的环境变量中。

In tensorflowcode:

在张量代码中:

bool GPUBFCAllocator::GetAllowGrowthValue(const GPUOptions& gpu_options) {
  const char* force_allow_growth_string =
      std::getenv("TF_FORCE_GPU_ALLOW_GROWTH");
  if (force_allow_growth_string == nullptr) {
    return gpu_options.allow_growth();
}

回答by mx_muc

Tensorflow 2.0 Beta and (probably) beyond

Tensorflow 2.0 Beta 和(可能)以后

The API changed again. It can be now found in:

API 再次更改。现在可以在以下位置找到它:

tf.config.experimental.set_memory_growth(
    device,
    enable
)

Aliases:

别名:

  • tf.compat.v1.config.experimental.set_memory_growth
  • tf.compat.v2.config.experimental.set_memory_growth
  • tf.compat.v1.config.experimental.set_memory_growth
  • tf.compat.v2.config.experimental.set_memory_growth

References:

参考:

See also:Tensorflow - Use a GPU: https://www.tensorflow.org/guide/gpu

另请参阅:Tensorflow - 使用 GPUhttps : //www.tensorflow.org/guide/gpu

for Tensorflow 2.0 Alpha see:this answer

对于 Tensorflow 2.0 Alpha,请参阅:这个答案