Python 如何检查pytorch是否正在使用GPU?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48152674/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to check if pytorch is using the GPU?
提问by vinzee
I would like to know if pytorch
is using my GPU. It's possible to detect with nvidia-smi
if there is any activity from the GPU during the process, but I want something written in a python
script.
我想知道是否pytorch
正在使用我的 GPU。可以检测nvidia-smi
在此过程中 GPU 是否有任何活动,但我想要一些用python
脚本编写的内容。
Is there a way to do so?
有没有办法这样做?
回答by vinzee
This is going to work :
这将起作用:
In [1]: import torch
In [2]: torch.cuda.current_device()
Out[2]: 0
In [3]: torch.cuda.device(0)
Out[3]: <torch.cuda.device at 0x7efce0b03be0>
In [4]: torch.cuda.device_count()
Out[4]: 1
In [5]: torch.cuda.get_device_name(0)
Out[5]: 'GeForce GTX 950M'
In [6]: torch.cuda.is_available()
Out[6]: True
This tells me the GPU GeForce GTX 950M
is being used by PyTorch
.
这告诉我 GPUGeForce GTX 950M
正在被PyTorch
.
回答by MBT
As it hasn't been proposed here, I'm adding a method using torch.device
, as this is quite handy, also when initializing tensors on the correct device
.
由于这里没有提出,我添加了一个方法 using torch.device
,因为这非常方便,在正确的device
.
# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()
#Additional Info when using cuda
if device.type == 'cuda':
print(torch.cuda.get_device_name(0))
print('Memory Usage:')
print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
print('Cached: ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')
Output:
输出:
Using device: cuda
Tesla K80
Memory Usage:
Allocated: 0.3 GB
Cached: 0.6 GB
As mentioned above, using device
it is possible to:
如上所述,使用device
它可以:
To movetensors to the respective
device
:torch.rand(10).to(device)
To createa tensor directly on the
device
:torch.rand(10, device=device)
要移至张量到各自的
device
:torch.rand(10).to(device)
要直接在 上创建张量
device
:torch.rand(10, device=device)
Which makes switching between CPUand GPUcomfortable without changing the actual code.
这使得在不更改实际代码的情况下在CPU和GPU之间轻松切换。
Edit:
编辑:
As there has been some questions and confusion about the cachedand allocatedmemory I'm adding some additional information about it:
由于对缓存和分配的内存存在一些问题和困惑,我正在添加一些关于它的附加信息:
torch.cuda.max_memory_cached(device=None)
Returns the maximum GPU memory managed by the caching allocator in bytes for a given device.torch.cuda.memory_allocated(device=None)
Returns the current GPU memory usage by tensors in bytes for a given device.
torch.cuda.max_memory_cached(device=None)
返回给定设备的缓存分配器管理的最大 GPU 内存(以字节为单位)。torch.cuda.memory_allocated(device=None)
返回给定设备的当前 GPU 内存使用量(以字节为单位)。
You can either directly hand over a device
as specified further above in the post or you can leave it Noneand it will use the current_device()
.
您可以直接移交device
上面在帖子中进一步指定的a ,也可以将其保留为None,它将使用current_device()
.
回答by kmario23
After you start running the training loop, if you want to manuallywatch it from the terminal whether your program is utilizing the GPU resources and to what extent, then you can simply use watch
as in:
开始运行训练循环后,如果您想从终端手动查看您的程序是否使用了 GPU 资源以及使用程度,那么您可以简单地使用watch
:
$ watch -n 2 nvidia-smi
This will continuously update the usage stats for every 2 seconds until you press ctrl+c
这将每 2 秒持续更新一次使用统计数据,直到您按下ctrl+c
If you need more control on more GPU stats you might need, you can use more sophisticated version of nvidia-smi
with --query-gpu=...
. Below is a simple illustration of this:
如果您需要更多地控制可能需要的更多 GPU 统计数据,您可以使用更复杂的nvidia-smi
with版本--query-gpu=...
。下面是一个简单的说明:
$ watch -n 3 nvidia-smi --query-gpu=index,gpu_name,memory.total,memory.used,memory.free,temperature.gpu,pstate,utilization.gpu,utilization.memory --format=csv
which would output the stats something like:
它将输出类似以下的统计信息:
Note: There should not be any space between the comma separated query names in --query-gpu=...
. Else those values will be ignored and no stats are returned.
注意: 中逗号分隔的查询名称之间不应有任何空格--query-gpu=...
。否则这些值将被忽略并且不返回任何统计信息。
Also, you can check whether your installation of PyTorch detects your CUDA installation correctly by doing:
此外,您可以通过执行以下操作来检查您的 PyTorch 安装是否正确检测到您的 CUDA 安装:
In [13]: import torch
In [14]: torch.cuda.is_available()
Out[14]: True
True
status means that PyTorch is configured correctly and isusing the GPU although you have to move/place the tensors with necessary statements in your code.
True
状态意味着 PyTorch 已正确配置并正在使用 GPU,尽管您必须在代码中移动/放置带有必要语句的张量。
If you want to do this inside Python code, then look into this module:
如果您想在 Python 代码中执行此操作,请查看此模块:
https://github.com/jonsafari/nvidia-ml-pyor in pypi here: https://pypi.python.org/pypi/nvidia-ml-py/
https://github.com/jonsafari/nvidia-ml-py或在 pypi 中:https://pypi.python.org/pypi/nvidia-ml-py/
回答by TimeSeam
On the office site and the get start page, check GPU for PyTorch as below:
在办公网站和开始页面上,检查 PyTorch 的 GPU,如下所示:
import torch
torch.cuda.is_available()
Reference: PyTorch|Get Start
参考:PyTorch|入门
回答by prosti
From practical standpoint just one minor digression:
从实践的角度来看,只是一个小题外话:
import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
This dev
now knows if cuda or cpu.
这dev
现在知道是 cuda 还是 cpu。
And there is a difference how you deal with model and with tensors when moving to cuda. It is a bit strange at first.
在转移到 cuda 时,您处理模型和张量的方式有所不同。一开始有点奇怪。
import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
t1 = torch.randn(1,2)
t2 = torch.randn(1,2).to(dev)
print(t1) # tensor([[-0.2678, 1.9252]])
print(t2) # tensor([[ 0.5117, -3.6247]], device='cuda:0')
t1.to(dev)
print(t1) # tensor([[-0.2678, 1.9252]])
print(t1.is_cuda) # False
t1=t1.to(dev)
print(t1) # tensor([[-0.2678, 1.9252]], device='cuda:0')
print(t1.is_cuda) # True
model = M() # not on cuda
model.to(dev) # is on cuda (all parameters)
print(next(model.parameters()).is_cuda) #True
This all is tricky and understanding it once, helps you to deal fast with less debugging.
这一切都是棘手的,理解它一次,可以帮助您以更少的调试快速处理。
回答by Jadiel de Armas
To check if there is a GPU available:
要检查是否有可用的 GPU:
torch.cuda.is_available()
If the above function returns False
,
如果上述函数返回False
,
- you either have no GPU,
- or the Nvidia drivers have not been installed so the OS does not see the GPU,
- or the GPU is being hidden by the environmental variable
CUDA_VISIBLE_DEVICES
. When the value ofCUDA_VISIBLE_DEVICES
is -1, then all your devices are being hidden. You can check that value in code with this line:os.environ['CUDA_VISIBLE_DEVICES']
- 你要么没有 GPU,
- 或者未安装 Nvidia 驱动程序,因此操作系统看不到 GPU,
- 或者 GPU 被环境变量隐藏
CUDA_VISIBLE_DEVICES
。当 的值为CUDA_VISIBLE_DEVICES
-1 时,您的所有设备都将被隐藏。您可以使用以下行在代码中检查该值:os.environ['CUDA_VISIBLE_DEVICES']
If the above function returns True
that does not necessarily mean that you are using the GPU. In Pytorch you can allocate tensors to devices when you create them. By default, tensors get allocated to the cpu
. To check where your tensor is allocated do:
如果上述函数返回True
,并不一定意味着您正在使用 GPU。在 Pytorch 中,您可以在创建设备时将张量分配给设备。默认情况下,张量被分配给cpu
. 要检查张量的分配位置,请执行以下操作:
# assuming that 'a' is a tensor created somewhere else
a.device # returns the device where the tensor is allocated
Note that you cannot operate on tensors allocated in different devices. To see how to allocate a tensor to the GPU, see here: https://pytorch.org/docs/stable/notes/cuda.html
请注意,您不能对分配在不同设备中的张量进行操作。要查看如何为 GPU 分配张量,请参见此处:https: //pytorch.org/docs/stable/notes/cuda.html
回答by Bram Vanroy
Almost all answers here reference torch.cuda.is_available()
. However, that's only one part of the coin. It tells you whether the GPU (actually CUDA) is available, not whether it's actually being used. In a typical setup, you would set your device with something like this:
这里几乎所有的答案都参考了torch.cuda.is_available()
。然而,这只是硬币的一部分。它告诉您 GPU(实际上是 CUDA)是否可用,而不是它是否正在实际使用。在典型的设置中,您将使用以下内容设置您的设备:
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
but in larger environments (e.g. research) it is also common to give the user more options, so based on input they can disable CUDA, specify CUDA IDs, and so on. In such case, whether or not the GPU is used is not only based on whether it is available or not. After the device has been set to a torch device, you can get its type
property to verify whether it's CUDA or not.
但是在更大的环境(例如研究)中,为用户提供更多选择也是很常见的,因此根据输入,他们可以禁用 CUDA,指定 CUDA ID,等等。在这种情况下,是否使用GPU不仅仅取决于它是否可用。设备设置为torch设备后,可以通过获取其type
属性来验证是否为CUDA。
if device.type == 'cuda':
# do something
回答by DSBLR
Simply from command prompt or Linux environment run the following command.
只需从命令提示符或 Linux 环境运行以下命令。
python -c 'import torch; print(torch.cuda.is_available())'
The above should print True
以上应该打印 True
python -c 'import torch; print(torch.rand(2,3).cuda())'
This one should print the following:
这个应该打印以下内容:
tensor([[0.7997, 0.6170, 0.7042], [0.4174, 0.1494, 0.0516]], device='cuda:0')
回答by mithunpaul
FWIW: If you are here because your pytorch always gives false
for torch.cuda.is_available()
that's probably because you installed your pytorch version without GPU support. (Eg: you coded up in laptop then testing on server). Solution is to uninstall and install pytorch again with the right command from pytorch downloadspage. Also refer thispytorch issue.
FWIW:如果你在这里是因为你的 pytorch 总是false
为此付出代价,torch.cuda.is_available()
那可能是因为你安装了没有 GPU 支持的 pytorch 版本。(例如:您在笔记本电脑上编码,然后在服务器上进行测试)。解决方案是使用 pytorch下载页面中的正确命令再次卸载并安装 pytorch 。另请参阅此pytorch 问题。
回答by litesaber
Create a tensor on the GPU as follows:
在 GPU 上创建一个张量,如下所示:
$ python
>>> import torch
>>> print(torch.rand(3,3).cuda())
Do not quit, open another terminal and check if the python process is using the GPU using:
不要退出,打开另一个终端并使用以下命令检查 python 进程是否正在使用 GPU:
$ nvidia-smi