Python 如何检查pytorch是否正在使用GPU？

Question

提问by vinzee

I would like to know if pytorchis using my GPU. It's possible to detect with nvidia-smiif there is any activity from the GPU during the process, but I want something written in a pythonscript.

我想知道是否pytorch正在使用我的 GPU。可以检测nvidia-smi在此过程中 GPU 是否有任何活动，但我想要一些用python脚本编写的内容。

Is there a way to do so?

有没有办法这样做？

Answer 1

回答by vinzee

This is going to work :

这将起作用：

In [1]: import torch

In [2]: torch.cuda.current_device()
Out[2]: 0

In [3]: torch.cuda.device(0)
Out[3]: <torch.cuda.device at 0x7efce0b03be0>

In [4]: torch.cuda.device_count()
Out[4]: 1

In [5]: torch.cuda.get_device_name(0)
Out[5]: 'GeForce GTX 950M'

In [6]: torch.cuda.is_available()
Out[6]: True

This tells me the GPU GeForce GTX 950Mis being used by PyTorch.

这告诉我 GPUGeForce GTX 950M正在被PyTorch.

Answer 2

回答by MBT

As it hasn't been proposed here, I'm adding a method using torch.device, as this is quite handy, also when initializing tensors on the correct device.

由于这里没有提出，我添加了一个方法 using torch.device，因为这非常方便，在正确的device.

# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()

#Additional Info when using cuda
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:   ', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')

Output:

输出：

Using device: cuda

Tesla K80
Memory Usage:
Allocated: 0.3 GB
Cached:    0.6 GB

As mentioned above, using deviceit is possible to:

如上所述，使用device它可以：

To movetensors to the respective device:
```
torch.rand(10).to(device)
```
To createa tensor directly on the device:
```
torch.rand(10, device=device)
```

要移至张量到各自的device：
```
torch.rand(10).to(device)
```
要直接在上创建张量device：
```
torch.rand(10, device=device)
```

Which makes switching between CPUand GPUcomfortable without changing the actual code.

这使得在不更改实际代码的情况下在CPU和GPU之间轻松切换。

Edit:

编辑：

As there has been some questions and confusion about the cachedand allocatedmemory I'm adding some additional information about it:

由于对缓存和分配的内存存在一些问题和困惑，我正在添加一些关于它的附加信息：

torch.cuda.max_memory_cached(device=None)

Returns the maximum GPU memory managed by the caching allocator in bytes for a given device.
torch.cuda.memory_allocated(device=None)

Returns the current GPU memory usage by tensors in bytes for a given device.

torch.cuda.max_memory_cached(device=None)

返回给定设备的缓存分配器管理的最大 GPU 内存（以字节为单位）。
torch.cuda.memory_allocated(device=None)

返回给定设备的当前 GPU 内存使用量（以字节为单位）。

You can either directly hand over a deviceas specified further above in the post or you can leave it Noneand it will use the current_device().

您可以直接移交device上面在帖子中进一步指定的a ，也可以将其保留为None，它将使用current_device().

Answer 3

回答by kmario23

After you start running the training loop, if you want to manuallywatch it from the terminal whether your program is utilizing the GPU resources and to what extent, then you can simply use watchas in:

开始运行训练循环后，如果您想从终端手动查看您的程序是否使用了 GPU 资源以及使用程度，那么您可以简单地使用watch：

$ watch -n 2 nvidia-smi

This will continuously update the usage stats for every 2 seconds until you press ctrl+c

这将每 2 秒持续更新一次使用统计数据，直到您按下ctrl+c

If you need more control on more GPU stats you might need, you can use more sophisticated version of nvidia-smiwith --query-gpu=.... Below is a simple illustration of this:

如果您需要更多地控制可能需要的更多 GPU 统计数据，您可以使用更复杂的nvidia-smiwith版本--query-gpu=...。下面是一个简单的说明：

$ watch -n 3 nvidia-smi --query-gpu=index,gpu_name,memory.total,memory.used,memory.free,temperature.gpu,pstate,utilization.gpu,utilization.memory --format=csv

which would output the stats something like:

它将输出类似以下的统计信息：

Note: There should not be any space between the comma separated query names in --query-gpu=.... Else those values will be ignored and no stats are returned.

注意：中逗号分隔的查询名称之间不应有任何空格--query-gpu=...。否则这些值将被忽略并且不返回任何统计信息。

Also, you can check whether your installation of PyTorch detects your CUDA installation correctly by doing:

此外，您可以通过执行以下操作来检查您的 PyTorch 安装是否正确检测到您的 CUDA 安装：

In [13]: import  torch

In [14]: torch.cuda.is_available()
Out[14]: True

Truestatus means that PyTorch is configured correctly and isusing the GPU although you have to move/place the tensors with necessary statements in your code.

True状态意味着 PyTorch 已正确配置并正在使用 GPU，尽管您必须在代码中移动/放置带有必要语句的张量。

If you want to do this inside Python code, then look into this module:

如果您想在 Python 代码中执行此操作，请查看此模块：

https://github.com/jonsafari/nvidia-ml-pyor in pypi here: https://pypi.python.org/pypi/nvidia-ml-py/

https://github.com/jonsafari/nvidia-ml-py或在 pypi 中：https://pypi.python.org/pypi/nvidia-ml-py/

Answer 4

回答by TimeSeam

On the office site and the get start page, check GPU for PyTorch as below:

在办公网站和开始页面上，检查 PyTorch 的 GPU，如下所示：

import torch
torch.cuda.is_available()

Reference: PyTorch|Get Start

参考：PyTorch|入门

Answer 5

回答by prosti

From practical standpoint just one minor digression:

从实践的角度来看，只是一个小题外话：

import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

This devnow knows if cuda or cpu.

这dev现在知道是 cuda 还是 cpu。

And there is a difference how you deal with model and with tensors when moving to cuda. It is a bit strange at first.

在转移到 cuda 时，您处理模型和张量的方式有所不同。一开始有点奇怪。

import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
t1 = torch.randn(1,2)
t2 = torch.randn(1,2).to(dev)
print(t1)  # tensor([[-0.2678,  1.9252]])
print(t2)  # tensor([[ 0.5117, -3.6247]], device='cuda:0')
t1.to(dev) 
print(t1)  # tensor([[-0.2678,  1.9252]]) 
print(t1.is_cuda) # False
t1=t1.to(dev)
print(t1)  # tensor([[-0.2678,  1.9252]], device='cuda:0') 
print(t1.is_cuda) # True


model = M()   # not on cuda
model.to(dev) # is on cuda (all parameters)
print(next(model.parameters()).is_cuda) #True

This all is tricky and understanding it once, helps you to deal fast with less debugging.

这一切都是棘手的，理解它一次，可以帮助您以更少的调试快速处理。

Answer 6

回答by Jadiel de Armas

To check if there is a GPU available:

要检查是否有可用的 GPU：

torch.cuda.is_available()

If the above function returns False,

如果上述函数返回False，

you either have no GPU,
or the Nvidia drivers have not been installed so the OS does not see the GPU,
or the GPU is being hidden by the environmental variable CUDA_VISIBLE_DEVICES. When the value of CUDA_VISIBLE_DEVICESis -1, then all your devices are being hidden. You can check that value in code with this line: os.environ['CUDA_VISIBLE_DEVICES']

你要么没有 GPU，
或者未安装 Nvidia 驱动程序，因此操作系统看不到 GPU，
或者 GPU 被环境变量隐藏CUDA_VISIBLE_DEVICES。当的值为CUDA_VISIBLE_DEVICES-1 时，您的所有设备都将被隐藏。您可以使用以下行在代码中检查该值：os.environ['CUDA_VISIBLE_DEVICES']

If the above function returns Truethat does not necessarily mean that you are using the GPU. In Pytorch you can allocate tensors to devices when you create them. By default, tensors get allocated to the cpu. To check where your tensor is allocated do:

如果上述函数返回True，并不一定意味着您正在使用 GPU。在 Pytorch 中，您可以在创建设备时将张量分配给设备。默认情况下，张量被分配给cpu. 要检查张量的分配位置，请执行以下操作：

# assuming that 'a' is a tensor created somewhere else
a.device  # returns the device where the tensor is allocated

Note that you cannot operate on tensors allocated in different devices. To see how to allocate a tensor to the GPU, see here: https://pytorch.org/docs/stable/notes/cuda.html

请注意，您不能对分配在不同设备中的张量进行操作。要查看如何为 GPU 分配张量，请参见此处：https: //pytorch.org/docs/stable/notes/cuda.html

Answer 7

回答by Bram Vanroy

Almost all answers here reference torch.cuda.is_available(). However, that's only one part of the coin. It tells you whether the GPU (actually CUDA) is available, not whether it's actually being used. In a typical setup, you would set your device with something like this:

这里几乎所有的答案都参考了torch.cuda.is_available()。然而，这只是硬币的一部分。它告诉您 GPU（实际上是 CUDA）是否可用，而不是它是否正在实际使用。在典型的设置中，您将使用以下内容设置您的设备：

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

but in larger environments (e.g. research) it is also common to give the user more options, so based on input they can disable CUDA, specify CUDA IDs, and so on. In such case, whether or not the GPU is used is not only based on whether it is available or not. After the device has been set to a torch device, you can get its typeproperty to verify whether it's CUDA or not.

但是在更大的环境（例如研究）中，为用户提供更多选择也是很常见的，因此根据输入，他们可以禁用 CUDA，指定 CUDA ID，等等。在这种情况下，是否使用GPU不仅仅取决于它是否可用。设备设置为torch设备后，可以通过获取其type属性来验证是否为CUDA。

if device.type == 'cuda':
    # do something

Answer 8

回答by DSBLR

Simply from command prompt or Linux environment run the following command.

只需从命令提示符或 Linux 环境运行以下命令。

python -c 'import torch; print(torch.cuda.is_available())'

The above should print True

以上应该打印 True

python -c 'import torch; print(torch.rand(2,3).cuda())'

This one should print the following:

这个应该打印以下内容：

tensor([[0.7997, 0.6170, 0.7042], [0.4174, 0.1494, 0.0516]], device='cuda:0')

Answer 9

回答by mithunpaul

FWIW: If you are here because your pytorch always gives falsefor torch.cuda.is_available()that's probably because you installed your pytorch version without GPU support. (Eg: you coded up in laptop then testing on server). Solution is to uninstall and install pytorch again with the right command from pytorch downloadspage. Also refer thispytorch issue.

FWIW：如果你在这里是因为你的 pytorch 总是false为此付出代价，torch.cuda.is_available()那可能是因为你安装了没有 GPU 支持的 pytorch 版本。（例如：您在笔记本电脑上编码，然后在服务器上进行测试）。解决方案是使用 pytorch下载页面中的正确命令再次卸载并安装 pytorch 。另请参阅此pytorch 问题。

Answer 10

回答by litesaber

Create a tensor on the GPU as follows:

在 GPU 上创建一个张量，如下所示：

$ python
>>> import torch
>>> print(torch.rand(3,3).cuda())

Do not quit, open another terminal and check if the python process is using the GPU using:

不要退出，打开另一个终端并使用以下命令检查 python 进程是否正在使用 GPU：

$ nvidia-smi

Python 如何检查pytorch是否正在使用GPU？

提问by vinzee

回答by vinzee

回答by MBT

Edit:

编辑：

回答by kmario23

回答by TimeSeam

回答by prosti

回答by Jadiel de Armas

回答by Bram Vanroy

回答by DSBLR

回答by mithunpaul

回答by litesaber

相关推荐

最近更新

标签

Python 如何检查pytorch是否正在使用GPU？

提问by vinzee

回答by vinzee

回答by MBT

Edit:

编辑：

回答by kmario23

回答by TimeSeam

回答by prosti

回答by Jadiel de Armas

回答by Bram Vanroy

回答by DSBLR

回答by mithunpaul

回答by litesaber

相关推荐

Python：Pandas pd.read_excel 给出 ImportError：安装 xlrd >= 0.9.0 以获得 Excel 支持

Python -- read_pickle 导入错误：没有名为 index.base 的模块

Python “utf-8”编解码器无法解码位置 4276 中的字节 0xa0：起始字节无效

Python Pandas：两个数据帧的差异

相关推荐

最近更新

标签