Python 如何在pytorch中进行渐变裁剪？

Question

提问by Gulzar

What is the correct way to perform gradient clipping in pytorch?

在 pytorch 中执行渐变裁剪的正确方法是什么？

I have an exploding gradients problem, and I need to program my way around it.

我有一个梯度爆炸问题，我需要围绕它进行编程。

Answer 1

采纳答案by a_guest

clip_grad_norm(which is actually deprecated in favor of clip_grad_norm_following the more consistent syntax of a trailing _when in-place modification is performed) clips the norm of the overallgradient by concatenating all parameters passed to the function, as can be seen from the documentation:

clip_grad_norm（实际上已弃用，以支持在执行就地修改时clip_grad_norm_遵循更一致的尾随语法_）通过连接传递给函数的所有参数来剪辑整体梯度的范数，如文档中所示：

The norm is computed over all gradients together, as if they were concatenated into a single vector. Gradients are modified in-place.

范数是在所有梯度上一起计算的，就好像它们被连接成一个向量一样。渐变就地修改。

From your example it looks like that you want clip_grad_value_instead which has a similar syntax and also modifies the gradients in-place:

从您的示例中，它看起来像您想要的clip_grad_value_那样，它具有类似的语法并且还就地修改了渐变：

clip_grad_value_(model.parameters(), clip_value)

Another option is to register a backward hook. This takes the current gradient as an input and may return a tensor which will be used in-place of the previous gradient, i.e. modifying it. This hook is called each time after a gradient has been computed, i.e. there's no need for manually clipping once the hook has been registered:

另一种选择是注册一个反向钩子。这将当前梯度作为输入，并可能返回一个张量，该张量将用于代替先前的梯度，即修改它。每次计算梯度后都会调用这个钩子，即一旦钩子被注册，就不需要手动裁剪：

for p in model.parameters():
    p.register_hook(lambda grad: torch.clamp(grad, -clip_value, clip_value))

Answer 2

回答by Rahul

A more complete example

一个更完整的例子

optimizer.zero_grad()        
loss, hidden = model(data, hidden, targets)
loss.backward()

torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip)
optimizer.step()

Source: https://github.com/pytorch/pytorch/issues/309

来源：https: //github.com/pytorch/pytorch/issues/309

Answer 3

回答by Gulzar

Reading through the forum discussiongave this:

通读论坛讨论给出了这个：

clipping_value = 1 # arbitrary value of your choosing
torch.nn.utils.clip_grad_norm(model.parameters(), clipping_value)

I'm sure there is more depth to it than only this code snippet.

我确信它比仅此代码片段更深入。

Python 如何在pytorch中进行渐变裁剪？

提问by Gulzar

采纳答案by a_guest

回答by Rahul

回答by Gulzar

相关推荐

最近更新

标签

Python 如何在pytorch中进行渐变裁剪？

提问by Gulzar

采纳答案by a_guest

回答by Rahul

回答by Gulzar

相关推荐

如何在python中将csv转换为json？

TensorFlow Python 3.7

Python 分组数据框并获得总和和计数？

Python 导入错误：无法导入名称“_validate_lengths”

相关推荐

最近更新

标签