Python pytorch 如何设置 .requires_grad False
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/51748138/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pytorch how to set .requires_grad False
提问by Qian Wang
I want to set some of my model frozen. Following the official docs:
我想冻结我的一些模型。按照官方文档:
with torch.no_grad():
linear = nn.Linear(1, 1)
linear.eval()
print(linear.weight.requires_grad)
But it prints True
instead of False
. If I want to set the model in eval mode, what should I do?
但它打印True
而不是False
. 如果我想将模型设置为 eval 模式,我该怎么做?
回答by iacolippo
requires_grad=False
requires_grad=False
If you want to freeze part of your model and train the rest, you can set requires_grad
of the parameters you want to freeze to False
.
如果您想冻结部分模型并训练其余部分,您可以设置requires_grad
要冻结的参数False
。
For example, if you only want to keep the convolutional part of VGG16 fixed:
例如,如果只想保持 VGG16 的卷积部分固定:
model = torchvision.models.vgg16(pretrained=True)
for param in model.features.parameters():
param.requires_grad = False
By switching the requires_grad
flags to False
, no intermediate buffers will be saved, until the computation gets to some point where one of the inputs of the operation requires the gradient.
通过将requires_grad
标志切换为False
,将不会保存中间缓冲区,直到计算到达某个点,其中操作的输入之一需要梯度。
torch.no_grad()
torch.no_grad()
Using the context manager torch.no_grad
is a different way to achieve that goal: in the no_grad
context, all the results of the computations will have requires_grad=False
, even if the inputs have requires_grad=True
. Notice that you won't be able to backpropagate the gradient to layers before the no_grad
. For example:
使用上下文管理器torch.no_grad
是实现该目标的另一种方式:在no_grad
上下文中,所有计算结果都将具有requires_grad=False
,即使输入具有requires_grad=True
。请注意,您将无法将梯度反向传播到no_grad
. 例如:
x = torch.randn(2, 2)
x.requires_grad = True
lin0 = nn.Linear(2, 2)
lin1 = nn.Linear(2, 2)
lin2 = nn.Linear(2, 2)
x1 = lin0(x)
with torch.no_grad():
x2 = lin1(x1)
x3 = lin2(x2)
x3.sum().backward()
print(lin0.weight.grad, lin1.weight.grad, lin2.weight.grad)
outputs:
输出:
(None, None, tensor([[-1.4481, -1.1789],
[-1.4481, -1.1789]]))
Here lin1.weight.requires_grad
was True, but the gradient wasn't computed because the oepration was done in the no_grad
context.
这里lin1.weight.requires_grad
是 True,但没有计算梯度,因为操作是在no_grad
上下文中完成的。
model.eval()
模型.评估()
If your goal is not to finetune, but to set your model in inference mode, the most convenient way is to use the torch.no_grad
context manager. In this case you also have to set your model to evaluationmode, this is achieved by calling eval()
on the nn.Module
, for example:
如果您的目标不是微调,而是将模型设置为推理模式,那么最方便的方法是使用torch.no_grad
上下文管理器。在这种情况下,您还必须将模型设置为评估模式,这是通过调用 来实现eval()
的nn.Module
,例如:
model = torchvision.models.vgg16(pretrained=True)
model.eval()
This operation sets the attribute self.training
of the layers to False
, in practice this will change the behavior of operations like Dropout
or BatchNorm
that must behave differently at training and test time.
此操作将self.training
层的属性设置为False
,实际上这将改变操作的行为,Dropout
或者BatchNorm
在训练和测试时必须表现不同的操作。
回答by Salih Karagoz
Here is the way;
这是方法;
linear = nn.Linear(1,1)
for param in linear.parameters():
param.requires_grad = False
with torch.no_grad():
linear.eval()
print(linear.weight.requires_grad)
OUTPUT: False
输出:错误
回答by benjaminplanche
To complete @Salih_Karagoz's answer, you also have the torch.set_grad_enabled()
context (further documentation here), which can be used to easily switch between train/eval modes:
要完成@Salih_Karagoz 的回答,您还有torch.set_grad_enabled()
上下文(此处为进一步文档),可用于在训练/评估模式之间轻松切换:
linear = nn.Linear(1,1)
is_train = False
with torch.set_grad_enabled(is_train):
linear.eval()
print(linear.weight.requires_grad)
回答by Meiqi
回答by prosti
Nice. The trick is to check that when you define a Linear layar, by default the parameters will have requires_grad=True
, because we would like to learn, right?
好的。诀窍是检查当你定义一个线性层时,默认情况下参数会有requires_grad=True
,因为我们想学习,对吧?
l = nn.Linear(1, 1)
p = l.parameters()
for _ in p:
print (_)
# Parameter containing:
# tensor([[-0.3258]], requires_grad=True)
# Parameter containing:
# tensor([0.6040], requires_grad=True)
The other construct,
另一个构造,
with torch.no_grad():
Means you cannot learn in here.
意味着你不能在这里学习。
So your code, just shows you are capable of learning, even though you are in torch.no_grad()
where learning is forbidden.
所以你的代码,只是表明你有学习的能力,即使你处于torch.no_grad()
禁止学习的地方。
with torch.no_grad():
linear = nn.Linear(1, 1)
linear.eval()
print(linear.weight.requires_grad) #true
If you really plan to turn off requires_grad
for the weight parameter, you can do it also with:
如果你真的打算关闭requires_grad
权重参数,你也可以这样做:
linear.weight.requires_grad_(False)
or
或者
linear.weight.requires_grad = False
So your code may become like this:
所以你的代码可能会变成这样:
with torch.no_grad():
linear = nn.Linear(1, 1)
linear.weight.requires_grad_(False)
linear.eval()
print(linear.weight.requires_grad)
If you plan to switch to requires_grad for all params in a module:
如果您打算为模块中的所有参数切换到 requires_grad:
l = nn.Linear(1, 1)
for _ in l.parameters():
_.requires_grad_(False)
print(_)