Python pytorch 如何设置 .requires_grad False

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/51748138/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 19:55:02  来源:igfitidea点击:

pytorch how to set .requires_grad False

pythonpytorchgradient-descent

提问by Qian Wang

I want to set some of my model frozen. Following the official docs:

我想冻结我的一些模型。按照官方文档:

with torch.no_grad():
    linear = nn.Linear(1, 1)
    linear.eval()
    print(linear.weight.requires_grad)

But it prints Trueinstead of False. If I want to set the model in eval mode, what should I do?

但它打印True而不是False. 如果我想将模型设置为 eval 模式,我该怎么做?

回答by iacolippo

requires_grad=False

requires_grad=False

If you want to freeze part of your model and train the rest, you can set requires_gradof the parameters you want to freeze to False.

如果您想冻结部分模型并训练其余部分,您可以设置requires_grad要冻结的参数False

For example, if you only want to keep the convolutional part of VGG16 fixed:

例如,如果只想保持 VGG16 的卷积部分固定:

model = torchvision.models.vgg16(pretrained=True)
for param in model.features.parameters():
    param.requires_grad = False

By switching the requires_gradflags to False, no intermediate buffers will be saved, until the computation gets to some point where one of the inputs of the operation requires the gradient.

通过将requires_grad标志切换为False,将不会保存中间缓冲区,直到计算到达某个点,其中操作的输入之一需要梯度。

torch.no_grad()

torch.no_grad()

Using the context manager torch.no_gradis a different way to achieve that goal: in the no_gradcontext, all the results of the computations will have requires_grad=False, even if the inputs have requires_grad=True. Notice that you won't be able to backpropagate the gradient to layers before the no_grad. For example:

使用上下文管理器torch.no_grad是实现该目标的另一种方式:在no_grad上下文中,所有计算结果都将具有requires_grad=False,即使输入具有requires_grad=True。请注意,您将无法将梯度反向传播到no_grad. 例如:

x = torch.randn(2, 2)
x.requires_grad = True

lin0 = nn.Linear(2, 2)
lin1 = nn.Linear(2, 2)
lin2 = nn.Linear(2, 2)
x1 = lin0(x)
with torch.no_grad():    
    x2 = lin1(x1)
x3 = lin2(x2)
x3.sum().backward()
print(lin0.weight.grad, lin1.weight.grad, lin2.weight.grad)

outputs:

输出:

(None, None, tensor([[-1.4481, -1.1789],
         [-1.4481, -1.1789]]))

Here lin1.weight.requires_gradwas True, but the gradient wasn't computed because the oepration was done in the no_gradcontext.

这里lin1.weight.requires_grad是 True,但没有计算梯度,因为操作是在no_grad上下文中完成的。

model.eval()

模型.评估()

If your goal is not to finetune, but to set your model in inference mode, the most convenient way is to use the torch.no_gradcontext manager. In this case you also have to set your model to evaluationmode, this is achieved by calling eval()on the nn.Module, for example:

如果您的目标不是微调,而是将模型设置为推理模式,那么最方便的方法是使用torch.no_grad上下文管理器。在这种情况下,您还必须将模型设置为评估模式,这是通过调用 来实现eval()nn.Module,例如:

model = torchvision.models.vgg16(pretrained=True)
model.eval()

This operation sets the attribute self.trainingof the layers to False, in practice this will change the behavior of operations like Dropoutor BatchNormthat must behave differently at training and test time.

此操作将self.training层的属性设置为False,实际上这将改变操作的行为,Dropout或者BatchNorm在训练和测试时必须表现不同的操作。

回答by Salih Karagoz

Here is the way;

这是方法;

linear = nn.Linear(1,1)

for param in linear.parameters():
    param.requires_grad = False

with torch.no_grad():
    linear.eval()
    print(linear.weight.requires_grad)

OUTPUT: False

输出:错误

回答by benjaminplanche

To complete @Salih_Karagoz's answer, you also have the torch.set_grad_enabled()context (further documentation here), which can be used to easily switch between train/eval modes:

要完成@Salih_Karagoz 的回答,您还有torch.set_grad_enabled()上下文(此处为进一步文档),可用于在训练/评估模式之间轻松切换:

linear = nn.Linear(1,1)

is_train = False
with torch.set_grad_enabled(is_train):
    linear.eval()
    print(linear.weight.requires_grad)

回答by Meiqi

This tutorialmay help.

教程可能会有所帮助。

In short words, I think a good way for this question could be:

简而言之,我认为这个问题的一个好方法可能是:

linear = nn.Linear(1,1)

for param in linear.parameters():
    param.requires_grad = False

linear.eval()
print(linear.weight.requires_grad)

回答by prosti

Nice. The trick is to check that when you define a Linear layar, by default the parameters will have requires_grad=True, because we would like to learn, right?

好的。诀窍是检查当你定义一个线性层时,默认情况下参数会有requires_grad=True,因为我们想学习,对吧?

l = nn.Linear(1, 1)
p = l.parameters()
for _ in p:
    print (_)

# Parameter containing:
# tensor([[-0.3258]], requires_grad=True)
# Parameter containing:
# tensor([0.6040], requires_grad=True)    

The other construct,

另一个构造,

with torch.no_grad():

Means you cannot learn in here.

意味着你不能在这里学习。

So your code, just shows you are capable of learning, even though you are in torch.no_grad()where learning is forbidden.

所以你的代码,只是表明你有学习的能力,即使你处于torch.no_grad()禁止学习的地方。

with torch.no_grad():
    linear = nn.Linear(1, 1)
    linear.eval()
    print(linear.weight.requires_grad) #true

If you really plan to turn off requires_gradfor the weight parameter, you can do it also with:

如果你真的打算关闭requires_grad权重参数,你也可以这样做:

linear.weight.requires_grad_(False)

or

或者

linear.weight.requires_grad = False

So your code may become like this:

所以你的代码可能会变成这样:

with torch.no_grad():
    linear = nn.Linear(1, 1)
    linear.weight.requires_grad_(False)
    linear.eval()
    print(linear.weight.requires_grad)

If you plan to switch to requires_grad for all params in a module:

如果您打算为模块中的所有参数切换到 requires_grad:

l = nn.Linear(1, 1)
for _ in l.parameters():
    _.requires_grad_(False)
    print(_)