Python 哪些参数应该用于提前停止?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/43906048/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:31:03  来源:igfitidea点击:

Which parameters should be used for early stopping?

pythonkerasdeep-learningconv-neural-network

提问by AizuddinAzman

I'm training a neural network for my project using Keras. Keras has provided a function for early stopping. May I know what parameters should be observed to avoid my neural network from overfitting by using early stopping?

我正在使用 Keras 为我的项目训练神经网络。Keras 提供了提前停止的功能。我可以知道应该观察哪些参数才能通过使用提前停止来避免我的神经网络过度拟合吗?

回答by umutto

early stopping

提前停止

Early stopping is basically stopping the training once your loss starts to increase (or in other words validation accuracy starts to decrease). According to documentsit is used as follows;

提前停止基本上是在您的损失开始增加时停止训练(或者换句话说,验证准确性开始下降)。根据文件,它的用法如下;

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=0,
                              verbose=0, mode='auto')

Values depends on your implementation (problem, batch size etc...) but generally to prevent overfitting I would use;

值取决于您的实现(问题、批量大小等),但通常为了防止过度拟合,我会使用;

  1. Monitor the validation loss (need to use cross validation or at least train/test sets) by setting the monitorargument to 'val_loss'.
  2. min_deltais a threshold to whether quantify a loss at some epoch as improvement or not. If the difference of loss is below min_delta, it is quantified as no improvement. Better to leave it as 0 since we're interested in when loss becomes worse.
  3. patienceargument represents the number of epochs before stopping once your loss starts to increase (stops improving). This depends on your implementation, if you use very small batchesor a large learning rateyour loss zig-zag(accuracy will be more noisy) so better set a large patienceargument. If you use large batchesand a small learning rateyour loss will be smoother so you can use a smaller patienceargument. Either way I'll leave it as 2 so I would give the model more chance.
  4. verbosedecides what to print, leave it at default (0).
  5. modeargument depends on what direction your monitored quantity has (is it supposed to be decreasing or increasing), since we monitor the loss, we can use min. But let's leave keras handle that for us and set that to auto
  1. 通过将monitor参数设置为来监控验证损失(需要使用交叉验证或至少训练/测试集)'val_loss'
  2. min_delta是是否将某个时期的损失量化为改进的阈值。如果损失的差异低于min_delta,则量化为没有改善。最好将其保留为 0,因为我们对损失何时变得更糟感兴趣。
  3. patience参数表示损失开始增加(停止改善)后停止之前的时期数。这取决于您的实现,如果您使用非常小的批次大的学习率,您的损失之字形(精度会更嘈杂)所以最好设置一个大patience参数。如果你使用大批量小学习率,你的损失会更平滑,所以你可以使用更小的patience参数。无论哪种方式,我都会将其保留为 2,这样我就会给模型更多的机会。
  4. verbose决定要打印的内容,将其保留为默认值 (0)。
  5. mode参数取决于您监控的数量的方向(它应该减少还是增加),因为我们监控损失,我们可以使用min. 但是让我们让 keras 为我们处理它并将其设置为auto

So I would use something like this and experiment by plotting the error loss with and without early stopping.

所以我会使用这样的东西,并通过绘制有和没有提前停止的错误损失来进行实验。

keras.callbacks.EarlyStopping(monitor='val_loss',
                              min_delta=0,
                              patience=2,
                              verbose=0, mode='auto')


For possible ambiguity on how callbacks work, I'll try to explain more. Once you call fit(... callbacks=[es])on your model, Keras calls given callback objects predetermined functions. These functions can be called on_train_begin, on_train_end, on_epoch_begin, on_epoch_endand on_batch_begin, on_batch_end. Early stopping callback is called on every epoch end, compares the best monitored value with the current one and stops if conditions are met (how many epochs have past since the observation of the best monitored value and is it more than patience argument, the difference between last value is bigger than min_delta etc..).

对于回调如何工作可能存在的歧义,我将尝试解释更多。一旦你调用fit(... callbacks=[es])你的模型,Keras 就会调用给定的回调对象预定函数。这些功能可以称为on_train_beginon_train_endon_epoch_beginon_epoch_endon_batch_beginon_batch_end。提前停止回调在每个 epoch 结束时调用,将最佳监控值与当前值进行比较,并在满足条件时停止(自观察到最佳监控值以来已经过去了多少个 epoch,这是否不仅仅是耐心参数,之间的差异最后一个值大于 min_delta 等。)。

As pointed by @BrentFaust in comments, model's training will continue until either Early Stopping conditions are met or epochsparameter (default=10) in fit()is satisfied. Setting an Early Stopping callback will not make the model to train beyond its epochsparameter. So calling fit()function with a larger epochsvalue would benefit more from Early Stopping callback.

正如@BrentFaust 在评论中指出的那样,模型的训练将继续,直到满足提前停止条件或满足epochs参数(默认值 = 10)fit()。设置提前停止回调不会使模型训练超出其epochs参数。因此,调用fit()具有较大epochs值的函数将从 Early Stopping 回调中受益更多。