Python 加载经过训练的 Keras 模型并继续训练

Question

提问by Wilmar van Ommeren

I was wondering if it was possible to save a partly trained Keras model and continue the training after loading the model again.

我想知道是否可以保存部分训练的 Keras 模型并在再次加载模型后继续训练。

The reason for this is that I will have more training data in the future and I do not want to retrain the whole model again.

这样做的原因是我将来会有更多的训练数据，我不想再次重新训练整个模型。

The functions which I am using are:

我正在使用的功能是：

#Partly train model
model.fit(first_training, first_classes, batch_size=32, nb_epoch=20)

#Save partly trained model
model.save('partly_trained.h5')

#Load partly trained model
from keras.models import load_model
model = load_model('partly_trained.h5')

#Continue training
model.fit(second_training, second_classes, batch_size=32, nb_epoch=20)

Edit 1: added fully working example

编辑 1：添加了完整的工作示例

With the first dataset after 10 epochs the loss of the last epoch will be 0.0748 and the accuracy 0.9863.

对于 10 个时期后的第一个数据集，最后一个时期的损失为 0.0748，准确度为 0.9863。

After saving, deleting and reloading the model the loss and accuracy of the model trained on the second dataset will be 0.1711 and 0.9504 respectively.

保存、删除和重新加载模型后，在第二个数据集上训练的模型的损失和准确率将分别为 0.1711 和 0.9504。

Is this caused by the new training data or by a completely re-trained model?

这是由新的训练数据引起的还是由完全重新训练的模型引起的？

"""
Model by: http://machinelearningmastery.com/
"""
# load (downloaded if needed) the MNIST dataset
import numpy
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
from keras.models import load_model
numpy.random.seed(7)

def baseline_model():
    model = Sequential()
    model.add(Dense(num_pixels, input_dim=num_pixels, init='normal', activation='relu'))
    model.add(Dense(num_classes, init='normal', activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

if __name__ == '__main__':
    # load data
    (X_train, y_train), (X_test, y_test) = mnist.load_data()

    # flatten 28*28 images to a 784 vector for each image
    num_pixels = X_train.shape[1] * X_train.shape[2]
    X_train = X_train.reshape(X_train.shape[0], num_pixels).astype('float32')
    X_test = X_test.reshape(X_test.shape[0], num_pixels).astype('float32')
    # normalize inputs from 0-255 to 0-1
    X_train = X_train / 255
    X_test = X_test / 255
    # one hot encode outputs
    y_train = np_utils.to_categorical(y_train)
    y_test = np_utils.to_categorical(y_test)
    num_classes = y_test.shape[1]

    # build the model
    model = baseline_model()

    #Partly train model
    dataset1_x = X_train[:3000]
    dataset1_y = y_train[:3000]
    model.fit(dataset1_x, dataset1_y, nb_epoch=10, batch_size=200, verbose=2)

    # Final evaluation of the model
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

    #Save partly trained model
    model.save('partly_trained.h5')
    del model

    #Reload model
    model = load_model('partly_trained.h5')

    #Continue training
    dataset2_x = X_train[3000:]
    dataset2_y = y_train[3000:]
    model.fit(dataset2_x, dataset2_y, nb_epoch=10, batch_size=200, verbose=2)
    scores = model.evaluate(X_test, y_test, verbose=0)
    print("Baseline Error: %.2f%%" % (100-scores[1]*100))

Answer 1

采纳答案by Marcin Mo?ejko

Actually - model.savesaves all information need for restarting training in your case. The only thing which could be spoiled by reloading model is your optimizer state. To check that - try to saveand reload model and train it on training data.

实际上 -model.save保存在您的情况下重新开始培训所需的所有信息。重新加载模型可能会破坏的唯一事情是您的优化器状态。要检查 - 尝试save重新加载模型并在训练数据上对其进行训练。

Answer 2

回答by Wolfgang

The problem might be that you use a different optimizer - or different arguments to your optimizer. I just had the same issue with a custom pretrained model, using

问题可能是您使用了不同的优化器 - 或者优化器的不同参数。我刚刚使用自定义预训练模型遇到了同样的问题

reduce_lr = ReduceLROnPlateau(monitor='loss', factor=lr_reduction_factor,
                              patience=patience, min_lr=min_lr, verbose=1)

for the pretrained model, whereby the original learning rate starts at 0.0003 and during pre-training it is reduced to the min_learning rate, which is 0.000003

对于预训练模型，原始学习率从 0.0003 开始，在预训练期间将其降低到 min_learning 率，即 0.000003

I just copied that line over to the script which uses the pre-trained model and got really bad accuracies. Until I noticed that the last learning rate of the pretrained model was the min learning rate, i.e. 0.000003. And if I start with that learning rate, I get exactly the same accuracies to start with as the output of the pretrained model - which makes sense, as starting with a learning rate that is 100 times bigger than the last learning rate used in the pretrained model will result in a huge overshoot of GD and hence in heavily decreased accuracies.

我只是将该行复制到使用预训练模型的脚本中，但准确度非常差。直到我注意到预训练模型的最后一个学习率是最小学习率，即 0.000003。如果我从那个学习率开始，我会得到与预训练模型的输出完全相同的准确度——这是有道理的，因为从比预训练中使用的最后一个学习率大 100 倍的学习率开始模型将导致 GD 的巨大超调，从而严重降低精度。

Answer 3

回答by Vishnuvardhan Janapati

Most of the above answers covered important points. If you are using recent Tensorflow (TF2.1or above), Then the following example will help you. The model part of the code is from Tensorflow website.

以上大部分答案都涵盖了要点。如果您使用的是最近的 Tensorflow（TF2.1或更高版本），那么以下示例将对您有所帮助。代码的模型部分来自 Tensorflow 网站。

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),  
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])

  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
  return model

# Create a basic model instance
model=create_model()
model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

Please save the model in *.tf format. From my experience, if you have any custom_loss defined, *.h5 format will not save optimizer status and hence will not serve your purpose if you want to retrain the model from where we left.

请将模型保存为 *.tf 格式。根据我的经验，如果您定义了任何 custom_loss，*.h5 格式将不会保存优化器状态，因此如果您想从我们离开的地方重新训练模型，将无法达到您的目的。

# saving the model in tensorflow format
model.save('./MyModel_tf',save_format='tf')


# loading the saved model
loaded_model = tf.keras.models.load_model('./MyModel_tf')

# retraining the model
loaded_model.fit(x_train, y_train, epochs = 10, validation_data = (x_test,y_test),verbose=1)

This approach will restart the training where we left before saving the model. As mentioned by others, if you want to save weights of best model or you want to save weights of model every epoch you need to use keras callbacks function (ModelCheckpoint) with options such as save_weights_only=True, save_freq='epoch', and save_best_only.

这种方法将在保存模型之前从我们离开的地方重新开始训练。正如其他人所提到的，如果你想保存最好的模型的重量或要保存模型的加权每次你需要使用keras回调函数（ModelCheckpoint）的选项，如时代save_weights_only=True，save_freq='epoch'和save_best_only。

For more details, please check hereand another example here.

有关详细信息，请点击这里和另一个例子在这里。

Answer 4

回答by shahar_m

Notice that Keras sometimes has issues with loaded models, as in here. This might explain cases in which you don't start from the same trained accuracy.

请注意，有时Keras与加载模型的问题，因为在这里。这可能解释了您没有从相同的训练精度开始的情况。

Answer 5

回答by flowgrad

All above helps, you mustresume from same learning rate() as the LR when the model and weights were saved. Set it directly on the optimizer.

以上所有内容都有帮助，当保存模型和权重时，您必须从与 LR 相同的学习率（）恢复。直接在优化器上设置。

Note that improvement from there is not guaranteed, because the model may have reached the local minimum, which may be global. There is no point to resume a model in order to search for another local minimum, unless you intent to increase the learning rate in a controlled fashion and nudge the model into a possibly better minimum not far away.

请注意，不能保证从那里改进，因为模型可能已达到局部最小值，这可能是全局最小值。没有必要为了搜索另一个局部最小值而恢复模型，除非您打算以受控方式增加学习率并将模型推到不远处可能更好的最小值。

Answer 6

回答by Gustavo

You might also be hitting Concept Drift, see Should you retrain a model when new observations are available. There's also the concept of catastrophic forgetting which a bunch of academic papers discuss. Here's one with MNIST Empirical investigation of catastrophic forgetting

您可能还会遇到概念漂移问题，请参阅是否应在有新观测值时重新训练模型。还有一堆学术论文讨论的灾难性遗忘的概念。这是 MNIST对灾难性遗忘的实证研究

Python 加载经过训练的 Keras 模型并继续训练

提问by Wilmar van Ommeren

采纳答案by Marcin Mo?ejko

回答by Wolfgang

回答by Vishnuvardhan Janapati

回答by shahar_m

回答by flowgrad

回答by Gustavo

相关推荐

最近更新

标签

Python 加载经过训练的 Keras 模型并继续训练

提问by Wilmar van Ommeren

采纳答案by Marcin Mo?ejko

回答by Wolfgang

回答by Vishnuvardhan Janapati

回答by shahar_m

回答by flowgrad

回答by Gustavo

相关推荐

Python 如何使用 PyQt5 设置窗口图标？

Python 对 scikit learn 决策树中的 random_state 感到困惑

Python Windows 上的 OpenAI Gym Atari

Python 如何在没有列名或行名的熊猫中选择列和行？

相关推荐

最近更新

标签