Python 如何判断哪个 Keras 模型更好?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34702041/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to tell which Keras model is better?
提问by pr338
I don't understand which accuracy in the output to use to compare my 2 Keras models to see which one is better.
我不明白使用输出中的哪个精度来比较我的 2 个 Keras 模型以查看哪个更好。
Do I use the "acc" (from the training data?) one or the "val acc" (from the validation data?) one?
我是使用“acc”(来自训练数据?)还是“val acc”(来自验证数据?)一个?
There are different accs and val accs for each epoch. How do I know the acc or val acc for my model as a whole? Do I average all of the epochs accs or val accs to find the acc or val acc of the model as a whole?
每个时期有不同的 accs 和 val accs。我如何知道整个模型的 acc 或 val acc?我是否对所有的 epochs accs 或 val accs 求平均值以找到模型作为一个整体的 acc 或 val acc ?
Model 1 Output
模型 1 输出
Train on 970 samples, validate on 243 samples
Epoch 1/20
0s - loss: 0.1708 - acc: 0.7990 - val_loss: 0.2143 - val_acc: 0.7325
Epoch 2/20
0s - loss: 0.1633 - acc: 0.8021 - val_loss: 0.2295 - val_acc: 0.7325
Epoch 3/20
0s - loss: 0.1657 - acc: 0.7938 - val_loss: 0.2243 - val_acc: 0.7737
Epoch 4/20
0s - loss: 0.1847 - acc: 0.7969 - val_loss: 0.2253 - val_acc: 0.7490
Epoch 5/20
0s - loss: 0.1771 - acc: 0.8062 - val_loss: 0.2402 - val_acc: 0.7407
Epoch 6/20
0s - loss: 0.1789 - acc: 0.8021 - val_loss: 0.2431 - val_acc: 0.7407
Epoch 7/20
0s - loss: 0.1789 - acc: 0.8031 - val_loss: 0.2227 - val_acc: 0.7778
Epoch 8/20
0s - loss: 0.1810 - acc: 0.8010 - val_loss: 0.2438 - val_acc: 0.7449
Epoch 9/20
0s - loss: 0.1711 - acc: 0.8134 - val_loss: 0.2365 - val_acc: 0.7490
Epoch 10/20
0s - loss: 0.1852 - acc: 0.7959 - val_loss: 0.2423 - val_acc: 0.7449
Epoch 11/20
0s - loss: 0.1889 - acc: 0.7866 - val_loss: 0.2523 - val_acc: 0.7366
Epoch 12/20
0s - loss: 0.1838 - acc: 0.8021 - val_loss: 0.2563 - val_acc: 0.7407
Epoch 13/20
0s - loss: 0.1835 - acc: 0.8041 - val_loss: 0.2560 - val_acc: 0.7325
Epoch 14/20
0s - loss: 0.1868 - acc: 0.8031 - val_loss: 0.2573 - val_acc: 0.7407
Epoch 15/20
0s - loss: 0.1829 - acc: 0.8072 - val_loss: 0.2581 - val_acc: 0.7407
Epoch 16/20
0s - loss: 0.1878 - acc: 0.8062 - val_loss: 0.2589 - val_acc: 0.7407
Epoch 17/20
0s - loss: 0.1833 - acc: 0.8072 - val_loss: 0.2613 - val_acc: 0.7366
Epoch 18/20
0s - loss: 0.1837 - acc: 0.8113 - val_loss: 0.2605 - val_acc: 0.7325
Epoch 19/20
0s - loss: 0.1906 - acc: 0.8010 - val_loss: 0.2555 - val_acc: 0.7407
Epoch 20/20
0s - loss: 0.1884 - acc: 0.8062 - val_loss: 0.2542 - val_acc: 0.7449
Model 2 Output
模型 2 输出
Train on 970 samples, validate on 243 samples
Epoch 1/20
0s - loss: 0.1735 - acc: 0.7876 - val_loss: 0.2386 - val_acc: 0.6667
Epoch 2/20
0s - loss: 0.1733 - acc: 0.7825 - val_loss: 0.1894 - val_acc: 0.7449
Epoch 3/20
0s - loss: 0.1781 - acc: 0.7856 - val_loss: 0.2028 - val_acc: 0.7407
Epoch 4/20
0s - loss: 0.1717 - acc: 0.8021 - val_loss: 0.2545 - val_acc: 0.7119
Epoch 5/20
0s - loss: 0.1757 - acc: 0.8052 - val_loss: 0.2252 - val_acc: 0.7202
Epoch 6/20
0s - loss: 0.1776 - acc: 0.8093 - val_loss: 0.2449 - val_acc: 0.7490
Epoch 7/20
0s - loss: 0.1833 - acc: 0.7897 - val_loss: 0.2272 - val_acc: 0.7572
Epoch 8/20
0s - loss: 0.1827 - acc: 0.7928 - val_loss: 0.2376 - val_acc: 0.7531
Epoch 9/20
0s - loss: 0.1795 - acc: 0.8062 - val_loss: 0.2445 - val_acc: 0.7490
Epoch 10/20
0s - loss: 0.1746 - acc: 0.8103 - val_loss: 0.2491 - val_acc: 0.7449
Epoch 11/20
0s - loss: 0.1831 - acc: 0.8082 - val_loss: 0.2477 - val_acc: 0.7449
Epoch 12/20
0s - loss: 0.1831 - acc: 0.8113 - val_loss: 0.2496 - val_acc: 0.7490
Epoch 13/20
0s - loss: 0.1920 - acc: 0.8000 - val_loss: 0.2459 - val_acc: 0.7449
Epoch 14/20
0s - loss: 0.1945 - acc: 0.7928 - val_loss: 0.2446 - val_acc: 0.7490
Epoch 15/20
0s - loss: 0.1852 - acc: 0.7990 - val_loss: 0.2459 - val_acc: 0.7449
Epoch 16/20
0s - loss: 0.1800 - acc: 0.8062 - val_loss: 0.2495 - val_acc: 0.7449
Epoch 17/20
0s - loss: 0.1891 - acc: 0.8000 - val_loss: 0.2469 - val_acc: 0.7449
Epoch 18/20
0s - loss: 0.1891 - acc: 0.8041 - val_loss: 0.2467 - val_acc: 0.7531
Epoch 19/20
0s - loss: 0.1853 - acc: 0.8072 - val_loss: 0.2511 - val_acc: 0.7449
Epoch 20/20
0s - loss: 0.1905 - acc: 0.8062 - val_loss: 0.2460 - val_acc: 0.7531
采纳答案by aleju
Do I use the "acc" (from the training data?) one or the "val acc" (from the validation data?) one?
我是使用“acc”(来自训练数据?)还是“val acc”(来自验证数据?)一个?
If you want to estimate the ability of your model to generalize to new data (which is probably what you want to do), then you look at the validation accuracy, because the validation split contains only data that the model never sees during the training and therefor cannot just memorize.
如果您想估计您的模型泛化到新数据的能力(这可能是您想要做的),那么您可以查看验证准确性,因为验证拆分只包含模型在训练期间从未见过的数据,并且因此不能只是死记硬背。
If your training data accuracy ("acc") keeps improving while your validation data accuracy ("val_acc") gets worse, you are likely in an overfittingsituation, i.e. your model starts to basically just memorize the data.
如果您的训练数据准确度(“acc”)不断提高而您的验证数据准确度(“val_acc”)变差,则您可能处于过度拟合的情况,即您的模型开始基本上只是记住数据。
There are different accs and val accs for each epoch. How do I know the acc or val acc for my model as a whole? Do I average all of the epochs accs or val accs to find the acc or val acc of the model as a whole?
每个时期有不同的 accs 和 val accs。我如何知道整个模型的 acc 或 val acc?我是否对所有的 epochs accs 或 val accs 求平均值以找到模型作为一个整体的 acc 或 val acc ?
Each epoch is a training run over all of your data. During that run the parameters of your model are adjusted according to your loss function. The result is a set of parameters which have a certain ability to generalize to new data. That ability is reflected by the validation accuracy. So think of every epoch as its own model, which can get better or worse if it is trained for another epoch. Whether it got better or worse is judged by the change in validation accuracy (better = validation accuracy increased). Therefore pick the model of the epoch with the highest validation accuracy. Don't average the accuracies over different epochs, that wouldn't make much sense. You can use the Keras callback ModelCheckpoint
to automatically save the model with the highest validation accuracy (see callbacks documentation).
每个时期都是对所有数据的训练。在运行期间,您的模型参数会根据您的损失函数进行调整。结果是一组具有一定泛化能力的参数对新数据。这种能力体现在验证准确性上。因此,将每个 epoch 视为它自己的模型,如果针对另一个 epoch 进行训练,它可能会变得更好或更糟。是变好还是变差是通过验证准确度的变化来判断的(更好=验证准确度增加)。因此,选择具有最高验证准确率的 epoch 模型。不要平均不同时期的准确度,那没有多大意义。您可以使用 Keras 回调ModelCheckpoint
自动保存具有最高验证准确度的模型(请参阅回调文档)。
The highest accuracy in model 1 is 0.7737
and the highest one in model 2 is 0.7572
. Therefore you should view model 1 (at epoch 3) as better. Though it is possible that the 0.7737
was just a random outlier.
模型 1 的精度0.7737
最高,模型 2 的精度最高0.7572
。因此,您应该更好地看待模型 1(在 epoch 3)。尽管这可能0.7737
只是一个随机的异常值。
回答by Erik Aronesty
You need to key on decreasing val_loss or increasing val_acc, ultimately it doesn't matter much. The differences are well within random/rounding errors.
您需要重点降低 val_loss 或增加 val_acc,最终都没有太大关系。差异完全在随机/舍入误差范围内。
In practice, the training loss can drop significantly due to over-fitting, which is why you want to look at validation loss.
在实践中,由于过度拟合,训练损失可能会显着下降,这就是您要查看验证损失的原因。
In your case, you can see that your training loss is not dropping - which means you are learning nothing after each epoch. It look like there's nothing to learn in this model, aside from some trivial linear-like fit or cutoff value.
在你的例子中,你可以看到你的训练损失没有下降——这意味着你在每个 epoch 之后什么也没学到。除了一些微不足道的线性拟合或截止值之外,看起来在这个模型中没有什么可学习的。
Also, when learning nothing, or a trivial linear thing, you should a similar performance on training and validation (trivial learning is always generalizable). You should probably shuffle your data before using the validation_split feature.
此外,当什么都不学习,或者一个微不足道的线性事物时,你应该在训练和验证上有类似的表现(微不足道的学习总是可以推广的)。在使用validation_split 功能之前,您可能应该洗牌您的数据。