首页 > 解决方案 > 迁移学习训练精度从一个固定值开始

问题描述

我正在使用tensorflow.keras用于图像分类的库来训练迁移学习模型。我正在处理7个班级。我使用了几个在“imagenet”上训练的预训练模型。例如,我正在使用Xception和解冻最后 30 层。当我用层替换最后一层时softmax,训练看起来很合理。模型摘要如下所示:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
xception (Model)             (None, 5, 5, 2048)        20861480  
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048)              0         
_________________________________________________________________
dense (Dense)                (None, 7)                 14343     
=================================================================
Total params: 20,875,823
Trainable params: 8,990,679
Non-trainable params: 11,885,144
_________________________________________________________________
Number of layers in the base model:  132
Number of trainable layers in the full model:  3

前两个 epoch 的训练看起来像(训练精度从低精度(10% 左右)开始):

Epoch 1/12
525/525 [==============================] - 190s 362ms/step - loss: 0.7438 - accuracy: 0.7314 - val_loss: 0.3813 - val_accuracy: 0.8648
Epoch 2/12
257/525 [=============>................] - ETA: 1:28 - loss: 0.4986 - accuracy: 0.8182

问题

当我在最后softmax一层之前添加一个或多个新层时,问题就开始了。训练准确度正好从85.71开始。我添加多少层或使用什么基本模型都没有关系。训练准确度正好从 85.71 开始,在第一个 epoch 之后,我得到了非常高的验证集准确度(几乎 94-95)。在几个 epoch 内,验证准确率达到 98% 以上。但是当我在单独的验证集上对其进行测试时,与前面描述的模型相比,性能实际上更差

例如,模型摘要为:

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
xception (Model)             (None, 5, 5, 2048)        20861480  
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048)              0         
_________________________________________________________________
dense (Dense)                (None, 1024)              2098176   
_________________________________________________________________
dropout (Dropout)            (None, 1024)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 7)                 7175      
=================================================================
Total params: 22,966,831
Trainable params: 11,081,687
Non-trainable params: 11,885,144
_________________________________________________________________
Number of layers in the base model:  132
Number of trainable layers in the full model:  5

训练时期如下所示:

Epoch 1/12
525/525 [==============================] - 191s 363ms/step - loss: 0.1902 - accuracy: 0.9225 - val_loss: 0.0971 - val_accuracy: 0.9640
Epoch 2/12
 58/525 [==>...........................] - ETA: 2:38 - loss: 0.1351 - accuracy: 0.9470

我用不同的模型改变了基础模型,在 之前添加/删除层softmax,但是训练精度总是从 85.71 开始。

标签: tensorflowkerastransfer-learning

解决方案


推荐阅读