首页 > 解决方案 > 我试图用 Keras 进行微调,但它不起作用

问题描述

我试图对臭名昭著的糖尿病视网膜病变数据集进行一些微调。据我所知,我正在遵循这些步骤

- 训练一个没有头层的 VGG16 网络,加载 imagenet 权重并冻结所有 conv 层几个 epoch。

-解冻一些卷积层(最后一个块)并再次训练。

事实是我总是一次又一次地获得相同的acc分数。当我在所有层都冻结和 imagenet 权重的情况下训练模型时,我得到了几乎 0.74。当我解冻一些层并再次训练时,我得到完全相同的分数,它似乎什么也没做。

我正在使用 Tensorflow-Gpu 2.0 和 Keras 2.3.0。

这是我的代码:

from __future__ import absolute_import, division, print_function, unicode_literals
import os
import tensorboard
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dropout, Flatten, Dense
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D, GlobalMaxPooling2D, Input
import pandas as pandas
from tensorflow.keras.applications import vgg16
import tensorflow as tf
from tensorflow.keras.models import Model
import datetime
from keras.callbacks import TensorBoard
import pkg_resources
from keras import callbacks
from PIL import Image
import numpy as np
from pathlib import Path
import tensorflow.keras as k


for entry_point in pkg_resources.iter_entry_points('tensorboard_plugins'):
    print(entry_point.dist)
#-----------------------------------------------------------
#Corregir el fallo de la CPU
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
#os.environ["PATH"].append("C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v10.0/bin/cudart64_100.dll")
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
#------------------------------------------------------------
trainLabels = pandas.read_csv("./trainLabels_cropped.csv", dtype=str)

#Hay que añadir la extension a la lista de imagenes
def append_ext(fn):
    return fn+".jpeg"

trainLabels["image"]=trainLabels["image"].apply(append_ext)
#test_data["id_code"]=test_data["id_code"].apply(append_ext)

train_datagen = ImageDataGenerator(
    zoom_range=[-0.5, 0.5],
    width_shift_range=[-5, 5],
    height_shift_range=[-5, 5],
    rotation_range=5,
    shear_range=5,
    #samplewise_center=True,
    #samplewise_std_normalization=True,
    #fill_mode='nearest',
    validation_split=0.25)


train_generator = train_datagen.flow_from_dataframe(
        dataframe=trainLabels,
        directory='resized_train_cropped/resized_train_cropped/',
        x_col="image",
        y_col="level",
        target_size=(224, 224),
        batch_size=10,
        class_mode='categorical',
        color_mode='rgb', #quitar o no quitar
        subset='training')

validation_generator = train_datagen.flow_from_dataframe(
        dataframe=trainLabels,
        directory='resized_train_cropped/resized_train_cropped/',
        x_col="image",
        y_col="level",
        target_size=(224, 224),
        batch_size=10,
        class_mode='categorical',
        color_mode='rgb',
        subset='validation')

basemodel=Sequential()
baseModel=vgg16.VGG16(weights="imagenet", include_top=False, input_tensor=Input(shape=(224, 224, 3)))
for l in baseModel.layers:
    l.trainable=False
for layer in baseModel.layers:
    print("{}: {}".format(layer, layer.trainable))

headModel = baseModel.output
headModel = Flatten(name="flatten")(headModel)
headModel = Dense(512, activation="relu")(headModel)
headModel = Dropout(0.5)(headModel)
headModel = Dense(5, activation="softmax")(headModel)

model = Model(inputs=baseModel.input, outputs=headModel)
optimizer=k.optimizers.Adam(learning_rate=1e-4)
model.compile(loss='categorical_crossentropy',
            optimizer=optimizer,
            metrics=['acc'])

#Model Summary
model.summary()

log_dir="logs\\fit\\" +'Prueba'+ datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)

parada=callbacks.callbacks.EarlyStopping(monitor='acc',mode='max',verbose=1,restore_best_weights=True,patience=2)
learningRate=callbacks.callbacks.ReduceLROnPlateau(monitor='acc', factor=0.1, verbose=1, mode='max', min_delta=0.0001, cooldown=0, min_lr=0,patience=2)
#checkpoint=keras.callbacks.callbacks.ModelCheckpoint('\\pesos\\weights', monitor='acc', verbose=0, save_best_only=True, save_weights_only=True, mode='auto', period=1)


model.fit_generator(
        train_generator,
        steps_per_epoch=250,
        epochs=20,
        validation_data=validation_generator,
        validation_steps=250
        #callbacks=[parada]
        )

train_generator.reset()
validation_generator.reset()

for l in baseModel.layers[15:]:
    l.trainable=True
for layer in baseModel.layers:
    print("{}: {}".format(layer, layer.trainable))

optimizer=k.optimizers.Adam(learning_rate=1e-6)
model.compile(loss='categorical_crossentropy',
            optimizer=optimizer,
            metrics=['acc'])

model.fit_generator(
        train_generator,
        steps_per_epoch=250,
        epochs=30,
        validation_data=validation_generator,
        validation_steps=250
        #callbacks=[parada]
        )

标签: pythonkerasdeep-learningneural-networktraining-data

解决方案


尝试以下操作:

1)添加几个更密集的层。

2) 尝试使用其他激活函数,例如“tanh”,看看是否有帮助。(ReLU 可能会导致神经元发生更新,从而阻止任何后续数据点通过梯度改变其权重)


推荐阅读