首页 > 解决方案 > 为什么模型没有在 keras 中使用预训练的 vgg16 进行学习?

问题描述

我正在使用 Keras 提供的预训练VGG 16模型,并将其应用于SVHN 数据集,该数据集是 0-10 的 10 个类别的数据集。网络没有学习并且一直停留在0.17准确性上。有些事情我做错了,但我无法识别它。我进行培训的方式如下:

import tensorflow.keras as keras

## DEFINE THE MODEL ##
vgg16 = keras.applications.vgg16.VGG16()

model = keras.Sequential()
for layer in vgg16.layers:
   model.add(layer)

model.layers.pop()

for layer in model.layers:
   layer.trainable = False

model.add(keras.layers.Dense(10, activation = "softmax"))


## START THE TRAINING ##
train_optimizer_rmsProp = keras.optimizers.RMSprop(lr=0.0001)
model.compile(loss="categorical_crossentropy", optimizer=train_optimizer_rmsProp, metrics=['accuracy'])
batch_size = 128*1

data_generator = keras.preprocessing.image.ImageDataGenerator(
    rescale = 1./255
)


train_generator = data_generator.flow_from_directory(
        'training',
        target_size=(224, 224),
        batch_size=batch_size,
        color_mode='rgb',
        class_mode='categorical'
)

validation_generator = data_generator.flow_from_directory(
        'validate',
        target_size=(224, 224),
        batch_size=batch_size,
        color_mode='rgb',
        class_mode='categorical')

history = model.fit_generator(
    train_generator, 
    validation_data = validation_generator, 
    validation_steps = math.ceil(val_split_length / batch_size),
    epochs = 15, 
    steps_per_epoch = math.ceil(num_train_samples / batch_size), 
    use_multiprocessing = True, 
    workers = 8, 
    callbacks = model_callbacks, 
    verbose = 2
)

我做错了什么?有什么我想念的吗?我期待一个非常高的准确度,因为它承载了重量,但它从第一个时期imagenet就卡在了准确度上。0.17

标签: pythontensorflowkerascomputer-visionconv-neural-network

解决方案


I assume you're upsampling the 32x32 MNIST-like images to fit the VGG16 input, what you should actually do in this case is to remove all the dense layers, this way you can input any image size as in convolutional layers the weights are agnostic to the image size.

You can do this like:

vgg16 = keras.applications.vgg16.VGG16(include_top=False, input_shape=(32, 32))

Which I consider should be the default behaviour of the constructor.

When you upsample the image, best case scenario you're basically blurring it, in this case you have to consider that a single pixel of the original image corresponds to 7 pixels of the upsampled one, while VGG16's filters are 3 pixels wide, so in other words you're losing the image's features.

It is not necessary to add 3 dense layers at the end like the original VGG16, you can try with the same layer you have in your code.


推荐阅读