tensorflow - 使用 keras VGG16 模型验证准确率保持在 50%
问题描述
我正在使用带有 food-101 图像数据集的 keras(Tensorflow 后端)VGG16 模型进行食品分类项目。但是,我遇到了一些与验证准确性相关的问题。(我相信问题在于过度拟合)。我的验证准确度没有增加,并且始终保持在 48-51% 左右。我有 40 个类(40 种不同的食物),其中 700 张图像用于训练,300 张图像用于每种食物的验证。我用一堆随机的食物图像评估了我的模型。我努力了:
- 降低学习率
- 将 Dropout 层更改为 0.75
- 图像增强
尽管它对我有所帮助,但并没有显着提高验证的准确性。我听说有人使用 preprocess_input() 函数来提高验证准确性,但我不确定。
这是我的代码:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
from keras.utils import to_categorical
from keras import optimizers
# Dimensions of images
img_width, img_height = 150, 150
top_model_weights_path = 'test2_classes.h5'
train_data_dir = 'D:\intallation\dataset\dataset-101/food/train'
validation_data_dir = 'D:\intallation\dataset\dataset-101/food/validation'
nb_train_samples = 28000
nb_validation_samples = 12000
epochs = 80
batch_size = 32
def save_bottlebeck_features():
datagen = ImageDataGenerator(rescale=1. / 255)
# build the VGG16 network
model = applications.VGG16(include_top=False, weights='imagenet')
generator = datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
bottleneck_features_train = model.predict_generator(
generator, nb_train_samples // batch_size)
np.save('test_trained.npy', bottleneck_features_train)
generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
bottleneck_features_validation = model.predict_generator(
generator, nb_validation_samples // batch_size)
np.save('test_validation.npy', bottleneck_features_validation)
def train_top_model():
# Class Labels for Training Data
datagen_top = ImageDataGenerator(rescale=1./255,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=0.05,
zoom_range=0.05,
fill_mode='nearest',
channel_shift_range=0.2*255)
datagen_top_val = ImageDataGenerator(rescale=1./255)
generator_top = datagen_top_val.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
np.save('test_class_indices.npy', generator_top.class_indices)
num_classes = len(generator_top.class_indices)
train_data = np.load('test_trained.npy')
train_labels = generator_top.classes # Get Class Labels
train_labels = to_categorical(train_labels, num_classes=num_classes)
# Class Labels for Validation Data
generator_top = datagen_top.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode=None,
shuffle=False)
validation_data = np.load('test_validation.npy')
validation_labels = generator_top.classes
validation_labels = to_categorical(validation_labels, num_classes=num_classes)
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
sgd = optimizers.SGD(lr=1e-4, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd,
loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(train_data, train_labels,
epochs=epochs,
batch_size=batch_size,
validation_data=(validation_data, validation_labels))
model.save_weights(top_model_weights_path)
save_bottlebeck_features()
train_top_model()
解决方案
推荐阅读
- ruby-on-rails - 使用 Ruby 搜索 JSON 响应并计算出现次数
- python - 每月对 Dataframe、Pandas 重新采样后的索引错误
- events - 当输入改变时绘制一个形状
- mysql - 玛丽亚数据库/MySQL。在分组中查找具有最大值和最小值的行字段
- java - Java-Selection 按 int 键对对象数组进行排序并显示在表中
- javascript - 如何获取被点击元素的类名并使用相同的类名来操作其他元素?
- python - Tkinter 还是 Pygame?哪个更适合创建经典的 Atari 游戏?
- sql - 进行 SQL/JPQL 查询以选择匹配两个关键字的所有主题
- python - 在列中提取以字母开头并以数字结尾的特殊单词
- postgresql - 在postgresql中查找与另一个常量字段关联的一个字段的最大值