首页 > 解决方案 > ImageDataGenerator 没有生成足够的样本

问题描述

我正在关注 F.Chollet 的书“使用 python 进行深度学习”,但无法获得一个示例。特别是,我正在运行“在小型数据集上从头开始训练卷积网络”一章中的示例。我的训练数据集有 2000 个样本,我正在尝试使用 ImageDataGenerator 通过增强来扩展它。尽管我的代码完全相同,但我收到了错误:

您输入的数据用完了;中断训练。确保您的数据集或生成器至少可以生成steps_per_epoch * epochs 批次(在本例中为 10000 个批次)。

from keras import layers
from keras import models
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator

# creating model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

# model compilation
model.compile(loss='binary_crossentropy',
                optimizer=optimizers.RMSprop(lr=1e-4),
                metrics=['acc'])

# model.summary()

# generating trains and test sets with rescaling 0-255 -> 0-1
train_dir = 'c:\\Work\\Code\\Python\\DL\\cats_and_dogs_small\\train\\'
validation_dir = 'c:\\Work\\Code\\Python\\DL\\cats_and_dogs_small\\validation\\'

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)
# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=32,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

for data_batch, labels_batch in train_generator:
    print('data batch shape:', data_batch.shape)
    print('labels batch shape:', labels_batch.shape)
    break

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=100,
      validation_data=validation_generator,
      validation_steps=50)

是本书示例的 github 页面的链接。您也可以在哪里检查代码。

我不确定我做错了什么并寻求任何建议。谢谢

标签: pythontensorflowkerasconv-neural-network

解决方案


似乎batch_size应该是 20 而不是 32。

既然你有steps_per_epoch = 100,它将next()在火车生成器上执行 100 次,然后再进入下一个 epoch。

现在,在train_generatorisbatch_size中,假设您有多个训练样本32,它可以生成2000/32批次数。2000那是近似的62

所以按时63th执行不会给任何东西,它会告诉next()train_generatorYour input ran out of data;

理想情况下,

steps_per_epoch = total_traing_sample / batch_size

推荐阅读