tensorflow - 在谷歌 Colab 中训练 CNN 模型时卡在第一个 Epoch
问题描述
我创建了一个模型来识别植物疾病。我希望能识别出 10 种疾病。在 jupyter notebook 中,它运行良好,但由于 GPU 限制,它运行缓慢。然后我决定在 google colab 中运行该模型,但它没有运行。它停留在第一个时代。
我用来构建模型的代码如下
BATCH_SIZE = 64
IMAGE_SIZE = 256
CHANNELS=3
EPOCHS=10
dataset = tf.keras.preprocessing.image_dataset_from_directory(
"/content/drive/MyDrive/google-colab-files/PlantVillage",
seed=123,
shuffle=True,
image_size=(IMAGE_SIZE,IMAGE_SIZE),
batch_size=BATCH_SIZE
)
def get_dataset_partisions_tf(ds,trains_split=0.8,val_split=0.1,test_split=0.1,shuffle=True,shuffle_size=10000):
ds_size = len(ds)
if shuffle:
ds = ds.shuffle(shuffle_size,seed=12)
train_size = int(trains_split * ds_size)
val_size = int(val_split * ds_size)
train_ds = ds.take(train_size)
val_ds = ds.skip(train_size).take(val_size)
test_ds = ds.skip(train_size).skip(val_size)
return train_ds,val_ds,test_ds
train_ds,val_ds,test_ds = get_dataset_partisions_tf(dataset)
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
val_ds = val_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
test_ds = test_ds.cache().shuffle(1000).prefetch(buffer_size = tf.data.AUTOTUNE)
resize_and_rescales = Sequential([
layers.experimental.preprocessing.Resizing(IMAGE_SIZE,IMAGE_SIZE),
layers.experimental.preprocessing.Rescaling(1.0/255)
])
data_agmetation = Sequential([
layers.experimental.preprocessing.RandomFlip('horizontal_and_vertical'),
layers.experimental.preprocessing.RandomRotation(0.2),
])
input_shape = (BATCH_SIZE,IMAGE_SIZE,IMAGE_SIZE,CHANNELS)
n_classes = 10
model = Sequential([
resize_and_rescales,
data_agmetation,
layers.Conv2D(32,(3,3), activation='relu',input_shape = input_shape),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,kernel_size = (3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,kernel_size = (3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Conv2D(64,(3,3), activation='relu'),
layers.MaxPooling2D((2,2)),
layers.Flatten(),
layers.Dense(64,activation='relu'),
layers.Dense(n_classes, activation='softmax'),
])
model.build(input_shape = input_shape)
model.summary()
模型摘要的屏幕截图是:
model.compile(
optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['accuracy']
)
当我使用以下代码训练数据时:
model.fit(
train_ds,
epochs=EPOCHS,
batch_size=BATCH_SIZE,
verbose=2,
validation_data=val_ds
)
它一直停留在第一个时代
解决方案
检查 TensorFlow 是否使用 GPU。您可以尝试减少批量大小。
推荐阅读
- python - 如何删除列表中项目的单引号或双引号?
- c# - Blazor 本地化的工作方式是否发生了变化?
- javascript - 表单不显示在 Angular 中
- react-native - 在 ScrollView 中使用 ListItem、FlatList 时出现警告
- python - Python PyAutoGui pixelMatchesColor 引发 windll.user32.ReleaseDC 错误
- flutter - 在按下按钮两次之前状态不会更新
- python - 无论线程如何,tkinter 都会冻结
- performance - Rust 编译中最耗时的部分通常是什么?
- powershell - 如何使用下面的 Try/catch 块实际捕获无法找到的电子邮件地址?
- c++ - 运行时错误:基数为 0x000000000000 的指针索引表达式溢出到 0xffffffffffffffe8