首页 > 解决方案 > 训练 rgb 图像数据集时,模型未正确训练精度保持不变

问题描述

我正在 kaggle 的 fruits360 数据集上训练一个模型。我的 keras 模型中有 0 个密集层和 3 个卷积层。我的输入形状是 (60,60,3),因为图像以 rgb 格式加载。请帮我解决这个模型有什么问题,为什么它没有正确训练。我已经尝试过不同的层组合,但无论你改变什么,准确性和损失都保持不变。

以下是模型:

dense_layers = [0]
layer_sizes = [64]
conv_layers = [3]

for dense_layer in dense_layers:
for layer_size in layer_sizes:
    for conv_layer in conv_layers:
        NAME = "{}-conv-{}-nodes-{}-dense-{}".format(conv_layer, layer_size, dense_layer, int(time.time()))
        print(NAME)

        model = Sequential()

        model.add(Conv2D(layer_size, (3, 3), input_shape=(60, 60, 3)))
        model.add(Activation('relu'))
        model.add(MaxPooling2D(pool_size=(2, 2)))

        for l in range(conv_layer-1):
            model.add(Conv2D(layer_size, (3, 3)))
            model.add(Activation('relu'))
            model.add(MaxPooling2D(pool_size=(2, 2)))

        model.add(Flatten())

        for _ in range(dense_layer):
            model.add(Dense(layer_size))
            model.add(Activation('relu'))

        model.add(Dense(1))
        model.add(Activation('sigmoid'))

        tensorboard = TensorBoard(log_dir="logs/")

        model.compile(loss='sparse_categorical_crossentropy',
                      optimizer='adam',
                      metrics=['accuracy'],
                      )

        model.fit(X_norm, y,
                  batch_size=32,
                  epochs=10,
                  validation_data=(X_norm_test,y_test),
                  callbacks=[tensorboard])

但精度保持不变,如下所示:

Epoch 1/10
42798/42798 [==============================] - 27s 641us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 2/10
42798/42798 [==============================] - 27s 638us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 3/10
42798/42798 [==============================] - 27s 637us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 4/10
42798/42798 [==============================] - 27s 635us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 5/10
42798/42798 [==============================] - 27s 635us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 6/10
42798/42798 [==============================] - 27s 631us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 7/10
42798/42798 [==============================] - 27s 631us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 8/10
42798/42798 [==============================] - 27s 631us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 9/10
42798/42798 [==============================] - 27s 635us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114
Epoch 10/10
42798/42798 [==============================] - 27s 626us/step - loss: nan - acc: 0.0115 - val_loss: nan - val_acc: 0.0114

我该怎么做才能正确训练这个模型。以提高准确性。

标签: pythontensorflowmachine-learningkerasgoogle-cloud-ml

解决方案


我不确定sparse_categorical_crossentropy对于只有 1 个单位的输出来说是否是适当的损失。

注意你的损失是nan. 这意味着您的模型/数据/丢失某处存在数学错误。很多时候,这是由被零除、数字溢出等引起的。

我想您应该将其'binary_crossentropy'用作损失函数。

请注意,由于“relu”激活,您仍然存在冻结损失的风险。如果发生这种情况,您可以在BatchNormalization()图层之前添加Activation('relu')图层。


请考虑@desertnaut 的评论。您正在Sequential每个循环中创建一个新模型。


推荐阅读