首页 > 解决方案 > Keras 卷积自动编码器,MSE 适用于 fashion_mnist,但不适用于 mnist

问题描述

我正在关注本教程https://blog.keras.io/building-autoencoders-in-keras.html特别是卷积示例。我不明白为什么如果我将损失函数从 binary_crossentropy 更改为 MSE,它只适用于 fashion_mnist。

使用 mnist,损失在第一个 epoch 之后下降并且不再变化。训练后,测试集上的预测图像只是黑色图像。使用 fashion_mnist 它可以完美运行。

import keras
from keras import layers
import keras.backend as K

input_img = keras.Input(shape=(28, 28, 1))

x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = layers.MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, (3, 3), activation='relu')(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = keras.Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='mse') # binary_crossentropy
from keras.datasets import mnist
from keras.datasets import fashion_mnist
import numpy as np

(x_train, _), (x_test, _) = mnist.load_data() # fashion_mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))

history = autoencoder.fit(x_train, x_train,
                epochs=50,
                batch_size=128,
                shuffle=True,
                validation_data=(x_test, x_test))

标签: pythontensorflowkerasdeep-learningautoencoder

解决方案


我猜你想使用自动编码器执行一些图像去噪/重建。对于此类任务,MSE 是正确使用的损失。对于此任务,您必须在输出层使用线性激活函数,以便将重建的输出图像数组元素(像素)与标签图像的像素进行比较。像素通常没有标准化,通常具有 0 到 255 之间的值。Sigmoid 激活函数会将输出的值标准化为 0 到 1 之间的值,这最适合分类任务,因为它为您提供了 2 个类别的类别概率 ( softmax 用于更多类)。然后将此激活函数与交叉熵函数一起用于分类任务。


推荐阅读