首页 > 解决方案 > 变分自动编码器 - 警告:张量流:当最小化损失时,变量 [] 不存在梯度

问题描述

我正在尝试实现变分自动编码器,使用官方keras 页面的最后一部分,输入是标准化和平坦的 mnist 数据集:

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

该模型:

original_dim = 28 * 28
intermediate_dim = 64
latent_dim = 2

inputs = keras.Input(shape=(original_dim,))
h = layers.Dense(intermediate_dim, activation='relu')(inputs)
z_mean = layers.Dense(latent_dim)(h)
z_log_sigma = layers.Dense(latent_dim)(h)

from keras import backend as K

def sampling(args):
    z_mean, z_log_sigma = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim),
                              mean=0., stddev=0.1)
    return z_mean + K.exp(z_log_sigma) * epsilon

z = layers.Lambda(sampling)([z_mean, z_log_sigma])

# Create encoder
encoder = keras.Model(inputs, [z_mean, z_log_sigma, z], name='encoder')

# Create decoder
latent_inputs = keras.Input(shape=(latent_dim,), name='z_sampling')
x = layers.Dense(intermediate_dim, activation='relu')(latent_inputs)
outputs = layers.Dense(original_dim, activation='sigmoid')(x)
decoder = keras.Model(latent_inputs, outputs, name='decoder')

# instantiate VAE model
outputs = decoder(encoder(inputs)[2])
vae = keras.Model(inputs, outputs, name='vae_mlp')


reconstruction_loss = keras.losses.binary_crossentropy(inputs, outputs)
reconstruction_loss *= original_dim
kl_loss = 1 + z_log_sigma - K.square(z_mean) - K.exp(z_log_sigma)
kl_loss = K.sum(kl_loss, axis=-1)
kl_loss *= -0.5
vae_loss = K.mean(reconstruction_loss + kl_loss)
vae.add_loss(vae_loss)
vae.compile(optimizer='adam')

vae.fit(x_train, x_train,
        epochs=100,
        batch_size=32,
        validation_data=(x_test, x_test))

但是当我训练模型时,我在第一个 epoch 得到了 4 次这个警告,然后训练显然没有问题。

Epoch 1/100
WARNING:tensorflow:Gradients do not exist for variables ['dense_4/kernel:0', 'dense_4/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dense_4/kernel:0', 'dense_4/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dense_4/kernel:0', 'dense_4/bias:0'] when minimizing the loss.
WARNING:tensorflow:Gradients do not exist for variables ['dense_4/kernel:0', 'dense_4/bias:0'] when minimizing the loss.
2021-02-03 00:03:40.37: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
1875/1875 [==============================] - 2s 1ms/step - loss: 536.7903 - val_loss: 534.6534
Epoch 2/100
1875/1875 [==============================] - 2s 1ms/step - loss: 533.1923 - val_loss: 532.5651
Epoch 3/100
....

我的问题是为什么会有这些警告,这是一个问题,我该如何解决?因为我认为该模型需要这些梯度来改进。

我尝试在colab上复制这个并且这些警告没有出现,我不知道是不是因为它们被自动隐藏在设置中(可能是cause_error = False)。

提前致谢

标签: pythontensorflowkerasdeep-learningautoencoder

解决方案


无法识别解码器层(dense_layer_4 等)的梯度的事实表明问题出在重建损失上。换句话说,它是从 KL 损失中识别编码器的梯度,这就是它运行的原因。我会说使用 vae 来获取输出,例如:

vae = keras.Model(inputs, [coding_means, codings_log_var, outputs], name='vae_mlp')
coding_means, codings_log_var, outputs = vae(x_train)

从这个例子来看,像这样添加重建损失对我来说也很奇怪。我会尝试替换

vae_loss = K.mean(reconstruction_loss + kl_loss)
vae.add_loss(vae_loss)
vae.compile(optimizer='adam')

为了

vae.add_loss(K.mean(kl_loss))
vae.compile(loss="binary_crossentropy", optimizer="adam")

推荐阅读