首页 > 解决方案 > LSTM __call__ initial_state 参数的问题

问题描述

我有这段代码:

s = Input(shape=(self.s_length,), name="s")
z = Input(shape=(self.z_length,), name="z")
decoder_inputs = [s, z]

# latent.shape: (None, self.s_length + self.z_length)
latent = Concatenate(name="latent_concat")([s, z])

# get initial state of high decoder
# init_state.shape: (None, X_high_size)
init_state = Dense(X_high_size, activation="tanh", name="hidden_state_init")(latent)

# high decoder produces embeddings
# h_X.shape: (None, n_embeddings, self.s_length + self.z_length)
h_X = RepeatVector(n_embeddings, name="latent_repeat")(latent)
for l in range(X_high_depth):
    h_X = LSTM(
        X_high_size,
        return_sequences=True,
        activation="relu",
        name=f"high_encoder_{l}"
    )(h_X, initial_state=[init_state, init_state])

当我运行它时,在训练期间出现错误:

File "/home/xxx/anaconda3/envs/mygan/lib/python3.6/site-packages/keras/engine/training.py", line 1215, in train_on_batch
    outputs = self.train_function(ins)
  File "/home/xxx/anaconda3/envs/mygan/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2666, in __call__
    return self._call(inputs)
  File "/home/xxx/anaconda3/envs/mygan/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2636, in _call
    fetched = self._callable_fn(*array_vals)
  File "/home/xxx/anaconda3/envs/mygan/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1382, in __call__
    run_metadata_ptr)
  File "/home/xxx/anaconda3/envs/mygan/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 519, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 's' with dtype float and shape [?,16]
     [[Node: s = Placeholder[dtype=DT_FLOAT, shape=[?,16], _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
     [[Node: loss_2/z_discriminator_loss/Mean_3/_1073 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_41285_loss_2/z_discriminator_loss/Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

(顺便说一下,16是输入s维度。)但是我注意到,如果我只是省略initial_state=[init_state, init_state]LSTM 层调用中的参数,一切都会顺利进行。代码对我来说似乎是正确的,我不知道我做错了什么......可能是 Keras 错误?

标签: pythontensorflowmachine-learningkeraslstm

解决方案


占位符是一个空输入张量,期望在拟合或预测时接收数据。如果您指定“初始状态”,LSTM 需要真实数据、实数,而当您调用它们时,占位符仍未填充。


推荐阅读