python - Keras 双向 LSTM:传递了与 `cell.state_size 不兼容的初始状态`
问题描述
我正在尝试在 Keras 中构建一个堆叠的双向 LSTM seq2seq 模型,但是在将编码器的输出状态传递给解码器的输入状态时遇到了问题。根据这个拉取请求,这看起来应该是可能的。最终,我想encoder_output
为其他下游任务保留向量。
错误信息:
ValueError: An `initial_state` was passed that is not compatible with `cell.state_size`. Received `state_spec`=[InputSpec(shape=(None, 100), ndim=2)]; however `cell.state_size` is (100, 100)
我的模型:
MAX_SEQUENCE_LENGTH = 50
EMBEDDING_DIM = 250
latent_size_1 = 100
latent_size_2 = 50
latent_size_3 = 250
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(embedding_matrix),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False,
mask_zero=True)
encoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="encoder_input")
encoder_emb = embedding_layer(encoder_inputs)
encoder_lstm_1 = Bidirectional(LSTM(latent_size_1, return_sequences=True),
merge_mode="concat",
name="encoder_lstm_1")(encoder_emb)
encoder_outputs, forward_h, forward_c, backward_h, backward_c = Bidirectional(LSTM(latent_size_2, return_state=True),
merge_mode="concat"
name="encoder_lstm_2")(encoder_lstm_1)
state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
encoder_states = [state_h, state_c]
decoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="decoder_input")
decoder_emb = embedding_layer(decoder_inputs)
decoder_lstm_1 = Bidirectional(LSTM(latent_size_1, return_sequences=True),
merge_mode="concat",
name="decoder_lstm_1")(decoder_emb, initial_state=encoder_states)
decoder_lstm_2 = Bidirectional(LSTM(latent_size_3, return_sequences=True),
merge_mode="concat",
name="decoder_lstm_2")(decoder_lstm_1)
decoder_outputs = Dense(num_words, activation='softmax', name="Dense_layer")(decoder_lstm_2)
seq2seq_Model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
非常感谢任何帮助/建议/方向!
解决方案
您的代码有两个问题,
正如@Daniel 指出的那样,您不应该将
encoder_states
( 而不是encoder_states = [forward_h, forward_c, backward_h, backward_c]
)中的状态连接起来编码器返回的状态是大小
latent_size_2
(不是latent_size_1
)。所以如果你想把它作为你的解码器初始状态,你的解码器应该是latent_size_2
.
您可以在下面找到带有这些更正的代码。
from tensorflow.keras.layers import Embedding, Input, Bidirectional, LSTM, Dense, Concatenate
from tensorflow.keras.initializers import Constant
from tensorflow.keras.models import Model
MAX_SEQUENCE_LENGTH = 50
EMBEDDING_DIM = 250
latent_size_1 = 100
latent_size_2 = 50
latent_size_3 = 250
num_words = 5000
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=Constant(1.0),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False,
mask_zero=True)
encoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="encoder_input")
encoder_emb = embedding_layer(encoder_inputs)
encoder_lstm_1 = Bidirectional(LSTM(latent_size_1, return_sequences=True),
merge_mode="concat",
name="encoder_lstm_1")(encoder_emb)
encoder_outputs, forward_h, forward_c, backward_h, backward_c = Bidirectional(LSTM(latent_size_2, return_state=True),
merge_mode="concat", name="encoder_lstm_2")(encoder_lstm_1)
encoder_states = [forward_h, forward_c, backward_h, backward_c]
decoder_inputs = Input(shape=(MAX_SEQUENCE_LENGTH,), name="decoder_input")
decoder_emb = embedding_layer(decoder_inputs)
decoder_lstm_1 = Bidirectional(
LSTM(latent_size_2, return_sequences=True),
merge_mode="concat", name="decoder_lstm_1")(decoder_emb, initial_state=encoder_states)
decoder_lstm_2 = Bidirectional(LSTM(latent_size_3, return_sequences=True),
merge_mode="concat",
name="decoder_lstm_2")(decoder_lstm_1)
decoder_outputs = Dense(num_words, activation='softmax', name="Dense_layer")(decoder_lstm_2)
seq2seq_Model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
推荐阅读
- php - PHP合并数组并添加一个值
- angular - ASPNETCORE 反向代理将现有应用程序合并到新的 Angular SPA
- java - 使用 Java + STS 的 API 调用返回“不支持内容类型‘application/octet-stream’”
- linux - 如何为我在 Fedora 上运行的虚拟机分配更多内存以避免 Heap out of Memory 错误
- php - Mysql 检查是否已经过了 30 分钟
- android - 最新的 AVD 模拟器拒绝了我的预算网络摄像头(以前不是问题)
- angular - Chart.js 圆环图大小
- c# - 如何从 C# 中的表达式构建 OData $filter 字符串?
- php - 如何获得最近 7 个日期的数组?
- f# - 返回过滤器中的条件?