python - tensorflow / keras.layers.RNN /传递了与“cell.state_size”不兼容的“initial_state”
问题描述
如果你在,请看看我的问题。(我正在使用翻译器,因此句子可能很尴尬。)我正在尝试使用注意力制作编码器解码器模型。我已经看到了类似问题的解决方案,但我无法解决这个问题。下面是我的代码。
class Encoder(keras.layers.Layer):
def __init__(self, units, vocab_size, embedding_dim, mask_zero=False, **kwargs):
super(Encoder, self).__init__(**kwargs)
self.embedding = keras.layers.Embedding(vocab_size, embedding_dim, mask_zero=mask_zero)
self.gru = keras.layers.GRU(units, return_sequences=True, return_state=True)
def call(self, x):
x = self.embedding(x)
return self.gru(x)
class DecoderCell(keras.layers.Layer):
def __init__(self, units, vocab_size, embedding_dim, mask_zero=False, **kwargs):
super(DecoderCell, self).__init__(**kwargs)
self.attention = BahdanauAttention(units)
self.embedding = keras.layers.Embedding(vocab_size, embedding_dim, mask_zero=mask_zero)
self.gru_cell = keras.layers.GRUCell(units)
self.state_size = [(None, 512), (None, None, 512)] <-----------This part seems to be the problem
self.output_size = vocab_size
self.fc = keras.layers.Dense(vocab_size, activation='softmax')
def call(self, x, states):
context_vector, _ = self.attention(*states) # context_vector.shape = (batch, units)
x = self.embedding(x)
x = tf.concat([context_vector, x], axis=-1)
x, state = self.gru_cell(x, states[0])
x = self.fc(x)
return x, [state, states[1]]
units = 512
embedding_dim = 512
max_input_len = 100 # temporary
max_output_len = 100 # temporary
t_enc_input = keras.layers.Input(shape=[None])
encoder = Encoder(units, input_vocab_size, embedding_dim)
t_enc_output, t_enc_state = encoder(t_enc_input)
t_enc_outputs = [t_enc_state, t_enc_output]
t_dec_input = keras.layers.Input(shape=[None, 1])
decoder_cell = DecoderCell(units, output_vocab_size, embedding_dim)
decoder = keras.layers.RNN(decoder_cell, return_sequences=True)
t_dec_output = decoder(t_dec_input, t_enc_outputs) <---------------------I get an error here
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-184-6097053bb9af> in <module>()
2 decoder_cell = DecoderCell(units, output_vocab_size, embedding_dim)
3 decoder = keras.layers.RNN(decoder_cell, return_sequences=True)
----> 4 t_dec_output = decoder(t_dec_input, t_enc_outputs)
ValueError: An `initial_state` was passed that is not compatible with `cell.state_size`. Received `state_spec`=ListWrapper([InputSpec(shape=(None, 512), ndim=2), InputSpec(shape=(None, None, 512), ndim=3)]); however `cell.state_size` is [(None, 512), (None, None, 512)]
解决方案
推荐阅读
- java - 具有延迟获取的反应式 Java 队列
- java - 顶部栏中的应用程序菜单标题显示 java 包路径
- python - 如何使用相同的功能绘制和保存多个图表或图形?
- javascript - 看起来您正在 React App & Invalid API key 中使用 Firebase JS SDK 的开发版本
- reactjs - 如何将函数传递给子组件并用作useState?
- python - ValueError:错误的文件描述符,宽度 = os.get_terminal_size().columns
- azure - Azure ML 工作区:如何将管道发布到现有端点而不是创建新端点
- javascript - 带有额外参数的 npm 测试
- windows-10 - 检查是否有新的 Windows 版本可用并以编程方式安装
- javascript - 需要单独更改每个图标的颜色