首页 > 解决方案 > tensorflow / keras.layers.RNN /传递了与“cell.state_size”不兼容的“initial_state”

问题描述

如果你在,请看看我的问题。(我正在使用翻译器,因此句子可能很尴尬。)我正在尝试使用注意力制作编码器解码器模型。我已经看到了类似问题的解决方案,但我无法解决这个问题。下面是我的代码。

class Encoder(keras.layers.Layer):
  def __init__(self, units, vocab_size, embedding_dim, mask_zero=False, **kwargs):
    super(Encoder, self).__init__(**kwargs)
    
    self.embedding = keras.layers.Embedding(vocab_size, embedding_dim, mask_zero=mask_zero)
    self.gru = keras.layers.GRU(units, return_sequences=True, return_state=True) 

  def call(self, x): 
    x = self.embedding(x) 
    return self.gru(x)
class DecoderCell(keras.layers.Layer):
  def __init__(self, units, vocab_size, embedding_dim, mask_zero=False, **kwargs):
    super(DecoderCell, self).__init__(**kwargs)

    self.attention = BahdanauAttention(units)
    self.embedding = keras.layers.Embedding(vocab_size, embedding_dim, mask_zero=mask_zero)
    self.gru_cell = keras.layers.GRUCell(units)

    self.state_size = [(None, 512), (None, None, 512)] <-----------This part seems to be the problem
    self.output_size = vocab_size

    self.fc = keras.layers.Dense(vocab_size, activation='softmax')

  def call(self, x, states):
    context_vector, _ = self.attention(*states) # context_vector.shape = (batch, units)

    x = self.embedding(x)

    x = tf.concat([context_vector, x], axis=-1) 
    x, state = self.gru_cell(x, states[0])
    x = self.fc(x)

    return x, [state, states[1]] 
units = 512
embedding_dim = 512
max_input_len = 100 # temporary
max_output_len = 100 # temporary
t_enc_input = keras.layers.Input(shape=[None])
encoder = Encoder(units, input_vocab_size, embedding_dim)
t_enc_output, t_enc_state = encoder(t_enc_input)
t_enc_outputs = [t_enc_state, t_enc_output]
t_dec_input = keras.layers.Input(shape=[None, 1])
decoder_cell = DecoderCell(units, output_vocab_size, embedding_dim)
decoder = keras.layers.RNN(decoder_cell, return_sequences=True)
t_dec_output = decoder(t_dec_input, t_enc_outputs) <---------------------I get an error here
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-184-6097053bb9af> in <module>()
      2 decoder_cell = DecoderCell(units, output_vocab_size, embedding_dim)
      3 decoder = keras.layers.RNN(decoder_cell, return_sequences=True)
----> 4 t_dec_output = decoder(t_dec_input, t_enc_outputs)

ValueError: An `initial_state` was passed that is not compatible with `cell.state_size`. Received `state_spec`=ListWrapper([InputSpec(shape=(None, 512), ndim=2), InputSpec(shape=(None, None, 512), ndim=3)]); however `cell.state_size` is [(None, 512), (None, None, 512)]

标签: pythontensorflowkeras

解决方案


推荐阅读