首页 > 解决方案 > 如何将多层双向 LSTM 编码器连接到解码器?

问题描述

我正在制作一个 seq2seq 模型,它使用 Bi-LSTM 作为编码器和解码器中的注意力机制。对于单层 LSTM 模型工作正常。我的编码器看起来像这样。

编码器:

def encoding_layer(self, rnn_inputs, rnn_size, num_layers, keep_prob, 
                   source_vocab_size, 
                   encoding_embedding_size,
                   source_sequence_length,
                   emb_matrix):

    embed = tf.nn.embedding_lookup(emb_matrix, rnn_inputs)   
    stacked_cells = tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob)
    outputs, state = tf.nn.bidirectional_dynamic_rnn(cell_fw=stacked_cells, 
                                                             cell_bw=stacked_cells, 
                                                             inputs=embed, 
                                                             sequence_length=source_sequence_length, 
                                                             dtype=tf.float32)

    concat_outputs = tf.concat(outputs, 2)
    cell_state_fw, cell_state_bw = state
    cell_state_final = tf.concat([cell_state_fw.c, cell_state_bw.c], 1)
    hidden_state_final = tf.concat([cell_state_fw.h, cell_state_bw.h], 1)
    encoder_final_state = tf.nn.rnn_cell.LSTMStateTuple(c=cell_state_final, h=hidden_state_final)

    return concat_outputs, encoder_final_state

解码器:

    def decoding_layer_train(self, encoder_outputs, encoder_state, dec_cell, dec_embed_input, 
                         target_sequence_length, max_summary_length, 
                         output_layer, keep_prob, rnn_size, batch_size):

    rnn_size = 2 * rnn_size
    dec_cell = tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob)


    train_helper = tf.contrib.seq2seq.TrainingHelper(dec_embed_input, target_sequence_length)

    attention_mechanism = tf.contrib.seq2seq.BahdanauAttention(rnn_size, encoder_outputs,
                                                               memory_sequence_length=target_sequence_length)

    attention_cell = tf.contrib.seq2seq.AttentionWrapper(dec_cell, attention_mechanism,
                                                         attention_layer_size=rnn_size/2)

    state = attention_cell.zero_state(dtype=tf.float32, batch_size=batch_size)
    state = state.clone(cell_state=encoder_state)

    decoder = tf.contrib.seq2seq.BasicDecoder(cell=attention_cell, helper=train_helper, 
                                              initial_state=state,
                                              output_layer=output_layer) 
    outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder, impute_finished=True, maximum_iterations=max_summary_length)

    return outputs

通过上述单层 Bi-LSTM 的配置,我的模型运行良好。但是,现在我想使用多层 Bi-LSTM 编码器和解码器。因此,在编码器和解码器中,如果我将单元格更改为:

stacked_cells = tf.contrib.rnn.MultiRNNCell([tf.contrib.rnn.DropoutWrapper(tf.contrib.rnn.LSTMCell(rnn_size), keep_prob) for _ in range(num_layers)])

更改单元格后,我收到此错误:

AttributeError:“元组”对象没有属性“c”

这里, num_layers = 2

rnn_size = 128

嵌入大小= 50

state所以,我想知道在第二种情况下究竟返回了什么。以及如何将该状态传递给解码器。

完整代码:https ://github.com/sainimohit23/Text-Summarization

标签: tensorflownlpseq2seq

解决方案


推荐阅读