首页 > 解决方案 > 为什么 Keras SimpleRNN 层实现输出到输出而不是隐藏到隐藏的递归?

问题描述

我一直在检查文献,最常见的重复是隐藏到隐藏,即:

h(t) = tanh(W_x*x(t)+b+W_hh*h(t-1))
output(t) = tanh(W_ho*h(t)+b)

但是,在 tf.keras.layers.SimpleRNN 层中,它实现了输出到输出的递归:

h(t) = W_xh*x(t)+b_h
output(t) = tanh(h(t)+W_oo*output(t-1)))

这可以通过从 Python Machine Learning 3rd Ed 一书中提取的以下代码来演示:

rnn_layer = tf.keras.layers.SimpleRNN(
    units=2, use_bias=True,
    return_sequences=True)
rnn_layer.build(input_shape=(None, None, 5))

w_xh, w_oo, b_h = rnn_layer.weights

x_seq = tf.convert_to_tensor(
     [[1.0]*5, [2.0]*5, [3.0]*5],
     dtype=tf.float32)
 ## output of SimepleRNN:
 output = rnn_layer(tf.reshape(x_seq, shape=(1, 3, 5)))
 ## manually computing the output:
 out_man = []
 for t in range(len(x_seq)):
    xt = tf.reshape(x_seq[t], (1, 5))
    print('Time step {} =>'.format(t))
    print('   Input           :', xt.numpy())

    ht = tf.matmul(xt, w_xh) + b_h
    print('   Hidden          :', ht.numpy())
    if t>0:
    #Time step 0 =>
        prev_o = out_man[t-1]
    else:
        prev_o = tf.zeros(shape=(ht.shape))
    ot = ht + tf.matmul(prev_o, w_oo)
    ot = tf.math.tanh(ot)
    out_man.append(ot)
    print('   Output (manual) :', ot.numpy())
    print('   SimpleRNN output:'.format(t),
          output[0][t].numpy())
    print()

为什么实现和文献之间存在如此大的差异?在实践中输出到输出是否优于隐藏到隐藏?

标签: tensorflowkerasdeep-learningrecurrent-neural-network

解决方案


推荐阅读