首页 > 解决方案 > 层 lstm_24 的输入 0 与层不兼容:预期 ndim=3,发现 ndim=2。收到的完整形状:[64, 8]

问题描述

我有一个双深度 Q 网络模型,它与两个密集层一起工作,我试图将它转换为两个 LSTM 层,因为我的模型处理时间序列。当我更改代码中的密集层时,出现此错误,我无法处理。我知道这个问题已经在这里解决了很多次,但是这些解决方案都不起作用。

使用两个密集层的代码编写如下:

class DuelingDeepQNetwork(keras.Model):
    def __init__(self, n_actions, fc1_dims, fc2_dims):
        super(DuelingDeepQNetwork, self).__init__()
        self.dense1 = keras.layers.Dense(fc1_dims, activation='relu')
        self.dense2 = keras.layers.Dense(fc2_dims, activation='relu')
        self.V = keras.layers.Dense(1, activation=None)
        self.A = keras.layers.Dense(n_actions, activation=None)
        
    def call(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        V = self.V(x)
        A = self.A(x)

        Q = (V + (A - tf.math.reduce_mean(A, axis=1, keepdims=True)))

        return Q

    def advantage(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        A = self.A(x)

        return A

它可以正常工作,但是当我将两个第一个密集层转换为 LSTM 时,如下所示:

class DuelingDeepQNetwork(keras.Model):
    def __init__(self, n_actions, fc1_dims, fc2_dims):
        super(DuelingDeepQNetwork, self).__init__()
        self.dense1 = keras.layers.LSTM(fc1_dims, activation='relu')
        self.dense2 = keras.layers.LSTM(fc2_dims, activation='relu')
        self.V = keras.layers.Dense(1, activation=None)
        self.A = keras.layers.Dense(n_actions, activation=None)

出现此错误:

层 lstm_24 的输入 0 与层不兼容:预期 ndim=3,发现 ndim=2。收到的完整形状:[64, 8]

按照这个问题“预期 ndim=3, found ndim=2我已经尝试使用“state = state.reshape(64, 1, 8)”设置输入形状,然后运行神经网络,如下所示:

    def choose_action(self, observation):
    if np.random.random() < self.epsilon:
        action = np.random.choice(self.action_space)
    else:
        state = np.array([observation])
        state = state.reshape(64, 1, 8) #<--------
        actions = self.q_eval.advantage(state)
        action = tf.math.argmax(actions, axis=1).numpy()[0,0]

    return action

但我得到完全相同的错误。我还尝试在两个层中添加参数“return_sequences = True”,但它也不起作用。

不知道怎么办,一周内要交,有大神指教吗?

编辑

我正在使用 fc1_dims = 64、fc2_dims = 32 和 n_actions = 2。该模型使用 8 个变量,批量大小为 64。我将代码上传到 github,因此您可以根据需要执行它。该项目还没有完成,所以我现在不会写一个合适的自述文件。

[github 带代码][2]

标签: pythontensorflowkeraslstmkeras-layer

解决方案


所以下面的代码对我有用,没有任何问题。

class DuelingDeepQNetwork(keras.Model):
    def __init__(self, n_actions, fc1_dims, fc2_dims):
        super(DuelingDeepQNetwork, self).__init__()
        self.dense1 = keras.layers.LSTM(fc1_dims, activation='relu', return_sequences=True)
        self.dense2 = keras.layers.LSTM(fc2_dims, activation='relu')
        self.V = keras.layers.Dense(1, activation=None)
        self.A = keras.layers.Dense(n_actions, activation=None)
        
    def call(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        V = self.V(x)
        A = self.A(x)

        Q = (V + (A - tf.math.reduce_mean(A, axis=1, keepdims=True)))

        return Q

    def advantage(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        A = self.A(x)

        return A

然后调用模型如下图:

LSTMModel = DuelingDeepQNetwork(2, 64, 32)
LSTMModel.build(input_shape=(None,1,8))
LSTMModel.summary()

结果如下图:

Model: "dueling_deep_q_network_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_12 (LSTM)               multiple                  18688     
_________________________________________________________________
lstm_13 (LSTM)               multiple                  12416     
_________________________________________________________________
dense_16 (Dense)             multiple                  33        
_________________________________________________________________
dense_17 (Dense)             multiple                  66        
=================================================================

推荐阅读