python - 在 tensorflow 2 中训练自定义多对一 RNN
问题描述
我正在使用 tensorflow 2 实现一个自定义 RNN,为此我编写了一个模型,该模型采用无限数量的时间步长,并为所有时间步长获取最后一个隐藏层的输出,并对其应用一些 Dense 层。
现在,我的数据集包含一组具有形状的训练示例[28207, 8, 2]
(28207 个训练示例,8 个时间步长,2 个特征),我的输出是一个具有形状的矩阵[28207, 2]
(28207 个训练示例,2 个特征),但是在训练时出现以下错误模型:
Data cardinality is ambiguous:
x sizes: (then a lot of 8's)
y sizes: (then a lot of 2's)
我已经尝试将标签集的尺寸扩大到[28207, 1, 2]
没有成功,谷歌也没有太大帮助。
甚至可以在 tf2 中执行这种多对一的实现吗?
我将 anaconda 与 python 3.6.12、windows 10、tensorflow 2.4.0 一起使用。单元格、模型和训练代码是这样的:
class RNNCell(keras.layers.Layer):
def __init__(self, units, **kwargs):
self.units = units
self.state_size = units
super(TrayectoryRNNCell, self).__init__(**kwargs)
def build(self, input_shape):
# i computation
self.Wxi = self.add_weight(name='Wxi', shape=(input_shape[0][-1], self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.Whi = self.add_weight(name='Whi', shape=(self.units, self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.Wci = self.add_weight(name='Wci', shape=(self.units, self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.bi = self.add_weight(name='bi', shape=(self.units, ), initializer="zeros", regularizer=customL2Regularizer)
# f computation
self.Wxf = self.add_weight(name='Wxf', shape=(input_shape[0][-1], self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.Whf = self.add_weight(name='Whf', shape=(self.units, self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.Wcf = self.add_weight(name='Wcf', shape=(self.units, self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.bf = self.add_weight(name='bf', shape=(self.units, ), initializer="zeros", regularizer=customL2Regularizer)
# c computation
self.Wxc = self.add_weight(name='Wxc', shape=(input_shape[0][-1], self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.Whc = self.add_weight(name='Whc', shape=(self.units, self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.bc = self.add_weight(name='bc', shape=(self.units, ), initializer="zeros", regularizer=customL2Regularizer)
# o computation
self.Wxo = self.add_weight(name='Wxo', shape=(input_shape[0][-1], self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.Who = self.add_weight(name='Who', shape=(self.units, self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.Wco = self.add_weight(name='Wco', shape=(self.units, self.units), initializer="random_normal", regularizer=customL2Regularizer)
self.bo = self.add_weight(name='bo', shape=(self.units, ), initializer="zeros", regularizer=customL2Regularizer)
def call(self, inputs, states):
# It expects two inputs: the X and the previous h
i = tf.math.sigmoid(K.dot(inputs[0], self.Wxi) + K.dot(inputs[1], self.Whi) + K.dot(states[0], self.Wci) + self.bi)
f = tf.math.sigmoid(K.dot(inputs[0], self.Wxf) + K.dot(inputs[1], self.Whf) + K.dot(states[0], self.Wcf) + self.bf)
c = f * states[0] + i * tf.math.tanh(K.dot(inputs[0], self.Wxc) + K.dot(inputs[1], self.Whc) + self.bc)
o = tf.math.sigmoid(K.dot(inputs[0], self.Wxo) + K.dot(inputs[1], self.Who) + K.dot(c, self.Wco) + self.bo)
return o * tf.tanh(c), c
网络:
rnn_hidden_units = 128
rnn_hidden_layers = 2
lstm_outputs = []
# Inputs: [None, time_steps, 2]
inputs = keras.Input(shape=(time_steps, 2), name='inputs')
# First hidden layer previous h: [None, time_steps, 2]
zeros_placeholder = tf.fill(tf.stack([tf.shape(inputs)[0], time_steps, rnn_hidden_units]), 0.0, name='zeros_placeholder')
# First hidden layer: inputs, zeros_placeholder => [None, time_steps, rnn_hidden_units]
last_hidden_output = RNN(RNNCell(rnn_hidden_units), return_sequences=True, name='first_rnn_layer')((inputs, zeros_placeholder))
# Append last output to a list
lstm_outputs.append(last_hidden_output[:, -1, :])
# The rest of the hidden layers
for l in range(rnn_hidden_layers - 1):
last_hidden_output = RNN(RNNCell(rnn_hidden_units), return_sequences=True, name='{}_rnn_layer'.format(l+1))((inputs, last_hidden_output))
lstm_outputs.append(last_hidden_output[:, -1, :])
# Compute p_t+1 (assuming Y is the sigmoid function): [None, 5]
p = tf.sigmoid(OutputLayer(rnn_hidden_units)(tf.stack(lstm_outputs)))
# Compute (mu, sigma, rho): [None, 5]
output = OutputLayer(5, include_bias=False)(p)
# Define the model
model = keras.models.Model(inputs=inputs, outputs=output)
失败的代码:
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001, rho=0.95), loss=bivariate_loss_function, metrics=['val_loss'])
# Define the Keras TensorBoard callback.
logdir="./logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)
# Train the model.
model.fit(training_examples,
training_labels,
batch_size=64,
epochs=5,
callbacks=[tensorboard_callback])
解决方案
It turns out it was a problem of the input, since it was a list and not a numpy array.
推荐阅读
- vhdl - 错误:[VRFC 10-3353] 正式端口“i0”没有实际值或默认值
- database - 数据库建模中的计划与实际实体
- java - MongoDB使用测试数据库使用SpringBoot保存记录
- python - 带有 min_periods 的 Pandas 中列的滚动排名
- javascript - 在方法中格式化不规则的 json 数据
- java - Java 反射:传递一个整数参数
- javascript - 函数返回数组的空值或未定义
- nginx - “$http_x_forwarded_for”的第一个ip在nginx中永远不会改变
- react-native - react native 中的异步条件组件渲染
- javascript - 世博相机保持正面照片镜像?