python - TensorFlow 中堆叠 LSTM 网络的维度
问题描述
在回顾有关多维输入和堆叠 LSTM RNN 的许多类似问题时,我没有找到一个示例来说明initial_state
占位符的维数,如下rnn_tuple_state
所示。尝试[lstm_num_layers, 2, None, lstm_num_cells, 2]
是这些示例中代码的扩展(http://monik.in/a-noobs-guide-to-implementing-rnn-lstm-using-tensorflow/,https://medium.com/@erikhallstrm/ using-the-tensorflow-multilayered-lstm-api-f6e7da7bbe40),并在末尾为特征的每个时间步长的多个值添加了一个额外的维度feature_dim
(这不起作用,而是ValueError
由于在tensorflow.nn.dynamic_rnn
称呼)。
time_steps = 10
feature_dim = 2
label_dim = 4
lstm_num_layers = 3
lstm_num_cells = 100
dropout_rate = 0.8
# None is to allow for variable size batches
features = tensorflow.placeholder(tensorflow.float32,
[None, time_steps, feature_dim])
labels = tensorflow.placeholder(tensorflow.float32, [None, label_dim])
cell = tensorflow.contrib.rnn.MultiRNNCell(
[tensorflow.contrib.rnn.LayerNormBasicLSTMCell(
lstm_num_cells,
dropout_keep_prob = dropout_rate)] * lstm_num_layers,
state_is_tuple = True)
# not sure of the dimensionality for the initial state
initial_state = tensorflow.placeholder(
tensorflow.float32,
[lstm_num_layers, 2, None, lstm_num_cells, feature_dim])
# which impacts these two lines as well
state_per_layer_list = tensorflow.unstack(initial_state, axis = 0)
rnn_tuple_state = tuple(
[tensorflow.contrib.rnn.LSTMStateTuple(
state_per_layer_list[i][0],
state_per_layer_list[i][1]) for i in range(lstm_num_layers)])
# also not sure if expanding the feature dimensions is correct here
outputs, state = tensorflow.nn.dynamic_rnn(
cell, tensorflow.expand_dims(features, -1),
initial_state = rnn_tuple_state)
最有帮助的是对一般情况的解释,其中:
- 每个时间步有N个值
- 每个时间序列有S个步骤
- 每批有B个序列
- 每个输出都有R值
- 网络中有L个隐藏的 LSTM 层
- 每层有M个节点
所以这个的伪代码版本是:
# B, S, N, and R are undefined values for the purpose of this question
features = tensorflow.placeholder(tensorflow.float32, [B, S, N])
labels = tensorflow.placeholder(tensorflow.float32, [B, R])
...
如果我能完成,我一开始就不会在这里问。提前致谢。欢迎对相关最佳实践提出任何意见。
解决方案
经过多次试验和错误后,dynamic_rnn
无论特征的维度如何,以下都会产生一个堆叠的 LSTM:
time_steps = 10
feature_dim = 2
label_dim = 4
lstm_num_layers = 3
lstm_num_cells = 100
dropout_rate = 0.8
learning_rate = 0.001
features = tensorflow.placeholder(
tensorflow.float32, [None, time_steps, feature_dim])
labels = tensorflow.placeholder(
tensorflow.float32, [None, label_dim])
cell_list = []
for _ in range(lstm_num_layers):
cell_list.append(
tensorflow.contrib.rnn.LayerNormBasicLSTMCell(lstm_num_cells,
dropout_keep_prob=dropout_rate))
cell = tensorflow.contrib.rnn.MultiRNNCell(cell_list, state_is_tuple=True)
initial_state = tensorflow.placeholder(
tensorflow.float32, [lstm_num_layers, 2, None, lstm_num_cells])
state_per_layer_list = tensorflow.unstack(initial_state, axis=0)
rnn_tuple_state = tuple(
[tensorflow.contrib.rnn.LSTMStateTuple(
state_per_layer_list[i][0],
state_per_layer_list[i][1]) for i in range(lstm_num_layers)])
state_series, last_state = tensorflow.nn.dynamic_rnn(
cell=cell, inputs=features, initial_state=rnn_tuple_state)
hidden_layer_output = tensorflow.transpose(state_series, [1, 0, 2])
last_output = tensorflow.gather(hidden_layer_output, int(
hidden_layer_output.get_shape()[0]) - 1)
weights = tensorflow.Variable(tensorflow.random_normal(
[lstm_num_cells, int(labels.get_shape()[1])]))
biases = tensorflow.Variable(tensorflow.constant(
0.0, shape=[labels.get_shape()[1]]))
predictions = tensorflow.matmul(last_output, weights) + biases
mean_squared_error = tensorflow.reduce_mean(
tensorflow.square(predictions - labels))
minimize_error = tensorflow.train.RMSPropOptimizer(
learning_rate).minimize(mean_squared_error)
在许多众所周知的兔子洞之一中开始这段旅程的部分原因是先前引用的示例重新塑造了输出以适应分类器而不是回归器(这是我试图构建的)。由于这与特征维度无关,因此它用作此用例的通用模板。
推荐阅读
- python - Python脚本找不到文件
- docker - 无法在 Windows 10 上安装 Docker
- php - 如何在 laravel 中为管理员和其他人单独登录?
- javascript - 议程作业库,如何在每个月的最后一天午夜或一天的最后一分钟执行 cron
- sql - 如何选择包含最大数量的不同值的列名?- 甲骨文 SQL
- deep-learning - 将 pb 文件转换为 tflite 文件以在 Coral 开发板上运行(分段错误(核心转储))
- java - 有没有一种方法可以检查是否检测到哈希集(Java)的重复输入?
- c++ - 在 Linux 的 QT C++ 中设置 QButtons 的透明度
- sql - 试图在 oracle 中完全删除表访问
- javascript - 如何通过javascript更改类的所有元素的字体颜色?