python-3.x - 具有注意层 tf.keras 的双向 LSTM
问题描述
我正在尝试将注意力机制添加到波纹管模型中。注意力模型真的需要 CTC 损失吗?
我如何为图像 OCR 问题实现具有注意力机制的 BLSTM。
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
y_pred = y_pred[:, 2:, :]
return tf.keras.backend.ctc_batch_cost(labels, y_pred, input_length, label_length)
def get_Model(training):
input_shape = (img_w, img_h, 1) # (128, 64, 1)
labels = Input(name='the_labels', shape=[max_text_len], dtype='float32') # (None ,8)
input_length = Input(name='input_length', shape=[1], dtype='int64') # (None, 1)
label_length = Input(name='label_length', shape=[1], dtype='int64') # (None, 1)
# Make Networkw
inputs = Input(name='the_input', shape=input_shape, dtype='float32') # (None, 128, 64, 1)
# Convolution layer (VGG)
inner = Conv2D(64, (3, 3), padding='same', name='conv1', kernel_initializer='he_normal')(inputs) # (None, 128, 64, 64)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(2, 2), name='max1')(inner) # (None,64, 32, 64)
inner = Conv2D(128, (3, 3), padding='same', name='conv2', kernel_initializer='he_normal')(inner) # (None, 64, 32, 128)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(2, 2), name='max2')(inner) # (None, 32, 16, 128)
inner = Conv2D(256, (3, 3), padding='same', name='conv3', kernel_initializer='he_normal')(inner) # (None, 32, 16, 256)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = Conv2D(256, (3, 3), padding='same', name='conv4', kernel_initializer='he_normal')(inner) # (None, 32, 16, 256)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(1, 2), name='max3')(inner) # (None, 32, 8, 256)
inner = Conv2D(512, (3, 3), padding='same', name='conv5', kernel_initializer='he_normal')(inner) # (None, 32, 8, 512)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = Conv2D(512, (3, 3), padding='same', name='conv6')(inner) # (None, 32, 8, 512)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
inner = MaxPooling2D(pool_size=(1, 2), name='max4')(inner) # (None, 32, 4, 512)
inner = Conv2D(512, (2, 2), padding='same', kernel_initializer='he_normal', name='con7')(inner) # (None, 32, 4, 512)
inner = BatchNormalization()(inner)
inner = Activation('relu')(inner)
# CNN to RNN
inner = Reshape(target_shape=((32, 2048)), name='reshape')(inner) # (None, 32, 2048)
inner = Dense(64, activation='relu', kernel_initializer='he_normal', name='dense1')(inner) # (None, 32, 64)
# RNN layer
lstm1 = Bidirectional(LSTM(512, return_sequences=True, kernel_initializer='he_normal'), name='biLSTM1') (inner)
lstm1_norm = BatchNormalization()(lstm1)
lstm2 = Bidirectional(LSTM(512, return_sequences=True, kernel_initializer='he_normal'), name='biLSTM2') (lstm1_norm)
lstm2_norm = BatchNormalization()(lstm2)
# transforms RNN output to character activations:
inner = Dense(num_classes, kernel_initializer='he_normal',name='dense2')(lstm2_norm) #(None, 32, 63)
y_pred = Activation('softmax', name='softmax')(inner)
# Keras doesn't currently support loss funcs with extra parameters
# so CTC loss is implemented in a lambda layer
loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred, labels, input_length, label_length]) #(None, 1)
if training:
return Model(inputs=[inputs, labels, input_length, label_length], outputs=loss_out)
else:
return Model(inputs=[inputs], outputs=y_pred)```
Is lstm1 is encoder and lstm2 is decoder?
I didn't find any attention implementation using keras functional api and also it seems there is no keras attention layer.
解决方案
推荐阅读
- angular - 有没有办法将多个路由解析器堆叠成一个?
- mysql - Mysql 磁盘空间已满,等待有人释放一些空间
- sql - 计算短语中的单词 - SQL
- java - 为什么 Java 的回调接口扩展而不在 Scala 中实现
- php - 我们如何在 python 中运行 PHP artisan 命令
- python-3.x - 将xml文件中的数据提取到数据框中
- java - 连接 web.xml 中的环境条目
- tensorflow - 使用 TensorFlow 的字符串模式分类器
- javascript - 猫鼬没有输出我想要的东西
- shiny - 如何在 radioGroupButtons 的按钮之间获得更多空间并包含图标