tensorflow - 使用嵌入层序列化 keras 模型
问题描述
我已经用这样的预训练词嵌入训练了一个模型:
embedding_matrix = np.zeros((vocab_size, 100))
for word, i in text_tokenizer.word_index.items():
embedding_vector = embeddings_index.get(word)
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
embedding_layer = Embedding(vocab_size,
100,
embeddings_initializer=Constant(embedding_matrix),
input_length=50,
trainable=False)
架构如下所示:
sequence_input = Input(shape=(50,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
text_cnn = Conv1D(filters=5, kernel_size=5, padding='same', activation='relu')(embedded_sequences)
text_lstm = LSTM(500, return_sequences=True)(embedded_sequences)
char_in = Input(shape=(50, 18, ))
char_cnn = Conv1D(filters=5, kernel_size=5, padding='same', activation='relu')(char_in)
char_cnn = GaussianNoise(0.40)(char_cnn)
char_lstm = LSTM(500, return_sequences=True)(char_in)
merged = concatenate([char_lstm, text_lstm])
merged_d1 = Dense(800, activation='relu')(merged)
merged_d1 = Dropout(0.5)(merged_d1)
text_class = Dense(len(y_unique), activation='softmax')(merged_d1)
model = Model([sequence_input,char_in], text_class)
当我将模型转换为 json 时,出现此错误:
ValueError: can only convert an array of size 1 to a Python scalar
同样,如果我使用该model.save()
功能,它似乎可以正确保存,但是当我去加载它时,我得到Type Error: Expected Float32
.
我的问题是:尝试序列化此模型时是否遗漏了什么?我需要某种Lambda
层或类似的东西吗?
任何帮助将不胜感激!
解决方案
您可以使用 layer 中的weights
参数Embedding
来提供初始权重。
embedding_layer = Embedding(vocab_size,
100,
weights=[embedding_matrix],
input_length=50,
trainable=False)
模型保存/加载后,权重应保持不可训练:
model.save('1.h5')
m = load_model('1.h5')
m.summary()
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) (None, 50) 0
__________________________________________________________________________________________________
input_4 (InputLayer) (None, 50, 18) 0
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 50, 100) 1000000 input_3[0][0]
__________________________________________________________________________________________________
lstm_4 (LSTM) (None, 50, 500) 1038000 input_4[0][0]
__________________________________________________________________________________________________
lstm_3 (LSTM) (None, 50, 500) 1202000 embedding_1[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 50, 1000) 0 lstm_4[0][0]
lstm_3[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 50, 800) 800800 concatenate_2[0][0]
__________________________________________________________________________________________________
dropout_2 (Dropout) (None, 50, 800) 0 dense_2[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 50, 15) 12015 dropout_2[0][0]
==================================================================================================
Total params: 4,052,815
Trainable params: 3,052,815
Non-trainable params: 1,000,000
__________________________________________________________________________________________________
推荐阅读
- amazon-web-services - 如何将环境变量传递给 AWS 代码构建的 buildspec.yml
- dynamic - 如何在 Spring Cloud Gateway 中获取可用的过滤器列表?
- linq - 这是使用 LINQ 查询语法分配属性的更短方式吗?
- php - 仅从数据库中获取上周的结果
- c# - 使用 iTextSharp 更改导出 pdf 文件中的默认字体
- python - 二维初始化numpy数组
- selenium - 使用 Selenium 时未输入符号 (@)
- hyperledger-fabric - IBM Blockchain api - 状态端点
- javascript - JavaScript - 通过单击 id 和 addEventListener 仅更改一个按钮
- excel - 在重新排列的表格中使用公式