python-3.x - 使用函数式 API 的 Transformer 模型
问题描述
几个月前我开始学习 NLP。所以现在我正在尝试使用功能 API 来实现变压器模型,并且我想使用 model.fit 来训练这个变压器模型。当我这样称呼它们时,编码器和解码器部分工作得很好' dec=decoder(8000, 2, 512, 256, 8, 0.1) '并使用 tf.keras.utils.plot_model 绘制模型
def transformer(vocab_size,
num_layers,
units,
d_model,
num_heads,
dropout,
name="transformer"):
inputs = tf.keras.Input(shape=(None,), name="inputs")
dec_inputs = tf.keras.Input(shape=(None,), name="dec_inputs")
enc_padding_mask = tf.keras.layers.Lambda(
create_padding_mask, output_shape=(1, 1, None),
name='enc_padding_mask')(inputs)
look_ahead_mask = tf.keras.layers.Lambda(
create_look_ahead_mask,
output_shape=(1, None, None),
name='look_ahead_mask')(dec_inputs)
dec_padding_mask = tf.keras.layers.Lambda(
create_padding_mask, output_shape=(1, 1, None),
name='dec_padding_mask')(inputs)
enc_outputs = encoder(
vocab_size=vocab_size,
num_layers=num_layers,
units=units,
d_model=d_model,
num_heads=num_heads,
dropout=dropout,
)(inputs=[inputs, enc_padding_mask])
dec_outputs = decoder(
vocab_size=vocab_size,
num_layers=num_layers,
units=units,
d_model=d_model,
num_heads=num_heads,
dropout=dropout,
)(inputs=[dec_inputs, enc_outputs, look_ahead_mask, dec_padding_mask])
outputs = tf.keras.layers.Dense(units=vocab_size, name="outputs")(dec_outputs)
return tf.keras.Model(inputs=[inputs, dec_inputs], outputs=outputs, name=name)
但是每次我尝试像这样调用这个变压器模型时,都会出现下面显示的错误。
NUM_LAYERS = 2
D_MODEL = 256
NUM_HEADS = 8
UNITS = 512
DROPOUT = 0.1
model = transformer(
vocab_size=8000,
num_layers=NUM_LAYERS,
units=UNITS,
d_model=D_MODEL,
num_heads=NUM_HEADS,
dropout=0.1)
错误如下图
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in
_create_c_op(graph, node_def, inputs, control_inputs, op_def)
1879 try:
-> 1880 c_op = pywrap_tf_session.TF_FinishOperation(op_desc)
1881 except errors.InvalidArgumentError as e:
InvalidArgumentError: Shape must be rank 1 but is rank 3 for '{{node look_ahead_mask/ones}} =
Fill[T=DT_FLOAT, index_type=DT_INT32](look_ahead_mask/ones/packed,
look_ahead_mask/ones/Const)' with input shapes: [2,?,?], [].
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
17 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in
_create_c_op(graph, node_def, inputs, control_inputs, op_def)
1881 except errors.InvalidArgumentError as e:
1882 # Convert to ValueError for backwards compatibility.
-> 1883 raise ValueError(str(e))
1884
1885 return c_op
ValueError: Shape must be rank 1 but is rank 3 for '{{node look_ahead_mask/ones}} =
Fill[T=DT_FLOAT, index_type=DT_INT32](look_ahead_mask/ones/packed,
look_ahead_mask/ones/Const)' with input shapes: [2,?,?], [].
解决方案
推荐阅读
- oracle - 将 SQL 语句重写为 PL/SQL
- elasticsearch - 在不知道 _id 的情况下更新大量文档
- python - 'GeoDataFrame' object has no attribute 'assign_centroids' in CLIMADA when using admin1
- slatejs - 在 slate-react 文本编辑器中的图像后添加段落
- python - 在 Python(Windows 10)中出现 OpenSSLerror 时如何处理证书?
- java - 如何从 Java 中的 Callable Future 函数返回 Map
- c++ - Dijkstra 问题[UVa-10986 发送电子邮件]
- javascript - res.sendFile() 显示我的 html,但不呈现反应组件
- python - Pandas:如何填充大型数据集中的缺失值?
- javascript - 过滤器数组存在于对象数组中而不影响主数组