首页 > 解决方案 > 使用函数式 API 的 Transformer 模型

问题描述

几个月前我开始学习 NLP。所以现在我正在尝试使用功能 API 来实现变压器模型,并且我想使用 model.fit 来训练这个变压器模型。当我这样称呼它们时,编码器和解码器部分工作得很好' dec=decoder(8000, 2, 512, 256, 8, 0.1) '并使用 tf.keras.utils.plot_model 绘制模型

def transformer(vocab_size,
            num_layers,
            units,
            d_model,
            num_heads,
            dropout,
            name="transformer"):
inputs = tf.keras.Input(shape=(None,), name="inputs")
dec_inputs = tf.keras.Input(shape=(None,), name="dec_inputs")

enc_padding_mask = tf.keras.layers.Lambda(
  create_padding_mask, output_shape=(1, 1, None),
  name='enc_padding_mask')(inputs)

look_ahead_mask = tf.keras.layers.Lambda(
  create_look_ahead_mask,
  output_shape=(1, None, None),
  name='look_ahead_mask')(dec_inputs)

dec_padding_mask = tf.keras.layers.Lambda(
  create_padding_mask, output_shape=(1, 1, None),
  name='dec_padding_mask')(inputs)

enc_outputs = encoder(
  vocab_size=vocab_size,
  num_layers=num_layers,
  units=units,
  d_model=d_model,
  num_heads=num_heads,
  dropout=dropout,
)(inputs=[inputs, enc_padding_mask])

dec_outputs = decoder(
  vocab_size=vocab_size,
  num_layers=num_layers,
  units=units,
  d_model=d_model,
  num_heads=num_heads,
  dropout=dropout,
)(inputs=[dec_inputs, enc_outputs, look_ahead_mask, dec_padding_mask])

outputs = tf.keras.layers.Dense(units=vocab_size, name="outputs")(dec_outputs)

return tf.keras.Model(inputs=[inputs, dec_inputs], outputs=outputs, name=name)

但是每次我尝试像这样调用这个变压器模型时,都会出现下面显示的错误。

NUM_LAYERS = 2
D_MODEL = 256
NUM_HEADS = 8
UNITS = 512
DROPOUT = 0.1


model = transformer(
vocab_size=8000,
num_layers=NUM_LAYERS,
units=UNITS,
d_model=D_MODEL,
num_heads=NUM_HEADS,
dropout=0.1)

错误如下图

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in 
_create_c_op(graph, node_def, inputs, control_inputs, op_def)
1879   try:
-> 1880     c_op = pywrap_tf_session.TF_FinishOperation(op_desc)
1881   except errors.InvalidArgumentError as e:

InvalidArgumentError: Shape must be rank 1 but is rank 3 for '{{node look_ahead_mask/ones}} = 
Fill[T=DT_FLOAT, index_type=DT_INT32](look_ahead_mask/ones/packed, 
look_ahead_mask/ones/Const)' with input shapes: [2,?,?], [].

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
17 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in 
_create_c_op(graph, node_def, inputs, control_inputs, op_def)
1881   except errors.InvalidArgumentError as e:
1882     # Convert to ValueError for backwards compatibility.
-> 1883     raise ValueError(str(e))
1884 
1885   return c_op

ValueError: Shape must be rank 1 but is rank 3 for '{{node look_ahead_mask/ones}} = 
Fill[T=DT_FLOAT, index_type=DT_INT32](look_ahead_mask/ones/packed, 
look_ahead_mask/ones/Const)' with input shapes: [2,?,?], [].

标签: python-3.xnlptensorflow2.0tf.kerastransformer

解决方案


推荐阅读