首页 > 解决方案 > 为什么 BERT 模型找不到与我的输入位置参数匹配的选项?

问题描述

在尝试进行 NLP 练习时,我尝试利用 BERT 架构来获得良好的训练模型。所以我定义了一个函数,使用 BERT 作为层来构建和编译模型。但是,在尝试执行该函数并实际构建模型时,我收到一个错误,即 BERT 层找不到与我的输入位置参数匹配的选项。

我的位置参数的维度是,[None, 160]但 BERT 层似乎期望它们是[None, None]. 我该如何解决这个问题?

重现我的问题:

这些是我导入的库:

import tensorflow as tf
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
import tensorflow_hub as hub

接下来,我为模型定义了一个函数,如下所示:

# Build and compile the model

def build_model(bert_layer, max_len = 512):
    input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
    input_mask = Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
    segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")

    pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
    clf_output = sequence_output[:, 0, :]
    out = Dense(1, activation='sigmoid')(clf_output)
    
    model = Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=out)
    model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])
    
    return model

接下来,我下载了 BERT 架构并实例化bert_layer如下:

module_url = "https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4"
bert_layer = hub.KerasLayer(module_url, trainable=True)

最后,我尝试使用该build_model函数构建模型,bert_layer如下所示:

model = build_model(bert_layer, max_len=160)
model.summary()

但这会返回一个错误,我认为这意味着我输入的尺寸与所需的尺寸不同。错误如下所示:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-42-516b88804394> in <module>
----> 1 model = build_model(bert_layer, max_len=160)
      2 model.summary()

<ipython-input-41-713013238e2f> in build_model(bert_layer, max_len)
      6     segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
      7 
----> 8     pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
      9     clf_output = sequence_output[:, 0, :]
     10     out = Dense(1, activation='sigmoid')(clf_output)

~\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py in __call__(self, inputs, *args, **kwargs)
    840                     not base_layer_utils.is_in_eager_or_tf_function()):
    841                   with auto_control_deps.AutomaticControlDependencies() as acd:
--> 842                     outputs = call_fn(cast_inputs, *args, **kwargs)
    843                     # Wrap Tensors in `outputs` in `tf.identity` to avoid
    844                     # circular dependencies.

~\Anaconda3\lib\site-packages\tensorflow_core\python\autograph\impl\api.py in wrapper(*args, **kwargs)
    235       except Exception as e:  # pylint:disable=broad-except
    236         if hasattr(e, 'ag_error_metadata'):
--> 237           raise e.ag_error_metadata.to_exception(e)
    238         else:
    239           raise

ValueError: in converted code:
    relative to C:\Users\Wolemercy\Anaconda3\lib\site-packages:

    tensorflow_hub\keras_layer.py:237 call  *
        result = smart_cond.smart_cond(training,
    tensorflow_core\python\framework\smart_cond.py:59 smart_cond
        name=name)
    tensorflow_core\python\saved_model\load.py:436 _call_attribute
        return instance.__call__(*args, **kwargs)
    tensorflow_core\python\eager\def_function.py:457 __call__
        result = self._call(*args, **kwds)
    tensorflow_core\python\eager\def_function.py:494 _call
        results = self._stateful_fn(*args, **kwds)
    tensorflow_core\python\eager\function.py:1822 __call__
        graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
    tensorflow_core\python\eager\function.py:2150 _maybe_define_function
        graph_function = self._create_graph_function(args, kwargs)
    tensorflow_core\python\eager\function.py:2041 _create_graph_function
        capture_by_value=self._capture_by_value),
    tensorflow_core\python\framework\func_graph.py:915 func_graph_from_py_func
        func_outputs = python_func(*func_args, **func_kwargs)
    tensorflow_core\python\eager\def_function.py:358 wrapped_fn
        return weak_wrapped_fn().__wrapped__(*args, **kwds)
    tensorflow_core\python\saved_model\function_deserialization.py:262 restored_function_body
        "\n\n".join(signature_descriptions)))

    ValueError: Could not find matching function to call loaded from the SavedModel. Got:
      Positional arguments (3 total):
        * [<tf.Tensor 'inputs:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_1:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_2:0' shape=(None, 160) dtype=int32>]
        * True
        * None
      Keyword arguments: {}
    
    Expected these arguments to match one of the following 4 option(s):
    
    Option 1:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
        * False
        * None
      Keyword arguments: {}
    
    Option 2:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
        * False
        * None
      Keyword arguments: {}
    
    Option 3:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
        * True
        * None
      Keyword arguments: {}
    
    Option 4:
      Positional arguments (3 total):
        * {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
        * True
        * None
      Keyword arguments: {}

我的期望是该模型将被成功编译。相反,我得到了这个错误。

标签: pythontensorflowkerasnlpbert-language-model

解决方案


推荐阅读