python - 为什么 BERT 模型找不到与我的输入位置参数匹配的选项?
问题描述
在尝试进行 NLP 练习时,我尝试利用 BERT 架构来获得良好的训练模型。所以我定义了一个函数,使用 BERT 作为层来构建和编译模型。但是,在尝试执行该函数并实际构建模型时,我收到一个错误,即 BERT 层找不到与我的输入位置参数匹配的选项。
我的位置参数的维度是,[None, 160]
但 BERT 层似乎期望它们是[None, None]
. 我该如何解决这个问题?
重现我的问题:
这些是我导入的库:
import tensorflow as tf
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
import tensorflow_hub as hub
接下来,我为模型定义了一个函数,如下所示:
# Build and compile the model
def build_model(bert_layer, max_len = 512):
input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
input_mask = Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
clf_output = sequence_output[:, 0, :]
out = Dense(1, activation='sigmoid')(clf_output)
model = Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=out)
model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])
return model
接下来,我下载了 BERT 架构并实例化bert_layer
如下:
module_url = "https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4"
bert_layer = hub.KerasLayer(module_url, trainable=True)
最后,我尝试使用该build_model
函数构建模型,bert_layer
如下所示:
model = build_model(bert_layer, max_len=160)
model.summary()
但这会返回一个错误,我认为这意味着我输入的尺寸与所需的尺寸不同。错误如下所示:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-42-516b88804394> in <module>
----> 1 model = build_model(bert_layer, max_len=160)
2 model.summary()
<ipython-input-41-713013238e2f> in build_model(bert_layer, max_len)
6 segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
7
----> 8 pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
9 clf_output = sequence_output[:, 0, :]
10 out = Dense(1, activation='sigmoid')(clf_output)
~\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py in __call__(self, inputs, *args, **kwargs)
840 not base_layer_utils.is_in_eager_or_tf_function()):
841 with auto_control_deps.AutomaticControlDependencies() as acd:
--> 842 outputs = call_fn(cast_inputs, *args, **kwargs)
843 # Wrap Tensors in `outputs` in `tf.identity` to avoid
844 # circular dependencies.
~\Anaconda3\lib\site-packages\tensorflow_core\python\autograph\impl\api.py in wrapper(*args, **kwargs)
235 except Exception as e: # pylint:disable=broad-except
236 if hasattr(e, 'ag_error_metadata'):
--> 237 raise e.ag_error_metadata.to_exception(e)
238 else:
239 raise
ValueError: in converted code:
relative to C:\Users\Wolemercy\Anaconda3\lib\site-packages:
tensorflow_hub\keras_layer.py:237 call *
result = smart_cond.smart_cond(training,
tensorflow_core\python\framework\smart_cond.py:59 smart_cond
name=name)
tensorflow_core\python\saved_model\load.py:436 _call_attribute
return instance.__call__(*args, **kwargs)
tensorflow_core\python\eager\def_function.py:457 __call__
result = self._call(*args, **kwds)
tensorflow_core\python\eager\def_function.py:494 _call
results = self._stateful_fn(*args, **kwds)
tensorflow_core\python\eager\function.py:1822 __call__
graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
tensorflow_core\python\eager\function.py:2150 _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
tensorflow_core\python\eager\function.py:2041 _create_graph_function
capture_by_value=self._capture_by_value),
tensorflow_core\python\framework\func_graph.py:915 func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
tensorflow_core\python\eager\def_function.py:358 wrapped_fn
return weak_wrapped_fn().__wrapped__(*args, **kwds)
tensorflow_core\python\saved_model\function_deserialization.py:262 restored_function_body
"\n\n".join(signature_descriptions)))
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (3 total):
* [<tf.Tensor 'inputs:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_1:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_2:0' shape=(None, 160) dtype=int32>]
* True
* None
Keyword arguments: {}
Expected these arguments to match one of the following 4 option(s):
Option 1:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
* False
* None
Keyword arguments: {}
Option 2:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
* False
* None
Keyword arguments: {}
Option 3:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
* True
* None
Keyword arguments: {}
Option 4:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
* True
* None
Keyword arguments: {}
我的期望是该模型将被成功编译。相反,我得到了这个错误。
解决方案
推荐阅读
- amazon-web-services - 处理来自 Amazon SQS 死信队列的消息
- r - 仅给出最终样本大小的 R 多阶段抽样
- testing - Golang 测试覆盖与黑盒 _test 覆盖
- python - 将时间戳舍入到最接近的 30 秒
- java - couchbase 插入和查询延迟
- php - 在 000webhost 中找不到 Laravel 路由
- multithreading - 在 ABCL (Armed Bear) LISP 中如何创建后台子进程/后台线程?
- php - php pthreads中的动态任务调度?
- c++ - Visual Studio 控制台调试错误文件
- parsing - 使用解析选择sumologic中的字段时如何更改字段的格式