tensorflow - XLNet:命名实体识别的自定义训练 - Huggingface Transformers - Tensorflow
问题描述
问题:如何成功地在“NER-Like”任务上训练 XLNet?
我正在尝试对 Huggingface 的TFXLNetForTokenClassification
模型进行“类似命名实体识别(NER)”的培训。本质上,我试图在一个句子中标记什么是动作。
我的输入语句是:text = "Please click on this, and then press the save button"
运行后tokenize(text)
:
inputs = {'input_ids': <tf.Tensor: shape=(1, 13), dtype=int32, numpy=
array([[1431, 1962, 31, 52, 19, 21, 137, 1320, 18, 1537, 3167,
4, 3]], dtype=int32)>, 'token_type_ids': <tf.Tensor: shape=(1, 13), dtype=int32, numpy=array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]], dtype=int32)>, 'attention_mask': <tf.Tensor: shape=(1, 13), dtype=int32, numpy=array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=int32)>}
标签为:label = tf.reshape(tf.constant([1,2,2,2,0,0,1,2,2,2,2,0,0]), (1, tf.size(input_ids)))
,其中1
对应一个动作的开始,2
对应同一个动作的实体,0
不是一个动作。
我创建一个tf.data.Dataset
传递给我的模型:
train_dataset = tf.data.Dataset.from_tensor_slices((
dict(inputs),
labels))
开始训练后:
action_model.compile(optimizer = 'adam')
action_model.fit(train_dataset, epochs=1)
它给了我一个错误:
...
tf.transpose(inputs["input_ids"], perm=(1, 0))
...
ValueError: Dimension must be 1 but is 2 for '{{node action_model_6/transformer/transpose}} = Transpose[T=DT_INT32, Tperm=DT_INT32](data_1, action_model_6/transformer/transpose/perm)' with input shapes: [13], [2].
但是,当我尝试tf.transpose(inputs["input_ids"], perm=(1, 0))
在我的句子上单独运行该函数时,它可以完美运行。
另外,当我自己调用模型时:
x = dict(inputs)
action_model(x)
它给了我相应的TFXLNetForTokenClassificationOutput
输出类。
附录:完整源代码
from transformers import XLNetTokenizer, TFXLNetForTokenClassification
import tensorflow as tf
class ActionModel(TFXLNetForTokenClassification):
def __init__(self, *args, log_dir=None, cache_dir= None, **kwargs):
super().__init__(*args, **kwargs)
self.loss_tracker= tf.keras.metrics.Mean(name='loss')
@tf.function
def train_step(self, data):
x = data[0]
y_true = data[1]
with tf.GradientTape() as tape:
outputs = self(x, training=True) # <------ It fails here
logits = outputs['logits']
loss = tf.reduce_mean(outputs['loss'])
grads = tape.gradient(loss, self.trainable_variables)
y_true = tf.reshape(y_true, [-1, 1])
self.optimizer.apply_gradients(zip(grads, self.trainable_variables))
self.loss_tracker.update_state(loss)
self.compiled_metrics.update_state(y_true, logits)
metrics = {m.name: m.result() for m in self.metrics}
lr = self.optimizer._decayed_lr(tf.float32)
metrics.update({'lr': lr})
return metrics
tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased')
action_model = ActionModel.from_pretrained('xlnet-base-cased')
text = "Please click on this, and then press the save button"
inputs = tokenizer(text, return_tensors="tf")
input_ids = inputs["input_ids"]
labels = tf.reshape(tf.constant([1,2,2,2,0,0,1,2,2,2,2,0,0]), (-1, tf.size(input_ids))) # Batch size 1
train_dataset = tf.data.Dataset.from_tensor_slices((
dict(inputs),
labels))
action_model.compile(optimizer = 'adam')
action_model.fit(train_dataset, epochs=1)
完整的错误信息
ValueError: in user code:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:855 train_function *
return step_function(self, iterator)
<ipython-input-47-6adc35b145c8>:16 train_step *
outputs = self(x, training=True)
/usr/local/lib/python3.7/dist-packages/transformers/models/xlnet/modeling_tf_xlnet.py:1757 call *
transformer_outputs = self.transformer(
/usr/local/lib/python3.7/dist-packages/transformers/models/xlnet/modeling_tf_xlnet.py:630 call *
inputs["input_ids"] = tf.transpose(inputs["input_ids"], perm=(1, 0))
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper **
return target(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py:2227 transpose_v2
return transpose(a=a, perm=perm, name=name, conjugate=conjugate)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
return target(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py:2308 transpose
return transpose_fn(a, perm, name=name)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_array_ops.py:11653 transpose
"Transpose", x=x, perm=perm, name=name)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
attrs=attr_protos, op_def=op_def)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py:601 _create_op_internal
compute_device)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:3565 _create_op_internal
op_def=op_def)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:2042 __init__
control_input_ops, op_def)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:1883 _create_c_op
raise ValueError(str(e))
ValueError: Dimension must be 1 but is 2 for '{{node action_model_6/transformer/transpose}} = Transpose[T=DT_INT32, Tperm=DT_INT32](data_1, action_model_6/transformer/transpose/perm)' with input shapes: [13], [2].
解决方案
推荐阅读
- python - 为 Keras 顺序模型格式化 networkx 数据
- django - Django:用另一个表中的字段注释(一对多)
- node.js - 带有 axios 的 Nodemon 服务器 - 无法发送
- python - 在类中使用 Beautiful Soup 查找字符串
- apache-spark - Spark Streaming - Kafka 集成
- javascript - 材料表列标题旁边的添加按钮
- reactjs - 创建新的 React 应用程序时遇到麻烦
- sql - Oracle 11g - 将多个值传递给 XML 属性
- python-3.x - 如何在 Python 中导入模块
- powershell - 有没有办法使用 SharePoint/office 365 向用户发送带有附件的电子邮件?