首页 > 解决方案 > XLNet:命名实体识别的自定义训练 - Huggingface Transformers - Tensorflow

问题描述

问题:如何成功地在“NER-Like”任务上训练 XLNet?

我正在尝试对 Huggingface 的TFXLNetForTokenClassification模型进行“类似命名实体识别(NER)”的培训。本质上,我试图在一个句子中标记什么是动作。

我的输入语句是:text = "Please click on this, and then press the save button"

运行后tokenize(text)

inputs = {'input_ids': <tf.Tensor: shape=(1, 13), dtype=int32, numpy=
array([[1431, 1962,   31,   52,   19,   21,  137, 1320,   18, 1537, 3167,
           4,    3]], dtype=int32)>, 'token_type_ids': <tf.Tensor: shape=(1, 13), dtype=int32, numpy=array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2]], dtype=int32)>, 'attention_mask': <tf.Tensor: shape=(1, 13), dtype=int32, numpy=array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=int32)>}

标签为:label = tf.reshape(tf.constant([1,2,2,2,0,0,1,2,2,2,2,0,0]), (1, tf.size(input_ids))),其中1对应一个动作的开始,2对应同一个动作的实体,0不是一个动作。

我创建一个tf.data.Dataset传递给我的模型:

train_dataset = tf.data.Dataset.from_tensor_slices((
    dict(inputs),
    labels))

开始训练后:

action_model.compile(optimizer = 'adam')
action_model.fit(train_dataset, epochs=1)

它给了我一个错误:

...
tf.transpose(inputs["input_ids"], perm=(1, 0))
...
ValueError: Dimension must be 1 but is 2 for '{{node action_model_6/transformer/transpose}} = Transpose[T=DT_INT32, Tperm=DT_INT32](data_1, action_model_6/transformer/transpose/perm)' with input shapes: [13], [2].

但是,当我尝试tf.transpose(inputs["input_ids"], perm=(1, 0))在我的句子上单独运行该函数时,它可以完美运行。

另外,当我自己调用模型时:

x = dict(inputs)
action_model(x)

它给了我相应的TFXLNetForTokenClassificationOutput输出类。

附录:完整源代码

from transformers import XLNetTokenizer, TFXLNetForTokenClassification
import tensorflow as tf

class ActionModel(TFXLNetForTokenClassification):
    def __init__(self, *args, log_dir=None, cache_dir= None, **kwargs):
        super().__init__(*args, **kwargs)
        self.loss_tracker= tf.keras.metrics.Mean(name='loss')
        
    @tf.function
    def train_step(self, data):
        x = data[0]
        y_true = data[1]
        with tf.GradientTape() as tape:
            outputs = self(x, training=True) # <------ It fails here
            logits = outputs['logits']
            loss = tf.reduce_mean(outputs['loss'])

            grads = tape.gradient(loss, self.trainable_variables)

        y_true = tf.reshape(y_true, [-1, 1])

        self.optimizer.apply_gradients(zip(grads, self.trainable_variables))
        self.loss_tracker.update_state(loss)       
        self.compiled_metrics.update_state(y_true, logits)
        metrics = {m.name: m.result() for m in self.metrics}
        lr = self.optimizer._decayed_lr(tf.float32)
        metrics.update({'lr': lr})
        
        return metrics

tokenizer = XLNetTokenizer.from_pretrained('xlnet-base-cased')
action_model = ActionModel.from_pretrained('xlnet-base-cased')
text = "Please click on this, and then press the save button"
inputs = tokenizer(text, return_tensors="tf")
input_ids = inputs["input_ids"]
labels = tf.reshape(tf.constant([1,2,2,2,0,0,1,2,2,2,2,0,0]), (-1, tf.size(input_ids))) # Batch size 1

train_dataset = tf.data.Dataset.from_tensor_slices((
    dict(inputs),
    labels))

action_model.compile(optimizer = 'adam')
action_model.fit(train_dataset, epochs=1)

完整的错误信息

ValueError: in user code:

    /usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py:855 train_function  *
        return step_function(self, iterator)
    <ipython-input-47-6adc35b145c8>:16 train_step  *
        outputs = self(x, training=True)
    /usr/local/lib/python3.7/dist-packages/transformers/models/xlnet/modeling_tf_xlnet.py:1757 call  *
        transformer_outputs = self.transformer(
    /usr/local/lib/python3.7/dist-packages/transformers/models/xlnet/modeling_tf_xlnet.py:630 call  *
        inputs["input_ids"] = tf.transpose(inputs["input_ids"], perm=(1, 0))
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper  **
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py:2227 transpose_v2
        return transpose(a=a, perm=perm, name=name, conjugate=conjugate)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py:206 wrapper
        return target(*args, **kwargs)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py:2308 transpose
        return transpose_fn(a, perm, name=name)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_array_ops.py:11653 transpose
        "Transpose", x=x, perm=perm, name=name)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
        attrs=attr_protos, op_def=op_def)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py:601 _create_op_internal
        compute_device)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:3565 _create_op_internal
        op_def=op_def)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:2042 __init__
        control_input_ops, op_def)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py:1883 _create_c_op
        raise ValueError(str(e))

    ValueError: Dimension must be 1 but is 2 for '{{node action_model_6/transformer/transpose}} = Transpose[T=DT_INT32, Tperm=DT_INT32](data_1, action_model_6/transformer/transpose/perm)' with input shapes: [13], [2].

标签: tensorflowkerashuggingface-transformers

解决方案


推荐阅读