首页 > 解决方案 > 用我自己的实体/标签微调 BERT

问题描述

我想用我自己的标签微调一个 BERT 模型,比如 [COLOR, MATERIAL] 而不是普通的“NAME”、“ORG”。

我正在关注这个 Colab:https ://colab.research.google.com/drive/14rYdqGAXJhwVzslXT4XIwNFBwkmBWdVV

我准备了这样的train.txt、eval.txt、test.txt:

-DOCSTART- -X- -X- O

blue B-COLOR
motorcicle B-CATEGORY
steel B-MATERIAL
etc.

但是当我执行这个命令时

!python run_ner.py --data_dir=data/ --bert_model=bert-base-multilingual-cased --task_name=ner --output_dir=out_ner --max_seq_length=128 --do_train --num_train_epochs 5 --do_eval --warmup_proportion=0.1

我得到这个错误

06/08/2020 13:30:27 - INFO - pytorch_transformers.modeling_utils -   loading weights file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-multilingual-cased-pytorch_model.bin from cache at /root/.cache/torch/pytorch_transformers/5b5b80054cd2c95a946a8e0ce0b93f56326dff9fbda6a6c3e02de3c91c918342.7131dcb754361639a7d5526985f880879c9bfd144b65a0bf50590bddb7de9059
06/08/2020 13:30:33 - INFO - pytorch_transformers.modeling_utils -   Weights of Ner not initialized from pretrained model: ['classifier.weight', 'classifier.bias']
06/08/2020 13:30:33 - INFO - pytorch_transformers.modeling_utils -   Weights from pretrained model not used in Ner: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']


File "run_ner.py", line 594, in
main()

File "run_ner.py", line 464, in main
train_examples, label_list, args.max_seq_length, tokenizer)

File "run_ner.py", line 210, in convert_examples_to_features
label_ids.append(label_map[labels[i]])

KeyError: 'B-COLOR'

我是否错误地创建了 train.txt 文件?

标签: neural-networktransformerbert-language-model

解决方案


将这些标签添加到 run_ner.py 文件中的 get_labels() 方法中,它将起作用


推荐阅读