python - 在 NER 中具有可重复的结果

问题描述

我目前正在使用库简单的转换器来执行 NER 任务。

DEFAULT_MODEL_PARAMS = {
    "save_eval_checkpoints": False,
    "save_steps": -1,
    'overwrite_output_dir': True,
    "save_model_every_epoch": False,
    'reprocess_input_data': True,
    "train_batch_size": 8,
    'num_train_epochs': 10,
    "max_seq_length": 50,
    "gradient_accumulation_steps": 1,
    "use_multiprocessing": True,
    "manual_seed": 42,
    'dataloader_num_workers': 0
}

model = NERModel('camembert', 'camembert-base',
                     labels=unique_labels,
                     use_cuda=use_cuda,
                     cuda_device=cuda_device,
                     args=DEFAULT_MODEL_PARAMS)

但是即使我在参数中修复了手动种子，我也会得到不同的结果。我也注意到我收到了这个警告

Some weights of CamembertForTokenClassification were not initialized from the model checkpoint at camembert-base and are newly initialized: ['classifier.weight', 'classifier.bias']

那么我怎样才能得到可重复的结果呢？

标签： pythonnlpnamed-entity-recognition

python - 在 NER 中具有可重复的结果

问题描述

解决方案

推荐阅读