首页 > 解决方案 > 如何训练自己的模型并用 spacy 测试它

问题描述

我正在使用下面的代码来训练一个已经存在的 spacy ner 模型。但是,我在测试中没有得到正确的结果:

我错过了什么?

import spacy
import random
from spacy.gold import GoldParse
from spacy.language import EntityRecognizer

train_data = [
    ('Who is Rocky babu?', [(7, 16, 'PERSON')]),
    ('I like London and Berlin.', [(7, 13, 'LOC'), (18, 24, 'LOC')])
]

nlp = spacy.load('en', entity=False, parser=False)
ner = EntityRecognizer(nlp.vocab, entity_types=['PERSON', 'LOC'])

for itn in range(5):
    random.shuffle(train_data)
    for raw_text, entity_offsets in train_data:
        doc = nlp.make_doc(raw_text)
        gold = GoldParse(doc, entities=entity_offsets)

        nlp.tagger(doc)
        nlp.entity.update([doc], [gold])
Now, When i try to test the above model by using the below code, I don't get the expected output.

text = ['Who is Rocky babu?']

for a in text:
        doc = nlp(a)
        print("Entities", [(ent.text, ent.label_) for ent in doc.ents])
My output is as follows:

Entities []
whereas my expected output is as follows:

Entities [('Rocky babu', 'PERSON')]
Can someone please tell me what I'm missing ?

标签: nltkspacynamed-entity-recognition

解决方案


你能重试吗

nlp = spacy.load('en_core_web_sm', entity=False, parser=False)

如果由于您没有安装该模型而导致错误,则可以运行

python -m spacy download en_core_web_sm

首先在命令行上。

当然请记住,为了对模型进行适当的训练,您需要更多的示例才能使模型能够泛化!


推荐阅读