首页 > 解决方案 > 提取 TextRazor 识别的实体的相关句子

问题描述

我正在使用 Textrazor 并想找出从中识别关键字的句子,但我无法做到。该文档没有包含太多关于它的信息,也没有在互联网上的任何地方找到。

如何提取与识别的关键字相关的句子。

import textrazor
key = "key"

textrazor.api_key = key

client = textrazor.TextRazor(extractors=["word","entities", "topics","sentence","words"])

for entity,sentence in zip(response.entities(),response.sentences()):
        print(sentence.words)

print 语句确实会生成句子的单词,但采用 textRazor 类格式,并且不能被 python 解释。

输出如下:

[TextRazor Word:"b'If'" at position 196, TextRazor Word:"b'aggression'" at position 197, TextRazor Word:"b'helps'" at position 198, TextRazor Word:"b'in'" at position 199, TextRazor Word:"b'the'" at position 200, TextRazor Word:"b'survival'" at position 201, TextRazor Word:"b'of'" at position 202, TextRazor Word:"b'our'" at position 203, TextRazor Word:"b'genes'" at position 204, TextRazor Word:"b','" at position 205, TextRazor Word:"b'then'" at position 206, TextRazor Word:"b'the'" at position 207, TextRazor Word:"b'process'" at position 208, TextRazor Word:"b'of'" at position 209, TextRazor Word:"b'natural'" at position 210, TextRazor Word:"b'selection'" at position 211, TextRazor Word:"b'may'" at position 212, TextRazor Word:"b'well'" at position 213, TextRazor Word:"b'have'" at position 214, TextRazor Word:"b'caused'" at position 215, TextRazor Word:"b'humans'" at position 216, TextRazor Word:"b','" at position 217, TextRazor Word:"b'as'" at position 218, TextRazor Word:"b'it'" at position 219, TextRazor Word:"b'would'" at position 220, TextRazor Word:"b'any'" at position 221, TextRazor Word:"b'other'" at position 222, TextRazor Word:"b'animal'" at position 223, TextRazor Word:"b','" at position 224, TextRazor Word:"b'to'" at position 225, TextRazor Word:"b'be'" at position 226, TextRazor Word:"b'aggressive'" at position 227, TextRazor Word:"b'-LRB-'" at position 228, TextRazor Word:"b'Buss'" at position 229, TextRazor Word:"b'&'" at position 230, TextRazor Word:"b'Duntley'" at position 231, TextRazor Word:"b','" at position 232, TextRazor Word:"b'2006'" at position 233, TextRazor Word:"b'-RRB-'" at position 234, TextRazor Word:"b'.'" at position 235]

标签: pythonnlp

解决方案


同样可以使用以下代码完成。

for entity in response.entities():
        entity_not_found = True
        for sentence in response.sentences():   
            if entity.matched_words[0] in sentence.words:

                sent = ''
                for w in sentence.words: # Forming a sentence from words
                    sent = sent+' ' +w.token

归功于 TextRazor 的作者对此的回应。


推荐阅读