python - 提取 TextRazor 识别的实体的相关句子
问题描述
我正在使用 Textrazor 并想找出从中识别关键字的句子,但我无法做到。该文档没有包含太多关于它的信息,也没有在互联网上的任何地方找到。
如何提取与识别的关键字相关的句子。
import textrazor
key = "key"
textrazor.api_key = key
client = textrazor.TextRazor(extractors=["word","entities", "topics","sentence","words"])
for entity,sentence in zip(response.entities(),response.sentences()):
print(sentence.words)
print 语句确实会生成句子的单词,但采用 textRazor 类格式,并且不能被 python 解释。
输出如下:
[TextRazor Word:"b'If'" at position 196, TextRazor Word:"b'aggression'" at position 197, TextRazor Word:"b'helps'" at position 198, TextRazor Word:"b'in'" at position 199, TextRazor Word:"b'the'" at position 200, TextRazor Word:"b'survival'" at position 201, TextRazor Word:"b'of'" at position 202, TextRazor Word:"b'our'" at position 203, TextRazor Word:"b'genes'" at position 204, TextRazor Word:"b','" at position 205, TextRazor Word:"b'then'" at position 206, TextRazor Word:"b'the'" at position 207, TextRazor Word:"b'process'" at position 208, TextRazor Word:"b'of'" at position 209, TextRazor Word:"b'natural'" at position 210, TextRazor Word:"b'selection'" at position 211, TextRazor Word:"b'may'" at position 212, TextRazor Word:"b'well'" at position 213, TextRazor Word:"b'have'" at position 214, TextRazor Word:"b'caused'" at position 215, TextRazor Word:"b'humans'" at position 216, TextRazor Word:"b','" at position 217, TextRazor Word:"b'as'" at position 218, TextRazor Word:"b'it'" at position 219, TextRazor Word:"b'would'" at position 220, TextRazor Word:"b'any'" at position 221, TextRazor Word:"b'other'" at position 222, TextRazor Word:"b'animal'" at position 223, TextRazor Word:"b','" at position 224, TextRazor Word:"b'to'" at position 225, TextRazor Word:"b'be'" at position 226, TextRazor Word:"b'aggressive'" at position 227, TextRazor Word:"b'-LRB-'" at position 228, TextRazor Word:"b'Buss'" at position 229, TextRazor Word:"b'&'" at position 230, TextRazor Word:"b'Duntley'" at position 231, TextRazor Word:"b','" at position 232, TextRazor Word:"b'2006'" at position 233, TextRazor Word:"b'-RRB-'" at position 234, TextRazor Word:"b'.'" at position 235]
解决方案
同样可以使用以下代码完成。
for entity in response.entities():
entity_not_found = True
for sentence in response.sentences():
if entity.matched_words[0] in sentence.words:
sent = ''
for w in sentence.words: # Forming a sentence from words
sent = sent+' ' +w.token
归功于 TextRazor 的作者对此的回应。
推荐阅读
- json - Bash循环通过CSV但用换行符留下最后一行值
- node.js - MongoDB 副本集在 DESKTOP 而不是 localhost 上启动
- php - CSV 联赛不跳过空记录
- sql - 是否有任何方法可以找到数据库中具有特定值的列的所有表,例如。EMP_NAME = ABC?
- c# - 未处理的异常:System.IO.IOException:句柄在 asp.net mvc core 3.1 应用程序中无效
- azure-devops - 谁可以访问/查看项目范围的提要?
- c# - 为用户连接到 TPC 时出错:XYZ -> TF400324:Azure DevOps 服务无法从服务器获得
- android - 当我从片段中选择另一个广播频道时,SimpleExoplayer 不会再次从 url 流式传输 mp3
- random - Arcgis 使用最小允许距离创建随机点
- java - await QueueChannel 如何处理所有消息?