python - How to get entities with direction in relation extraction?
问题描述
I have been working with relation extraction for a week. But what I need is direction between two entities, such as Company_x got bought by Company_y. So the model should predict the entities like Company_y->bought-> Company_X. Any models you guys think will be helpful for this?
解决方案
被动语态通常是关系方向的良好指标。
您可以从两个实体之间的上下文中提取以动词开头的模式,然后检测被动语态的存在或不存在。
一些简单的概念验证代码(使用 NLTK 中的 RegexpParser 实际上可能更简单)
from nltk import pos_tag
from nltk import word_tokenize
from nltk.stem.wordnet import WordNetLemmatizer
lmtzr = WordNetLemmatizer()
aux_verbs = ['be']
def detect_passive_voice(pattern):
passive_voice = False
if len(pattern) >= 3:
if pattern[0][1].startswith('V'):
verb = lmtzr.lemmatize(pattern[0][0], 'v')
if verb in aux_verbs:
if (pattern[1][1] == 'VBN' or pattern[1][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True
# past verb + by
elif (pattern[-2][1] == 'VBN' or pattern[-2][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True
# past verb + by
elif (pattern[-2][1] == 'VBN' or pattern[-2][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True
# past verb + by
elif len(pattern) >= 2:
if (pattern[-2][1] == 'VBN' or pattern[-2][1] == 'VBD') and pattern[-1][0] == 'by':
passive_voice = True
return passive_voice
运行一些示例:
In [4]: tokens = word_tokenize("was bought by")
...: tags = pos_tag(tokens)
...: detect_passive_voice(tags)
Out[4]: True
In [5]: tokens = word_tokenize("mailed the letter")
...: tags = pos_tag(tokens)
...: detect_passive_voice(tags)
Out[5]: False
In [7]: tokens = word_tokenize("was mailed by")
...: tags = pos_tag(tokens)
...: detect_passive_voice(tags)
Out[7]: True
您可以添加更多助动词,也可以允许中间存在副词或形容词。
推荐阅读
- rest - 在这种情况下使用的适当 HTTP 状态代码是什么?
- python-3.x - pip 不安装某些软件包,但会安装其他软件包
- sql - 我在创建 VBA 以在 ms-access 中将数据动态添加到我的表时遇到问题
- mysql - mysql 中的 ORDER BY 百分比 - 100% 移到最后
- python - 如何使用 ctypes 停止和重新启动从 python 运行的 C++ 代码
- c# - 如何在不收到 403 的情况下将 c# 中的浏览器握手复制到 websocket?
- jquery - 模态对话框不会随 stopPropogation 关闭
- java - 在 Spring Boot 中使用生成的 ID 持久化 OneToMany 实体
- firebase - 在 Firebase 托管中哪里可以找到以前连接的自定义域的 DNS A 记录信息?
- php - phpredis 会话锁定 -- 未能获得锁定会引发 php 通知 -- 会出现致命错误