python - 我们如何使用 NTLK 找出句子中的形容词动词副词短语?
问题描述
下面是来自 StackOverflow 的代码。(它适用于名词短语,但我需要动词、副词、形容词短语)。我是 NLP 新手,对语法规则了解不多。我试图从博客和文档中获得帮助,但无法弄清楚。
此代码块中所需的更改。
NP 块和 VB 块的规则
NBAR:
{<NN.*|JJ>*<NN.*>} # Nouns and Adjectives, terminated with Nouns
{<RB.?>*<VB.?>*<JJ>*<VB.?>+<VB>?} # Verbs and Verb Phrases
NP:
{<NBAR>}
{<NBAR><IN><NBAR>} # Above, connected with in/of/etc...
完整代码(它适用于名词短语)
from nltk import word_tokenize, pos_tag
from nltk.corpus import wordnet
from IPython.display import display
lemmatizer = nltk.WordNetLemmatizer()
#word tokenizeing and part-of-speech tagger
document = 'The little brown dog barked at the black cat'
tokens = [nltk.word_tokenize(sent) for sent in [document]]
postag = [nltk.pos_tag(sent) for sent in tokens][0]
# Rule for NP chunk and VB Chunk
grammar = r"""
NBAR:
{<NN.*|JJ>*<NN.*>} # Nouns and Adjectives, terminated with Nouns
{<RB.?>*<VB.?>*<JJ>*<VB.?>+<VB>?} # Verbs and Verb Phrases
NP:
{<NBAR>}
{<NBAR><IN><NBAR>} # Above, connected with in/of/etc...
"""
#Chunking
cp = nltk.RegexpParser(grammar)
# the result is a tree
tree = cp.parse(postag)
def leaves(tree):
"""Finds NP (nounphrase) leaf nodes of a chunk tree."""
for subtree in tree.subtrees(filter = lambda t: t.label() =='NP'):
yield subtree.leaves()
def get_word_postag(word):
if pos_tag([word])[0][1].startswith('J'):
return wordnet.ADJ
if pos_tag([word])[0][1].startswith('V'):
return wordnet.VERB
if pos_tag([word])[0][1].startswith('N'):
return wordnet.NOUN
else:
return wordnet.NOUN
def normalise(word):
"""Normalises words to lowercase and stems and lemmatizes it."""
word = word.lower()
postag = get_word_postag(word)
word = lemmatizer.lemmatize(word,postag)
return word
def get_terms(tree):
for leaf in leaves(tree):
terms = [normalise(w) for w,t in leaf]
yield terms
terms = get_terms(tree)
features = []
for term in terms:
_term = ''
for word in term:
_term += ' ' + word
features.append(_term.strip())
features
解决方案
推荐阅读
- c++ - C++如何解析一个ppm文件?
- mysql - MySQL:如何通过跳过源表中的空行将行从一个表复制到另一个表?
- python - 如何从 base64 编码重建文件?
- javascript - 访问和显示位于另一个组件中的模态
- listview - Flutter ListViews 与其他项目
- excel - Excel VBA,通过列表对日期进行排序
- javascript - 使用 useState React Native 将获取数据传递给组件
- python-3.x - 如何从 keras 的 ImageDataGenerator 调用验证数据
- unity3d - 在同一脚本中从游戏对象调用变量
- excel - 删除 VBA 并在特定路径中另存为 csv