首页 > 解决方案 > 如何从文本中删除一组 pos 标签(块)

问题描述

我想删除文本中的所有问题,所以我做了一些块来检测文本中的问题

sample_text = """
    where did you go ?
    is there anybody out there?
    can you tell me where i can find you ? please.
    do you know me?
    """
sentences=sent_tokenize(sample_text)
for s in sentences:
    tagged = pos_tag(word_tokenize(s))
    chunker = RegexpParser(r"""
                           normalQuestion: {<WRB|WP.?><.*>*}
                           Question: {<VBP|VBZ.?>+<.*>*}
                           canQuestion:{<MD><PRP><VB><.*>*}
                           doQuesiton: {<VB><PRP><VB><.*>*}
                           """)
    output = chunker.parse(tagged)
    for a in chunked:
        if isinstance(a, nltk.tree.Tree):
            if a.label() == "Question":
                a.draw()

现在我想从原文中删除它

标签: pythonnltk

解决方案


推荐阅读