首页 > 解决方案 > 在python中使用count提取包含特定单词的句子

问题描述

我有一个很长的文字。我只想提取包含列表中至少一个单词的句子。

list1 = ["apple", "orange", "tomato"...]
text = "I would love an apple. It is a nice day. How are you? Tasty orange..."

我想过做这样的事情:

sentences_with_fruits = []
for sentence in split_into_sentences(text):
    if sentence.count(list1) > 0:
        sentences_with_word.append(sentence)

我收到以下错误:

必须是 str,而不是列表。

关于如何解决这个问题或获得相同结果的更好方法的任何想法?

标签: python-3.xnlp

解决方案


您还可以使用 NLTK 库中的单词和句子标记器。

from nltk.tokenize import word_tokenize, sent_tokenize
list1 = ["apple", "orange", "tomato"]
text = "I would love an apple. It is a nice day. How are you? Tasty orange..."
sentences_with_word = []
for sen in sent_tokenize(text):
    l = word_tokenize(sen)
    if len(set(l).intersection(list1))>0:
        sentences_with_word.append(sen)

推荐阅读