python - 使用 Sentence Tokenizer 后从列表列表中选择子列表
问题描述
所以我在列表中有一些句子,例如:
some_list = ['Joe is travelling via train.'
'Joe waited for the train, but the train was late.'
'Even after an hour, there was no sign of the
train. Joe then went to talk to station master about the
train's situation.']
然后我使用了 nltk 的句子标记器,因为我想单独分析一个完整句子中的每个句子。所以现在 O/P 在列表格式列表中看起来像这样:
sent_tokenize_list = [['Joe is travelling via train.'],
['Joe waited for the train,',
'but the train was late.'],
['Even after an hour,',
'there was no sign of the
train.',
'Joe then went to talk to station master about
the train's situation.']]
现在,从这个列表列表中,我如何仅选择包含超过 1 个句子的列表,即我的示例中的第 2 和第 3 个列表,并将它们仅以列表格式作为单独的列表。
即 O/P 应该是
['Joe waited for the train,','but the train was late.']
['Even after an hour,','there was no sign of the train.',
'Joe then went to talk to station master about the train's situation.']
解决方案
您可以使用len
来检查列表中的句子数量。
前任:
sent_tokenize_list = [['Joe is travelling via train.'],
['Joe waited for the train,',
'but the train was late.'],
['Even after an hour,','there was no sign of the train.',"Joe then went to talk to station master about the train's situation."]]
print([i for i in sent_tokenize_list if len(i) >= 2])
输出:
[['Joe waited for the train,', 'but the train was late.'], ['Even after an hour,', 'there was no sign of the train.', "Joe then went to talk to station master about the train's situation."]]
推荐阅读
- javascript - 在启用了 --kiosk-printing 的 chrome 中使用 printJS 打印 pdf 总是失败
- javascript - 使用 Service-Worker React 的离线页面
- r - 如何在 r 中查看交叉验证的重采样数据?
- java - MiniKdc 无法从 org.springframework.security.kerberos.test.MiniKdc 获得
- azure - 名称或服务未知 - Azure 中的间歇性错误
- react-native - React Native 菱形按钮网格
- java - Java中对象内的自动过期字段
- npm - 酱汁服务不适用于移动测试
- swiftui - SwiftUI 中的 Spring 动画,IOS14 坏了
- bots - Telegram BOT 将电话号码从 Excel 表添加到电报频道