首页 > 解决方案 > Python计算拆分句子的单词?

问题描述

不确定如何删除输出末尾的“\n”

基本上,我有这个 txt 文件,其中包含以下句子:

"What does Bessie say I have done?" I asked.

"Jane, I don't like cavillers or questioners; besides, there is something truly forbidding in a child 
 taking up her elders in that manner.
 
Be seated somewhere; and until you can speak pleasantly, remain silent."

我设法用分号用代码分割句子:

import re
with open("testing.txt") as file:
read_file = file.readlines()
for i, word in enumerate(read_file):
    low = word.lower()
    re.split(';',low)

但不确定如何将拆分句子的单词数为 len() 不起作用:句子的输出:

['"what does bessie say i have done?" i asked.\n']
['"jane, i don\'t like cavillers or questioners', ' besides, there is something truly forbidding in a 
child taking up her elders in that manner.\n']
['be seated somewhere', ' and until you can speak pleasantly, remain silent."\n']

例如第三句话,我想数左边的 3 个单词和右边的 8 个单词。

谢谢阅读!

标签: pythonpython-3.xnlp

解决方案


`

import re
sentences = []                                                   #empty list for storing result
with open('testtext.txt') as fileObj:
    lines = [line.strip() for line in fileObj if line.strip()]   #makin list of lines allready striped from '\n's
for line in lines:
    sentences += re.split(';', line)                             #spliting lines by ';' and store result in sentences
for sentence in sentences:
    print(sentence +' ' + str(len(sentence.split())))            #out

推荐阅读