首页 > 解决方案 > 在没有标点符号的 .txt 文件中查找最长的单词

问题描述

我正在做 Python 文件 I/O 练习,尽管在我尝试在.txt文件的每一行中找到最长单词的练习中取得了巨大进步,但我无法摆脱标点符号

这是我的代码:

with open("original-3.txt", 'r') as file1:
lines = file1.readlines()
for line in lines:
    if not line == "\n":
        print(max(line.split(), key=len))

这是我得到的输出

这是original-3.txt我从中读取数据的文件

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

"Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!"

He took his vorpal sword in hand:
Long time the manxome foe he sought,
So rested he by the Tumtum tree,
And stood a while in thought.

And, as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One two! One two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

"And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!"
"Oh frabjous day! Callooh! Callay!"
He chortled in his joy.

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,
And the mome raths outgrabe.

如您所见,我得到了标点符号["," ";" "?" "!"]

你怎么认为我只能自己得到单词?

谢谢

标签: pythonparsingtext-processingtext-parsing

解决方案


使用正则表达式很容易得到什么是length of longest word

import re

for line in lines:
    found_strings = re.findall(r'\w+', line)
    print(max([len(txt) for txt in found_strings]))

推荐阅读