首页 > 解决方案 > 从python中的txt文件中提取行

问题描述

想象一下,我在一个 txt 文件中有这个文本:

bla bla bla
bla bla bla
Title Lorem ipsum dolor sit amet, consectetur adipiscing elit
, sed do eiusmod tempor incididunt ut labore et dolore
magna aliqua。Ut enim ad minim veniam,
condition
bla bla bla
bla bla
Title Sed ut perspiciatis unde omnis iste natus error sit voluptatem
accusantium doloremque laudantium, totam rem aperiam,
eaque ipsa quae ab illoinventre veritatis
condition
bla bla bla

从具有上述结构的文本(数百行)中,我想提取以“title”开头的行,直到找到以“condition”开头的行。所以结果会是这样的:

标题 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua。Ut enim ad minim veniam,

标题 Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illoinventre veritatis

我可以使用此代码选择第一个类似的内容,但在找到“条件”一词之前,我不知道如何添加下一行。请问你能帮帮我吗?

outF = open("myOutFile.txt", "w")
hand = open('doubt.txt', encoding="utf8")
for line in hand:
    line = line.rstrip()
    if re.search('^Title',line) :       
       outF.write(line); outF.write("\n")
       outF.write("\n")
outF.close()```

标签: pythontext-extraction

解决方案


如果您想要所有标题,直到出现第一个条件行,您需要break循环:

for line in hand:
    line = line.rstrip()
    if line.startswith("Title"):       
       outF.writelines([line])
    if line.startswith("condition"):
         break

outF.close()

如果您想在标题之后写下所有行,直到出现下一个条件:

write = False
writelines = []

for line in hand:
    line = line.rstrip()
    
    if line.startswith("condition"):
       write = False
       writelines.append("\n")
       
    if line.startswith("Title"):       
       write = True
    
    if write:
         writelines.append(line + " ")

outF.writelines(writelines)  
outF.close()

推荐阅读