首页 > 解决方案 > 如何使用python提取文本文件中的特定段落?

问题描述

我必须提取从“替代受托人”开始并以“根据上述信托契约”结尾的特定段落。

  1. 由于字段重复需要仅在段落内查找数据。

  2. 数据可能像日期、文档编号等

sample.txt
Inst #: 2021
Fees: $42.00

06/24/2021 06:54:48 AM
Receipt #: 4587188

Requestor:
FINANCIAL CORPORATION OF
After recording return to: Src: MAIL

Mail Tax Statements to:

SUBSTITUTION OF TRUSTEE
AND DEED OF RECONVEYANCE

The undersigned, Financial Corporation of Nevada, a Nevada Corporation, as the Owner and
Holder of the Note secured by Deed of Trust dated March 1, 2013 made by Elvia Bello, Trustor, to
Official Records -- HEREBY substitutes Financial Corporation of Nevada, a Nevada Corporation,
as Trustee in lieu of the Trustee therein.


Said Note, together with all other indebtedness secured by said Deed of Trust, has been fully paid 
satisfied; and as successor Trustee, the undersigned does hereby RECONVEY WITHOUT
WARRANTY TO THE PERSON OR PERSONS LEGALLY ENTITLED THERETO, all the estate now
held by it under said Deed of Trust.
This JNO aay of June 2021,
Financial Corporation
wy luo Rtn rae
import re
mylines = []

pattern = re.compile(r"SUBSTITUTION OF TRUSTEE", re.IGNORECASE)
with open(r'sample.txt', 'rt', encoding='utf-8') as myfile:
    for line in myfile:                 
            mylines.append(line)
    for line in mylines:
        if(line == "SUBSTITUTION OF TRUSTEE "):
            print(line)
            break
        else:
            mylines.remove(line)
    
    print("my lines",mylines)

标签: pythonstring

解决方案


substitution of trustee您可以首先在子字符串的开头检查每一行,一旦找到,将标志变量设置为True。当标志为真时,继续向mylines列表中添加行。然后,一旦到达包含 的行under said deed or trust,停止添加行并返回结果:

mylines = []
flag = False
with open(r'sample.txt', 'rt', encoding='utf-8') as myfile:
    for line in myfile:
        if line.strip().upper().startswith("SUBSTITUTION OF TRUSTEE"):
            flag = not flag
        if flag:
            mylines.append(line)
            if "under said deed of trust" in line.strip().lower():
                break

print("".join(mylines))

请参阅此 Python 演示

输出:

SUBSTITUTION OF TRUSTEE
AND DEED OF RECONVEYANCE

The undersigned, Financial Corporation of Nevada, a Nevada Corporation, as the Owner and
Holder of the Note secured by Deed of Trust dated March 1, 2013 made by Elvia Bello, Trustor, to
Official Records -- HEREBY substitutes Financial Corporation of Nevada, a Nevada Corporation,
as Trustee in lieu of the Trustee therein.


Said Note, together with all other indebtedness secured by said Deed of Trust, has been fully paid 
satisfied; and as successor Trustee, the undersigned does hereby RECONVEY WITHOUT
WARRANTY TO THE PERSON OR PERSONS LEGALLY ENTITLED THERETO, all the estate now
held by it under said Deed of Trust.

推荐阅读