首页 > 解决方案 > 使用正则表达式匹配电子邮件法律声明

问题描述

我只需要从以下文本中删除电子邮件免责声明。似乎是一项容易的任务,但我没有到达那里。

细绳:

你好。请参照附件。谢谢。机密和专有 * 此电子邮件的内容(包括所有页眉和页脚信息)是 The Cleaning house, LLC 的“机密和专有信息”。(“特许人”)并受特许人与其每个被特许人(“特许人”)之间的适用特许经营协议(“适用特许经营协议”)的保护。因此,收到此电子邮件的每个加盟商必须在适用的特许经营协议期限期间和之后,对该电子邮件的内容绝对保密,并且只能向其员工和代理披露该电子邮件的内容,并且只能向根据适用的特许经营协议经营其特许经营业务(如适用的特许经营协议中定义)所必需的范围。收到此电子邮件的加盟商均不得在任何其他业务中使用(或允许任何其他自然人或法人使用)此电子邮件的内容,或以未经特许人书面授权的任何方式使用。© 2009 清洁屋有限责任公司。版权所有。{}

期望的输出:

你好。请参照附件。谢谢。.{}

我的尝试:

机密和专有[\n|.]*(?=保留)

机密和专有。*保留所有权利

编辑:字符串中可能还有换行符和各种奇怪的东西。我希望 .* 能处理这个问题。

标签: regexpython-3.x

解决方案


可能您没有以正确的方式使用正则表达式解析器。

import re

email = "Hello. Please see attached. Thanks. CONFIDENTIAL AND PROPRIETARY * The content of this email, including all header and footer information, is the CONFIDENTIAL AND PROPRIETARY INFORMATION of The Cleaning house, LLC. (Franchisor) and is protected under the applicable franchise agreement (applicable Franchise Agreement) between Franchisor and each of its franchisees (Franchisee(s)). Accordingly, each Franchisee who receives this email must, both during and after the term of the applicable Franchise Agreement, maintain the absolute confidentiality of the content of this email and may disclose the content of this email only to its employees and agents and only to the extent necessary for the operation of its Franchised Business (as defined in the applicable Franchise Agreement) in accordance with the applicable Franchise Agreement. None of the Franchisees who receive this email may use (or permit any other natural or legal person to use) the content of this email in any other business or in any way not authorized by Franchisor in writing. © 2009 The Cleaning house, LLC. All rights reserved.{}"

cleaned_email = re.sub(r'CONFIDENTIAL AND PROPRIETARY[\s\S]*All rights reserved', '', email)

看看 [\s\S]* 的 .* 变化

(点)匹配除换行符以外的任何字符,但\s匹配 Unicode 空白字符,并且\S匹配任何不是空白字符的字符


推荐阅读