regex - Regex - Delete everything before first match
问题描述
Really struggling with this one. I need a regular expression to remove the subject/to/from/date fields from an e-mail, but leaving all previous subject/to/from/date field entries within the mail chain. For example:
Subject: RE: Test mail
From: test@stackoverflow.com
To: test@test.com
Date: 22/06/2018 10:00:00
This is the body of e-mail #3.
Subject: RE: Test mail
From: test@test.com
To: test@stackoverflow.com
Date: 22/06/2018 09:55:00
This is the body of e-mail #2.
Subject: Test mail
From: test@stackoverflow.com
To: test@test.com
Date: 22/06/2018 09:50:00
This is the body of e-mail #1.
I'd like the regular expression to remove simply the top five lines to give:
This is the body of e-mail #3.
Subject: RE: Test mail
From: test@test.com
To: test@stackoverflow.com
Date: 22/06/2018 09:55:00
This is the body of e-mail #2.
Subject: Test mail
From: test@stackoverflow.com
To: test@test.com
Date: 22/06/2018 09:50:00
This is the body of e-mail #1.
Unfortunately, I can't write anything that specifically deletes the first five lines as there may also be a CC field; which means it could potentially be six lines.
It therefore needs to match the first instance of "Date:" until the end of the line and delete everything before. Any ideas would be hugely appreciated; the closest I've got is the below which unfortunately matches both instances of "Date:".
[\s\S]*.*Date:.*[\s\S]
解决方案
The regex should be constructed the following way:
- Start from the start of string.
- Accept any content up to a line starting from "Date: ".
- Accept the rest of this line.
- Accept any number of following
\n
chars (the end of this line and following empty lines).
No g
(global) option, since you want to perform only a single match.
So one of possible solutions can be as follows:
/\A.+?^Date: [^\n]+\n+/ms
Details:
m
option - multi-line (^
and$
match also start / end of line).s
option - single-line (.
matches also\n
).\A
- Start of the whole string..+?
- Any number of any chars (due tos
option, including\n
).^
- Start of a line (due tom
option).Date:
- Start of the "Date" line.[^\n]+
- Any number of chars other than\n
- the actual date field.\n+
- The end of line and following empty lines.
As you specified neither the host language nor regex version, I assumed PCRE, supporting all the features used.
推荐阅读
- sql - 按键/值过滤 JSON 数组元素的 PostgreSQL (v9.6) 查询
- python - 数字字符串转换为 int 并添加到列表中(优化问题)
- python - 如何在最后两个反斜杠之间查找和添加子字符串?
- python - 如何使用比较运算符过滤还包含字符串作为值的字典中的整数?
- javascript - HTML & Javascript - 仅显示特定对象的按钮
- c - 在函数中修改和返回结构
- excel - 在 EXCEL/GOOGLE SHEETS 中计算具有多个条件和范围的多个值
- ssl - 在 EC2 上的生产中的 Phoenix 未使用 AWS 负载均衡器在 HTTPS 中呈现
- html - 无法在 Bootstrap 4 中找出响应卡
- python - 如何在仍然使用 WebSocket 的同时运行 Gunicorn