首页 > 解决方案 > 如何使用powershell从字符串中删除一段文本?

问题描述

我正在构建一封电子邮件并在该电子邮件中包含这部分内容,我有时需要将其删除,因此我想使用 -replace 从 #HOUSESTART 到 #HOUSEEND 进行替换,但它不起作用。

$body 包含此部分以及整个 html 电子邮件的更多内容:

"<p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
    normal'><b><u><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
    "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri'>#HOUSESTART<o:p></o:p></span></u></b></p>

    <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
    normal'><b><u><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
    "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri'>PLEASE
    NOTE</span></u></b><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
    "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri'>:&nbsp;
    As a house manager, you have two email addresses.&nbsp; Your secondary email
    address is #EMAIL.&nbsp; The only place you will need to use this email address
    is when you are enrolling any device in the Targeted Threat Protection.<span
    style='mso-spacerun:yes'>  </span><b><span style='background:yellow;mso-highlight:
    yellow'>(Only used in manager welcome emails.)</span><o:p></o:p></b></span></p>

    <p class=MsoNormal style='margin-bottom:0in;margin-bottom:.0001pt;line-height:
    normal'><b><span style='mso-ascii-font-family:Calibri;mso-fareast-font-family:
    "Times New Roman";mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri'>#HOUSEEND</span></b><span
    style='mso-ascii-font-family:Calibri;mso-fareast-font-family:"Times New Roman";
    mso-hansi-font-family:Calibri;mso-bidi-font-family:Calibri'><o:p></o:p></span></p>

我正在使用此命令尝试删除 #HOUSESTART 和 #HOUSEEND 之间的所有内容,但它没有删除它。

$body = $body -replace "#HOUSESTART.*#HOUSEEND"," "

任何帮助将不胜感激。

标签: powershell

解决方案


默认情况下,..NET 正则表达式中的元字符匹配除换行符以外的任何字符。

Therefore, if you want .* to match across multiple lines, i.e, to match newlines too, you must use inline regex option s ((?s) at the very start of the regex):

$body = $body -replace '(?s)#HOUSESTART.*#HOUSEEND', ' '

Note:
* I'm using '...' (single quotes, i.e. verbatim strings) rather than "..." (expandable (interpolating) string), to avoid confusion between what PowerShell may interpret up front, and what the regex engine will see.
* .* matches greedily, so that everything to the input's last instance of #HOUSEEND is matched; if there can be multiple instances, and you want to match only through the next one, use the non-greedy .*?

Note that $body must be a single, multi-line string for this to work.

For instance, if you use something like $body = Get-Content file.txt to set $body, you'll end up with an array of strings, each of which the -replace operation is applied to, which won't work. In that case, use the -Raw switch to ensure that the file is read as a single, multi-line string:
$body = Get-Content -Raw file.txt.


推荐阅读