首页 > 解决方案 > Python中的正则表达式和替换

问题描述

我有一个内容字符串,如:

content =
"""
the patient monitoring system shall perform a daily device check from 1:30 am to 4:30 am (patient local time). if a device malfunction is detected, the daily device check shall send the malfunction to the clinician. if a patient health alarm is detected, the daily device check shall turn into full interrogation as specified in srs-3003. if no device or patient health issue identified, the daily device check shall end without further notification to the clinicians or patient. if a scheduled interrogation happens on the same day, the daily device check shall be skipped. if any device issue detected during the daily device check, the patient monitoring system shall alarm the patient with red urgent light. . if any patient health issue detected during the daily device check, the patient monitoring system shall alarm the patient with yellow warning light. . if a daily device check fails, it should be retried in 15 minutes up to 3 times. if a daily device check still fails after 3 times, the patient monitoring system shall end the interrogation and notify patient of the failed device check at 8 am that morning. there are 3 types of interrogations as below:
1. scheduled interrogation.
2. daily device check
3. patient initiated interrogation. an interrogation could fail due to the following reasons:
1. failed to establish communication.
2. communication lost.
3. failed to obtain a key data from the implanted device.
"""

我想替换像 1. 2. 3. 这样的小标题,但不想像 srs-3003 这样影响实际的内容编号。

如果我使用以下正则表达式: re.findall("\d{1}\.", content) result are ['3.', '1.', '2.', '3.', '1.', '2.', '3.'] and '3.' 在 srs-300 3. 将在下一步中替换内容:

num_dot = re.findall("\d+\.", content)
for num in num_dot:
    content = content.replace(num, "")

我该如何进行?

标签: pythonregexreplace

解决方案


您的正则表达式符合要求。只是为了不匹配3.srs-3003.您可以添加^锚点。就像是:

^\d+\.

上述正则表达式的解释:

  • ^- 代表行的开始。
  • \d+- 表示出现一次或多次的数字类。
  • \..-从字面上匹配。如果您还想删除每个编号点线前面的空间;请使用 +\s+

您可以在此处找到上述正则表达式的演示。


Python中的示例实现:

import re

regex = r"^\d+\."

test_str = ("the patient monitoring system shall perform a daily device check from 1:30 am to 4:30 am (patient local time). if a device malfunction is detected, the daily device check shall send the malfunction to the clinician. if a patient health alarm is detected, the daily device check shall turn into full interrogation as specified in srs-3003. if no device or patient health issue identified, the daily device check shall end without further notification to the clinicians or patient. if a scheduled interrogation happens on the same day, the daily device check shall be skipped. if any device issue detected during the daily device check, the patient monitoring system shall alarm the patient with red urgent light. . if any patient health issue detected during the daily device check, the patient monitoring system shall alarm the patient with yellow warning light. . if a daily device check fails, it should be retried in 15 minutes up to 3 times. if a daily device check still fails after 3 times, the patient monitoring system shall end the interrogation and notify patient of the failed device check at 8 am that morning. there are 3 types of interrogations as below:\n"
    "1. scheduled interrogation.\n"
    "2. daily device check\n"
    "3. patient initiated interrogation. an interrogation could fail due to the following reasons:\n"
    "1. failed to establish communication.\n"
    "2. communication lost.\n"
    "3. failed to obtain a key data from the implanted device.")

subst = ""

# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)

if result:
    print (result)

请在此处找到上述程序的示例运行。


推荐阅读